e-Science

Advanced Scientific Computing
Modelation and Simulation
Large Scale Data Analysis
Numeric Methods and Algorithms

Advanced Scientific Computing (Sem. 1)

Learning outcomes

On successful completion of the curricular unit, students will be able to demonstrate the following knowledges,
capacities and skills:
-to characterize and evaluate in qualitative and quantitative terms the architectures of HPC systems that are
relevant to the scientific community
- to analyse, measure and evaluate the performance of computer systems in the execution of compute and data
intensive applications
- to develop and/or modify science/engineering computer applications aiming to improve performance and
scalability.

Syllabus

1. Sequential Computing
1.1 The Von Neumann architecture
1.2 Numerical data representation
1.3 Memory Hierarchies
1.4 Multicore architectures
1.5 Locality and data reuse
1.6 Programming strategies for high performance
1.7 Performance metrics and measurement techniques
2 Parallel Computing
2.1 Instruction Level Parallelism
2.2 Parallel Computers Architectures
2.3 Different types of memory access
2.4 Interconnection topologies
2.5 Granularity of parallelism
2.6 Parallel programming
2.7 Performance evaluation
2.7 Efficiency of parallel computing
2.8 Multi-threaded architectures
2.9 GPU computing
3 Case study analysis

Teaching methodologies and evaluation

Teaching methodologies:
- lectures with presentation of key concepts with examples
- class discussion of scientific papers
- lectures/talks by computational science/engineering researchers
- lab classes to analyse and solve case studies, with later open presentation and discussion
- participation in an international research internship, at Univ. Texas at Austin
Evaluation elements:
- a written test with several development or problem-solving questions (weight: ~40%)
- written essays on complementar subjects (weight: ~20%)
- practical lab workss, including report writing and oral discussion of results (weight: ~40%).

Bibliography

• David Patterson, John Hennessy, Computer Architecture. A Quantitative Approach, 5th Ed., Morgan Kaufmann,
2011
• David Kirk and Wen-mei Hwu, Programming Massively Parallel Processors, A Hands-on Approach, Morgan
Kaufmann, 2010

Modelation and Simulation (Sem. 2)

Learning outcomes

- Explain, in due context, the concepts of Modelling and Simulation.
- Apply numerical methods and algorithms to a representative set of computational physics problems
- Discuss the algorithms complexity and their efficient implementation in a parallel programming environment,
using heterogenous platforms (multicore CPU, multicore CPU-GPU, etc.).
- Solve 3 Case studies, with controlled complexity,using Monte Carlo methods, numerical resolution of PDEs
and Fourier analysis.
- Understand and discuss the algorithms complexity, from a point of view of an efficient implementation, in an eScience perspective.
- Understand and discuss the issues that Modelling and Simulation seek to solve, from the viewpoint of the
physical meaning of the solutions.
- Acquire, through exemple, a teamwork capacity in a multi-disciplinary environment.

Syllabus

1. Computational Physics and e-Science, definitions and issues.
Models, simulation and computational experiences, adimensional variables; consequences of the internal
representation and algorithms.
2. Monte Carlo: random number generation. Deterministic randomness. Applications with a termodinamichs
model. Sampling methods. Applications in diffusion problems.
3. Partial Differencial Equations (PDEs): electrostatic potencials electrostática and wave propagation. Maxwell:
EM waves and the FDTF algorithm.
4. Diffusion: numerical integration of the diffusion equation and boundary conditions.
5. Fourier analysis of non linear oscillations. FFT algorithm. Fourier series resolution of the Laplace equation.

Teaching methodologies and evaluation

Contents will be taught in lectures, based on oral and written communication and the resolution of at least 3
case studies, in teamwork activities grouped in 3/4 students. Support material will be provided through the elearning platform.
Student's assessment elements will include public presentation and private discussion of the teamwork case
studies.

Bibliography

[1] R.H.Landau, M.J. Paez, C.C. Bordeianu, A Survey of Computational Physics, Introductory Computational
Science, Princeton University Press, 2008.
[2] P.O.J. Scherer, Computational Physics, Simulation of Classical and Quantum Systems, Springer, 2010.
[3] D. Frenkel, B. Smit, Understanding Molecular Simulations, From Algorithms to Applications, Academic Press,
2002.
[5] M. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill, 2003.
[6] D.B. Kirk, W.W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach, Nvidea, Morgan
Kaufmann, 2010.
[7] William H. Press, Saul A. Teukolsky,William T. Vetterling, Brian P. Flannery, Numerical Recipes: The Art of
Scientific Computing,3rd. Ed., Cambridge, 2007.

Large Scale Data Analysis (Sem. 1)

Learning outcomes

A student who successfully complete this UC must be able to demonstrate that it has acquired the following
competences:
• Identify existing techniques for storing and processing data in large scale
• Apply techniques for extraction of knowledge on continuous data streams
• Use tools / libraries for common data processing and analysis of large-scale and continuous
• Develop / implement algorithms for (parallel) processing on distributed data

Syllabus

• Concepts of knowledge extraction, machine learning and data mining: clustering, association rules,
classification and regression models, algorithms and rules on induction trees, artificial neural networks, support
vector machines, evaluation / estimation error models of learning, meta-learning: selection of models, sets of
models.
• Mining data stream; processing streams, data aggregation techniques; stream clustering, detection of
changes in streams; applications in continuous data (eg financial markets)
• Mining graphs and networks
• Systems for large-scale data storage: storage of unstructured data, data grids, cloud computing and storage;
processing parallel / distributed data
• Case study analysis of large-scale scientific data: data analysis on Computational Biology; financial data
analysis
• Data visualization: principles and applications

Teaching methodologies and evaluation

The UC is taught with a weekly 2H theoretical session and a session laboratory of 1H.
In theoretical sessions are taught the fundamentals and concepts. The sessions are primarily theoretical
exposition.
The laboratory sessions are to consolidate the knowledge acquired in the theoretical sessions, by solving
exercises. The various exercises allow students to consolidate the knowledge through the design and
implementation of the most common algorithms for analysis and processing of scientific data. The laboratory
sessions also serve to clarify doubts.
The evaluation is done through practical assignment.

Bibliography

Data Mining. Pratical Machine Learning tools and techniques. (third edition) Ian Witten & Eibe Frank, Morgan
Kaufman 2011.
Knowledge Discovery from Data Streams J. Gama (2010), Chapman & Hall/CRC Press
Scientific Data Mining and Knowledge Discovery: Principles and Foundations, Mohamed Medhat Gaber,
Springer, 2009

Numeric Methods and Algorithms (Sem. 2)

Learning outcomes

A student who successfully complete this UC must be able to demonstrate that it has acquired the following
competences:
• Identify the theoretical mathematical and numerical methods studied
• Identify the different levels (complexity, numerical robustness, etc.) strengths and weaknesses of the methods
• Develop sequential and parallel implementations, use numerical libraries and discuss the performance and
efficiency obtained
• Identify, in parallel numerical algorithms, possible problems of load balancing and / or high cost of
communication between the computing elements

Syllabus

• Monte Carlo methods: sequential and parallel random number generators, applications of the method of Monte
Carlo
• Matrix multiplication: sequential and parallel implementations in message-passing model
• Systems of linear equations: Gaussian elimination, iterative methods, convergence analysis, sequential and
parallel implementations
• finite difference method: numerical solution of differential equations that arise in particular problems
"classics" (vibration of a string, heat diffusion)
• Fast Fourier Transform: Fourier analysis, the transformed DDT (discrete) and FFT, sequential and parallel
algorithms.

Teaching methodologies and evaluation

The UC is taught with a weekly 2H theoretical session and a session laboratory of 1H.
In theoretical sessions are taught the fundamentals and concepts. The sessions are primarily theoretical
exposition.
The laboratory sessions are to consolidate the knowledge acquired in the theoretical sessions, by solving
exercises. The various exercises allow students to consolidate the knowledge through the implementation of
common numeric methods. The laboratory sessions also serve to clarify doubts.
The evaluation is done through practical assignment and an exam.

Bibliography

Parallel Scientific Computing in C++ and MPI: A Seamless Approach to Parallel Algorithms and their
Implementation, George Em Karniadakis and Robert M. Kirby II, Cambridge University Press, 2003
Matrix Computations, G. Golub, C. F. Van Loan, 3rd. Ed., John Wiley & Sons, 1996