# MAC Summer Workshop 2012

### From MACwiki

# MAC Summer Workshop 2012

## General

The MAC Summer Workshop 2012 will take place in conjunction with the 11th Symposium on Parallel and Distributed Computing (ISPDC 2012) in Munich/Garching on June 27, 2012. It features presentations and demos given to a broader audience by MAC students with focus on research done in MAC affiliated projects. It is also a place for exchanging ideas and strengthening collaboration amidst the MAC Community.

## Program

## Talk Abstracts

A flexible visualization environment is an excellent environment to bring together researchers in the context of the Virtual Arabia project. Already multi-screen high-resolution visualization spaces enable users to jump into worlds of artificial data being either pre-recorded or simulated. Having flexibility in such a system facilitates adjusting the presentation space to specific needs of researchers. Single users can configure it to a CAVE-like setup, larger groups can use a power-wall-like setup, and decoupled building blocks support different views on the same set of data. The FRAVE is such a system, a flexibly reconfigurable CAVE. We, as researchers in user interfaces, use this system as a basis for the development of new interaction metaphors for virtual environments. A major part of our work in this field is the development of integrated navigation techniques. We present our ongoing work towards a maneuvering system that smoothly integrates three different metaphors, an airplane-, a steering-wheel- and a window-frame-metaphor.

### P. Maier, E. Artinger: Inspecting simulation data with AugmentedReality and multi-touch devices

In most disciplines simulation data is essential for solving a hard problem. To get the most out of simulation data, it has to be inspected in various ways. But it is cumbersome to change the values for visualization, as one first has to set the values and then can see the visualization. With both approaches, the Augmented Reality and the multi-touch approach, it can be possible to change the values online while viewing the data. A slice plane could be directly placed using one of the techniques. The presentation shows work on how to manipulate and inspect 3D objects using multi-touch and Augmented Reality.

### M. Treib: Interactive Editing of GigaSample Terrain Fields

Previous terrain rendering approaches have addressed the aspect of data compression and fast decoding for rendering, but applications where the terrain is repeatedly modified and needs to be buffered on disk have not been considered so far. Such applications require both decoding and encoding to be faster than disk transfer. We present a novel approach for editing gigasample terrain fields at interactive rates and high quality. To achieve high decoding and encoding throughput, we employ a compression scheme for height and pixel maps based on a sparse wavelet representation. On recent GPUs it can encode and decode up to 270 and 730 MPix/s of color data, respectively, at compression rates and quality superior to JPEG, and it achieves more than twice these rates for lossless height field compression. The construction and rendering of a height field triangulation is avoided by using GPU ray-casting directly on the regular grid underlying the compression scheme. We show the efficiency of our method for interactive editing and continuous level-of-detail rendering of terrain fields comprised of several hundreds of gigasamples.

### S. Auer, D. Butnaru, J. Kneževic, N. Narayan, M. Hamman de Vaal: A Scalable Infrastructure for Computational Steering

To fully harvest the potential of advanced computing technologies, one increasingly tries to put domain experts and their capabilities into the centre of computational analysis and simulation. The most efficient workflow for these experts, would allow them to interactively make changes to a running simulation and subsequently obtain immediate feedback, i.e. to use computational steering. This instantaneous feedback is crucial, since it enables the expert to intuitively follow the exact outcomes that arose from a specific intervention. Especially as complex simulations with run-times of days/weeks become more efficient, it is important to develop sophisticated computational steering strategies that will require minimal increases to the existing computational effort, along with seamless integration into existing codes. Additionally, approaches that reuse information from already computed simulations are also of interest. We highlight some specific problem areas in computational steering, present some generic concepts of addressing these problems and show some particular application examples to demonstrate the current state of our research within this topic, as well as future challenges.

### A. Matveev, A. Nikodem, M. Roderus, T. Soini, N. Rösch: Efficient parallel strategies for the density functional method in computational chemistry

Current implementations of first principles electronic structure methods for molecules, clusters, and surface models do not admit a homogeneous parallelisation strategy due to the intrinsic diversity of structures and algorithms involved. We designed a specialized scheduler for the blocked eigenvalue problem and a generic task scheduling framework for use in parallel algorithms that admit dynamic load balancing. We reformulated the scheduling problem in the real space grid integration and evaluation of analytic integrals involving two or three atomic shells. The same framework is also used in the newly developed library for two-electron integrals. For the derivatives of the density functionals forward-mode automatic differentiation is used. For applications of relativistic quantum chemistry we designed a programming interface for distributed dense and diagonal matrices with a set of linear algebra operators and domain-specific functions making use of parallel architectures. To explore potential energy surfaces, we implemented a Python framework for property calculations, path, and transition state searching algorithms making use of parallel processing of distinct molecular geometries. We extended a Scheme interpreter by MPI-based communication- and process group management primitives and embedded into a quantum chemical application to allow for coarse-grained parallelization.

### D. Büttner: Overlapping Communication and Computation in MPI+OpenMP

With the number of cores growing faster than memory per node, hybrid programming models become a requirement for efficient use of HPC systems. For this scenario, achieving efficient communication is challenging, even when using asynchronous communication. Most MPI implementations can only advance communication inside library calls. In the presented work, we propose to move communication into a new type of OpenMP task, which gets scheduled as part of the regular OpenMP work-pool. On two large HPC systems, we show for compute intensive iterative stencil algorithms, that this provides real asynchronous communication and reduces waiting times drastically. Without complicating the programming interface, our results show an excellent performance independent of the communication to computation ratio. Here, MPI and the classical MPI-OpenMP approaches fail, if no optimized domain decomposition is used.

### S. Wallentowitz: OpTiMSoC - Platform for Tiled Manycore Architectures

Tiled Manycore Architectures become increasingly popular in research and industry. In such platforms a Network-on-Chip with a regular mesh topology connects so called tiles of processing elements, memory, I/O interfaces or hardware accelerators. Such platforms allow to replicate bulding blocks and therefore increase design productivity. Research on hardware optimizations of such platforms depends on the availability of implementations of all necessary elements or is bound to software simulations. OpTiMSoC aims at providing a basic library of open source hardware and a tool set to efficiently prototype different platform layouts. This talk sketches the basic goals of the project, gives an overview of the current status and prototyping platforms and finally presents the roadmap for the near future of OpTiMSoC.

### M. Simon, M. Ulbrich: Adjoint Based Optimal Control of Partially Miscible Two-Phase Flow in Porous Media with Applications to CO2 Sequestration in Underground Reservoirs

With the target of optimizing CO2 sequestration in underground reservoirs, we investigate constrained optimal control problems with partially miscible two-phase flow in porous media. The objective is to maximize the amount of trapped CO2 in an underground reservoir after a fixed period of CO2 injection, while time-dependent injection rates in multiple wells are used as control parameters. The optimal control problem is addressed using an adjoint approach. For the discretization of the multiphase Darcy flow system we use a variant of the BOX method, a locally conservative control-volume FE method. The timestep-wise Lagrangian of the control problem is implemented as a variational form in Sundance, a toolbox for rapid development of parallel FE simulations, part of the HPC framework Trilinos. We discuss the advantages of our Sundance implementation strategies. The MPI parallelized state and adjoint solvers are linked to the interior point optimization package IPOPT. Finally, we present several optimal control results and discuss the optimal injection strategies.

The micro-scale simulation of CO2 sequestration involves complex, porous-like geometries. For the generation of such geometries, we have developed two approaches: In 2D, we construct a fractured domain by channel networks. In 3D, we approximate sand grain-like scenarios by dense sphere packings. The flow through these structures is simulated with the incompressible Navier-Stokes solver of the PDE framework Peano. Using an upscaling scheme, the results of the micro-scale are used as input data for a Darcy solver on the macro-scale. Recently, we have parallelized the fluid solver part and extended the overall frame work such that we are able to compute the full permeability tensor of a given domain. Currently, we are setting up a number of scenarios to investigate the parallel performance of our code on an HPC scale and to investigate several flow phenomena in porous structures.

### Quanji Cai, S. Kooshapur, M. Manhart, E. Rank, A. Springer, B. Vexler: Numerical simulation of transport in porous media: from micro to macro scale

Transport in porous media is governed by three processes, the advective transport by the macroscopic flow field, the molecular diffusion and the mechanical dispersion due to the randomness of the individual streamlines through the pore space. A direct numerical simulation of the flow and transport on the pore scale can predict all processes directly without any modelling. Due to computer limitations, this approach is available only for very small elementary volumes. On larger scales, an up-scaling of the underlying equations is being done involving models for mechanical dispersion. While the underlying equations for mass transport in porous media have similar forms on the micro- and macro-scales, the numerical challenges are different. While on the micro-scale the representation of the complicated pore space and solution of the Navier-Stokes equations in it are the main factor, on the macro-scale the distribution of the spatially averaged concentration profiles show strong spatial gradients which have to be resolved by suitable numerical schemes. Large variations in the parameters of the equations, such as conductivity add additional problems for the numerical solver. In some cases, even the parameters are unknown. The presentation shows some examples how these difficulties are treated in pore space and macro-scale computations.

### C. Böhm, M. Ulbrich: An Adaptive Semismooth Newton-CG Method for Constrained Parameter Identification in Seismic Tomography

Seismic tomography means to infer the material structure of the Earth's subsurface from the observations of earthquakes. The propagation of seismic waves is goverened by a coupled system of the acoustic and the elastic wave equation. We present a semismooth Newton-CG method for 3D-full-waveform seismic inversion. We apply Moreau-Yosida regularization to include box-constraints on the parameters and use a trust-region globalization for the penalized problem. The ill-posedness of the problem is addressed by inverting sequentially for increasing frequencies combined with a goal-oriented adaptive refinement of the parameter grid. The implementation relies on the adjoint-based computation of the gradient and Hessian-vector products of the reduced problem, a preconditioned CG method to solve the Newton system in matrix-free fashion and MPI parallelization to tackle large-scale inverse problems. We show numerical results for the application of our method to a dataset of marine geophysical exploration in the North Sea.

### J. Benk : Immersed Boundary Methods for Fluid-Structure Interaction and Shape Optimization within a FEM-based PDE Toolbox

One of the main challenges in the a classical mesh-based FEM-approach is the representation of complex boundary geometries. This challenge is often tackled by a computationally costly mesh generation process, where the resulting mesh's facets represent the boundary. An alternative approach, that we employ here, is the immerse boundary (IB) approach. This uses instead a computationally cheaper structured adaptive Cartesian mesh and an explicit boundary representation, where the challenge mainly lies in the boundary condition (BC) imposition on the by the geometry intersected mesh cells. One of the IBs is the Nitsche Method that we employ here for the fluid-structure interaction (FSI) and shape optimization problems.

The simulation of such complex physical systems modeled by PDEs requires a combination of sophisticated numerical methods. Implementing a FEM-based simulation software that computes a particular PDE's solution often requires the reusage of existing methods. In order to make our approach public and also to prove the modularity of it, we integrated our IB methods in an existing FEM-based PDE toolbox within the Trilinos project, called Sundance.

### Yang Li: Requirements Engineering for Scientific Computing

Requirements engineering is crucial to the success of software development. It supports downstream software development activities such as design and implementation. However, in many scientific computing projects, traditional requirements engineering practices are ignored. We claim that there is a need for methodologies, which help capturing and managing requirements for these projects, to collaboratively develop scientific software with greater interoperability. We propose a model-based approach to elicit and manage requirements in scientific computing projects. The proposed approach is based on a meta-model in order to deal with the high complexity and frequent change in scientific software development. The meta-model also provides abstractions and notations targeted at scientific computing projects. The approach supports requirements engineering in these projects with the flexibility of easily managing requirements versioning, traceability and communication across the boundary of disciplines.