ESO - European Southern Observatory

Miscellaneous Information

Miscellaneous Information

Abstract Reference: 30798
Identifier: O12.3
Presentation: Oral communication
Key Theme: 3 New Trends in HPC and Distributed Computing 

HPC Development for the ALMA Pipeline

Castro Sandra, Gonzalez Villalba Justo, Taylor Julian, Bhatnagar Sanjay, Caillat Michel, Ford Pam, Kumar Golap, Jacobs Jim, Kern Jeff, Loveland Susan, Mehringer David, Moellenbrockeorge, Petry Dirk, Pokorny Martin, Rao Urvashi, Schiebel Darrell, Suoranta Ville, Tsutsumi Takahiro, Sugimoto Kanako, Kawasaki Wataru

CASA, the Common Astronomy Software Applications, has the primary goal of supporting the data processing needs of ALMA and VLA. The Parallelisation framework implemented in CASA uses MPI, the Message Passing Interface, which is accessible at run time through a wrapper of the MPI executor called “mpicasa”. We use MPI Python bindings to control the parallelization of high-level CASA tasks and will soon start to use MPI C bindings for specific low-level C++ parts of CASA.

The parallelisation in CASA is achieved by partitioning the input MeasurementSet (MS) into several pieces that are virtually concatenated. Once the data is partitioned into a so-called Multi-MS, the CASA parallelised tasks are able to detect it automatically and if a cluster is available, sub-tasks are sent to the cluster nodes using MPI.

The ALMA pipeline will soon start using the CASA parallelisation framework to create science-ready data products to users. In this talk, I will show details of the Tier approaches used in the parallelisation of the pipeline and give preliminary performance numbers of the ALMA pipeline parallel processing.