Charm++ Explained

Charm++
Paradigm:	Message-driven parallel programming, migratable objects, Object-oriented, asynchronous many-tasking
Designer:	Laxmikant Kale
Developer:	Parallel Programming Laboratory
Latest Release Version:	7.0.0
Programming Language:	C++, Python
Platform:	Cray XC, XK, XE, IBM Blue Gene/Q, Infiniband, TCP, UDP, MPI, OFI
Operating System:	Linux, Windows, macOS
Website:	http://charmplusplus.org

Charm++ is a parallel object-oriented programming paradigm based on C++ and developed in the Parallel Programming Laboratory at the University of Illinois at Urbana–Champaign. Charm++ is designed with the goal of enhancing programmer productivity by providing a high-level abstraction of a parallel program while at the same time delivering good performance on a wide variety of underlying hardware platforms. Programs written in Charm++ are decomposed into a number of cooperating message-driven objects called chares. When a programmer invokes a method on an object, the Charm++ runtime system sends a message to the invoked object, which may reside on the local processor or on a remote processor in a parallel computation. This message triggers the execution of code within the chare to handle the message asynchronously.

Chares may be organized into indexed collections called chare arrays and messages may be sent to individual chares within a chare array or to the entire chare array simultaneously.

The chares in a program are mapped to physical processors by an adaptive runtime system. The mapping of chares to processors is transparent to the programmer, and this transparency permits the runtime system to dynamically change the assignment of chares to processors during program execution to support capabilities such as measurement-based load balancing, fault tolerance, automatic checkpointing, and the ability to shrink and expand the set of processors used by a parallel program.

Applications implemented using Charm++ include NAMD (molecular dynamics) and OpenAtom (quantum chemistry), ChaNGa and SpECTRE (astronomy), EpiSimdemics (epidemiology), Cello/Enzo-E (adaptive mesh refinement), and ROSS (parallel discrete event simulation). All of these applications have scaled up to a hundred thousand cores or more on petascale systems.

Adaptive MPI (AMPI)^[1] is an implementation of the Message Passing Interface standard on top of the Charm++ runtime system and provides the capabilities of Charm++ in a more traditional MPI programming model. AMPI encapsulates each MPI process within a user-level migratable thread that is bound within a Charm++ object. By embedding each thread in a chare, AMPI programs can automatically take advantage of the features of the Charm++ runtime system with little or no changes to the MPI program.

Charm4py allows writing Charm++ applications in Python, supporting migratable Python objects and asynchronous remote method invocation.

Example

Here is some Charm++ code for demonstration purposes:^[2]

Header file class Hello : public CBase_Hello ;

Charm++ Interface file module hello ;

Source file

include "hello.decl.h"
include "hello.h"

extern CProxy_Main mainProxy;extern int numElements;

Hello::Hello

void Hello::sayHi(int from)

include "hello.def.h"

Adaptive MPI (AMPI)

Adaptive MPI is an implementation of MPI (like MPICH, OpenMPI, MVAPICH, etc.) on top of Charm++'s runtime system. Users can take pre-existing MPI applications, recompile them using AMPI's compiler wrappers, and begin experimenting with process virtualization, dynamic load balancing, and fault tolerance. AMPI implements MPI "ranks" as user-level threads (rather than operating system processes). These threads are fast to context switch between, and so multiple of them can be co-scheduled on the same core based on the availability of messages for them. AMPI ranks, and all the data they own, are also migratable at runtime across the different cores and nodes of a job. This is useful for load balancing and for checkpoint/restart-based fault tolerance schemes. For more information on AMPI, see the manual: http://charm.cs.illinois.edu/manuals/html/ampi/manual.html

Charm4py

Charm4py^[3] is a Python parallel computing framework built on top of the Charm++ C++ runtime, which it uses as a shared library. Charm4py simplifies the development of Charm++ applications and streamlines parts of the programming model. For example, there is no need to write interface files (.ci files) or to use SDAG, and there is no requirement to compile programs. Users are still free to accelerate their application-level code with technologies like Numba. Standard ready-to-use binary versions can be installed on Linux, macOS and Windows with pip.

It is also possible to write hybrid Charm4py and MPI programs.^[4] An example of a supported scenario is a Charm4py program using mpi4py libraries for specific parts of the computation.

References

External links

Notes and References

Web site: Parallel Programming Laboratory. charm.cs.illinois.edu. 2018-12-12.
Web site: 2017-05-08. PPL - UIUC PARALLEL PROGRAMMING LABORATORY. Array "Hello World": A Slightly More Advanced "Hello World" Program: Array "Hello World" Code.
Web site: Charm4py — Charm4py 1.0.0 documentation. charm4py.readthedocs.io. 2019-09-11.
Web site: Running hybrid mpi4py and Charm4py programs (mpi interop). 2018-11-30. Charm++ and Charm4py Forum. en. 2018-12-11.