The Los Alamos Message Passing Interface

What's New: Open MPI

LA-MPI is no longer in active development, but is being maintained for use on production systems at LANL, and we welcome other users.

Our future development is focused on the Open MPI project, a new component-based, extensible implementation of MPI-2.

Introduction

LA-MPI is an implementation of the Message Passing Interface (MPI) motivated by a growing need for fault tolerance at the software level in large high-performance computing (HPC) systems.

This need is caused by the vast number of components present in modern HPC systems, particularly clusters. The individual components -- processors, memory modules, network interface cards (NICs), etc. -- are typically manufactured to tolerances adequate for small or desktop systems. When aggregated into a large HPC system, however, system-wide error rates may be too great to successfully complete a long application run. For example, a network device may have an error rate which is perfectly acceptable for a desktop system, but not in a cluster of thousands of nodes, which must run error free for many hours or even days to complete a scientific calculation.

LA-MPI has two primary goals: network fault tolerance and high performance.

Network fault tolerance is acheived by implementing a highly efficient checksum/retransmission protocol. The integrity of delivered data is (optionally) verified at the user-level using a checksum or CRC. Data that is corrupt (or never delivered) is retransmitted.

As for high performance, LA-MPI's lightweight checksum/retransmission protocol allows us to achieve low latency messaging. Furthermore, the flexible approach taken to the use of redundant data paths in a network-device-rich system leads to high network bandwidth since different messages and/or message-fragments can be sent in parallel along different paths. Also, since LA-MPI is developed for use on the the large systems at Los Alamos National Laboratory we have verified that LA-MPI is scalable to over 3,500 processes.

An alternative solution to the network fault tolerance problem is to use the TCP/IP protocol. We believe, however, that this protocol -- developed to handle unreliable, inhomogeneous and oversubscribed networks -- performs poorly and is overly complex for HPC system messaging, and that LA-MPI's lightweight checksum/retransmission protocol is a more appropriate choice.

Features

Standard compliant (MPI version 1.2 integrated with ROMIO for MPI-IO)
Highly portable
Open source (LGPL)
Thread safe
Optimized for SMP systems, including NUMA architectures
Network fault tolerant (data integrity checked at user level)
Message-fragment striping across multiple network devices

Platforms

Processors: Intel IA32, Intel IA64, AMD Opteron, PowerPC (G4, G5), Alpha, MIPS
Operating systems: Linux, Linux/Clustermatic, MacOS X, Tru64, IRIX
Interconnects: Shared memory, Ethernet (TCP, UDP), Myrinet (GM), QSNet (Quadrics Elan3), InfiniBand (VAPI), HIPPI-800

Download

The current release of LA-MPI is

lampi-1.5.16.tar.gz

Some earlier releases are also available:

LA-MPI is installed in the usual way

  configure [OPTIONS]
  make
  make install

where configure options include

  --enable-debug          enable debugging
  --enable-lsf            use LSF
  --enable-rms            use RMS
  --enable-bproc          use BPROC
  --enable-udp            enable UDP path
  --enable-tcp            enable TCP path
  --enable-qsnet          enable QSNET path
  --enable-gm             enable Myrinet GM path
  --enable-ib             enable InfiniBand path
  --with-romio            include MPI-IO support

Research

LA-MPI is developed by the Application Communications and Performance Research Team of the Advanced Computing Laboratory at LANL. We are actively investigating other aspects of fault tolerance and performance optimization. Topics of current interest include

Scalable, fault-tolerant runtime systems
Process fault tolerance
Process migration
Asynchronous progress thread
Improved collective algorithms and library support
On-NIC optimizations

Many of these ideas will be explored as part of the Open MPI project.

Papers

A Network-Failure-Tolerant Message-Passing System for Terascale Clusters, International Journal of Parallel Programming 31 (4): 285-303, August 2003.
Network Fault Tolerance in LA-MPI, Euro PVM/MPI 2003.
Architecture of LA-MPI, a Network-Fault-Tolerant MPI, IPDPS 2004.

Also see our Open MPI papers.

Contact

LA-MPI is developed at the Advanced Computing Laboratory of Los Alamos National Laboratory. For more information contact lampi-support@lanl.gov