Molecular Science Software Suite - MS3

Ecce - NWCHem - ParSoft

Winner of 1999 R&D 100 Award Offsite: R and D 100 Awards from R&D Magazine Offsite: R and D Magazine

MS3 is a comprehensive, integrated set of tools that enables scientists to understand complex chemical systems at the molecular level by coupling the power of advanced computational chemistry techniques with existing and rapidly evolving high-performance, massively parallel computing systems.


Page Contents: Description - Software Improvements - Principal Applications - Future Developments - Potential Applications - Summary - Additional Information


MS3 Description

The Molecular Science Software Suite (MS3) is a unique, comprehensive, integrated suite of software that enables computational chemists to focus their advanced techniques on finding solutions to complex issues involving chemical systems. It is the first general-purpose software that provides access to high-performance, massively parallel computers for a broad range of chemists on a broad range of applications. MS3 lets chemists easily couple the power of advanced computational chemistry techniques with existing and rapidly evolving high-performance, massively parallel computing systems. A multidisciplinary team of scientists and computer experts at Pacific Northwest National Laboratory's Environmental Molecular Sciences Laboratory (EMSL) developed MS3.

Background. To address the complex environmental issues facing the nation, new scientific understanding is needed of the fundamental chemical, physical, and biological processes underlying these issues. We need to obtain a fundamental understanding of these complex phenomena at the basic, molecular level to develop new, timely, and cost-effective solutions. By using modeling and simulation techniques, scientists can now begin to perform computations with the required accuracy on the molecular systems involved in these complex issues. Computational chemistry can provide fundamental information about molecules and their behavior, so it is a key component of any modeling and simulation program used to address these issues.

High-performance, massively parallel computing systems give computational chemists the computing power they need to model and simulate ever more complex chemical systems. However, such computers are extremely difficult to run efficiently and effectively without the right computational methods. By providing access to high-performance, massively parallel computers for a broad range of applications, MS3 can be used to address environmental problems. It can also be applied to the computational "Grand Challenge" problems in computational chemistry as addressed by the chemical industry's Vision 2020 subcommittee on computational chemistry. In addition, it will provide unique insights into the molecular-level understanding of our world!

About MS3. MS3 consists of three components: 1) the Extensible Computational Chemistry Environment (Ecce), 2) the Northwest Computational Chemistry Software (NWChem), and 3) Parallel Software Development Tools (ParSoft).

Ecce is the first comprehensive, integrated, problem-solving environment developed for computational chemistry. Based on an object-oriented data model developed at EMSL, Ecce is a suite of distributed client/server applications that enable scientists to easily use computational software such as NWChem to perform complex molecular modeling and analysis tasks by accessing networked, high-performance computers from their desktop workstations. Ecce combines automated metadata and database management, modern "intelligent" graphical user interfaces, automated calculation initiation and monitoring, scientific visualization, analysis tools, and access to a hierarchical mass storage system. This interactive environment allows the user ready access to computational resources, both hardware and software, on highly sophisticated parallel computing systems.

Key components of the Ecce environment are:

An outline of MS3, with its distributed client/server model, is shown in Figure 1.

Link to MS3 Client/Server Model diagram

Figure 1. MS3: Ecce- NWChem- ParSoft Distributed Client/Server Model

NWChem is a new generation of high-performance molecular modeling software that runs on parallel computing systems ranging from clusters of workstations to the emerging teraflops class of massively parallel computers. NWChem is scalable to both problem size and computer size as well as portable for different high-performance computing systems. It provides a broad range of capabilities for solving sophisticated mathematical models of chemical systems from first principles at both the molecular orbital and density functional theory levels. These capabilities enable theoretical chemists to predict the fundamental characteristics of chemical systems at a level of accuracy that is otherwise obtainable only from the most sophisticated experimental approaches. NWChem also supports molecular dynamics calculations with a variety of empirical force fields to simulate macromolecular and solution systems as well as with quantum mechanical force fields. The software is modular, so that even though it has more than 500,000 lines of code, less than 10,000 lines must be modified to run at high-performance levels on any new parallel computer architecture.

The current version of NWChem is 3.2.1, and its capabilities include

Molecular electronic structure

Periodic system electronic structure

Classical mechanics

Combined quantum mechanics and classical mechanics

ParSoft provides the high-performance, efficient, and portable computing libraries and tools that enable NWChem to run on a wide variety of parallel computing systems with leading-edge performance and scalability. ParSoft is targeted at both common and specific research requirements. The parallel software includes the Global Array toolkit, which provides an efficient and portable "shared-memory" programming interface for distributed-memory computers; the Parallel Eigensolver (PeIGS) Library for solving linear algebra on parallel architectures; and Chem I/O, a parallel input/output library.

Available ParSoft capabilities include:

MS3 is available to users for no charge through EMSL (http://www.emsl.pnl.gov). The ParSoft tools can be downloaded and license requests for Ecce and NWChem can be made from the EMSL web site (http://www.emsl.pnl.gov).

MS3 is the only comprehensive, integrated computational chemistry software suite that provides ease-of-use; portability on high-performance, massively parallel computing systems; and scalability to the problem and the computing system size. It is also unique in its ability to handle all levels of quantum chemical calculations and classical molecular dynamics simulations. Some of MS3's outstanding capabilities were recognized by selection of the paper "An Out-of-Core Implementation of the Massively Parallel Multi-Reference Configuration Interaction Program" as the Best Overall Paper at Supercomputing '98 in Orlando, Florida, November 7-13, 1998 (see http://www.supercomp.org/sc98/TechPapers/sc98_FullAbstracts/Dachsel897/).

MS3 Software Improvements

A new software development paradigm was used to develop MS3. This new paradigm is based on the realization that a key component of any successful modern software development program for scientific applications on massively parallel computers is the use of teams of computer scientists, applied mathematicians, application developers, and users to design and implement the software. The synergy of such efforts allows the development of the highest performing software with the best algorithms and the longest in-use lifetime. Such teams help minimize long-term development costs by developing software that is, to the maximum extent possible, portable and readily maintained and updated. This is especially true when tackling "Grand Challenge" computational problems where changes in computer architecture occur on a regular basis and new algorithms are constantly being developed.

The most notable features of MS3 are the integration of its three major components and its ability to allow a wide range of users to easily access high-performance, massively parallel computers to solve complex chemical problems.

A model of the unique software architecture of MS3, with it's modular, integrated coupling of NWChem with ParSoft and Ecce is shown in Figure 2.

Link to MS3 Infrastructure Diagram

Figure 2. MS3 Infrastructure

MS3 Principal Applications

MS3 was developed to support the modeling and simulation of chemical systems relevant to U.S. Department of Energy environmental cleanup efforts, but it will also support research relevant to the other national issues described below. Originally developed to run on the 512-processor, IBM-SP massively parallel computing system in the EMSL Molecular Science Computing Facility, MS3 has since been exported to other high-performance computing systems, including the Cray T3D and T3E, the Intel Paragon, the KSR-1 and -2, Silicon Graphics Origin 2000, and clusters of workstations.

MS3 is currently in use at many of the national supercomputer centers, national laboratories, and universities. These are not just single users in most cases but are actually large computer centers. NWChem has been distributed to over 100 institutions and Ecce is used by researchers from over 10 institutions. Software from the ParSoft suite of tools has been distributed through the World Wide Web at http://www.emsl.pnl.gov/docs/parsoft/. It is being used by the computer industry, financial service companies, national laboratories, and many universities. There are about 20 downloads of the ParSoft software per month from the web site.

As noted above, there are many applications for the MS3. Calculations done with the software include:

MS3 will help researchers address the computational "Grand Challenge" problems in computational chemistry as addressed by the Chemical Industry's Vision 2020 subcommittee on computational chemistry:

MS3 Future Developments

For Ecce, future developments will focus on developing a three-tiered architecture to support

These modifications will provide a problem-solving environment that addresses the needs of computational scientists working at different locations while using a variety of data sources.

For NWChem, future developments will include expansion of functionality to allow

We are adding methods/algorithms to incorporate relativistic corrections for systems containing lanthanides/actinides and advanced computational chemistry techniques (e.g., molecular dynamics techniques). We will also investigate methods and algorithms needed to exploit the next generation of massively parallel computing systems.

ParSoft will be extended to support the needs of more application areas. Future capabilities will include support for multidimensional and sparse array data structures. A compression module will be added to parallel I/O to provide effective utilization of resources for out-of-core algorithms with sparse/compressed data. A new portable communication library called aggregate remote memory copy interface (ARMCI) will be included as a building block for new tools and libraries that require high-performance, one-sided asynchronous communication.

MS3 Potential Applications

There are an enormous number of potential applications for MS3 as new, faster, and more powerful high-performance computer systems are developed. Examples include

An important result will be reliable predictions of chemical phenomena with high accuracy and with error bars that can replace difficult and expensive experimental approaches for a broad range of molecules. We are just starting to do this today for small molecules, and continued development of the software and growth in computer power will enable us to continue revolutionizing the field of chemistry. Such computations will become even more important as we lose experimental capabilities in the measurement of many chemical phenomena.

MS3 Summary

Our nation is facing many technologically challenging issues as we try to sustain and improve our quality of life. Among the most important of these issues is our stewardship of the environment. We must clean up problems caused by past activities, and we must seek new ways of conducting our activities so we don't create new problems for future generations to deal with. The application of modern science and technology will greatly enhance our ability to solve these problems in a timely and cost-effective manner.

MS3 is a revolutionary suite of computational chemistry software that enables chemists to effectively use large-scale, high-performance, massively parallel computers to help solve complex national problems. The software is based on a new development paradigm and provides scalable performance across a wide range of computer architectures. It enables the scientific community to solve complex environmental problems in the atmosphere, in aquatic systems, and in the subterranean environment. In addition, it will be used in the search for new drugs to help us pursue longer, healthier lives; to improve our agricultural productivity; and to provide insights into how organisms work at the molecular level. Finally, it can be used to develop new products and processes that will enable both us and future generations to lead better lives.

MS3 Additional Information

Additional information on the MS3 components, including how to obtain the software, can be found at the following web pages.