Parallel netCDF: A Parallel I/O Library for NetCDF File Access
Parallel netCDF (PnetCDF) is jointly developed by Northwestern University and Argonne National Laboratory.
PnetCDF is a library providing high-performance I/O while still maintaining file-format compatibility with Unidata's NetCDF. NetCDF gives scientific programmers a self-describing and portable means for storing data. However, prior to version 4, netCDF does so in a serial manner. By making some small changes to the netCDF APIs, PnetCDF can use MPI-IO to achieve high-performance parallel I/O.
- Downloads: latest and previous software releases, as well as development source repository.
- Documentation: a QuickTutorial, plus papers, presentations, articles, and other resources
- Benchmarking: tools and suggestions for evaluating PnetCDF performance
- November 17, 2013: PnetCDF 1.4.0 released. See ReleaseNotes-1.4.0 for more details.
- Fortran 90 APIs are now available in 1.4.0.
- New APIs, ncmpi_get/put_varn_<type> for reading/writing a list of sub-requests to a single variable. Available for F77 and F90 as well.
- Interoperability with netCDF-4: netCDF-4 programs can now access CDF-1 and CDF-2 files in parallel through PnetCDF. See example programs.
- FLASH-IO benchmark using PnetCDF is now part of the source code release.
A Note About Large File Support
The CDF (or CDF-1) file format has been in use by NetCDF library through version 3.5.1.
Starting from 3.6.0, the serial NetCDF library added support for CDF-2 format. With this format, even 32 bit platforms can create NetCDF files greater than 2GB in size. CDF-2 also allows more special characters in the name strings of defined dimension, variables, and attributes. The support was based largely on work from Greg Sjaardema.
As of PnetCDF 0.9.2, we ship with support for large file size specified in CDF-2 format. See README.large_files in the source tree for more information.
Starting from 1.3.0, PnetCDF supports CD-5 file format: adding unsigned and 64-bit integer data types and allowing variables with more than 232 array elements.
File and Variable Limits
Both PnetCDF and NetCDF share limitations on file and variable sizes. More information can be found on the FileLimits page.
PnetCDF requires an MPI implementation with MPI-IO support. Most MPI libraries have this nowadays. A parallel file system would also go a long way towards achieving highest performance.
PnetCDF makes use of several other technologies.
- ROMIO, an implementation of MPI-IO, provides optimized collective and noncontiguous operations. It also provides an abstract interface for a large number of parallel file systems.
- One of those file systems ROMIO supports is PVFS, a high performance parallel filesystem for linux clusters.
Today, there are several options for high level I/O libraries. Here are some discussions on the role of PnetCDF in this ecosystem:
We discuss the design and use of the PnetCDF library on the email@example.com mailing list. Anyone interested in developing or using PnetCDF is encouraged to join. Visit the list information page for details.
The URL for the list archive is http://lists.mcs.anl.gov/pipermail/parallel-netcdf/. You can browse even older mailing list messages at the older mailing list archives
- Rob Latham, Rob Ross, and Rajeev Thakur (Argonne National Lab)
- Wei-keng Liao, Seung Woo Son, and Alok Choudhary (Northwestern University)
- Kui Gao (formally postdoc at Northwestern, now Dassault Systèmes Simulia Corp.)
- Jianwei Li (Northwestern, graduated in 2006)
- Bill Gropp (formerly ANL, now UIUC)
When referring to the Parallel netCDF project, please use the following URLs:
- www.mcs.anl.gov/parallel-netcdf (the 'trac' or 'www-unix' URLs could change)
- http://cucis.ece.northwestern.edu/projects/PnetCDF/ (a page maintained by Northwestern University)
If you are looking for a reference to use in a published paper, please cite our SC2003 paper below.
- Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. Parallel netCDF: A Scientific High-Performance I/O Interface. In the Proceedings of Supercomputing Conference, November, 2003.
Original Parallel netCDF development was sponsored by the Scientific Data Management Center (SDM) under the DOE program of Scientific Discovery through Advanced Computing (SciDAC). It was also supported in part by National Science Foundation under the SDCI HPC program award numbers OCI-0724599 and HECURA program award numbers CCF-0938000. Ongoing maintenance is funded by the Scientific Data, Analysis, and Visualization (SDAV) Institute under the SciDAC program.