Parallel netCDF: A Parallel I/O Library for NetCDF File Access
Parallel netCDF (PnetCDF) is jointly developed by Northwestern University and Argonne National Laboratory.
PnetCDF is a library providing high-performance parallel I/O while still maintaining file-format compatibility with Unidata's NetCDF, specifically the formats of CDF-1 and CDF-2. Although NetCDF supports parallel I/O starting from version 4, the files must be in HDF5 format. PnetCDF is currently the only choice for carrying out parallel I/O on files that are in classic formats (CDF-1 and 2).
In addition, PnetCDF supports the CDF-5 file format, an extension of CDF-2, that supports more data types and allows users to define large dimensions, attributes, and variables (>2B elements).
NetCDF gives scientific programmers a self-describing and portable means for storing data. However, prior to version 4, netCDF does so in a serial manner. By making some small changes to the netCDF APIs, PnetCDF can use MPI-IO to achieve high-performance parallel I/O.
- Downloads: latest and previous software releases, as well as development source repository.
- Documentation: a QuickTutorial, plus papers, presentations, articles, and other resources
- Benchmarking: tools and suggestions for evaluating PnetCDF performance
- July 8, 2014: PnetCDF 1.5.0 is released. See ReleaseNotes-1.5.0.
- May 16, 2014: a test release of PnetCDF 1.5.0.pre1 is available. See ReleaseNotes-1.5.0.pre1 for more details.
- PnetCDF C interface guide is now available.
- PnetCDF Q&A contains a few tips for achieving better I/O performance.
- December 23, 2013: PnetCDF 1.4.1 is released. See ReleaseNotes-1.4.1 for more details.
- Fortran header file, pnetcdf.inc, now can be included in both fixed and free-formed Fortran programs.
- Initial subfiling feature has been added to 1.4.1.
- November 17, 2013: PnetCDF 1.4.0 is released. See ReleaseNotes-1.4.0 for more details.
- Fortran 90 APIs are now available in 1.4.0.
- New APIs, ncmpi_get/put_varn_<type> for reading/writing a list of sub-requests to a single variable. Available for F77 and F90 as well.
- Interoperability with netCDF-4: netCDF-4 programs can now access CDF-1 and CDF-2 files in parallel through PnetCDF. See example programs.
- FLASH-IO benchmark using PnetCDF is now part of the source code release.
A Note About Large File Support
The classic CDF file format (now obsolete) has been in use by NetCDF library through version 3.5.1. The classic format has been updated by NASA ESDS community standard and added a support for 64-bit offset file format. See NetCDF Classic and 64-bit Offset File Formats.
Starting from 3.6.0, the serial NetCDF library added support for the 64-bit offset format, (also referred as CDF-2 format.) With this format, even 32 bit platforms can create NetCDF files greater than 2GB in size. CDF-2 also allows more special characters in the name strings of defined dimension, variables, and attributes. The support was based largely on work from Greg Sjaardema.
As of PnetCDF 0.9.2, we ship with support for large file size specified in CDF-2 format. See README.large_files in the source tree for more information.
Starting from 1.3.0, PnetCDF supports CDF-5 file format: adding unsigned and 64-bit integer data types and allowing variables with more than 232 array elements.
File and Variable Limits
Both PnetCDF and NetCDF share limitations on file and variable sizes. More information can be found on the FileLimits page.
PnetCDF requires an MPI implementation with MPI-IO support. Most MPI libraries have this nowadays. A parallel file system would also go a long way towards achieving highest performance.
PnetCDF makes use of several other technologies.
- ROMIO, an implementation of MPI-IO, provides optimized collective and noncontiguous operations. It also provides an abstract interface for a large number of parallel file systems.
- One of those file systems ROMIO supports is PVFS, a high performance parallel filesystem for linux clusters.
Today, there are several options for high level I/O libraries. Here are some discussions on the role of PnetCDF in this ecosystem:
We discuss the design and use of the PnetCDF library on the firstname.lastname@example.org mailing list. Anyone interested in developing or using PnetCDF is encouraged to join. Visit the list information page for details.
The URL for the list archive is http://lists.mcs.anl.gov/pipermail/parallel-netcdf/. You can browse even older mailing list messages at the older mailing list archives
- Rob Latham, Rob Ross, and Rajeev Thakur (Argonne National Lab)
- Wei-keng Liao and Alok Choudhary (Northwestern University)
- Seung Woo Son (formally postdoc at ANL and then Northwestern, now an Assistant Professor at UMass Lowell)
- Kui Gao (formally postdoc at Northwestern, now Dassault Systèmes Simulia Corp.)
- Jianwei Li (Northwestern, graduated in 2006)
- Bill Gropp (formerly ANL, now UIUC)
When referring to the Parallel netCDF project, please use the following URLs:
- www.mcs.anl.gov/parallel-netcdf (the 'trac' or 'www-unix' URLs could change)
- http://cucis.ece.northwestern.edu/projects/PnetCDF/ (a page maintained by Northwestern University)
If you are looking for a reference to use in a published paper, please cite our SC2003 paper below.
- Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. Parallel netCDF: A Scientific High-Performance I/O Interface. In the Proceedings of ACM/IEEE conference on Supercomputing, pp. 39, November, 2003.
Original Parallel netCDF development was sponsored by the Scientific Data Management Center (SDM) under the DOE program of Scientific Discovery through Advanced Computing (SciDAC). It was also supported in part by National Science Foundation under the SDCI HPC program award numbers OCI-0724599 and HECURA program award numbers CCF-0938000. Ongoing maintenance is funded by the Scientific Data, Analysis, and Visualization (SDAV) Institute under the SciDAC program.