Parallel netCDF: A Parallel I/O Library for NetCDF File Access
Parallel netCDF (PnetCDF) is jointly developed by Northwestern University and Argonne National Laboratory.
PnetCDF is a library providing high-performance parallel I/O while still maintaining file-format compatibility with Unidata's NetCDF, specifically the formats of CDF-1, 2, and 5. The CDF-5 file format, an extension of CDF-2, supports more data types and allows users to use 64-bit integers to define large dimensions, attributes, variables (> 2B array elements).
NetCDF supports parallel I/O starting from version 4. Prior to version 4.1, the file format for parallel I/O operations is restricted to HDF5. Starting from release of 4.1, NetCDF users can also perform parallel I/O on files in classic formats (CDF-1, 2, and 5) through PnetCDF library underneath.
NetCDF gives scientific programmers a self-describing and portable means for storing data. However, prior to version 4, netCDF does so in a serial manner. By making some small changes to the netCDF APIs, PnetCDF can use MPI-IO to achieve high-performance parallel I/O.
- Downloads: latest and previous software releases, as well as development source repository.
- Documentation: a QuickTutorial, plus papers, presentations, articles, and other resources
- Benchmarking: tools and suggestions for evaluating PnetCDF performance
- March 3, 2016: PnetCDF 1.7.0 is released (the latest stable version). See ReleaseNotes-1.7.0.
- Starting from 4.4.0, netCDF-4 officially supports the CDF-5 file format for both sequential and parallel access. See its release note.
- A set of NetCDF-4 example programs shows how to access files in parallel through PnetCDF or HDF5.
- We continue working with the netCDF team at Unidata to improve CDF-5 and PnetCDF features in netCDF-4.
- PnetCDF C interface guide is now available.
- PnetCDF Q&A contains a few tips for achieving better I/O performance.
A Note About Large File Support
The classic CDF file format (now obsolete) has been in use by NetCDF library through version 3.5.1. The classic format has been updated by NASA ESDS community standard and added a support for 64-bit offset file format. See NetCDF Classic and 64-bit Offset File Formats.
Starting from 3.6.0, the serial NetCDF library added support for the 64-bit offset format, (also referred as CDF-2 format.) With this format, even 32 bit platforms can create NetCDF files greater than 2GB in size. CDF-2 also allows more special characters in the name strings of defined dimension, variables, and attributes. The support was based largely on work from Greg Sjaardema.
As of PnetCDF 0.9.2, we ship with support for large file size specified in CDF-2 format. See README.large_files in the source tree for more information.
Starting from 1.3.0, PnetCDF supports CDF-5 file format: adding unsigned and 64-bit integer data types and allowing variables with more than 232 array elements.
File and Variable Limits
Both PnetCDF and NetCDF share limitations on file and variable sizes. More information can be found on the FileLimits page.
PnetCDF requires an MPI implementation with MPI-IO support. Most MPI libraries have this nowadays. A parallel file system would also go a long way towards achieving highest performance.
PnetCDF makes use of several other technologies.
- ROMIO, an implementation of MPI-IO, provides optimized collective and noncontiguous operations. It also provides an abstract interface for a large number of parallel file systems.
- One of those file systems ROMIO supports is PVFS, a high performance parallel filesystem for linux clusters.
Today, there are several options for high level I/O libraries. Here are some discussions on the role of PnetCDF in this ecosystem:
We discuss the design and use of the PnetCDF library on the firstname.lastname@example.org mailing list. Anyone interested in developing or using PnetCDF is encouraged to join. Visit the list information page for details.
The URL for the list archive is http://lists.mcs.anl.gov/pipermail/parallel-netcdf/. You can browse even older mailing list messages at the older mailing list archives
- Rob Latham, Rob Ross, and Rajeev Thakur (Argonne National Lab)
- Wei-keng Liao and Alok Choudhary (Northwestern University)
- Seung Woo Son (formally postdoc at ANL and then Northwestern, now an Assistant Professor at UMass Lowell)
- Kui Gao (formally postdoc at Northwestern, now Dassault Systèmes Simulia Corp.)
- Jianwei Li (Northwestern, graduated in 2006)
- Bill Gropp (formerly ANL, now UIUC)
When referring to the Parallel netCDF project, please use the following URLs:
- www.mcs.anl.gov/parallel-netcdf (the 'trac' or 'www-unix' URLs could change)
- http://cucis.ece.northwestern.edu/projects/PnetCDF/ (a page maintained by Northwestern University)
If you are looking for a reference to use in a published paper, please cite our SC2003 paper below.
- Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. Parallel netCDF: A Scientific High-Performance I/O Interface. In the Proceedings of ACM/IEEE conference on Supercomputing, pp. 39, November, 2003.
Original Parallel netCDF development was sponsored by the Scientific Data Management Center (SDM) under the DOE program of Scientific Discovery through Advanced Computing (SciDAC). It was also supported in part by National Science Foundation under the SDCI HPC program award numbers OCI-0724599 and HECURA program award numbers CCF-0938000. Ongoing maintenance is funded by the Scientific Data, Analysis, and Visualization (SDAV) Institute under the SciDAC program.