Parallel netCDF: A Parallel I/O Library for NetCDF File Access

Parallel netCDF (PnetCDF) is jointly developed by Northwestern University and Argonne National Laboratory.

Overview

PnetCDF is a library providing high-performance I/O while still maintaining file-format compatibility with  Unidata's NetCDF. NetCDF gives scientific programmers a self-describing and portable means for storing data. However, prior to version 4, netCDF does so in a serial manner. By making some small changes to the netCDF APIs, PnetCDF can use MPI-IO to achieve high-performance parallel I/O.

  • Downloads: latest and previous software releases, as well as development source repository.
  • Documentation: a QuickTutorial, plus papers, presentations, articles, and other resources
  • Benchmarking: tools and suggestions for evaluating PnetCDF performance

News

  •  PnetCDF C interface guide is now available.
  •  PnetCDF FAQ contains a few tips for achieving better I/O performance.
  • December 23, 2013: PnetCDF 1.4.1 released. See ReleaseNotes-1.4.1 for more details.
  • Fortran header file, pnetcdf.inc, now can be included in both fixed and free-formed Fortran programs.
  • Initial  subfiling feature has been added to 1.4.1.
  • November 17, 2013: PnetCDF 1.4.0 released. See ReleaseNotes-1.4.0 for more details.
  • Fortran 90 APIs are now available in 1.4.0.
  • New APIs, ncmpi_get/put_varn_<type> for reading/writing a list of sub-requests to a single variable. Available for F77 and F90 as well.
  • Interoperability with netCDF-4: netCDF-4 programs can now access CDF-1 and CDF-2 files in parallel through PnetCDF. See  example programs.
  • FLASH-IO benchmark using PnetCDF is now part of the source code release.
  • NewsArchive

A Note About Large File Support

The  CDF (or CDF-1) file format has been in use by NetCDF library through version 3.5.1.

Starting from 3.6.0, the serial NetCDF library added support for  CDF-2 format. With this format, even 32 bit platforms can create NetCDF files greater than 2GB in size. CDF-2 also allows more special characters in the name strings of defined dimension, variables, and attributes. The support was based largely on work from Greg Sjaardema.

As of PnetCDF 0.9.2, we ship with support for large file size specified in CDF-2 format. See README.large_files in the source tree for more information.

Starting from 1.3.0, PnetCDF supports  CDF-5 file format: adding unsigned and 64-bit integer data types and allowing variables with more than 232 array elements.

File and Variable Limits

Both PnetCDF and NetCDF share limitations on file and variable sizes. More information can be found on the FileLimits page.

Required Software

PnetCDF requires an MPI implementation with MPI-IO support. Most MPI libraries have this nowadays. A parallel file system would also go a long way towards achieving highest performance.

Related Projects

PnetCDF makes use of several other technologies.

  •  ROMIO, an implementation of MPI-IO, provides optimized collective and noncontiguous operations. It also provides an abstract interface for a large number of parallel file systems.
  • One of those file systems ROMIO supports is  PVFS, a high performance parallel filesystem for linux clusters.

Today, there are several options for high level I/O libraries. Here are some discussions on the role of PnetCDF in this ecosystem:

Mailing List

We discuss the design and use of the PnetCDF library on the parallel-netcdf@mcs.anl.gov mailing list. Anyone interested in developing or using PnetCDF is encouraged to join. Visit  the list information page for details.

The URL for the list archive is  http://lists.mcs.anl.gov/pipermail/parallel-netcdf/. You can browse even older mailing list messages at the older  mailing list archives

Project Members

  • Rob Latham, Rob Ross, and Rajeev Thakur (Argonne National Lab)
  • Wei-keng Liao, Seung Woo Son, and Alok Choudhary (Northwestern University)
  • Kui Gao (formally postdoc at Northwestern, now Dassault Systèmes Simulia Corp.)
  • Jianwei Li (Northwestern, graduated in 2006)
  • Bill Gropp (formerly ANL, now UIUC)

Citations

When referring to the Parallel netCDF project, please use the following URLs:

If you are looking for a reference to use in a published paper, please cite our SC2003 paper below.

Acknowledgements

Original Parallel netCDF development was sponsored by the Scientific Data Management Center (SDM) under the DOE program of Scientific Discovery through Advanced Computing (SciDAC). It was also supported in part by National Science Foundation under the SDCI HPC program award numbers OCI-0724599 and HECURA program award numbers CCF-0938000. Ongoing maintenance is funded by the Scientific Data, Analysis, and Visualization (SDAV) Institute under the SciDAC program.