Posts

Showing posts from July, 2017

What is NetCDF, and When Should You Use It

NetCDF is a data format for science and engineering data. It's also a set of free and open software libraries and tools which support the format. NetCDF works better than a database for most science and engineering data, which usually does not fit well into the relational database model. NetCDF has been accepted for decades by the Earth science community to store and share weather and climate data. It is also used by other science and engineering communities. Unique Needs for Science Data Science data needs are different from the data needs of commercial entities, like Google or Amazon. Why Not Use Databases? Databases are all about tables of data. Each table is a two dimensional array of fields. For example, in a table of customer data at Amazon, they will have your customer ID number, which allows them to look up your record, and then they will have fields like "first name", "last name", "street address 1", "street address 2", e

Building NetCDF for HPC

Image
Building netCDF from scratch, on a High Performance Computing (HPC) platform is a challenge. There are a lot of other libraries involved. This diagram shows all the possible 3-rd party libraries in a netCDF build (the yellow boxes): Why Build from Source? Building the netCDF library from source is not usually necessary for most users. On most HPC systems, netCDF will already be install somewhere. Contact your sysadmins and ask how to use it. Sometimes, you need to build from source, because: You need the latest version, and the sysadmins haven't installed it. You need some combination of tools and versions that is not already installed. You are the sysadmin, and you have to build the libraries so that all your users don't have to. You want to have full control and understanding of the build process. Building with Autotools The libraries we need to build all have standard "autotools" build systems. This means that the developers use autoconf