From Ocean Teacher Library
Self-Describing FormatsBackground
- These formats are the operational formats in use today by the world's meteorological community (GRIB, BUFR), the satellite community (HDF) and the newly-developed ocean observing systems (NetCDF). Three of them are suitable for gridded or raster data (GRIB, HDF, NetCDF) and two of them are suited for data reports (BUFR, NetCDF). They contain extensive internal metadata, hence the group name, providing user systems with all the information needed for both data discovery and practical usage. Recent advancements that indicate a fusion of these technologies are noted below.
- WMO calls BUFR and GRIB table-driven code forms, because they require the use of many standard code tables (see the WMO Codes reference below). The global meteorological community has led the development of data standards, such as code tables, and in recent months the ocean community has begun to look toward these sample principles.
Binary Universal Form for the Representation of Meteorological Data (BUFR)
Please read the sub-article BUFR and GRIB Formats
Character Form for the Representation and Exchange of Data (CREX)
ASCII analog to BUFR. [Further information needed]
Gridded Binary (GRIB, GRB, GRB1, GRB2)
Please read the sub-article BUFR and GRIB Formats
Hierarchical Data Format (HDF, HDF4, HD4, HDF5, HD5)
Due to its extremely widespread and long-term use within the remote sensing community, HDF has experienced evolution in form, resulting in some issues about format and use that must be addressed. Many thanks to the HDF Group for the material below on Format Issues and for some of the resources cited.
HDF Format Issues
- HDF was originally developed as a robust, standard format for gridded data ranging in scales from planetary surface scans down to electron microscope scenes. It remains one of the principal formats for distribution of Earth Observing System (EOS) data from US NASA.
- There are two different versions of HDF: HDF4 and HDF5.
- HDF4 is the original HDF format and HDF5 is a completely new HDF format.
- Some software programs can accommodate both HDF4 and HDF5, but in general the switch to HDF5 involves adopting new systems.
- Both are very general and can be used for almost any kind of data.
- HDF4 has been most widely used during the past 2 decades for data publications from NASA. Apparently it is still used exclusively by NASA's Ocean Color Web.
- HDF-EOS
- In order to standardize their use of for a particular kind of data, it is common for users to specify just how that data should be organized in either HDF4 or HDF5, and to produce software that understands that organization and hides it from the user.
- This has been done by EOS for earth science data.
- EOS has defined a data model called HDF-EOS, which defines certain kinds of earth science data objects, and specifies how to organize them in HDF4 and HDF5.
- So, you can think of HDF-EOS as a collection of earth science data objects, and there are many tools for accessing HDF-EOS files.
- These Earth Observing Systems (EOS) extensions are supposed to be adopted by all US NASA systems, but there are unfortunately some hold-outs.
- HDF-EOS2 and HDF-EOS5
- There are two implementations of HDF-EOS: HDF-EOS2 (which uses HDF4) and HDF-EOS5 (which uses HDF5).
- When you receive an HDF-EOS file, you usually do not need to worry about which format it uses. The software that is available for working with HDF-EOS files usually works with both kinds.
- HDF-EOS2 is used operationally by MODIS, MISR, ASTER, Landsat, AIRS and other EOS instruments.
- HDF-EOS5 is used only for EOS Aura instruments at present.
HDF Usage Issues
- The current status of HDF use is complicated by these factors:
- Many sofware programs do not state specifically which version of HDF their software can accomodate, and conversely many data sites don't clearly state which version they contain
- Possible misunderstandings and disagreements about exact format specifications (resulting in incorrect/hybrid forms)
- Different georeferencing methods used for Levels 1 and 2 data from Levels 3 and 4
- HDF Use Recommendation
- HDF use is a critical skill in the toolkit of marine data managers, but due to the above factors it is never easy, particularly so if a PC/Windows system is the only only available computer platform.
- When HDF use is necessary, due to the desirability of the data , it is usually possible to use HDFView to convert regular HDF grids (i.e. L3 and L4) to TXT, and then it should be further converted to a widely used grid format, such as either the ASCII or the binary versions of the ESRI gridded data format. Swath data (L2) may be accommodated by the software program Panoply, and/or HDF-EOS data may be accommodated by the software program HEG. Otherwise, specific software recommendations given with the data products may be useful.
Network Common Data Form (NetCDF, NC, NC4, NCML)
- NetCDF was developed principally for array data (i.e. grids), but it has been extended to measurements data, as BUFR is used. It is widely used in the climate, weather and marine community, and there are indications that it will play a large role in the emerging global ocean observing systems. Recently NetCDF 4.0 was released, incorporating HDF5, representing the first union of major formats. NetCDF has an ASCII analog format, CDL, that can be easily "compiled" to NetCDF.
- Apparently NetCDF is now being routinely used in some global remote sensing programs, for example the Group for High Resolution Sea Surface Temperature (GHRSST). Because NetCDF development has not experience quite so many "version" problems as HDF (although there have been some issues), its use greatly furthers compatibility between data products and applications.
- In development is an ASCII variant of NetCDF, similar to the CDL format (below) but written with XML syntax, called NetCDF Markup Language (NCML). An introductory level reference is provided below.
- NetCDF Use Recommendation
- Well-formed NetCDF grid files represent very few difficulties, when used with a wide variety of visualization and analysis programs. Capture of the basic grid within the file can be accomplished by exporting a CDL file (from ncBrowse) or by simple cut and paste from the data view in Panoply (using the displayed geographic coordinates). Either route enables easy creation of floating point TIF files for a GIS system, i.e. for WMS, after simple conversions in Saga. Exactly subsetted images can now be created with Panoply, but the only export mode for the geo-registered images is KMZ, unfortunately. The entire page (image plus labeling) can be saved and geo-registered with the Georeferencing Tool.
- Available files:
- None at this time
Common Data Language (CDL)
The CDL format is the ASCII analog to NetCDF (above). Both are designed primarily to hold grids, although recently they have been extended to hold measurement data. When a CDL file contains a grid, the grid dimensions are not necessarily Cartesian, so the coordinates of the cell values are given in separate longitude (COADSX) and latitude (COADSY) lists. Notice in this example file of air temperature offshore Namibia, that there is a large header containing useful metadata, a feature CDL shares with NetCDF.
netcdf coads_airT_annu_namib {
dimensions:
TIME = UNLIMITED ; // (1 currently)
COADSY27_38 = 12 ;
COADSX170_181 = 12 ;
variables:
double TIME(TIME);
TIME:units = "hour since 0000-01-01 00:00:00";
TIME:time_origin = "01-JAN-0000 00:00:00";
TIME:modulo = " ";
TIME:axis = "T";
double COADSY27_38(COADSY27_38);
COADSY27_38:units = "degrees_north";
COADSY27_38:point_spacing = "even";
COADSY27_38:axis = "Y";
double COADSX170_181(COADSX170_181);
COADSX170_181:units = "degrees_east";
COADSX170_181:modulo = " ";
COADSX170_181:point_spacing = "even";
COADSX170_181:axis = "X";
float AIRT(TIME, COADSY27_38, COADSX170_181);
AIRT:missing_value = -1.0E34; // float
AIRT:_FillValue = -1.0E34; // float
AIRT:long_name = "AIR TEMPERATURE";
AIRT:history = "From coads_climatology";
AIRT:units = "DEG C";
data:
TIME = 366.0 ;
COADSY27_38 = -37.0, -35.0, -33.0, -31.0, -29.0, -27.0, -25.0, -23.0, -21.0,
-19.0, -17.0, -15.0 ;
COADSX170_181 = 359.0, 361.0, 363.0, 365.0, 367.0, 369.0, 371.0, 373.0,
375.0, 377.0, 379.0, 381.0 ;
AIRT = 17.228333, 17.065, 17.455263, 16.346666, 17.512499, 16.987143,
17.545, 17.392857, 18.278461, 18.636896, 19.393158, 20.12606, 18.900278,
18.434546, 18.449444, 18.503714, 18.595135, 18.457222, 18.675499,
18.710697, 19.071627, 19.72925, 19.780909, 20.680454, 20.247097,
20.205555, 20.416842, 19.726, 19.536154, 19.536154, 19.85093, 19.870714,
19.926363, 19.161818, 18.026363, -1.0E34, 21.402308, 21.224167, 21.257647,
21.004103, 20.88439, 20.502619, 20.328604, 20.34159, 20.045227, 19.30814,
18.785713, -1.0E34, 22.426786, 22.085554, 21.621315, 21.5655, 21.184048,
20.894545, 20.68186, 20.453863, 19.682499, 17.732187, -1.0E34, -1.0E34,
22.565641, 22.434633, 22.128809, 21.716743, 21.435226, 20.857273,
20.561363, 20.263409, 17.732925, -1.0E34, -1.0E34, -1.0E34, 22.782927,
22.277618, 22.16744, 21.9075, 21.535814, 21.021135, 20.62659, 18.51375,
16.666666, -1.0E34, -1.0E34, -1.0E34, 22.719025, 22.728636, 22.302273,
22.170513, 21.755814, 21.232044, 20.676285, 18.70317, 17.25389, -1.0E34,
-1.0E34, -1.0E34, 22.673489, 22.640232, 22.737429, 22.063095, 21.708635,
21.38128, 20.5005, 19.43775, 20.286999, -1.0E34, -1.0E34, -1.0E34,
22.922045, 22.576841, 22.574652, 22.175226, 21.924318, 21.463783,
20.031794, 19.62697, -1.0E34, -1.0E34, -1.0E34, -1.0E34, 23.318485,
22.988647, 22.597273, 22.291136, 21.959486, 21.70975, 20.270811, -1.0E34,
-1.0E34, -1.0E34, -1.0E34, -1.0E34, 23.486755, 22.993954, 22.83909,
22.902895, 22.845121, 22.681786, 22.34054, 23.177826, -1.0E34, -1.0E34,
-1.0E34, -1.0E34 ;
ENVISAT Format
The EnviSat format is actually a family of closely related formats, developed within a common schema for representation of data from the eponymous satellite platform. ENVISAT products will all follow a generalized structure consisting of:
- A Main Product Header (MPH); inspection of sample files indicates the MPH is often ASCII
- A Specific Product Header (SPH) containing information specific to the whole product plus one or more Data Set Descriptors (DSDs) which describe individual Data Sets; often ASCII
- One or more Data Sets (DSs), each consisting of one or more Data Set Records (DSRs); often binary.
Consult the references below for detailed information.
Additional Resources
- Wikipedia: BUFR
- Wikipedia: GRIB - This article is patently out of date, and actually misleading (when it refers to a GRIB1 subversion 2, a possible misreading of the headings on the WMO page listed below). It is listed here only for the reader's information, and in hopes that it will be updated/corrected eventually.
- Wikipedia: HDF
- Wikipedia: NetCDF
- WMO GRIB1 and GRIB2 Information Page - Links to principal documentation
- WMO Codes: Operational Codes TAC - BUFR - CREX - GRIB
- GRIB2 Use at the (US) National Centers for Environmental Prediction - Extensive documentation on GRIB2 implementation in the US
- The HDF Group - Homepage for the HDF4 format
- Home Page for HDF5
- NetCDF (home page)
- What versions of HDF and HDF-EOS are available and how are they different?
- Software for Manipulating or Displaying NetCDF Data
- HDF4 and HDF5 Tools and Utilities
- NetCDF Markup Language - UNIDATA's description of the new development
- Transition from HDF5 for HDF4 Users: status and goals - Differences between HDF4 and HDF5
- How is HDF5 different than HDF4?
- HDF5 for HDF4 Users: a short guide
- HDF vs. HDF5
- ENVISAT Product Specifications - Overall program document
- ENVISAT Product Structures - Volume 5 in the above reference, covering physical data structures
Subsections of this Article
| Pagename | Short title | Description | |
|---|---|---|---|
| BUFR and GRIB Formats | BUFR and GRIB Formats | BUFR and GRIB | none |
Information about this article
Short title: Self-Describing Formats
Description: These formats contain extensive internal metadata, which provides user systems with all the information needed for both use and discovery. Station data, grids and rasters can be accommodated in these formats.
Expertise level: beginner
Author: Murray.Brown
Approval status: approved
Approved by: Murray.Brown
Last change: 2012-1-13
Subsection of: Marine Data Format Types
Contact
If you have any direct comments or suggestions for the author of this page then please feel free to send an email to the author (listed above). For discussions on this page please use the discussions page.,



