From Ocean Teacher Library

Jump to: navigation, search
Gridded Data

Contents


Background

Gridded data (or raster data) are the result of converting scattered individual data points (from a single observed surface) into a regular "grid" (or "raster") of calculated, hypothetical values. This conversion process is called gridding, and there are many methods and algorithms to grid marine data. The resulting grid is easier to analyze and display than the original scattered points, and grids (or more properly their graphical displays) are universally expected by scientists viewing earth science surveys. There are many different Raster and Grid Formats to publish and store gridded data, but all methods allow for specifying exactly where the points would lie on the real earth or a surface within the ocean, atmosphere or solid rock. This specification is called mapping, and there are many ways to accomplish it, based on the data format itself. Many environmental data grids are found in very complex Self-Describing Formats requiring specialized software for reading and analysis.

Gridding

There are many mathematical methods to create the grid, and the topic is beyond the scope of this article. In short, various algorithms are available to examine data points near ("in the neighborhood of") the desired fixed points of the regular grid, and to calculate the hypothetical value for the gridpoint. These algorithms often employ weighting methods to emphasize the data near the point, compared to more distant values. Most modern gridding programs provide a set of gridding algorithm choices. Grids can be Cartesian or rectilinear, as discussed in the Regular Grid reference below.

Data Values Within the Grid

The grid file can be either ASCII or BINARY, indicating the type of number values it contains. It is also common that ASCII or binary files can contain an initial header of ASCII text information.

Grid Resolution

This term refers to the "fineness" of the grid, or how small the grid cells are specified. All types of earth science data have different inherent spatial scales, depending on the phenomenon of interest. And different real-world datasets may or may not contain enough data (in the spatial sense) to support gridding that accurately portrays features. There are no hard rules that cover the selection of grid scales for each and every type of earth science data, but some general rules of thumb may be helpful:

  • Features of known scales, such as meso-scale eddies in the ocean or synoptic atmospheric weather patterns cannot be well resolved when the grid cell size is more than about 1/4 the physical scale.  In other words, you must have a minimum of 3-5 grid cells within a feature to display it reasonably accurately.
  • For general gridding purposes, regardless of the inherent physical scales of the natural features, the total number of grid cells should never be more than about 4 times the total number of data points, when the points evenly distributed; non-evenly distributed data points call for fewer grid cells.

Both of these "rules" tend to move the grid cells toward larger sizes, which is always preferred. The specification of very small cells is sometimes proposed as an alternative to sufficient good data, a poor analysis practice.

Grid Scale

The size of the grid cells is also sometimes referred to as the "scale of the grid." So it is useful to keep a careful eye on the stated scales of data grids you receive or create. The following extremely rough table is provided for general guidance, but it is only nearly correct at the equator and becomes progressively more incorrect as you approach the poles:

  • 1 degree = 60 nautical miles ≈ 100 km
  • 0.1 degree ≈ 10 km
  • 0.05 degree ≈ 5 km
  • 0.01 degree ≈ 1 km

For example, the commonly encountered US NASA MODIS imagery at stated grids scales of "9 km" and "4 km" are most similar to grids with resolutions of 0.1 degree and 0.05 degree, respectively.

Multiple Grids

Gridded data files commonly contain more than a single grid; in fact the term Scientific Data Set (SDS) typically means large, multi-parameter files containing more than one grid. In such cases, the grid for Parameter 1 can completely precede the grid for Parameter 2, and so on. An alternate structure would be to place all the data values for Parameter 1 through Parameter N in sequence for a single XY grid location, then move on to the next grid location where the Parameter 1 through Parameter n sequence would be repeated, and so on. If separate XY grids (for any number of parameters, as shown above) are available for different Z (depth) levels, then these grids can follow one another within the file, going for example from top to bottom. This scheme can be combined with the above case for multiple parameters in many ways, using sequences of XYZ dimensions and parameter dimensions that suit the user's software and offer a logical method to analyze the data. There is no "correct" way to construct files of multiple data grids, just many options. It is extremely important to document the sequence in which the dimensions (XYZ location, time, parameters) are "read." Software programs must be directed to input these data in looping fashion, going from the fastest-changing dimension (usually, but not always, the parameters) to the slowest-changing dimension (often time).

Vector Grids

To represent vectors (literally arrows showing the direction of flow) in ocean and meteorological datasets, two methods have been devised, both requiring two separate grids:

  • Provide the U and V vector components of the wind speed or current
  • Provide the direction and magnitude of the wind speed or current.

The grids can be contained in separate files, or sequentially listed in the same file.

Using Grids, or The 13 Questions

Before any grid can be analyzed (or even mapped to the real world), it has to be understood what it contains and where it lies on the earth. This requires that as many as all of the following issues must be addressed by the datafile format itself, or by ancillary metadata accompanying the grid:

  1. Binary or ASCII?
  2. If ASCII, then DOS or UNIX line terminators?
  3. If Binary, then Endianness?
  4. If Binary, then Number Type?
  5. Number of columns?
  6. Number of rows?
  7. Spacing between columns (delta-X)?
  8. Spacing between rows (delta-Y)?
  9. XY coordinates of first point?
  10. Corner location of first point: northwest or southwest?
    1. Both starting points are commonly used, but NW is normal.
    2. Some grids are defined by their SW corner coordinates, but the data values begin in the NW corner!
  11. Reading order: row-by-row, or column-by-column?
    1. Row-by-row is extremely common
    2. Column-by-column is almost never used
  12. Factor by which the data may have been multiplied prior to storage in the file?
    1. Sometimes done to make the values easier to store, read or analyze
    2. If, for instance, you have a data value of 1.234 and multiply it by 1000, then the value stored in the grid would be 1234 (an integer value) and the factor would be recorded as 1000
  13. The order in which the dimensions vary in multiple grids (see above)?
    1. Usually from most-frequently repeating to least-frequently repeating.
    2. For a gridded file of a single parameter, the most-frequently repeating value, the data parameter value, is referred to as Order 0; the next value (usually the longitude) is Order 1, and the next value (usually the latitude) is Order 2. For a multi-parameter file, the data values could be Order 0 through Order N, the longitude would be Order N+1 and the latitude would be Order N+2. Etc.


Additional Resources


Subsections of this Article

No subsections available

Information about this article

Short title: Gridded Data

Description: What are data grids, and what are their physical/mathematical characteristics?

Expertise level: beginner

Author: Murray.Brown

Approval status: approved

Approved by: Murray.Brown

Last change: 2010-6-8

Subsection of: Numerical Data

Contact

If you have any direct comments or suggestions for the author of this page then please feel free to send an email to the author (listed above). For discussions on this page please use the discussions page.,   

This page was last modified on 8 June 2010, at 14:35.This page has been accessed 5,491 times.
SemanticTreeview close tree

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License