From Ocean Teacher Library
Raster and Grid FormatsContents |
Background
- In the earth sciences, a gridded data file is usually thought of as the set of numbers making up a rectilinear array (i.e. rows and columns) of parameter values, and the raster is sometimes thought of as a visualization of the grid.
- In computer graphics, a raster graphics image or bitmap, is a data structure representing a generally rectangular grid of pixels, or points of color, viewable via a monitor, paper, or other display medium. Raster images are stored in image files with varying formats. [From Wikipedia: Raster graphics]
- Basic units:
- Of the grid is the "cell" containing one numerical value of the measured parameter
- Of the raster is the "pixel" containing a specification for a particular color. The color value is derived from the corresponding cell value by a formula or a look-up table.
- Vector data files and raster/grid data files, when they are appropriately georeferenced (see below), constitute the basic data for Geographic Information Systems.
- Earth scientists are familiar with grid data covering a wide range of scales, from a simple grid that analyzes a set of scattered data points in a small region of the sea, to global satellite images. The shift from pure numbers to color values is seemless and often unrecognized because most analysis programs have excellent automatic color generating algorithms that obscure the process
- Because it is almost impossible to separate the concepts involved here, both grids and rasters will be considered in this article.
- Except in the case of CDL, all these grids are constructed of evenly spaced rows and columns. This means that if you know the row spacing and the column spacing, and you know the point of origin of the grid, then you can work out the exact final dimensions of the whole grid. This assumption is not always true today, especially with exotic model grids and projected data, so the CDL format allow the rows and column locations to be specified individually.
Grid Formats
Raw Binary
These grids are the most simple, because they consists of only a sequence of binary numbers, usually reading from west-to-east (in the geographic sense), and either from north-to-south or from south-to-north (as specified in the format description). Raw binary grids can also be encountered imbedded within more complex structures, such as those described below and in Self-Describing Formats.
Raw Binary with Imbedded ASCI Header
Prior to the widespread use of Self-Describing Formats, which contain formally proscribed internal descriptive tags, many important datasets were published containing informal internal ASCII descriptions at the beginning of the file. Users are supposed to use ASCII editors to view the file beginning in order to find necessary usage information (e.g. number of rows, number of columns, etc.). One very important datum was the exact number of ASCII bytes in the file to skip over before reading the binary grid values. Here is a typical example from an old WOCE windstress dataset. Just after the ASCII header part, the the binary part appears to be meaningless ASCII characters:
DATA SET : SSM/I WIND COMPONENTS, SPEED AND STRESS ; TIME AVERAGE : FIVE DAY ; TIME MIN MAX : 19900101 19900105 0001 0005 ; ARRAY SIZES : 720 360 ; LONGITUDE MIN MAX DELTA : 0.25 359.75 0.5 ; LATITUDE MIN MAX DELTA : -89.75 89.75 0.5 ; FORMAT EAST-COMP ARRAY : 720x360 1-BYTE UNSIGNED ; FORMAT NORTH-COMP ARRAY : 720x360 1-BYTE UNSIGNED ; FORMAT WIND SPEED ARRAY : 720x360 1-BYTE UNSIGNED ; FORMAT EAST-STRESS ARRAY : 720x360 2-BYTE SIGNED INTEGER (SGI) ; FORMAT NORTH-STRESS ARRAY : 720x360 2-BYTE SIGNED INTEGER (SGI) ; FORMAT COUNTER ARRAY : 720x360 1-BYTE UNSIGNED ; SCALING FOR COMP ARRAYS : WIND COMPONENT (M/S) = (X/4)-30 ; SCALING FOR WIND SPEED : WIND SPEED (M/S) = X/8 ; SCALING FOR STRESS ARRAYS : WIND PSEUDOSTRESS (M/S)^2 = ABS(X)*X/10000 ; VERSION : WOCE CD-ROM NASA JPL PODAAC V1-ATLAS2 199804 ; END OF HEADER : 1360 byte header ; s;onv;snv;r68of;9j;of;;;vn;sh.l;0o583473753700pppyup3hp3hjp38gvpvpghjgvpp8jgphgh ogppe0707aout;70an09'at[0ut[3j-nu-t3tnu[3[93j[g03jug[03j[mt[3u[0u[t3mu[3mu[3u[tu pj3nt037696e6[86;jg;e;ee;9898tutuut;n988-85--v'v-'n'tu'au'tu'tnu'9nu9nut9ut2ut2x ;otu3nu[386nau93u'93u'9u'-3'-93'-9u3'9u3'nu'39u'3ub93u'9u39'u3[u'3tu'3u'9'93u'tu [rest of file omitted for brevity]
Surfer Grids
These grids are some of the most widely used formats in the earth sciences, due to the enormous popularity of the Surfer gridding and contouring program. There are two basic types:
Surfer ASCII Grid (GRD)
This example is a sea surface temperature grid offshore Namibia. Examining the small header we find these values: After DSAA (a company identifier), the number of rows, number of colums, minimum longitude (X), maximum longitude (X), minimum latitude (Y), maximum latitude (Y), minimum grid value and maximum grid value. 'NOTE: Blank rows between the data grid rows have been removed for clarity in this wiki.
DSAA 10 14 7 17 -29 -15 11.243040968085 19.431364974874 16.824741012143 16.745461989843 16.587686781674 16.253122497206 15.949882499559 15.870442610423 14.956398046048 14.164092764509 13.348231715184 1.70141E+038 16.864519096148 16.717397259789 16.573352923603 16.365521649285 15.861864168512 16.459582282843 15.828388339189 15.60694014831 1.70141E+038 1.70141E+038 16.832513910695 16.499913594476 16.681316331512 16.355517896677 15.844839501661 15.839318520163 16.100996884231 12.483563432934 1.70141E+038 1.70141E+038 16.912755112345 16.763857427221 16.365525179991 15.566948873345 16.140160100238 15.720448187103 14.727189252088 11.243040968085 1.70141E+038 1.70141E+038 17.143251105689 16.927550070385 15.490304247905 16.874353192752 16.153333513854 15.582507876994 14.260412739458 1.70141E+038 1.70141E+038 1.70141E+038 16.969848863574 16.764922793326 17.378931675059 16.620479695448 16.041638762935 14.784318533978 13.878714698118 1.70141E+038 1.70141E+038 1.70141E+038 17.111157870444 17.249487339579 17.076440987238 16.919177533376 15.815051098758 14.544624256724 13.902985069789 1.70141E+038 1.70141E+038 1.70141E+038 17.320508806953 17.010005719576 17.098764770367 16.320446004906 14.907059718288 16.016383730942 14.418986018669 1.70141E+038 1.70141E+038 1.70141E+038 17.417493510903 17.317982986081 17.360892518646 15.582766505059 14.907139087959 14.430006480511 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 17.430786856401 17.325174703284 16.320565414238 15.641073581356 15.132948057917 13.75622553285 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 17.509270321069 17.013905429559 16.159659152898 14.968126979467 15.354255678655 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 17.701093237905 17.354570863884 17.312330513577 15.541605807285 14.679367258808 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 19.241305214076 18.092511307966 17.146115195515 16.619967330077 18.668326557116 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 19.431364974874 18.797263012583 17.485058447247 17.114769871276 17.388525398727 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038 1.70141E+038
Surfer Binary Grid (GRD)
Contains the same information, but all in binary and cannot be viewed.
ArcView Gridded
Also known as the Arc Grid format and several other names. There are two versions of this format, ASCII and binary. They both have exactly the same explanatory header: for the ASCII version, the header is contained within the file, at the beginning; for the binary version, it is contained in a separate HDR file.
ASCII Version (*.ASC, *.TXT)
- The Wikipedia article cited below is correct as it concerns the ASCII version.
- The ASCII version of this format is illustrated by this example, a sea surface temperature (SST) analysis from the World Ocean Database. The first 6 lines are a header containing basic usage information. The first 17 values in the grid are blank values, probably because the locations are over land. After a few example SST values the file has been cut short for brevity.
- The items NCOLS, NROWS and CELLSIZE are self-explanatory.
- The item XLLCORNER refers to the x coordinate of the left edge of the lower left (LL) grid cell
- An optional syntax is XLLCENTER, which refers to the x coordinate of the center of the lower left cell
- The item YLLCORNER refers to the y coordinate of the bottom edge of the lower left cell
- An optional syntax is YLLCENTER, which refers to the y coordinate of the center of the lower left cell
- Regardless of whether CORNER- or CENTER-syntax is used, the universally recognized interpretation of grid values is that they represent the expected (or analyzed) value at the center of the grid cell. If you are converting a grid/raster to XYZ format, then the CENTER-syntax values can be used directly. If the CORNER-syntax is used, then the values must be slightly adjusted before exporting as XYZ; see Grids/Rasters and XYZ Files for further information.
- The item NODATA_VALUE refers to a special number in the grid that indicates no valid data value is available for that cell.
- The bizarre, long values often used by US NASA and other servers may exceed the precision capabilities of some PCs, and not be correctly recognized during data processing. If this occurs, then an ASCII editor should be used to convert all instances to the standard Arc Grid value of -9999.
ncols 25 nrows 17 xllcorner -98 yllcorner 15 cellsize 1 nodata_value -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 23.73070 24.34900 24.77740 24.91330 24.82520 24.58710 24.27690 23.97770 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 -9.9999998E+33 (remainder of the file omitted for brevity)
Note that ESRI ASCII grids are "anchored" at the southwest corner by the X and Y values (of either the corner or center of that pixel, as appropriate), but the data values in the grid actually begin in the northwest corner.
Binary Version (FLT + HDR)
- The Wikipedia article cited below, entitled "ESRI Grid," really refers to the Arc/Info Grid (see next section).
- The binary version of the ArcView gridded format consists of two files, with the same filename:
- The grid data, as single precision binary values, in a file with the extension FLT.
- The header lines of the ASCII version (see above) placed in a separate file, with the extension HDR. In the case of the example above, the HDR file would contain only these lines:
ncols 25 nrows 17 xllcorner -98 yllcorner 15 cellsize 1 nodata_value -9.9999998E+33
ESRI Coverage Format
The ESRI coverage format (see documentation link below) is an older format that can accomodate both vectors and rasters. Apparently it can accomodate about 15 different types of mapping concepts, corresponding exactly to the E00 compression format, also developed by ESRI. Coverages are often associated with the older ArcInfo software from ESRI, and there are indications that both are slowly losing importance in the GIS field. Coverages can also contain numerous separate files, and are often found compressed in the E00 format.
Arc/INFO Grid (ADF)
This type of grid is probably the most familiar of the grid-holding components of ESRI coverages, although there are apparently others. It is described in the ESRI reference below.
Raster Formats
Image File Formats
This topic is well described in the Wikipedia reference below. As found here without georeferencing, none of these simple images can be used directly in GIS systems. The major formats of interest to the marine science data manager are these:
- JPEG or JPG (Joint Photographic Experts Group) - Uses 8 bits per color (red, green, blue) for a 24-bit total. This results in the commonly seen 3-number color specifications, for example YELLOW = (R=255, G=255, B=0). Sometimes a single, much longer number is used. In this case it would be YELLOW = (R + G*255 + B*255*255) = 255 + 255*255 +0*255*255) = 255 + 65025 + 0 = 65280. The minimum number in this scheme would be 0+0+0=0 for pure black, and the maximum number would be 255 + 255*255 + 255*255*255 = 16646655 for pure bright white.
- JPEG2000 or JP2 - A wavelet-based image compression standard that reduces images to mathematical expressions of spectral curves across the image; size reductions are possible down to 4 pixels per bit for simple cases. The JP2 specification allows for internal georeferencing, but most GIS systems still require world files.
- TIF or TIFF (Tagged Image File Format) - When used for images, TIF uses 8 bits or 16 bits per color (red, green, blue) for 24-bit and 48-bit totals. TIF files can also contain integer or floating-point number arrays. The TIF specification allows for internal georeferencing, but most GIS systems still require world files.
- PNG (Portable Network Graphics) - Uses truecolor (16 million colours)
- GIF (Graphics Interchange Format) - Uses an 8-bit palette, or 256 colors; an old copyright limitation still obtains, sparking the recent development of PNG
- GIF images are often used as an easy way to rasterize (i.e. visualize) simple data grids. The data grid values are simply scaled to the range 0-255 (the range of GIF colors) and the these numbers are assigned to the GIF pixels.
- BMP (Windows bitmap) - Uncompressed and usually quite large
Georeferenced Images
With georeferencing, all of the image formats above can be used in GIS systems and be mapped correctly onto the earth's surface. If the image is projected, then complex methods in full GIS systems are required. If the image is not projected (i.e. it is in plain Cartesian coordinates) then it can easily be georeferenced by either of the following two methods:
- Images with World Files - If a world file is present alongside any of the above formats, then the image is recognized as georeferenced by GIS programs. Most Cartesian images can be georeferenced with the Georeferencing Tool, which writes the applicable world file after 3 reference points have been identified on the image.
- GeoTIFF - In the case of TIFF/TIF files (and now also JP2), another option is available for georeferencing. The program GeoTIFF Examiner can be used to insert internal georeferencing tags into the file; when this is the case, then the file can correctly also be called "GeoTIF." The TIF internal tags use a slightly different physical referencing point from the world file convention, which is obvious during the use of the program. The importance of GeoTIF is debatable, because nearly all GIS programs require the world file and ignore the internal tags. When both are present, but differ in content, only the world file is used.
Raster and Grid Data in Self-Describing Formats
In addition to the formats described here, raster and grid data can be entirely contained within more complex formats of the Self-Describing Formats type. In such cases the data values of the cells or the color values of the pixels are moved into an entirely different structure, often also containing metadata. The most notable of these formats are NetCDF, HDF and GRIB. NetCDF is unique in that it has an ASCII analog format called CDL, described above.
Raster and Grid Data in Vector Formats
Gridded data can be placed into a special type of shapefile, called a "points shape," wherein each grid point is represented by an individual point on the map, with individual properties. The most obvious property for the point, of course, would be the original measured parameter value from the grid (i.e. the "z" value). When a large number of points shape points are viewed in a GIS system, they often appear to cover the entire map exactly like a colored raster image. But when the map is zoomed, each point appears separately. The value of points shapes is that they offer a very easy method to transfer data from complex grid systems (e.g. GRIB format for meteorological data) to GIS systems.
XYZ Tables
The XYZ spreadsheet/table format, where three columns of data represent (usually) longitude, latitude and a parameter value, is very closely related to geo-referenced rasters and grids. They are the simplest and most unambiguous means of transferring the contents of grids between programs. The relation between an XYZ file and the source raster/grid is not simple, however, as you can see in the related article Grids/Rasters and XYZ Files.
Additional Resources
- Wikipedia: ESRI grid
- What is the file structure of an Arc/INFO Grid?
- Arc/Info Binary Coverage Format Analysis] - Unofficial description of the coverage format, which has never been publically documented by ESRI.
- Wikipedia: Image file formats
- Wikipedia: JPEG
- Wikipedia: Tagged Image File Format
- Wikipedia: JPEG 2000
- Wikipedia: Portable Network Graphics
- Wikipedia: Wikipedia: Graphics Interchange Format
- Wikipedia: BMP file format
- Wikipedia: Georeference - Very brief definition of the term
Subsections of this Article
| Pagename | Short title | Description | |
|---|---|---|---|
| Grids/Rasters and XYZ Files | Grids/Rasters and XYZ Files | Grids/Rasters and XYZ | The relationship between the grid cell values and geography is such that the parameter value is assumed to exist at the geographic centre of the grid cell |
| Vertex-Centered and Cell-Centered Grids | Vertex-Centered and Cell-Centered Grids | Vertex-Centered/Cell-Centered | The mathematical process of gridding data results in calculated values that must be mapped onto the earth's surface. |
Information about this article
Short title: Raster and Grid Formats
Description: In the earth sciences, a gridded data file is usually thought of as a set of numbers making up a rectilinear array (i.e. rows and columns) of parameter values, and the raster is sometimes thought of as a visualization of the grid. Both are essential inputs to geographic information systems.
Expertise level: expert
Author: Murray Brown
Approval status: approved
Approved by: Greg Reed
Last change: 2012-1-30
Subsection of: Marine Data Format Types
Contact
If you have any direct comments or suggestions for the author of this page then please feel free to send an email to the author (listed above). For discussions on this page please use the discussions page.



