From Ocean Teacher Library

Jump to: navigation, search
Document Formats

Contents

Background

Although most marine data is archived and published in purely data formats, there are cases where the data are only available in formats usually concerned with documents, including proprietary formats or elaborately formatted ASCII text. In most cases programs can easily be written to extract or convert ASCII, but proprietary formats (e.g. MS Word) can be extremely difficult to work with, unless regular blocks of material can be copied and pasted into an ASCII editor. Very large resources in proprietary formats simply wouldn't be possible to treat in this way. Another problem might be that the proprietary format has become obsolete. To the maximum practicable extent, data contained in proprietary formats should be exported or stripped out to ASCII tables, documented, and stored separately for safety.

ASCII Documents

Just about every possible way that data can be expressed visually on the page has been used at some time. Great effort sometimes went into creating tables with row and column headings, and attractive headers. These were usually published with little or no thought given to eventual use in a computer.

Shipdrift Data Tables from a US NODC CD-ROM

This CD-ROM contained ASCII text documents of data, intended to look good when printed. Notice that it even has vertical labels along the left side. To actually use these data required a program that went into each line and extracted the desired numbers. There were literally thousands of these tables in a few huge files.

              NODC SURFACE CURRENT (SHIPS DRIFT) LONG SUMMARY
10-DEGREE SQUARE 1400 1-DEGREE SQUARE 35                    MONTH 7
 RESULTANT DIRECTION   0  TOTAL OBSERVATIONS      1  NORTH COMPONENT    .0
 RESULTANT SPEED      .0                              EAST COMPONENT    .0
                  DISTRIBUTION OF INDIVIDUAL OBSERVATIONS             PER-
KNOTS(CM/SEC)    N    NE     E    SE     S    SW     W    NW     SUM  CENT
   CALM                                                            1 100.0
   0.1   (5)     0     0     0     0     0     0     0     0       0    .0
S  0.3  (15)     0     0     0     0     0     0     0     0       0    .0
P  0.5  (26)     0     0     0     0     0     0     0     0       0    .0
E  0.7  (36)     0     0     0     0     0     0     0     0       0    .0
E  0.9  (46)     0     0     0     0     0     0     0     0       0    .0
D  1.1  (57)     0     0     0     0     0     0     0     0       0    .0
   1.3  (67)     0     0     0     0     0     0     0     0       0    .0
C  1.5  (77)     0     0     0     0     0     0     0     0       0    .0
L  1.7  (88)     0     0     0     0     0     0     0     0       0    .0
A  1.9  (98)     0     0     0     0     0     0     0     0       0    .0
S  2.5 (129)     0     0     0     0     0     0     0     0       0    .0
S  3.0 (154)     0     0     0     0     0     0     0     0       0    .0
   3.5 (180)     0     0     0     0     0     0     0     0       0    .0
   4.0 (206)     0     0     0     0     0     0     0     0       0    .0
  >4.0(>206)     0     0     0     0     0     0     0     0       0    .0
SUM OF OBS.      0     0     0     0     0     0     0     0
PERCENT OBS.    .0    .0    .0    .0    .0    .0    .0    .0         100.0
MEAN SPEED      .0    .0    .0    .0    .0    .0    .0    .0
MAX. SPEED      .0    .0    .0    .0    .0    .0    .0    .0
STD DEVIATION   .0    .0    .0    .0    .0    .0    .0    .0
10-DEGREE SQUARE 1400 1-DEGREE SQUARE 42                    MONTH 9
 RESULTANT DIRECTION 224  TOTAL OBSERVATIONS      1  NORTH COMPONENT   -.1
 RESULTANT SPEED      .2                              EAST COMPONENT   -.1

ASCII Formatted Table from the NVODS Live Access Server (LAS)

The National Virtual Ocean Data System (NVODS), an OPeNDAP application using the interface, can deliver several different data formats that are crafted to have a good appearance when printed, as shown by this example. Note that the columns and rows are neatly labeled with their geographic coordinates. The use of -1.E+34 as the blank value is a common choice. Note that the link in the third row of the header may not be correct or active at this time.

            VARIABLE : ZONAL SURFACE CURRENT (CM/S)
            FILENAME : shipdrift_moncl.nc
            FILEPATH : http://ferret.pmel.noaa.gov/thredds/dodsC/data/PMEL/
            BAD FLAG : -9.9999998E+33
            SUBSET   : 11 by 12 points (LONGITUDE-LATITUDE)
            TIME     : 16-JAN 06:00
      6.5E   7.5E   8.5E   9.5E   10.5E  11.5E   12.5E   13.5E   14.5E   15.5E   16.5E
29.5S -13.07 -10.97 -8.455 -9.023 -17.87 11.69   -12.21  -10.4   -7.275  -24.7   29
28.5S -0.45  -25.24 -11.38 -12.99 -5.944 -5.1    -4.437  -4.928  -3.55   -15.55  -1.E+34
27.5S -20.25 -10.60 -11.98 -15.17 -11.33 -12.34  -6.765  -8.083  2.33    -10.6   -1.E+34
26.5S -10.63 -19.46 -8.239 -19.46 -13    -8.787  -5.7    -10.7   -35.4   -1.E+34 -1.E+34
25.5S -17.78 -13.42 -10.89 -10.81 -5.036 -9.609  23.1    -4.4    -3.6    -1.E+34 -1.E+34
24.5S -20.48 -16.05 -7.433 -5.867 -4.063 -1.E+34 -11.96  -1.E+34 2.817   -1.E+34 -1.E+34
23.5S -12.56  2.614 -10.04 1.785  5.525  0       6.6     1.7     -0.567  -1.E+34 -1.E+34
22.5S -12.96 -6.531 -12.12 4.4    13.75  -12.8   -14.83  -2.467  -5.333  -1.E+34 -1.E+34
21.5S -9.536 -1.171 7.97   59.9   -11.15 4.4     -22.68  -1.E+34 -1.E+34 -1.E+34 -1.E+34
20.5S -7.088 -6.9   53.1  -24.05  -7.825 -13.63  -4.243  -1.E+34 -1.E+34 -1.E+34 -1.E+34
19.5S -19.71 -16.25 18    -23.1   3.5    2.1     -8.7    -1.E+34 -1.E+34 -1.E+34 -1.E+34
18.5S -13.7  -4.4   1.667 6.425   -15.1  13.2    -1.E+34 -1.E+34 -1.E+34 -1.E+34 -1.E+34

Microsoft DOC

Very widely used format, native to the MS Word program.

  • Binary (mainly) with occasional bits of recognizable ASCII

Acrobat/Adobe Reader PDF

Highly valuable and widely used for scanning hard-copy documents

  • Files available:
    • SkagexReport.pdf - Technical report with valuable intercalibration data; available only as a very worn old paper copy. Provided by the Swedish Oceanographic Data Centre

Additional Resources

  • None at this time


Subsections of this Article

No subsections available

Information about this article

Short title: Document Formats

Description: The data are contained in formats usually concerned with digital documents, including proprietary formats (e.g. DOC) or elaborately formatted ASCII text.

Expertise level: beginner

Author: Murray.Brown

Approval status: approved

Approved by: Murray.Brown

Last change: 2011-12-2

Subsection of: Marine Data Format Types

Contact

If you have any direct comments or suggestions for the author of this page then please feel free to send an email to the author (listed above). For discussions on this page please use the discussions page.,   

This page was last modified on 2 December 2011, at 18:12.This page has been accessed 1,932 times.
SemanticTreeview close tree

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License