BRIEF SUMMARY OF PDS LABELS AND DATA FORMAT DESCRIPTIONS This document briefly describes the characteristics of the PDS labels which document the files included on this disk. PDS Label Components. --------------------- The PDS labels for data files on this disk can be logically grouped into several categories: SFDU Registration Group. ------------------------ Registers the file as conforming to PDS data definition rules. All PDS labelled files on this disk use the following SFDU registration label: NJPL1I00PDSnnnnnnnn = PDS_SFDU_LABEL NJPL is the JPL control authority. 1 version id, indicates that the sfdu length is represented as an ASCII string. I class id, indicates that this is an information or data object class of SFDU. 00 are reserved characters filled with ascii zeros. PDS0 is the data definition record identifier, identifies the SFDU format as being a STANDARD PDS LABELLED file, with embedded format specification statements. nnnnnnnn is the length of the file minus 20 bytes, (which is the length of the sfdu label itself). The length is expressed in ASCII numerals, if this value is unknown use 00000000. File Identification Group. -------------------------- This group of information identifies the source, type and format of a particular data file. Special file types which have been defined for PDS are described in the PDS File Formats section of this document. Examples of some potential file identification labels are as follows: FILE_TYPE = TEXT, TABLE, BINARY_TABLE, IMAGE, CUBE, QUBE. RECORD_TYPE = FIXED_LENGTH, VARIABLE_LENGTH, STREAM. HARDWARE_TYPE = VAX, PDP, UNIVAC, IBM_370, IBMPC OPERATING_SYSTEM = VMS, MS_DOS, UNIX, OS/MVS FILE_ENCODING = NONE, HUFFMAN, KERMIT, RUN_LENGTH Record Identification Group. ---------------------------- The record identification group describes the contents of the logical records in the data file. The major descriptive terms for all files are RECORD_TYPE, RECORD_BYTES and FILE_RECORDS. Descriptions for individual logical records within a data file vary depending on file type. Record format information shall be expressed in the simplest terms for common data formats (text, tables, images), but shall also provide the capability to describe of define complex record formats using either FORTRAN77 format statements or a simple record format table compatible with PDS data dictionary standards. File Description Group. ----------------------- This section describes the keywords used for the majority of files on this disk, the text, table and image file types. Text file type (extension ".TXT"). ---------------------------------- The default text file consists of ASCII text in STREAM format with each line separated by a carriage return/line feed pair. Lines should be 71 characters or less, and there should be no embedded control characters other than the form feed (control-L) and tab characters (Control-I). Table file type (extension ".TAB") ---------------------------------- The default PDS table file type is a uniform collection of records containing ASCII values. The values may contain any legal value field as specified in the label formats descriptive material. Value fields are delimited with space, comma or tab characters, or for fixed-length tables, are specifically defined. The default record type for a table is STREAM format with carriage return/line feed pairs delimiting each record. Fixed_length record_types are also allowed. Parameters used to define table contents are as follows: TABLE_RECORDS The number of physical records. Usually the same as table_rows. TABLE_ROWS The number of logical table entries. ROW_COLUMNS The number of items of information in each table row. COLUMN_NAME The name of each data item or column in a table row. This field is normally represented as a parenthesized list of names (eg "COLUMN_NAME = (TIME,LATITUDE,LONGITUDE,VALUE)"). COLUMN_TYPE The data type of the data item, INTEGER, REAL, DOUBLE, LITERAL, TIME, STRING. COLUMN_FORMAT A fortran representation of the format statement needed to read or write the data item. COLUMN_START_BYTE The byte position (counting from 1) of the beginning of the data item within the row. COLUMN_BYTES The number of bytes containing the data item. COLUMN_UNITS The units of measure of the data item. COLUMN_NOTE Descriptive notes about the data item. Binary_Table file type (extension ".TAB"). ------------------------------------------ The BINARY_TABLE differs from the TABLE type in that it contains data items stored in binary (or mixed binary and ASCII) format. The default RECORD_TYPE of a binary table is FIXED_LENGTH. The parameters which define a BINARY_TABLE are identical to those for the TABLE, except the default meaning of the column type refers to a binary storage format. For example, the column type integer in a BINARY_TABLE would indicate a signed 4 byte (32 bit) binary value. These data types are discussed later in this document. Image file type (extension ".IMG"). ----------------------------------- The image file format is designed for simple two-dimensional arrays of regular instrument sample values from imaging and other instruments (cameras, radar, etc.). For fixed format image files there may be a label group, a header group, a history group, an array group (called the image for this file type) and a trailer group, each of which will require a separate definition. In addition, it is common for the data records to have either prefix or suffix bytes with each record of data, representing time tags, line numbers or engineering parameters specific to a certain line of data. The physical and logical structure of any of these files can be defined with the following label parameters: RECORD_BYTES The record length parameter represents the physical length of each record in the file. It also represents the DEFAULT logical record length for components of the file. FILE_RECORDS The file records parameter represents the number of physical records within the file with each record having a length equal to the RECORD_BYTES value. Record_bytes * file_records should always be equal to the file size. If defaults are necessary for these parameters, the logical choice would be 512. These two parameters provide a framework within which the logical file structure is built. The most common situation is that the record components have logical length values which are equal to the physical values, HOWEVER, the physical length values can be overridden for any component of an image file by specifying a "component_RECORD_BYTES" parameter. The data format keywords for the IMAGE file type are: LABEL_RECORDS Number of records containing PDS text labels. Generally the label area will be filled so that the labels consume a multiple of the RECORD_LENGTH parameter. LABEL_RECORD_BYTES Length of each label record. This parameter defaults to the RECORD_LENGTH if no value is provided. HEADER_RECORDS Number of records containing non-PDS labels or binary header records which precede the image data. HEADER_RECORD_BYTES Length of each header record. This parameter defaults to the RECORD_LENGTH if no value is provided. HISTORY_RECORDS Number of records in the history group. IMAGE_RECORDS Number of records containing image data. IMAGE_RECORD_BYTES Length of each line record. This parameter defaults to the RECORD_LENGTH if no value is provided. This value must be an integral number of 8-bit bytes. IMAGE_LINES Number of lines in image. Normally equal to IMAGE_RECORDS value. LINE_PREFIX_BYTES Number of bytes of data which precede the image data in each line record. LINE_SAMPLES Number of sample values contained in each line record. SAMPLE_BITS Number of bits of data comprising one sample value. Common values are 1 (bit), 4 (nibble), 8 (byte), 16 (halfword), 32 (fullword). LINE_SUFFIX_BYTES Number of bytes of data which follow the last sample value in a line record. TRAILER_RECORDS Number of records which follow the last image line record in a file. TRAILER_RECORD_BYTES Length of each trailer record. This parameter defaults to the RECORD_LENGTH if no value is provided. DATA TYPE AND FORMAT SPECIFICATIONS ----------------------------------- DATA TYPE SPECIFICATIONS. The data type definition segments of the PDS label structure use a simple syntax to define the data types of stored data values. The data format specifications indicate to the parser the type of data in a field, the starting byte location and the length in bytes. An optional format specification can be given for use in displaying data item values. Data values may be represented within data files in ASCII or BINARY format. The ASCII storage format is much simpler to transfer between different hardware systems and even between different languages on the same computer. On the other hand, all numerics are stored and manipulated internally in binary numeric types; thus ASCII data values must be converted to internal formats before they can be processed. Also, the ASCII representation of most numeric values requires more storage space than does the binary format. For example, each 8 bit pixel value in an image file would require 3 bytes if stored in ASCII format (four bytes if it was a signed quantity. The current specification uses the same set of data types for both ASCII and binary data values. The basic interpretation of whether a data value is stored in ASCII or BINARY format within the data record is derived from the file type. For TEXT and TABLE files the default representation is that the data type is stored in an ASCII format. For the BINARY_TABLE, IMAGE, CUBE and QUBE file types the default representation is that the data value is stored in binary format. The following data types can be specified: INTEGER - A signed integer value [default = 4 bytes]. UNSIGNED_INTEGER - An unsigned integer value. REAL - A real (floating point) value [default = 4 bytes]. BIT - A binary value. Usage not currently defined. COMPLEX - A complex value. Usage not currently defined. LITERAL - A text string of 30 characters or less. CHARACTER (or STRING) - A text string. TIME - Time value. Must be stored in ASCII format. The interpretation of these values is determined by the field length (bytes) parameter, thus an integer field specified with a field_bytes value of 2 would be interpreted as a 16 bit signed binary value. The following type specifiers can optionally be used to define the data type. INTEGER_1 - A signed 1 byte integer value. INTEGER_2 - A signed 2 byte integer value. INTEGER_4 - A signed 4 byte integer value. UNSIGNED_INTEGER_1 - An unsigned 1 byte integer value. UNSIGNED_INTEGER_2 - An unsigned 2 byte integer value. UNSIGNED_INTEGER_4 - An unsigned 4 byte integer value. DOUBLE (or REAL_8) - An 8 byte double precision floating point value. Doubles should be used if the precision required for a numeric value exceeds 7 digits. DOUBLE_G - A special format to handle the VAX G-type double precision type. DATA FORMAT SPECIFICATIONS. This discussion is a great simplification of the data formatspecification question. It only addresses fundamental types. The data format specification is used to determine the format for display of a data value. A 4 byte binary integer can store values in the range of -2,147,483,648 to 2,147,483,647 however the actual values stored in the field may only range from - 9999 to 9999. In this case is is convenient to specify the output length with a format statement "I5" where I indicates that the value is an integer and 5 indicates the number of display positions (one for the sign and 4 for the numeric value). The following data format specifications will be used where w = Total number of positions in the output field (including sign, decimal point or "E"). d = Number of positions to the right of the decimal point. e = Number of positions in exponent length field. Aw - Character (alphanumeric) data value. Iw - Integer value. Fw.d - Floating point value, displayed in decimal format. Ew.d[Ee] - Floating point value, displayed in exponential format.