home *** CD-ROM | disk | FTP | other *** search
-
- Notes on Coding of BBC BASIC and GW-BASIC Files - Martin Carradus July 1993
- ---------------------------------------------------------------------------
-
- *** I do not guarantee the accuracy of the following information ***
-
- These notes give details of the format of files produced with BBC BASIC
- and GW-BASIC when the BASIC program is saved to disc.
-
- (BASIC command SAVE"filename", where 'filename' is the name of the file
- to be saved to).
-
- Introduction
- ------------
-
- In order to understand these notes, you will need to know something
- about the difference between compilers and interpreters.
-
- All computer languages need to be translated into the fundamental
- instructions of the particular computer (machine code). A compiler is
- presented with the complete program and translates the whole thing
- into machine code. The machine code is then run on the computer in
- a separate stage. An interpreter, however, translates the program and
- obeys it as it goes along. The advantage is that you do not need a
- preceding stage before you run the program. The disadvantage is that,
- since the whole program has not been 'seen', the final code is usually
- less efficient and runs more slowly than compiled code.
-
- In order for an interpreter to obey the program efficiently, it is
- usually coded so that it makes it easier to work out parts of the
- program. The main 'trick' is to code the keywords (words like GOTO, or
- IF or PRINT) into only a couple of characters (bytes), known as a
- token. This means that they can be easily recognised every time the
- interpreter runs across them in the program.
-
- At the beginning of each line of the program, the interpreter will
- need to know the line number of that line and where the next line is.
- Both are coded for greater efficiency.
-
- All types of BASIC use an interpreter.
-
- In these notes Hex is Hexadecimal, or counting to base 16, Octal is
- counting to base 8, and Binary is counting to base 2. A byte is enough
- storage to hold one character and is 8 bits long (bit is a 0 or 1).
- The low byte is the least significant part of a binary number (lower 8
- bits) and the high byte is the most significant (higher 8 bits). The
- file is assumed to come in from left to right, the earlier the byte,
- the more to the left it is. ASCII is a standard code for representing
- characters as bytes.
-
-
- A. Coding of BBC BASIC files
- -------------------------
-
- This is basically pure ASCII text, apart from line numbers and tokens.
-
- The program file begins with a return symbol and ends with ASCII code
- 255 (hex FF).
-
- Each line has the following format:-
-
- <Three bytes> <coded BASIC line> <return symbol - ASCII code 13>
-
- The starting three bytes have the line number in the first two bytes
- (high byte first) and the length of the decoded line in the third
- byte. -------
-
- All the key words in BBC BASIC are coded in either one or two bytes,
- always starting with a byte with an ASCII value greater than or equal
- to 127 (decimal). Hence all the tokens apart from that for OTHERWISE
- (see below) have the top bit set and are unprintable ASCII characters.
-
- Apart from that, the line numbers after GOTOs and GOSUBs are preceded
- by a byte with ASCII code 141 decimal (215 octal - 8D hex) and are
- coded in rather a strange way:-
-
- Line Number: (After ASCII character 141)
- -----------
-
- Consists of three bytes, the bits of which are as follows (most
- right most bit is bit 0). (the 'is' Bits refer to the actual line
- number in binary).
-
- Lowest Byte: (Right most)
-
- Bit 0 is Bit 8, Bit 1 is Bit 9, Bit 2 is Bit 10, Bit 3 is Bit 11.
- Bit 4 is Bit 12, Bit 5 is Bit 13, Bit 6 is always 0, Bit 7 is always 1.
-
- Middle Byte:
-
- Bit 0 is Bit 0, Bit 1 is Bit 2, Bit 2 is Bit 3, Bit 3 is Bit 3.
- Bit 4 is Bit 4, Bit 5 is Bit 5, Bit 6 is always 0, Bit 7 is
- always 1.
-
- Highest Byte: (Left Most)
-
- Bit 0 is always 0, Bit 1 is always 0, Bit 2 is Bit 14, Bit 3 is
- inverted bit 15.
- Bit 4 is inverted Bit 6, Bit 5 is Bit 7, Bit 6 is always 0, Bit 7 is
- always 1.
-
- So the middle byte basically codes the low byte of the line number,
- and the low byte codes the high byte of the line number with bits
- needed to be added in or inverted and added in from the highest byte
- of the coding.
-
- I coded this conversion in C as follows:
-
- If c[0] is the highest byte of the coding (left most) , c[1] is the
- middle byte of the coding, c[2] is the lowest byte of the coding
- (right most) and lineno is the integer that is to hold the final
- decoded line number:
-
- c[0]=c[0]^0x14; /* Inverts extra bits (Bits 6 and 15) */
-
- c[1]=c[1]&0x3f; /* Masks off low byte of lineno */
-
- c[2]=c[2]&0x3f; /* Masks off high byte of lineno */
-
- c[1]=c[1] | ((c[0] & 0x30)<<2); /* Adds in Bits 6 and 7 to low byte */
-
- c[2]=c[2] | ((c[0] & 0x0c)<<4); /* Adds in Bits 14 and 15 to high byte */
-
- lineno=(int)c[1] +((int)c[2]<<8); /* Combines low and high bytes of lineno */
-
- Tokens
- ------
-
- Herewith a complete list of tokens, both in alphabetical order and ASCII
- code order as held in a C structure:
-
- struct list
- {
- char bbc_keyword[10];
- char bbc_token_hex[3];
- char bbc_token_octal[10];
- };
-
- /* Alphabetical Order */
-
- struct list token_list[] = {
-
- "Line No", "\x8d", "\215",
- "ABS", "\x94", "\224",
- "ACS", "\x95", "\225",
- "ADVAL", "\x96", "\226",
- "AND", "\x80", "\200",
- "APPEND", "\xc7\x8e", "\307\216",
- "ASC", "\x97", "\227",
- "ASN", "\x98", "\230",
- "ATN", "\x99", "\231",
- "AUTO", "\xc7\x8f", "\307\217",
- "BEAT", "\xc6\x8f", "\306\217",
- "BEATS", "\xc8\x9e", "\310\236",
- "BGET", "\x9a", "\232",
- "BPUT", "\xd5", "\325",
- "CALL", "\xd6", "\326",
- "CASE", "\xc8\x8e", "\310\216",
- "CHAIN", "\xd7", "\327",
- "CHR$", "\xbd", "\275",
- "CIRCLE", "\xc8\x8f", "\310\217",
- "CLEAR", "\xd8", "\330",
- "CLG", "\xda", "\332",
- "CLOSE", "\xd9", "\331",
- "CLS", "\xdb", "\333",
- "COLOUR", "\xfb", "\373",
- "COS", "\x9b", "\233",
- "COUNT", "\x9c", "\234",
- "DATA", "\xdc", "\334",
- "DEF", "\xdd", "\335",
- "DEG", "\x9d", "\235",
- "DELETE", "\xc7\x90", "\307\220",
- "DIM", "\xde", "\336",
- "DIV", "\x81", "\201",
- "DRAW", "\xdf", "\337",
- "EDIT", "\xc7\x91", "\307\221",
- "ELLIPSE", "\xc8\x9d", "\310\235",
- "ELSE", "\xcc", "\314",
- "ELSE", "\x8b", "\213",
- "END", "\xe0", "\340",
- "ENDCASE", "\xcb", "\313",
- "ENDIF", "\xcd", "\315",
- "ENDPROC", "\xe1", "\341",
- "ENDWHILE", "\xce", "\316",
- "EOF", "\xc5", "\305",
- "EOR", "\x82", "\202",
- "ERL", "\x9e", "\236",
- "ERR", "\x9f", "\237",
- "ERROR", "\x85", "\205",
- "EVAL", "\xa0", "\240",
- "EXP", "\xa1", "\241",
- "EXT", "\xa2", "\242",
- "FALSE", "\xa3", "\243",
- "FILL", "\xc8\x90", "\310\220",
- "FN", "\xa4", "\244",
- "FOR", "\xe3", "\343",
- "GCOL", "\xe6", "\346",
- "GET", "\xa5", "\245",
- "GET$", "\xbe", "\276",
- "GOSUB", "\xe4", "\344",
- "GOTO", "\xe5", "\345",
- "HELP", "\xc7\x92", "\307\222",
- "HIMEM", "\xd3", "\323",
- "HIMEM", "\x93", "\223",
- "IF", "\xe7", "\347",
- "INKEY", "\xa6", "\246",
- "INKEY$", "\xbf", "\277",
- "INPUT", "\xe8", "\350",
- "INSTALL", "\xc8\x9a", "\310\232",
- "INSTR(", "\xa7", "\247",
- "INT", "\xa8", "\250",
- "LEFT$(", "\xc0", "\300",
- "LEN", "\xa9", "\251",
- "LET", "\xe9", "\351",
- "LIBRARY", "\xc8\x9b", "\310\233",
- "LINE", "\x86", "\206",
- "LIST", "\xc7\x93", "\307\223",
- "LN", "\xaa", "\252",
- "LOAD", "\xc7\x94", "\307\224",
- "LOCAL", "\xea", "\352",
- "LOG", "\xab", "\253",
- "LOMEM", "\xd2", "\322",
- "LOMEM", "\x92", "\222",
- "LVAR", "\xc7\x95", "\307\225",
- "MID$(", "\xc1", "\301",
- "MOD", "\x83", "\203",
- "MODE", "\xeb", "\353",
- "MOUSE", "\xc8\x97", "\310\227",
- "MOVE", "\xec", "\354",
- "NEW", "\xc7\x96", "\307\226",
- "NEXT", "\xed", "\355",
- "NOT", "\xac", "\254",
- "OF", "\xca", "\312",
- "OFF", "\x87", "\207",
- "OLD", "\xc7\x97", "\307\227",
- "ON", "\xee", "\356",
- "OPENIN", "\x8e", "\216",
- "OPENOUT", "\xae", "\256",
- "OPENUP", "\xad", "\255",
- "OR", "\x84", "\204",
- "ORIGIN", "\xc8\x91", "\310\221",
- "OSCLI", "\xff", "\377",
- "OTHERWISE","\x7f", "\177",
- "OVERLAY", "\xc8\xa3", "\310\243",
- "PAGE", "\xd0", "\320",
- "PAGE", "\x90", "\220",
- "PI", "\xaf", "\257",
- "PLOT", "\xf0", "\360",
- "POINT", "\xc8\x92", "\310\222",
- "POINT(", "\xb0", "\260",
- "POS", "\xb1", "\261",
- "PRINT", "\xf1", "\361",
- "PROC", "\xf2", "\362",
- "PTR", "\xcf", "\317",
- "PTR", "\x8f", "\217",
- "QUIT", "\xc8\x98", "\310\230",
- "RAD", "\xb2", "\262",
- "READ", "\xf3", "\363",
- "RECTANGLE","\xc8\x93", "\310\223",
- "REM", "\xf4", "\364",
- "RENUMBER", "\xc7\x98", "\307\230",
- "REPEAT", "\xf5", "\365",
- "REPORT", "\xf6", "\366",
- "RESTORE", "\xf7", "\367",
- "RETURN", "\xf8", "\370",
- "RIGHT$(", "\xc2", "\302",
- "RND", "\xb3", "\263",
- "RUN", "\xf9", "\371",
- "SAVE", "\xc7\x99", "\307\231",
- "SGN", "\xb4", "\264",
- "SIN", "\xb5", "\265",
- "SOUND", "\xd4", "\324",
- "SPC", "\x89", "\211",
- "SQR", "\xb6", "\266",
- "STEP", "\x88", "\210",
- "STEREO", "\xc8\xa2", "\310\242",
- "STOP", "\xfa", "\372",
- "STR$", "\xc3", "\303",
- "STRING$(", "\xc4", "\304",
- "SUM", "\xc6\x8e", "\306\216",
- "SWAP", "\xc8\x94", "\310\224",
- "SYS", "\xc8\x99", "\310\231",
- "TAB(", "\x8a", "\212",
- "TAN", "\xb7", "\267",
- "TEMPO", "\xc8\x9f", "\310\237",
- "THEN", "\x8c", "\214",
- "TIME", "\xd1", "\321",
- "TIME", "\x91", "\221",
- "TINT", "\xc8\x9c", "\310\234",
- "TO", "\xb8", "\270",
- "TOP", "\xb8P", "\270P",
- "TRACE", "\xfc", "\374",
- "TRUE", "\xb9", "\271",
- "TWIN", "\xc7\x9a", "\307\232",
- "TWINO", "\xc7\x9b", "\307\233",
- "UNTIL", "\xfd", "\375",
- "USR", "\xba", "\272",
- "VAL", "\xbb", "\273",
- "VDU", "\xef", "\357",
- "VOICE", "\xc8\xa1", "\310\241",
- "VOICES", "\xc8\xa0", "\310\240",
- "VPOS", "\xbc", "\274",
- "WAIT", "\xc8\x96", "\310\226",
- "WHEN", "\xc9", "\311",
- "WHILE", "\xc8\x95", "\310\225",
- "WIDTH", "\xfe", "\376",
- "", "", ""};
-
- /* Sorted token value order */
-
- struct list sort_token_list= {
-
- "OTHERWISE","\x7f", "\177",
- "AND", "\x80", "\200",
- "DIV", "\x81", "\201",
- "EOR", "\x82", "\202",
- "MOD", "\x83", "\203",
- "OR", "\x84", "\204",
- "ERROR", "\x85", "\205",
- "LINE", "\x86", "\206",
- "OFF", "\x87", "\207",
- "STEP", "\x88", "\210",
- "SPC", "\x89", "\211",
- "TAB(", "\x8a", "\212",
- "ELSE", "\x8b", "\213",
- "THEN", "\x8c", "\214",
- "Line No", "\x8d", "\215",
- "OPENIN", "\x8e", "\216",
- "PTR", "\x8f", "\217",
- "PAGE", "\x90", "\220",
- "TIME", "\x91", "\221",
- "LOMEM", "\x92", "\222",
- "HIMEM", "\x93", "\223",
- "ABS", "\x94", "\224",
- "ACS", "\x95", "\225",
- "ADVAL", "\x96", "\226",
- "ASC", "\x97", "\227",
- "ASN", "\x98", "\230",
- "ATN", "\x99", "\231",
- "BGET", "\x9a", "\232",
- "COS", "\x9b", "\233",
- "COUNT", "\x9c", "\234",
- "DEG", "\x9d", "\235",
- "ERL", "\x9e", "\236",
- "ERR", "\x9f", "\237",
- "EVAL", "\xa0", "\240",
- "EXP", "\xa1", "\241",
- "EXT", "\xa2", "\242",
- "FALSE", "\xa3", "\243",
- "FN", "\xa4", "\244",
- "GET", "\xa5", "\245",
- "INKEY", "\xa6", "\246",
- "INSTR(", "\xa7", "\247",
- "INT", "\xa8", "\250",
- "LEN", "\xa9", "\251",
- "LN", "\xaa", "\252",
- "LOG", "\xab", "\253",
- "NOT", "\xac", "\254",
- "OPENUP", "\xad", "\255",
- "OPENOUT", "\xae", "\256",
- "PI", "\xaf", "\257",
- "POINT(", "\xb0", "\260",
- "POS", "\xb1", "\261",
- "RAD", "\xb2", "\262",
- "RND", "\xb3", "\263",
- "SGN", "\xb4", "\264",
- "SIN", "\xb5", "\265",
- "SQR", "\xb6", "\266",
- "TAN", "\xb7", "\267",
- "TO", "\xb8", "\270",
- "TOP", "\xb8P", "\270P",
- "TRUE", "\xb9", "\271",
- "USR", "\xba", "\272",
- "VAL", "\xbb", "\273",
- "VPOS", "\xbc", "\274",
- "CHR$", "\xbd", "\275",
- "GET$", "\xbe", "\276",
- "INKEY$", "\xbf", "\277",
- "LEFT$(", "\xc0", "\300",
- "MID$(", "\xc1", "\301",
- "RIGHT$(", "\xc2", "\302",
- "STR$", "\xc3", "\303",
- "STRING$(", "\xc4", "\304",
- "EOF", "\xc5", "\305",
- "SUM", "\xc6\x8e", "\306\216",
- "BEAT", "\xc6\x8f", "\306\217",
- "APPEND", "\xc7\x8e", "\307\216",
- "AUTO", "\xc7\x8f", "\307\217",
- "DELETE", "\xc7\x90", "\307\220",
- "EDIT", "\xc7\x91", "\307\221",
- "HELP", "\xc7\x92", "\307\222",
- "LIST", "\xc7\x93", "\307\223",
- "LOAD", "\xc7\x94", "\307\224",
- "LVAR", "\xc7\x95", "\307\225",
- "NEW", "\xc7\x96", "\307\226",
- "OLD", "\xc7\x97", "\307\227",
- "RENUMBER", "\xc7\x98", "\307\230",
- "SAVE", "\xc7\x99", "\307\231",
- "TWIN", "\xc7\x9a", "\307\232",
- "TWINO", "\xc7\x9b", "\307\233",
- "CASE", "\xc8\x8e", "\310\216",
- "CIRCLE", "\xc8\x8f", "\310\217",
- "FILL", "\xc8\x90", "\310\220",
- "ORIGIN", "\xc8\x91", "\310\221",
- "POINT", "\xc8\x92", "\310\222",
- "RECTANGLE","\xc8\x93", "\310\223",
- "SWAP", "\xc8\x94", "\310\224",
- "WHILE", "\xc8\x95", "\310\225",
- "WAIT", "\xc8\x96", "\310\226",
- "MOUSE", "\xc8\x97", "\310\227",
- "QUIT", "\xc8\x98", "\310\230",
- "SYS", "\xc8\x99", "\310\231",
- "INSTALL", "\xc8\x9a", "\310\232",
- "LIBRARY", "\xc8\x9b", "\310\233",
- "TINT", "\xc8\x9c", "\310\234",
- "ELLIPSE", "\xc8\x9d", "\310\235",
- "BEATS", "\xc8\x9e", "\310\236",
- "TEMPO", "\xc8\x9f", "\310\237",
- "VOICES", "\xc8\xa0", "\310\240",
- "VOICE", "\xc8\xa1", "\310\241",
- "STEREO", "\xc8\xa2", "\310\242",
- "OVERLAY", "\xc8\xa3", "\310\243",
- "WHEN", "\xc9", "\311",
- "OF", "\xca", "\312",
- "ENDCASE", "\xcb", "\313",
- "ELSE", "\xcc", "\314",
- "ENDIF", "\xcd", "\315",
- "ENDWHILE", "\xce", "\316",
- "PTR", "\xcf", "\317",
- "PAGE", "\xd0", "\320",
- "TIME", "\xd1", "\321",
- "LOMEM", "\xd2", "\322",
- "HIMEM", "\xd3", "\323",
- "SOUND", "\xd4", "\324",
- "BPUT", "\xd5", "\325",
- "CALL", "\xd6", "\326",
- "CHAIN", "\xd7", "\327",
- "CLEAR", "\xd8", "\330",
- "CLOSE", "\xd9", "\331",
- "CLG", "\xda", "\332",
- "CLS", "\xdb", "\333",
- "DATA", "\xdc", "\334",
- "DEF", "\xdd", "\335",
- "DIM", "\xde", "\336",
- "DRAW", "\xdf", "\337",
- "END", "\xe0", "\340",
- "ENDPROC", "\xe1", "\341",
- "FOR", "\xe3", "\343",
- "GOSUB", "\xe4", "\344",
- "GOTO", "\xe5", "\345",
- "GCOL", "\xe6", "\346",
- "IF", "\xe7", "\347",
- "INPUT", "\xe8", "\350",
- "LET", "\xe9", "\351",
- "LOCAL", "\xea", "\352",
- "MODE", "\xeb", "\353",
- "MOVE", "\xec", "\354",
- "NEXT", "\xed", "\355",
- "ON", "\xee", "\356",
- "VDU", "\xef", "\357",
- "PLOT", "\xf0", "\360",
- "PRINT", "\xf1", "\361",
- "PROC", "\xf2", "\362",
- "READ", "\xf3", "\363",
- "REM", "\xf4", "\364",
- "REPEAT", "\xf5", "\365",
- "REPORT", "\xf6", "\366",
- "RESTORE", "\xf7", "\367",
- "RETURN", "\xf8", "\370",
- "RUN", "\xf9", "\371",
- "STOP", "\xfa", "\372",
- "COLOUR", "\xfb", "\373",
- "TRACE", "\xfc", "\374",
- "UNTIL", "\xfd", "\375",
- "WIDTH", "\xfe", "\376",
- "OSCLI", "\xff", "\377",
- "", "", ""};
-
- NB LOMEM, HIMEM, PAGE, PTR and TIME have different tokens depending
- -- on whether they are statement or function tokens (whether they
- are given or give a value). The token for ELSE has different values
- depending on whether it is part of an IF statement on one line or
- spread over several lines. Also the keyword 'BY' is not tokenised.
- Note the token for 'TOP', which is that for 'TO' with a 'P' after it.
-
-
- B. Coding of GW-BASIC Files
- ------------------------
-
- The format of a GW-BASIC file is as follows:
-
- <Hex FF> lines <Hex 00 00 FF FF 1A>
-
- or sometimes terminated with <Hex 00 00 FF 1A>
-
- The lines are coded as follows:
-
- <Four bytes> coded line <Hex 00>
-
- The first four bytes hold the following information:
-
- First two bytes: The position of the next line in the file.
-
- Second two bytes: The line number.
-
- The line number is coded as a binary number, the lowest byte first.
-
- The position of the next line is with an added offset. If the first
- byte of the file is regarded as byte 0, then the position given is that
- of the first of the next four bytes plus 4686 (decimal) (124E Hex).
- ----
- The coded position is given low byte first, high byte second.
-
- Thus if the first line was line number ten and the second line begins
- at file byte position 7, then the first four bytes of the file after
- the leading FF are:-
-
- Hex 55 12 0A 00
-
- The Line
- --------
- The coded line has the following components:
-
- Line Numbers, Keyword Tokens, String Literals, Operators, Integers,
- Hex Literals, Octal Literals and Floating Point Literals.
-
- 1. Line Numbers (after GOTO or GOSUB)
- ------------
-
- Are 'announced' by the character ASCII value 14 (Hex 0E, Octal 16)
-
- It is simply the binary value of the line number in two bytes, low
- byte first.
-
- 2. String Literals (Strings of characters enclosed in double quotes)
- ---------------
-
- Are entirely uncoded. Even characters that might be regarded as
- tokens are ignored by the BASIC interpreter.
-
- 3. Operators (i.e. for addition, subtraction etc.)
- ---------
-
- These are coded (they do not take their ASCII character value).
-
- They are:
-
- Operator Hex Token Octal Token
- -------- --------- -----------
-
- ">", "\xe6", "\346",
- "=", "\xe7", "\347",
- "<", "\xe8", "\350",
- "+", "\xe9", "\351",
- "-", "\xea", "\352",
- "*", "\xeb", "\353",
- "/", "\xec", "\354",
- "^", "\xed", "\355",
- "AND", "\xee", "\356",
- "OR", "\xef", "\357",
- "XOR", "\xf0", "\360",
- "EQV", "\xf1", "\361",
- "IMP", "\xf2", "\362",
- "MOD", "\xf3", "\363",
- "\\", "\xf4", "\364", (backslash)
- "NOT", "\xd3", "\323"
-
- 4. Integer Literals (a sequence of numeric characters e.g. 259 )
- ----------------
-
- These are coded in a different way depending on whether the number
- is less than ten, greater than nine and less than 256, or greater
- than 255.
-
- Less than ten.
- -------------
-
- Coded as single byte which has the ASCII value of 17 + the numeric value,
- hence:
-
- 0 is coded as a byte with ASCII code 17 (decimal).
- 1 " " " " " " " " 18
- 2 " " " " " " " " 19
- .....
- 9 " " " " " " " " 26
-
- Greater than 9 and less than 256
- --------------------------------
-
- Coded as a single byte preceded by a byte with ASCII code 15
- (Hex 0F Octal 17).
-
- The byte simply holds the numeric value.
-
- Greater than 255
- ----------------
-
- Coded in two bytes preceded by a byte with ASCII code 28 (decimal)
- (Hex 1C, Octal 34).
-
- The two bytes simply hold the numeric value, low byte first.
-
- 5. Hex Literals (a string of hex characters preceded by &H e.g. &HF1)
- ------------
-
- Converted to their binary representation, then coded in a similar
- way to integers greater than 255 and preceded by a character with
- ASCII code 12 (Hex 0C, Octal 14).
-
- 6. Octal Literals (a string of octal characters preceded by & or &O
- -------------- e.g. &O801 )
-
- Converted to their binary representation, then as for hex literals
- preceded by a character with ASCII value 11 (decimal) (Hex 0B,
- Octal 13).
-
- 7. Tokens (sustituted for the key words of the language)
- ------
-
- A complete list of tokens in a C structure follows:
-
- struct list
- {
- char gw_keyword[10];
- char gw_token_hex[3];
- char gw_token_octal[3];
- };
-
-
- struct list token_list[]={
-
- "END", "\x81", "\201",
- "FOR", "\x82", "\202",
- "NEXT", "\x83", "\203",
- "DATA", "\x84", "\204",
- "INPUT", "\x85", "\205",
- "DIM", "\x86", "\206",
- "READ", "\x87", "\207",
- "LET", "\x88", "\210",
- "GOTO", "\x89", "\211",
- "RUN", "\x8a", "\212",
- "IF", "\x8b", "\213",
- "RESTORE", "\x8c", "\214",
- "GOSUB", "\x8d", "\215",
- "RETURN", "\x8e", "\216",
- "REM", "\x8f", "\217",
- "STOP", "\x90", "\220",
- "PRINT", "\x91", "\221",
- "CLEAR", "\x92", "\222",
- "LIST", "\x93", "\223",
- "NEW", "\x94", "\224",
- "ON", "\x95", "\225",
- "WAIT", "\x96", "\226",
- "DEF", "\x97", "\227",
- "POKE", "\x98", "\230",
- "CONT", "\x99", "\231",
- "OUT", "\x9c", "\234",
- "LPRINT", "\x9d", "\235",
- "LLIST", "\x9e", "\236",
- "WIDTH", "\xa0", "\240",
- "ELSE", ":\xa1", ":\241",
- "TRON", "\xa2", "\242",
- "TROFF", "\xa3", "\243",
- "SWAP", "\xa4", "\244",
- "ERASE", "\xa5", "\245",
- "EDIT", "\xa6", "\246",
- "ERROR", "\xa7", "\247",
- "RESUME", "\xa8", "\250",
- "DELETE", "\xa9", "\251",
- "AUTO", "\xaa", "\252",
- "RENUM", "\xab", "\253",
- "DEFSTR", "\xac", "\254",
- "DEFINT", "\xad", "\255",
- "DEFSNG", "\xae", "\256",
- "DEFDBL", "\xaf", "\257",
- "LINE", "\xb0", "\260",
- "WHILE", "\xb1\xe9", "\261\351",
- "WEND", "\xb2", "\262",
- "CALL", "\xb3", "\263",
- "WRITE", "\xb7", "\267",
- "OPTION", "\xb8", "\270",
- "RANDOMIZE","\xb9", "\271",
- "OPEN", "\xba", "\272",
- "CLOSE", "\xbb", "\273",
- "LOAD", "\xbc", "\274",
- "MERGE", "\xbd", "\275",
- "SAVE", "\xbe", "\276",
- "COLOR", "\xbf", "\277",
- "CLS", "\xc0", "\300",
- "MOTOR", "\xc1", "\301",
- "BSAVE", "\xc2", "\302",
- "BLOAD", "\xc3", "\303",
- "SOUND", "\xc4", "\304",
- "BEEP", "\xc5", "\305",
- "PSET", "\xc6", "\306",
- "PRESET", "\xc7", "\307",
- "SCREEN", "\xc8", "\310",
- "KEY", "\xc9", "\311",
- "LOCATE", "\xca", "\312",
- "TO", "\xcc", "\314",
- "THEN", "\xcd", "\315",
- "TAB(", "\xce", "\316",
- "STEP", "\xcf", "\317",
- "USR", "\xd0", "\320",
- "FN", "\xd1", "\321",
- "NOT", "\xd3", "\323",
- "ERL", "\xd4", "\324",
- "ERR", "\xd5", "\325",
- "STRING$", "\xd6", "\326",
- "USING", "\xd7", "\327",
- "INSTR", "\xd8", "\330",
- "VARPTR", "\xda", "\332",
- "CSRLIN", "\xdb", "\333",
- "POINT", "\xdc", "\334",
- "OFF", "\xdd", "\335",
- "INKEY$", "\xde", "\336",
- ">", "\xe6", "\346",
- "=", "\xe7", "\347",
- "<", "\xe8", "\350",
- "+", "\xe9", "\351",
- "-", "\xea", "\352",
- "*", "\xeb", "\353",
- "/", "\xec", "\354",
- "^", "\xed", "\355",
- "AND", "\xee", "\356",
- "OR", "\xef", "\357",
- "XOR", "\xf0", "\360",
- "EQV", "\xf1", "\361",
- "IMP", "\xf2", "\362",
- "MOD", "\xf3", "\363",
- "\\", "\xf4", "\364",
- "CVI", "\xfd\x81", "\375\201",
- "CVS", "\xfd\x82", "\375\202",
- "CVD", "\xfd\x83", "\375\203",
- "MKI$", "\xfd\x84", "\375\204",
- "MKS$", "\xfd\x85", "\375\205",
- "MKD$", "\xfd\x86", "\375\206",
- "FILES", "\xfe\x81", "\376\201",
- "FIELD", "\xfe\x82", "\376\202",
- "SYSTEM", "\xfe\x83", "\376\203",
- "NAME", "\xfe\x84", "\376\204",
- "LSET", "\xfe\x85", "\376\205",
- "RSET", "\xfe\x86", "\376\206",
- "KILL", "\xfe\x87", "\376\207",
- "PUT", "\xfe\x88", "\376\210",
- "GET", "\xfe\x89", "\376\211",
- "RESET", "\xfe\x8a", "\376\212",
- "COMMON", "\xfe\x8b", "\376\213",
- "CHAIN", "\xfe\x8c", "\376\214",
- "DATE$", "\xfe\x8d", "\376\215",
- "TIME$", "\xfe\x8e", "\376\216",
- "PAINT", "\xfe\x8f", "\376\217",
- "COM", "\xfe\x90", "\376\220",
- "CIRCLE", "\xfe\x91", "\376\221",
- "DRAW", "\xfe\x92", "\376\222",
- "PLAY", "\xfe\x93", "\376\223",
- "TIMER", "\xfe\x94", "\376\224",
- "ERDEV", "\xfe\x95", "\376\225",
- "IOCTL", "\xfe\x96", "\376\226",
- "CHDIR", "\xfe\x97", "\376\227",
- "MKDIR", "\xfe\x98", "\376\230",
- "RMDIR", "\xfe\x99", "\376\231",
- "SHELL", "\xfe\x9a", "\376\232",
- "ENVIRON", "\xfe\x9b", "\376\233",
- "VIEW", "\xfe\x9c", "\376\234",
- "WINDOW", "\xfe\x9d", "\376\235",
- "PMAP", "\xfe\x9e", "\376\236",
- "PALETTE", "\xfe\x9f", "\376\237",
- "LCOPY", "\xfe\xa0", "\376\240",
- "CALLS", "\xfe\xa1", "\376\241",
- "PCOPY", "\xfe\xa5", "\376\245",
- "LOCK", "\xfe\xa7", "\376\247",
- "UNLOCK", "\xfe\xa8", "\376\250",
- "LEFT$", "\xff\x81", "\377\201",
- "RIGHT$", "\xff\x82", "\377\202",
- "MID$", "\xff\x83", "\377\203",
- "SGN", "\xff\x84", "\377\204",
- "INT", "\xff\x85", "\377\205",
- "ABS", "\xff\x86", "\377\206",
- "SQR", "\xff\x87", "\377\207",
- "RND", "\xff\x88", "\377\210",
- "SIN", "\xff\x89", "\377\211",
- "LOG", "\xff\x8a", "\377\212",
- "EXP", "\xff\x8b", "\377\213",
- "COS", "\xff\x8c", "\377\214",
- "TAN", "\xff\x8d", "\377\215",
- "ATN", "\xff\x8e", "\377\216",
- "FRE", "\xff\x8f", "\377\217",
- "INP", "\xff\x90", "\377\220",
- "POS", "\xff\x91", "\377\221",
- "LEN", "\xff\x92", "\377\222",
- "STR$", "\xff\x93", "\377\223",
- "VAL", "\xff\x94", "\377\224",
- "ASC", "\xff\x95", "\377\225",
- "CHR$", "\xff\x96", "\377\226",
- "PEEK", "\xff\x97", "\377\227",
- "SPACE$", "\xff\x98", "\377\230",
- "OCT$", "\xff\x99", "\377\231",
- "HEX$", "\xff\x9a", "\377\232",
- "LPOS", "\xff\x9b", "\377\233",
- "CINT", "\xff\x9c", "\377\234",
- "CSNG", "\xff\x9d", "\377\235",
- "CDBL", "\xff\x9e", "\377\236",
- "FIX", "\xff\x9f", "\377\237",
- "PEN", "\xff\xa0", "\377\240",
- "STICK", "\xff\xa1", "\377\241",
- "STRIG", "\xff\xa2", "\377\242",
- "EOF", "\xff\xa3", "\377\243",
- "LOC", "\xff\xa4", "\377\244",
- "LOF", "\xff\xa5", "\377\245",
- "","",""};
-
-
- 8. Floating Point Literals (a fractional number e.g. 62.8 )
- -----------------------
-
- The most difficult one of the lot!
-
- Coded in exponent and mantissa format, preceded by a character with ASCII
- code 29 (Hex 1D, Octal 35).
-
- Consists of four bytes, the first three the mantissa, the last the
- exponent. The mantissa is held lowest byte first, middle byte
- second, highest byte third.
-
- The mantissa is as would be expected, the successive bits of the
- floating point number with the highest bit removed (it is always
- 1).
-
- The exponent is the power of two needed in order to shift the number
- down until it is less than 1, however in the coding the highest bit
- is inverted.
-
- Thus the floating point number 4.5 has an exponent of 3 (or -3 depending
- on how you look at it) and a mantissa of 1001 (in binary). Hence the
- coding would be:
-
- Hex 00 00 10 83 (Preceded by Hex 1D).
-
- I coded this in C as follows:
-
- /* If dumnum is the double representation of the number then
- charexp is the coded exponent and mant2 is the right most
- byte, mant1 is the middle byte and mant0 is the left most
- byte of the mantissa. */
-
-
- /* exponent */
- iexp=(int)(log(dumnum)/log(2.0));
- /* mantissa in lower three bytes */
- imant=(int)(dumnum*pow(2.0,23.0-(double)iexp));
- imant=imant&0x007fffff;
- /* convert to GW-BASIC representation */
- iexp++;
- charexp=iexp^0x80;
- mant0=(imant>>16)&0xff;
- mant1=(imant>>8)&0xff;
- mant2=imant&0xff;
-
- I have not investigated what happens for negative numbers, presumably
- the top bit of the mantissa is set to 1. And note that if dumnum is
- negative the log(dumnum) will blow up!!
-
- That's the lot!
-
- Comments, queries, curses, praise to:-
-
- Martin Carradus,
- 3 Connaught Road,
- Ilkley,
- West Yorkshire,
- LS29 8QW.
- U.K.
-
- If you write to me, please enclose an s.a.e.
-
- P.S. On an Acorn Archimedes there is the operating system command:
-
- *DUMP filename
-
- which gives a hex and character dump of the file with the name 'filename'.
-
- There is usually a similar command on other computers.
-
-