home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The C Users' Group Library 1994 August
/
wc-cdrom-cusersgrouplibrary-1994-08.iso
/
vol_300
/
338_01
/
as68.doc
< prev
next >
Wrap
Text File
|
1990-09-25
|
24KB
|
595 lines
as68 - 68000 Assembler, version 1.02
(c) copyright 1982 Steve Passe all rights reserved
Modified to support CC68K C Compiler by Brian Brown, Nov 1989
TABLE OF CONTENTS
Chapter 1 Introduction 1
Chapter 2 Usage 6
Chapter 3 Pseudo-ops 8
Chapter 4 Mnemonics 11
Chapter 5 Expressions 13
Chapter 6 S File Format 15
Chapter 7 Error Messages 16
Chapter 8 Differences 18
CHAPTER 1
INTRODUCTION
The as68 assembler is a disk to disk assembler for the Motorola 68000
microprocessor chip. Written in the c programming language, it may be used as
a cross assembler on any machine supporting c, or as a native assembler if
compiled with a c that produces 68000 output. It's directives and mnemonic set
closely follow that of the Motorola Resident Structured Assembler. It has been
altered to accept assembler output from the 68000 C Compiler. This
modification was performed by B Brown at the Central Institute of Technology,
Heretaunga, New Zealand in 1989.
SOURCE PROGRAM
The input to the assembler is an ascii text file, consisting of a series of
statements written in the assembly language. Each statement consists of one or
more fields within a line.
The assembler is free format within each line, i.e. there is no need to start
a specific field of a statement in a particular column. Fields are separated
from one another with whitespace (tabs or spaces).
STATEMENTS
There are 3 basic statement types. The most common is an assembly language
instruction or mnemonic. It is a command to the assembler to produce a machine
operation code to carry out a specific action.
The second type of statement is called an assembly directive or pseudo-op.
Pseudo-ops tell the assembler how to assemble the program.
The third statement type is called a comment. It is ignored by the assembler,
it's purpose being to allow the programmer to insert descriptions of what the
code is doing within the text of the source program. Comments may exist as the
final field of the other two statement types.
INSTRUCTION STATEMENTS
An instruction statement consists of from one to four fields:
[label] <mnemonic> [operand] [comment]
LABEL FIELD
The first field, the label field, is optional. It is used to create a symbolic
name for the address of the code generated by the following assembler
mnemonic. This label is stored in the symbol table and any references to it
evaluate to the associated address.
The label field may be the only field of a statement and multiple, label only
fields may follow one another. In all cases the label(s) will evaluate to the
address of the first mnemonic to be assembled after the label(s) is specified.
Labels are composed of alphanumeric characters and may be up to 30 characters
long. All characters of a label are significant, as is the case of alphabetic
characters (i.e. "Foo" is different than "foo").
The first character of a label must be either alphabetic or the character '.'
(period). Following characters may also include the underscore (_), dollarsign
($), and the digits '0' thru '9'.
Labels starting in any other than the first column must be terminated with a
colon (:). Certain symbols are reserved for the use of the assembler and thus
may not be used as labels. These include "SP", "USP", "SR", "CCR", "A0"
through "A7" and "D0"
through "D7".
MNEMONIC FIELD
The second field is the mnemonic or assembly instruction field. It will always
be present in a statement except in the case of a label only statement (label
only statements might more properly be described as assembler directives). If
the line is unlabeled the mnemonic field must be preceeded by whitespace.
A mnemonic will consist of from 3 to 5 ascii characters, the case of which is
not significant. This assembler recognizes the standard Motorola instruction
set. The complete mnemonic instruction set is described in chapter 4,
"Mnemonics". Many 68000 instructions may work on different data sizes.
The desired data size is specified by appending a length modifier or data size
code to the mnemonic. A '.b' extension specifies a data size of byte (8 bits)
while '.l' will cause the data size to be a long word (32 bits). No extension
will cause the data size to be a word (16 bits). A '.w' extension may be used
for data sizes of word, although this size is the default and as such the '.w'
modifier is unnecessary.
OPERAND FIELD
The operand field is necessary only for those statements whose mnemonic
requires an operand(s). It will contain one or two operands. When two operands
are present they must be separated with a comma (no whitespace allowed between
operands).
The first of two operands is refered to as the source operand while the second
is the destination operand.
COMMENT FIELD
The comment field is optional and consists of all text following the above
fields.
DIRECTIVES
Label Field - Labels used with directive statements follow the general rules
of those used in assembly statements with one important
exception: they may only be used with the following directives:
1. EQU
2. SET
3. DC
4. DS
Directive Field - The directive field contains an instruction to the assembler
as to how the program should be assembled. This includes such
things as the base address of the program, setting of symbol
values, allocation of program memory storage, conditional
assembly, etc. The complete list of available assembly
directions is given in chapter 3, "Pseudo-ops".
Operand Field - The operand field of a directive statement will consist of
zero or more operands, as needed by the pseudo-op in question.
Multiple operands are separated with a comma (,). No whitespace
may exist between operands.
Comment Field - The comment field is identical to that used in instruction
statements and is optional.
Comments - Comments may exist alone as separate statements. In such cases
an asterisk, (*), must be the first character on the line.
CHAPTER 2
USAGE
Command Line Format - The command line format is:
as68 <sourcefile>[.ext] [option[ option]]
where:
sourcefile is the source file name.
ext is an optional file extension identifier. By default the
assembler expects source files to have an ext of ".asm".
option is one of several possible options in assembly. Whitespace
must separate multiple options when they occur. Individual
options are described below.
OPTIONS
The following options are available:
e, destination of error messages. If absent all error messages will go
to the console by default. If present, one or more of the following
destinations may be specified:
c error messages to the console.
e error messages to a file named "sourcefile.err".
f errors reported in listing.
l destination of assembly listing. If absent no listing is made. One or
more of the following option extensions are available:
c listing to console.
f listing to a file named "sourcefile.ls"
o type and destination of object file. If not present the object file
will be in Motorola 'S' FILE format.
s object to a file named "sourcefile.s19"
x no object file to be made.
s set the symbol table size. The symbol table requires 8 bytes plus
the length of the symbol for each entry. The argument should be in
decimal bytes. The symbol table defaults to 2000 bytes (decimal). To
reserve 3500 bytes the option would be: 's3500'.
t truncate source code lines in listing. This option will cause the
source code lines sent to any open list channel to be truncated at
the normal wrap position (see option 'w' below). It defaults to
being off, a 't' in the command line will turn it on.
w set the value of wrap. Source code lines in the listing(s) will
normally be 'wrapped' to the next line if they extend beyond the
column number specified by this option. If the 't' option is active
this option specifies the column beyond which source lines are
truncated. The default value of 'w' is 80, but can be set between 60
and any reasonable number of columns. Note that this number
should be set to the width of your list device, it is not the number
of columns of source code to a line (i.e. a value of 80 allows 40
characters of source per line after accounting for the 40 columns
used by the line/loc/code fields).
OTHER SYSTEMS
The code presently supports the MSDOS operating system. Some work would have
to be done in the command line parser to bring it up on other operating
systems.
CHAPTER 3
PSEUDO-OPS
Assembly Control
The following assembler directives are used to control the assembly:
ORG is used to specify the absolute memory origin of the code to be
assembled. The operand is an expression that evaluates to an
address within the first 64 kilobytes of memory space (0 thru
$FFFF). Any memory references outside this range will cause an
assembly error. Be aware that the 68000 chip sign extends
absolute short addresses. Thus address references above 32k
($8000) will access hardware memory in the range of $FF8000
thru $FFFFFF. This pseudo-op will cause the assembler to
generate code using absolute short addressing.
ORG.L is also used to specify the absolute memory origin. However, in
this case the entire address range of the 68000 is usable (0 thru
$FFFFFF). The assembler will generate code using absolute long
addressing.
RORG causes the assembler to generate program counter relative code.
The memory range restrictions of the "ORG" directive apply.
RORG.L causes the assembler to generate program counter relative code as
above, however the expression in the operand field may evaluate
to any value within the 68000's address range.
END signals the end of the assembly language program.
SYMBOL DEFINITION
The following directives control the definition of symbols:
EQU defines a symbol and sets it's value to that of the operand. This
symbol value is permanent, i.e. it cannot be changed later in a
program. The operand may be a complex expression but cannot
make forward references.
SET defines or redefines a symbol and sets it's value to that of the
operand. The value is temporary and may be reset with another
'SET' directive. Again, forward references are not allowed.
MEMORY ALLOCATION
These directives are used to reserve and/or initialize memory:
DC fills memory locations with constant value(s). An extension of '.B'
causes individual bytes to be filled with the value of the operand(s).
An extension of '.L' will cause the operands to be evaluated as 32
bit values, which are placed in 4 byte blocks, one for each operand.
No extension or '.W' signifies that the operand(s) are to be
evaluated as 16 bit values, each being stored in consecutive 2 byte
locations. Word and long word values are aligned on even address
boundries.
DC.B directives causing the location counter to end on an odd
address will pad 1 byte with a zero value unless the next source
statement is another DC.B.
DS reserves memory locations. Again, a data size extension may be
appended to 'DS' to specify either byte or long word allocation.
The operand specifies the number of data cells to reserve, i.e. if
the extension was '.L' and the operand evaluated to 5, forty (5 * 4
bytes for a long word) bytes would be reserved. Word alignment is
not automatic as in the dc directive. To force alignment after a
DS.B statement use a "ds 0" statement.
LIST CONTROLS
These directives are used to control the listing output:
LIST causes listing output (if enabled from the command line) to be
sent to each of the open list channels. LIST is active by default
and remains so until a NOLIST pseudo is encountered.
NOLIST causes listing output to be turned off until a LIST pseudo is
encountered.
CHAPTER 4
MNEMONICS
The mnemonics used by this assembler follow the standard mnemonic instruction
set as defined by Motorola. Mnemonics may exist in either upper or lower case
in the source file, the assembler makes no distinctions.
ABCD add binary coded decimal
ADD add binary
ADDQ add quick binary, operand in range of 1 thru 8
ADDX add binary with extend
AND logical and
ASL arithmetic shift left
ASR arithmetic shift right
Bcc branch conditionally
BCHG bit test and change
BCLR bit test and clear
BRA branch unconditional
BSET bit test and set
BSR branch to subroutine
BTST bit test
CHK check register against boundaries
CLR clear operand
CMP compare
CMPM compare memory
DBcc test condition, decrement and branch
DIVS signed divide
DIVU unsigned divide
EOR exclusive or
EXG exchange registers
EXT sign extend
JMP jump to address
JSR jump to subroutine
LEA load effective address
LINK link and allocate
LSL logical shift left
LSR logical shift right
MOVE move
MOVEM move multiple registers
MOVEP move peripheral data
MOVEQ move quick, operand in range og -128 thru 127
MULS signed multiply
MULU unsigned multiply
NBCD negate binary coded decimal
NEG negate
NEGX negate with extend
NOP no operation
NOT bitwise compliment
OR logical or
PEA push effective address
RESET reset external devices
ROL rotate left
ROR rotate right
ROXL rotate left with extend
ROXR rotate right with extend
RTE return from exception
RTR return and restore condition codes
RTS return from subroutine
SBCD subtract binary coded decimal
Scc set conditional
STOP stop
SUB subtract binary
SUBQ subtract quick binary, operand in range of 1 thru 8
SUBX subtract binary with extend
SWAP swap register halves
TAS test and set operand
TRAP trap
TRAPV trap on overflow
TST test an operand
UNLK unlink
CHAPTER 5
EXPRESSIONS
Expressions consist of one or more symbols combined by binary and/or unary
(algebraic) operators. Possible symbols include:
- Symbols defined with the EQU and SET directives.
- Program labels.
- Numeric values.
- The asterisk, ('*'), equates to the present value of the program location
counter.
ALGEBRAIC OPERATORS INCLUDE:
- arithmetic operators:
* multiplication: '*'
* division: '/'
* addition: '+'
* subtraction: '-'
LOGICAL OPERATORS:
* logical (bitwise) AND: '&'
* logical (bitwise) OR: '!'
* left shift: '<<'
* right shift: '>>'
UNARY OPERATORS:
* unary minus: '-'
* location counter value: '*'
SYMBOLS
Symbols are composed of alphanumeric characters and may be up to 30 characters
long. All characters of a label are significant, as is the case of alphabetic
characters (i.e. "Foo" is different than "foo"). The first character of a
label must be either alphabetic or the character '.' (period). Following
characters may also include the underscore (_), dollarsign ($), and the digits
'0' thru '9'.
Numbers may be represented as either decimal, hexadecimal, or binary:
- decimal numbers are represented by the normal ascii digits '0' thru '9'.
- hexadecimal numbers start with a dollar sign ('$') followed by the ascii
digits '0' thru '9' and the ascii characters 'A' thru 'F'.
- binary numbers start with a percent sign ('%') followed by the ascii digits
'0' and '1'.
CHAPTER 6
S FILE FORMAT
A Motorola "S file" is similar in structure to an Intel hex file. It consists
of a series of ascii records, each in the following format:
- The record start character, an uppercase ascii 'S', followed by an ascii
numeral, '0' thru '9':
* a '0' for the file header record.
* a '1' for records with short (16 bit) addresses.
* a '2' for records with medium (24 bit) addresses.
* a '3' for records with long (32 bit) addresses.
* ......
* a '9' for the tail record.
- The third and forth bytes forming an ascii hex representation of the number
of bytes in the body of the record:
* an address field consisting of 4, 6, or 8 ascii bytes representing
the load address of the record (record types S1, S2, S3,
respectively).
- The body of the record consists of up to 16 bytes of data, each byte
represented as 2 ascii hex bytes.
- A checksum byte, again represented by two ascii characters:
* The checksum for each record (except the last, S9) is the least
significant byte of the 1's compliment of the sum of the byte
values of the count, address, and data fields. (ie, everything in
the record except 'Sx').
* The checksum on the 'S9' record is undefined/non-existant.
CHAPTER 7
ERROR MESSAGES
- 1: statement parsing error. The occurrance of this error indicates that the
line totally confuses the parser and no further diagnostic comments
(legitimate comments anyway) can be made.
- 2: bad character in mnemonic-psdo field. An illegal character exists in the
word present in the mnenonic field, see chapter 1, introduction,
instruction statements.
- 3: instruction or pseudo not found in tables. Unless error 2 is also
generated the instr/pseudo word is properly formed (i.e. no illegal
characters are in it) but not a recognized instruction.
- 4: bad character in macro field. Version 1.xx does not recognize macros!
- 5: macro not found in macro table(s). Again, macros are not yet recognized.
- 6: improper use of label. Most often this error flags the use of a label on
a pseudo op line that does not allow the use of labels. See chapter 1,
introduction, directives:label field.
- 7: can't evaluate operand. The assembler cannot evaluate the value of an
operand. This may be caused by a variety of reasons such as imbedded
spaces, unrecognized operators, unbalanced parenthesis, etc. Additional
errors will usually be reported that will help clarify the problem.
- 8: can't evaluate equ operand. As above, generated in the case of an EQU
statement.
- 9: can't evaluate set operand. As above, generated in the case of an SET
statement.
- 10: attempt to redefine a permanent symbol. The symbol was previously
defined via an EQU statement and thus cannot be changed.
- 11: symbol table full. This is a fatal error and will cause the assembly to
abort. You may try again setting an optional symbol table size from the
command line. See chapter 2, usage, options:s.
- 12: unrecognized operand. A legal operand cannot be formed from the data in
the operand field.
- 13: symbol not defined in symbol table. The symbol was not encountered in
the program up to this point. Remember that forward references are not
allowed in EQU and SET pseudo statements.
- 14: label out of range for current addressing mode. The value of the label
is outside the limits of the current statement. This usually is an
attempt to reference an address beyond the 32k bytes range imposed in
the short addressing mode. See chapter 3, pseudo-ops, assembly control.
- 15: operand 1 is not valid for instruction type. The first operand
encountered is not a valid operand for this particular
instruction/addressing mode.
- 16: operand 2 is not valid for instruction type. The second operand
encountered is not a valid operand for this particular
instruction/addressing mode.
- 17: operand 1 is not correctly formed. This may be caused by illegal use of
operators, undefined labels, etc. The assembler may or may not attempt
to evaluate the second operand. If it does you are not assured that it
will do so correctly. For more on this see error #18.
- 18: operand 2 is not correctly formed. This may be for any of the reasons
stated above for error 17. Remember that it could be incorrectly
evaluated if error 17 was also generated and that the second operand
may be incorrectly formed but not reported as such after error 17
occurs. This is dependant on the assembler correctly determining where
the poorly formed operand 1 ends and operand 2 starts.
- 19: code building function failed. The binary code building function has
failed to properly construct a sequence of code for the statement. This
error will usually be followed with additional error #s pinpointing the
problem.
- 20: A 3 bit immediate data value out of bounds, i.e. it is greater than 7
or less than 0.
- 21: 8 bit bit field specifier out of range.
- 22: 32 bit bit field specifier out of range.
- 23: attempt to generate a bit field specifier failed.
- 24: count operand out of range (1-8).
- 25: destination register specifier illegal or out of range.
- 26: source or destination register specifier illegal or out of range.
- 27: attempt to generat register mask list from operand failed.
- 28: operand failed to evaluate to a proper source or destination effective
address.
- 29: illegal destination effective address, probably label or label with
index reference.
- 30: illegal destination effective address.
- 31: illegal multiple destination effective address, either label, label
with index, or address register indirect with predecrement or
postincrement.
- 32: illegal jump effective address, usually address register indirect with
predecrement or postincrement.
- 33: 4 bit vector out of range.
- 34: expected address displacement failed to evaluate correctly.
- 35: 8 bit displacement out of range (-128 thru 127)
- 36: 16 bit displacement out of range (-32768 thru 32766)
- 37: extent word of operand failed to evaluate correctly.
- 38: 8 bit operand out of range (-128 thru 255, sign is responsibility of
programmer).
- 39: 8 bit extension word value out of range (-128 thru 255).
- 40: 16 bit extension word out of range (-32768 thru 65535).
- 41: 32 bit extension word value out of range (overflow).
- 42: attempt to generate a 16 bit displacement failed, probably out of
range.
- 43: attempt to generate an 8 bit displacemnt failed, check range.
CHAPTER 8
DIFFERENCES
This list must be considered incomplete and will be added to as the facts are
brought to my attention.
In General
- As68 allows addresses assembled under an rorg.l or org.l condition to be
coerced into a short address by enclosing the entire expression in a set of
parenthesis followed by '.s'. This allows references to the first/last 32k
with a code sequence that is 2 bytes shorter than the long address mode. As
an example:
org.l $8000
outport equ $ffa000 * port mapped into last * page of address space
move.b d0,outport * will generate 6 bytes of code while
move.b d0,(outport).s * will only generate 4 bytes of code
MODIFIED
format of movem instruction.
movem /d4/d5/, (a7)+
NOTE the trailing /