home *** CD-ROM | disk | FTP | other *** search
- disassemble.library
-
-
- General
-
- 'disassemble.library' is a shareable AmigaDOS library which is a
- disassembler for the MC68000 family of processors. It disassembles code
- for the MC68000, MC68010, MC68020 and MC68030 processors, for the
- MC68851 memory management unit and for the MC68881 and MC68882 floating
- point coprocessors. It is capable of symbolic disassembly, will
- generate labels at referenced locations, and is highly controllable
- through a set of style flags.
-
- The library's single entry point, Disassemble, will attempt to
- disassemble one instruction per call. It communicates with its caller
- through a passed information vector, which includes pointers to
- routines to call to process text output, access symbolic information,
- record label locations, etc.
-
- There are two main reasons why I separated this functionality into a
- shareable library. One is that I wanted to share the code (which is
- fairly bulky) between a file disassembler/dumper and a debugger. The
- second is that I plan to write an entire set of such shared libraries,
- and this one has given me experience in how to go about it, and some of
- the consequences of doing it.
-
- In order to use this library you must copy file 'disassemble.library'
- to your LIBS: directory. This is where AmigaDOS looks when it needs to
- load the library in response to an OpenLibrary system call.
-
- To use the library with your own programs, you will need a set of
- interface stubs or definitions, depending on the language and compiler
- you use. The needed information is in the accompanying 'fd' file
- (disassemble_lib.fd) and in this document. I have included a defining
- include file and an interface library for Draco users.
-
- I have tested the library, as used by my disassembler/dumper, Dis,
- fairly extensively. There are bound to be some bugs left, however.
- Please let me know at one of the following electronic mail addresses if
- you find any:
-
- Chris Gray
- usenet: {uunet,alberta}!myrias!ami-cg!cg
- CIS: 74007,1165
-
- Sending me physical mail works, but I am VERY slow at answering (up to
- 6 months on one occasion!). Trying to telephone me can be expensive -
- you are more likely to get my modem.
-
-
- Interfacing to the Library
-
- All communications to and from the library is done through an
- information structure, the address of which is passed in register A0.
- The structure is declared (in Draco) as follows:
-
- type DisassemblerState_t = struct {
- proc(/* ulong address(d0) */)uint ds_readWord;
- proc(/* char ch(d0) */)void ds_putChar;
- proc(/* ulong addr(d0) */)*char ds_findLabel;
- proc(/* ulong addr(d0), refAt(d1); *ulong pTrueAddr(a0) */)*char
- ds_findAbsSymbol;
- proc(/* long offset(d0); ulong refAt(d1) */)*char ds_findRelCode;
- proc(/* long offset(d0); ulong refAt(d1);*long pTrueOffset(a0) */)*char
- ds_findRelData;
- proc(/* ulong addr(d0) */)void ds_labelAt;
- proc(/* ulong addr(d0) */)void ds_branchTo;
- proc(/* ulong addr(d0) */)bool ds_isLabel;
- ulong ds_address;
- ulong ds_relativeBase;
- *char ds_errorMessage;
- uint ds_operandColumn;
- uint ds_column;
- uint ds_extraWord;
- bool ds_putPosition;
- bool ds_absoluteAddress;
- bool ds_putErrors;
- bool ds_capExtended;
- bool ds_putAddress;
- bool ds_putRelForm;
- bool ds_extended;
- bool ds_extendedNow;
- bool ds_illegal;
- bool ds_hadExtraWord;
- };
-
- The first few fields are the addresses of functions which the library
- can call to perform various needed operations. All such addresses are
- 32 bit values. Fields of type 'ulong' are 32 bit unsigned integers.
- Fields of type 'uint' are 16 bit unsigned integers. Fields of type
- 'bool' are 8 bit 1/0 true/false values. In more detail:
-
- ds_readWord - this function is passed a 32 bit address or offset in
- register D0. It should return the 16 bit contents of that location
- in register D0. The addresses passed are all based on the value
- given in field 'ds_address', thus they can be real addresses or
- offsets into a buffer or hunk, depending on what the caller does.
- This routine MUST be supplied. The library does not try to
- reference any memory directly - all references will be through this
- function.
-
- ds_putChar - this function is passed a character in the low 8 bits of
- register D0. That character is part of the disassembled
- instruction. All output from the library will go through this
- function. If this function is not present (value is nil, a 32 bit,
- 0 value), then no output is done. This mode of operation runs
- slightly faster, and can be used to simply check for valid
- instructions or for a pre-scan to find label references.
-
- ds_findLabel - this function is passed a 32 bit address in register D0,
- and should return nil or the address of a symbol which is a
- symbolic label for that address. If no symbolic information is
- being used, this routine can be omitted. Any pointer returned must
- be valid until this call of Disassemble returns, but not beyond.
-
- ds_findAbsSymbol - this function is used to find symbolic names for
- addresses that are referenced as 32 bit absolute addresses. The
- address in question is passed in register D0. The address or offset
- within the code being disassembled (based on ds_address) at which
- the reference occurs is passed in register D1. This information can
- be used with relocation information supplied in AmigaDOS object
- files. Register A0 contains the address of a 32 bit value which
- should be filled in with the true address of the symbolic value.
- The pointer returned in D0 should be nil if no appropriate symbol
- was found or the address of a null-terminated string. As an
- example, suppose that label 'Fred' represents offset 0x208 in the
- code being disassembled, and a call to 'ds_findAbsSymbol' is made
- by the library with the following parameters:
-
- D0 - 0x20d
- D1 - 0x32
- A0 - ????
-
- It would be appropriate to return the string 'Fred', and to store
- the value 0x208 into the region pointed to by A0. The library would
- then show a reference like 'Fred+0x5'. As usual, this routine can
- be omitted if no symbolic information is available.
-
- ds_findRelCode - this function is used for references that are PC-
- relative, so what it should return are labels within the code. No
- ability is provided on this function to provide the closest label -
- most code doesn't branch to just past a label.
-
- ds_findRelData - this function is used for references that are relative
- to register A4. This allows symbolic disassembly of small-model
- data references generated by the Lattice and Aztec C compilers.
-
- ds_labelAt - this function is called when a PC relative data
- refererence is found in the code. A user program would supply an
- address here if it wanted to keep track of where labels should be.
- A bitmap is a good way of doing this. Keeping track of labels this
- way is generally only of use if a two-pass disassembly is going to
- be used.
-
- ds_branchTo - this function is similar to 'ds_labelAt' except that it
- is called only for branch and jump targets. In other words, the
- address given must be a code address, since it is a branch target.
-
- ds_isLabel - this function, if present, is called to determine if there
- was a reference to the given address. This is used to know whether
- or not to generate a label in front of the instruction being
- disassembled.
-
- ds_address - this is the address or offset that disassembly is
- occurring at. It is the value which will be given to 'ds_readWord'
- to get the first word of the instruction. The field is properly
- updated as disassembly occurs, so it need only be set before the
- first call to Disassemble. If multi-pass disassembly is being used
- (e.g. to produce labels), it should be reset before each pass.
-
- ds_relativeBase - this is the current base address for disassembly.
- Labels will be relative to this base. E.g. if an instruction 8
- bytes past this address needed a label, the label would be either
- 'L008' or 'L00000008', depending on label size. It will be updated
- by the libary to the current value of 'ds_address' if
- 'ds_findLabel' yields a label for the current value of
- 'ds_address'. Even though the field is maintained, it is not always
- used. See the description of 'ds_absoluteAddress'.
-
- ds_errorMessage - this field is occasionally filled in by the library
- with a specific error message concerning the disassembly. It is
- cleared at the start of each call, so if the field is non-null when
- Disassemble returns, it points to an error message. The message is
- not dynamically allocated, so it should not be freed or modified by
- the caller.
-
- ds_operandColumn - this 16 bit field should be filled in with the
- column at which the caller wants instruction operands to start.
- Spacing with blanks will be used to pad out to the desired column.
- If the instruction field, etc. already extends past the target
- column, no spacing will be used. A reasonable value for this field
- is 20 if initial addresses are not enabled, or 31 if they are.
-
- ds_column - this 16 bit field is used internally to count columns
-
- ds_extraWord - this 16 bit field is used internally to remember a
- second word of an invalid instruction, so that it can be dumped in
- hexadecimal.
-
- ds_putPosition - this 8 bit flag field controls whether or not the
- library will display hexadecimal addresses at the beginning of the
- output lines. As with the other flag fields, a value of 0 is
- treated as 'false', and any other value as 'true'. The addresses
- will either be 32 bit absolute ones (the value of 'ds_address') or
- will be 16 bit relative ones ('ds_address' - 'ds_relativeBase')
- depending on whether or not 'ds_absoluteAddress' is set.
-
- ds_absolueAddress - this flag field controls the form of labels and of
- the position display. If it is set, they are 32 bit values taken
- direct from 'ds_address'. If not set, they are 16 bit relative
- values computed as (address - 'ds_relativeBase'). For most
- purposes, the relative form is tidier.
-
- ds_putErrors - this flag controls whether or not the library will
- output error messages that are returned in 'ds_errorMessage'.
- Tighter formatting control can be obtained if this option is not
- used.
-
- ds_capExtended - this flag controls whether or not the library will
- capitalize instructions and modes that are not available on the
- MC68000. This is useful to make the non-68000 instructions stand
- out.
-
- ds_putAddress - this flag controls whether or not the hex address is
- displayed along with a symbolic or label form. It is useful if the
- symbolic or label forms are confused for some reason, and would be
- of value to a debugger, where all addresses are real.
-
- ds_putRelForm - this flag controls whether or not the relative form of
- PC-relative and A4-relative addressing is displayed along with any
- symbolic or label form. This is useful for those who wish to see
- the actual encoded form of the instructions, or if the symbolic or
- label forms are confused.
-
- ds_extended - this flag is initially cleared by the library and is set
- whenever a non-68000 instruction or mode is seen. Thus, after each
- call to Disassemble, this flag can be checked to see if a non-68000
- form was seen.
-
- ds_extendedNow - this flag is used internally to know whether or not
- output should be capitalized. Note that symbolic names are never
- capitalized.
-
- ds_illegal - this flag, initially cleared, is set whenever any illegal
- instruction or mode is encountered. There will not always be an
- accompanying error message. Note also that I have not gone to the
- trouble of checking each addressing mode for each instruction, thus
- there are instruction forms which will not cause 'ds_illegal' to be
- set but which the actual processor will not execute. Also, the
- specific 'illegal' instruction, opcode 0x4afc, will not cause this
- flag to be set.
-
- ds_hadExtraWord - this flag is used internally to indicate that an
- illegal instruction encountered had a second or extended opcode
- word that should also be printed in hex.
-
- As an example, here is a simple one-pass disassembly of a small hunk of
- code:
-
- #drinc:disassemble.g
-
- uint
- R_D0 = 0,
- R_FP = 6,
-
- OP_MOVEB = 0x1000,
- OP_MOVEL = 0x2000,
-
- M_DDIR = 0,
- M_DISP = 5;
-
- proc readWord(/* ulong address */)uint:
- ulong address;
-
- code(
- OP_MOVEL | R_FP << 9 | M_DISP << 6 | M_DDIR << 3 | R_D0,
- address
- );
- pretend(address, *uint)*
- corp;
-
- proc putChar(/* char ch */)void:
- char ch;
-
- code(
- OP_MOVEB | R_FP << 9 | M_DISP << 6 | M_DDIR << 3 | R_D0,
- ch
- );
- if ch = '\n' then
- writeln();
- else
- write(ch);
- fi;
- corp;
-
- proc main()void:
- extern tail()void;
- DisassemblerState_t ds;
-
- if OpenDisassembleLibrary(0) ~= nil then
- ds.ds_readWord := readWord;
- ds.ds_putChar := putChar;
- ds.ds_findLabel := nil;
- ds.ds_findAbsSymbol := nil;
- ds.ds_findRelCode := nil;
- ds.ds_findRelData := nil;
- ds.ds_labelAt := nil;
- ds.ds_branchTo := nil;
- ds.ds_isLabel := nil;
- ds.ds_address := pretend(readWord, ulong);
- ds.ds_relativeBase := 0;
- ds.ds_operandColumn := 31;
- ds.ds_putPosition := true;
- ds.ds_absoluteAddress := true;
- ds.ds_putErrors := true;
- ds.ds_capExtended := true;
- ds.ds_putAddress := false;
- ds.ds_putRelForm := false;
- while ds.ds_address < pretend(tail, ulong) do
- ignore Disassemble(&ds);
- od;
- CloseDisassembleLibrary();
- else
- writeln("Can't open Disassemble.library");
- fi;
- corp;
-
- proc tail()void:
- corp;
-
- Note the use of the 'code' construct to retrieve parameters passed in
- registers. Slightly different tricks would be needed to do this in
- other languages/compilers. For an example of using the library for full
- symbolic disassembly with label generation, see the source to the 'Dis'
- file disassembler/dumper, which is included in this archive.
-