home *** CD-ROM | disk | FTP | other *** search
- .\"/*% nroff -cm -rL72 %|epson|spr -f plain.a -h uncdoc -w
- .nr Hb 7
- .nr Hs 3
- .ds HF 3 3 3 3 3 3 3
- .nr Hu 5
- .nr Hc 1
- .SA 1
- .PH "''A Disassembler''"
- .PF "'Issue %I%'- Page \\\\nP -'%G%'"
- .H 1 "Introduction"
- This document describes the first release of a disassembler for UNIX
- executable files.
- The key features are:
- .AL
- .LI
- For object files the output can be assembled to generate the same
- object module, (apart from minor variations in symbol table ordering) as the
- input.
- .LI
- For stripped executable files object modules and libraries may be scanned,
- modules in the main input identified and the appropriate names automatically
- inserted into the output.
- .LI
- An option is available to convert most non-global names into local symbols,
- which cuts down the symbols in the generated assembler file.
- .LI
- The disassembler copes reasonably with modules merged with the
- .B "-r"
- option to
- .B "ld" ,
- generating a warning message as to the number of modules involved.
- .LE
- .P
- At present this is available for certain Motorola 68000 ports of UNIX
- System III and System V. Dependencies on
- .AL a
- .LI
- Instruction set.
- .LI
- Object module format.
- .LI
- Library module format.
- .LI
- Assembler output format.
- .LE
- .P
- are hopefully sufficiently localised to make the product useful as a
- basis for other disassemblers for other versions of UNIX.
- .P
- The product is thus distributed in source form at present.
- .H 1 "Use"
- The disassembler is run by entering:
- .DS I
- unc mainfile lib1 lib2 ...
- .DE
- .P
- The first named file is the file to be disassembled, which should be
- a single file, either an object module, a (possibly stripped) executable
- file, or a library member. Library members are designated using a
- parenthesis notation, thus:
- .DS I
- unc '/lib/libc.a(printf.o)'
- .DE
- .P
- It is usually necessary to escape the arguments in this case to prevent
- misinterpretation by the shell. Libraries in standard places such as
- .I "/lib"
- and
- .I "/usr/lib"
- may be specified in the same way as to
- .B "ld" ,
- thus
- .DS I
- unc '-lc(printf.o)'
- unc '-lcurses(wmove.o)'
- .DE
- .P
- As an additional facility, the list of directories searched for
- libraries may be varied by setting the environment variable
- .B "LDPATH" ,
- which is interpreted similarly to the shell
- .B "PATH"
- variable, and of course defaults to
- .DS I
- LDPATH=/lib:/usr/lib
- .DE
- .P
- As a further facility, the insertion of
- .B "lib"
- before and
- .B ".a"
- after the argument may be suppressed by using a capital
- .B "-L"
- argument, thus to print out the assembler for
- .I "/lib/crt0.o" ,
- then the command
- .DS I
- unc -Lcrt0.o
- .DE
- .P
- should have the desired effect.
- .P
- Second and subsequent file arguments are only referenced for stripped
- executable files, and may consist of single object files and library
- members, using the same syntax as before, or whole libraries of object
- files, thus:
- .DS I
- unc strippedfile -Lcrt0.o -lcurses -ltermcap '-lm(sqrt.o)' -lc
- .DE
- .P
- It is advisable to make some effort to put the libraries to be searched
- in the order in which they were originally loaded. This is because the
- search for each module starts where the previously matched module ended.
- However, no harm is done if this rule is not adhered to apart from
- increased execution time except in the rare cases where the disassembler
- is confused by object modules which are very nearly similar.
- .H 1 "Additional options"
- The following options are available to modify the behaviour of the
- disassembler.
- .VL 15 2
- .LI "-o file"
- Causes output to be sent to the specified file instead of the standard
- output.
- .LI "-t prefix"
- Causes temporary files to be created with the given prefix. The default
- prefix is
- .B "split" ,
- thus causing two temporary files to be created with this prefix in the
- current directory. If it is desired, for example, to create the files as
- .B "/tmp/xx*" ,
- then the argument
- .B "-t /tmp/xx"
- should be given. Note that the temporary files may be very large as a
- complete map of the text and data segments is generated.
- .LI "-a"
- Suppresses the generation of non-global absolute symbols from the
- output. This saves output from C compilations without any obvious
- problems, but the symbols are by default included in the name of
- producing as nearly identical output as possible to the original source.
- .LI "-s"
- Causes an additional scan to take place where all possible labels are
- replaced by local symbols. The local symbols are inserted in strictly
- ascending order, starting at 1.
- .LI "-v"
- Causes a blow-by-blow account of activities to be output on the standard
- error.
- .LE
- .H 1 "Diagnostics etc"
- Truncated or garbled object and library files usually cause processing
- to stop with an explanatory message.
- .P
- The only other kinds of message are some passing warnings concerning
- obscure constructs not handled, such as the relocation of byte fields,
- or the relocation of overlapping fields. Occasionally a message
- .DS I
- Library clash: message
- .DE
- .P
- may appear and processing cease. This message is found where at a late
- stage in processing libraries, the program discovers that due to the
- extreme similarity of two or more library members, it has come to the
- wrong conclusion about which one to use. The remedy here is to spell out
- to the program which members to take in which order.
- .H 1 "Future development"
- In the future it is hoped to devise ways of making the disassembler
- independent of all the above-mentioned version dependencies, by first
- reading a files defining these things. This will probably be applied
- after the Common Object Format becomes more standard.
- .P
- In the long term it would be desirable and useful to enhance the product
- to produce compilable C in addition to assemblable assembler. Stages in
- the process are seen as follows:
- .AL
- .LI
- Better identification of basic blocks in the code. Switch statements are
- a major problem here, as are constant data held in the text segment.
- .LI
- Marrying of data to the corresponding text. It is in various places hard
- to divorce static references "on the fly" (e.g. strings, and switch
- lists in some implementations) from static at the head of a module. This
- is part of the problem of identifying basic blocks.
- .LI
- Compilation of header files to work out structure references within the
- text. At this stage some interaction may be needed.
- .LE
- .P
- Meanwhile the product is one which is a useful tool to the author in its
- present form. Comments and suggestions as to the most practical method
- of improving the product in the ways suggested or in other ways would be
- gratefully considered.
-