OS/2 Shareware BBS: 10 Tools

home *** CD-ROM | disk | FTP | other *** search

/ OS/2 Shareware BBS: 10 Tools / 10-Tools.zip / mitsch75.zip / scheme-7_5_17-src.zip / scheme-7.5.17 / src / compiler / documentation / INSTALL < prev next >

Wrap

Text File | 1993-10-29 | 16.2 KB | 367 lines

-*-Text-*- Installation Notes for Liar version 4.9 Liar, the CScheme compiler, is available for the following computers: Sun 3 HP 9000 series 300 (except model 310) These are 68020 based machines. Ports for 68000/68010 machines and the Vax will be available in the future. For bug reports send computer mail to BUG-LIAR@ZURICH.AI.MIT.EDU (on the Arpanet/Internet) or US Snail to Scheme Team c/o Prof. Hal Abelson 545 Technology Sq. rm 410 Cambridge MA 02139 * The compiler is distributed as four compressed tar files, as follows: ** "dist6.2.1-tar.Z" is release 6.2.1 of CScheme. This is required for using the compiler. It is installed in the usual way except for one small change to the microcode needed to support compiled code. This tar file contains about 5.1 Mbyte of data when unloaded. ** "liar4.9b-tar.Z" contains the binary files for the compiler. This includes a ".bin" file (SCode binary, for the interpreter) and a ".com" file (native code compiler output) for each source file in the compiler. It also contains a few other files used to construct the compiler from the binary files. This tar file contains about 3 Mbyte of data when unloaded. ** "liar4.9s-tar.Z" contains the source files for the compiler. It also includes a TAGS table. This tar file contains about 1.2 Mbyte of data when unloaded. ** "liar4.9d-tar.Z" contains some debugging files. There is one ".binf" file corresponding to each ".com" file in the compiler. Given both of these files, the compiler can generate a symbolic assembly language listing of the compiled code. In future releases, these debugging files will also support debugging tools for parsing the stack and examining compiled code environment structures. This tar file contains about 4.5 Mbyte of data when unloaded. * Installation of the compiler. Installation requires about 17-20 Mbyte of disk space. This is conservative and could be reduced with some knowledge of what is needed and what is not. ** The first step in installation is building CScheme. Follow the instructions included in the release, except that the file "makefiles/sun" or "makefiles/hp200" (as appropriate) must be edited as follows. Look for the following lines in that file: # Compiled code interface files. # These defaults are just stubs. CSRC = compiler.c CFILE = compiler.oo D_CFILE = compiler.do F_CFILE = compiler.fo CFLAG = GC_HEAD_FILES= gccode.h edit these lines to read as follows: # Compiled code interface files. CSRC = cmp68020.s CFILE = cmp68020.o D_CFILE = cmp68020.o F_CFILE = cmp68020.o CFLAG = -DCMPGCFILE=\"cmp68kgc.h\" GC_HEAD_FILES= gccode.h cmp68kgc.h .s.o: ; as -o $*.o $*.s After this is done, connect to the microcode subdirectory and execute the following cp cmp68020.s-<sys> cmp68020.s where <sys> is "sun" if you are running on a Sun 3, or "hp" if you are running on an HP 9000 series 300. NOTE: the file "cmp68020.s-src" is the source file from which the other two were built. It was processed by m4 on an HP machine to create "cmp68020.s-hp", then that file was processed by a custom conversion program (courtesy of the butterfly-lisp hackers at BBN) to produce "cmp68020.s-sun". Once these changes have been made, finish the installation process in the normal way. **** Note that on Sun workstations, assembling "cmp68020.s" will produce the following harmless warning messages: as: error (cmp68020.s:1432): Unqualified forward reference as: error (cmp68020.s:1435): Unqualified forward reference as: error (cmp68020.s:1444): Unqualified forward reference Also, on older versions of Sun software (before release 3.4) you may not be able to assemble this file at all. For that case, we have included the file "cmp68020.o-sun" which is the output of the assembler on a 3.4 system. Copy that file to "cmp68020.o" and touch it to make sure it is newer than the source file. ** The next step in installation is unloading the Liar tar files. The tar files may be unloaded wherever you like. When unloaded, they will create a directory "liar4.9" under the directory to which you are connected. Note that only "liar4.9b-tar.Z" need be unloaded in order to perform the rest of the installation. In what follows, let $LIAR stand for the name of the directory in which the compiler is loaded, and let $SCHEME stand for the name of the directory in which the interpreter is loaded. ** After having unloaded the files, and after CScheme has been built and installed, do the following: cd $SCHEME mv $LIAR/runtime/* runtime mv $LIAR/sf/* sf cd runtime scheme -fasl cmp-runmd.bin < $LIAR/etc/mkrun.scm This transfers a number of compiled files to the Scheme runtime system directory, and constructs a new version of the runtime system, named "scheme.com", which is partially compiled. After this has been done, you may discard all of the ".com" files in the runtime system directory. If you want the new runtime system to be the default, rename it to "scheme.bin". **** Note: because this is a beta release, the compiled runtime system "scheme.com" is likely to have bugs. If you intend to use it by default, we suggest you retain the original (interpreted) runtime system "scheme.bin" by renaming it to something else. ** Next, do the following: scheme -constant 510 -heap 500 -band $SCHEME/runtime/scheme.com This starts up the scheme interpreter with a large constant space and heap, using the partially compiled runtime system. After the interpreter has started, type the following expression at it: (begin (%cd "$LIAR") (load "machines/bobcat/make" system-global-environment) (disk-save "$SCHEME/runtime/compiler.com")) it will load two files, then ask the question "Load compiled?". Type Y, which means to build the compiler using compiled code. If you type N, the compiler will be run interpretively, which is about a factor of 10 slower than the compiled version. After you answer the question, it will load and evaluate approximately 100 files. This will take several minutes. When it is done, you are returned to the interpreter. At this point, a new band will have been created, called "$SCHEME/runtime/compiler.com", which contains the compiler. All the other files in the $LIAR directory may be discarded, if you wish, since only "compiler.com" is needed to run the compiler. * Using the compiler. ** Loading. The compiler band, "compiler.com", is used by starting Scheme and specifying that file using the "-band" option. You must also use the "-constant" option to specify that the constant space is at least 510, and it is recommended that the "-heap" be specified at least 500. For medium to large compilations, a heap size of 700 or more may be needed; at MIT we typically use 1000 to be safe. Alternatively, the switch "-compiler" specifies constant 510, heap 500, and the compiler band. ** Memory usage. Note that the total memory used by Scheme in this configuration is substantial! With a heap of 1000 and a constant space of 510, the memory used is (* 4 (+ 510 (* 2 1000))), or about 10 Mbyte. For many computers this is a ridiculous figure, and Scheme will die a slow death due to paging. Using a heap of 500 reduces this to about 6 Mbytes, but that is still quite alot. For machines with small memories, using the `bchscheme' version of the microcode will be helpful. This program, which is made by connecting to "$SCHEME/microcode" and typing "make bchscheme", does its garbage collection to a disk file, thus requiring only one heap in the virtual address space. This reduces the overall memory requirements for the above examples to 6 Mbyte and 4 Mbyte, respectively. The savings of 4 and 2 Mbytes (respective) will be allocated in the file system rather than in virtual memory. This may seem like a complicated way of doing virtual memory management, but in fact it performs significantly better than paging on machines with small amounts of RAM. This is because the GC algorithm uses the disk much more efficiently than the paging system will be able to. ** Compilation. The following global definitions are available for calling the compiler: (COMPILE-BIN-FILE FILENAME #!OPTIONAL OUTPUT-FILENAME) Compiles a binary SCode file, producing a native code file. FILENAME should refer to a file which is the output of the SF program (see "$SCHEME/documentation/user.txt" for a description of SF). The type of the input file defaults to ".bin". OUTPUT-FILENAME, if given, is where to put the output file. If no output filename is given, the output filename defaults to the input filename, except with type ".com". If it is a directory specification (on unix, this means if it has a trailing "/"), then the output filename defaults as usual, except that it goes in that directory. This is similar to the operation of SF. Also, like SF, the input filename may be a list of filenames, in which case they are all compiled in order. (COMPILE-PROCEDURE PROCEDURE) Compiles a compound procedure, given as its argument, and returns a new procedure which is the compiled form. This does not perform side effects on the environment, so if one wished to compile MAP, for example, and install the compiled form, it would be necessary to say (set! map (compile-procedure map)) (COMPILER:WRITE-LAP-FILE FILENAME) This procedure generates a "LAP" disassembly file (LAP stands for Lisp Assembly Program, a traditional name for assembly language written in a list notation) from the output of COMPILE-BIN-FILE. If filename is "foo", then it looks for "foo.com" and disassembles that, producing a file "foo.lap". If, in addition, the file "foo.binf" exists, it will use that information to produce a disassembly which contains symbolic names for all of the labels. This second form is extremely useful for debugging. (COMPILE-DIRECTORY DIRECTORY #!OPTIONAL OUTPUT-DIRECTORY FORCE?) Finds all of the ".bin" files in DIRECTORY whose corresponding ".com" files either do not exist or are older, and recompiles them. OUTPUT-DIRECTORY, if given, specifies a different directory to look in for the ".com" files. FORCE?, if given and not #F, means recompile even if the output files appear up to date. * Debugging compiled code. At present the debugging tools are practically nonexistent. What follows is a description of the lowest level support, which is clumsy to use but which is adequate if you have a moderate understanding of the compiled code. This is one of the prices of beta test! Before release we will have user-level debugging tools. There are two basic kinds of errors: fatal and non-fatal. Fatal errors are things like segmentation violations and bus errors, and when these occur the only method of debugging is to use an assembly language debugger such as `adb' or `gdb'. Debugging these errors is complicated and will not be described here. ** Non-fatal errors can be debugged from Scheme. Here is the method: the file "$LIAR/etc/stackp.bin" contains a simple stack parser that will allow you to display the Scheme stack, and refer to any of the items in the stack by offset number. Loading this file (into the global environment, for example), defines two useful procedures: (RCD FILENAME) writes a file containing a description of the current stack. When an error has occurred, the current stack contains the continuation of the error, which is the information you want to see. Each line of the file contains an offset number and the printed representation of an object (the latter is truncated to fit on one line). (RCR OFFSET) returns the object corresponding to OFFSET from the current stack. Thus, after using RCD to see the stack, RCR will get you pointers to any of the objects. Given these procedures, you can look at the compiled code stack frames, and possibly (with some skill) figure out what is happening. ** Compiled code objects manipulators. Another set of useful procedures, built into the runtime system and defined in the file "$SCHEME/runtime/ustruc.scm", will allow you to manipulate various compiled code objects: (COMPILED-PROCEDURE-ENTRY PROCEDURE) returns the entry point of the compiled procedure PROCEDURE. This entry point is an object whose type is COMPILED-EXPRESSION. (COMPILED-CODE-ADDRESS? OBJECT) is true of both COMPILED-EXPRESSION objects as well as COMPILER-RETURN-ADDRESS objects. (COMPILED-CODE-ADDRESS->BLOCK COMPILED-CODE-ADDRESS) returns the compiled code block to which that address refers. The procedure COMPILED-CODE-BLOCK/DEBUGGING-INFO will tell you the name of the ".binf" file corresponding to that compiled code block, if the compiled code was generated by COMPILE-BIN-FILE. (COMPILED-CODE-ADDRESS->OFFSET COMPILED-CODE-ADDRESS) returns the offset, in bytes, of that address from the beginning of the compiled code block. NOTE: this offset is the SAME offset as that shown in the disassembly listing! Thus, given any compiled code address, you can figure out both what file it corresponds to, plus what label in the disassembly file it points at. This is the basic information you need to understand the stack. There are several other procedures defined for manipulating these objects -- see the source code for details. What follows is a brief description of the object formats to aid debugging. ** Compiled Code Blocks. Compiled code blocks are "partially marked" vectors. The first part of a compiled code block is "non-marked", which means that the GC copies it but does not look through it for pointers. This part is used to hold the compiled code. The second part is "marked", and contains constants that are referred to by the compiled code. These constants are ordinary Scheme objects and must be traced by the GC in the usual way. The disassembly listing shows the compiled code block in the same format that it is laid out in memory, with offsets in bytes from the beginning of the block. The header of the block is 8 bytes, so the disassembly listing starts at offset 8. The code and constants sections are displayed separately, in slightly different formats. ** Procedure Entry Points. The entry point of a procedure can be found in the LAP file by looking for a label with the same name as the procedure, concatenated with some positive integer. Unnamed lambda expressions will be lambda-<n> for some <n>. Closed procedures (i.e. those procedures which have an external representation) have two entry points, whose labels differ only in the concatenated integer. The first entry point is responsible for checking the number of arguments, and transfers control to the second entry point if that is correct. ** Stack Frames. The normal stack frame for a closed procedure is constructed by pushing the return address, then all the arguments right to left, then the procedure. If the procedure has internal definitions, then these are pushed on the stack on top of that in some unspecified order. Internal procedures, when invoked, may either extend the closure's frame or create new frames. The rules for this are complicated and far beyond the scope of this document. However, two special types of stack pointers may be used when the closure's frame is extended. The first of these is a "static link". This is a pointer into the stack which is used when a sub-frame needs to refer to bindings in some parent frame, but the compiler was unable to determine where that parent frame was at compile time. The other type is a "dynamic link", which points to where the return address for the current procedure is located in the stack. Because of tail recursion considerations, the compiler cannot always determine this at compile time, and in those cases dynamic links are used. The dynamic link is normally kept in register A4, and pushed and popped off the stack at appropriate times. Note that internal procedures evaluate and push their arguments in a completely unspecified order. Thus if your program depends on the fact that the interpreter evaluates arguments from right to left, you might be screwed, since the compiler chooses whatever order seems most efficient or convenient.