home *** CD-ROM | disk | FTP | other *** search
- "code.doc", documentation for scm4e0.
- Copyright (C) 1990, 1991, 1992, 1993, 1994 Aubrey Jaffer.
-
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 1, or (at your option)
- any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-
- The author can be reached at jaffer@ai.mit.edu or
- Aubrey Jaffer, 84 Pleasant St., Wakefield MA 01880
-
- Scm is a portable Scheme implementation written in C. Scm provides a
- machine independent platform for JACAL, a symbolic algebra system.
- SCM runs under VMS, MS-DOS, MacOS, Unix and similar systems.
-
- Scm conforms to Revised^4 Report on the Algorithmic Language Scheme
- and the IEEE P1178 specification. Scm is interpreted and implements
- tail recursion for interpreted code. Scm has inexacts, 30 bit
- immediate integers and large precision integers. Scm uses and garbage
- collects off the C-stack. This allows routines to be written in C
- without regard to GC visibility. call-with-current-continuation is
- fully are supported. ASCII and EBCDIC are supported.
-
- PROJECT HISTORY
-
- Siod, written by George Carrette, was the starting point for scm.
- Here is the Siod notice:
- /* Scheme In One Defun, but in C this time.
-
- * COPYRIGHT (c) 1989 BY *
- * PARADIGM ASSOCIATES INCORPORATED, CAMBRIDGE, MASSACHUSETTS. *
- * ALL RIGHTS RESERVED *
-
- Permission to use, copy, modify, distribute and sell this software
- and its documentation for any purpose and without fee is hereby
- granted, provided that the above copyright notice appear in all copies
- and that both that copyright notice and this permission notice appear
- in supporting documentation, and that the name of Paradigm Associates
- Inc not be used in advertising or publicity pertaining to distribution
- of the software without specific, written prior permission.
-
- PARADIGM DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
- ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL
- PARADIGM BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR
- ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
- WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
- ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
- SOFTWARE.
-
- gjc@paradigm.com
-
- Paradigm Associates Inc Phone: 617-492-6079
- 29 Putnam Ave, Suite 6
- Cambridge, MA 02138
- */
-
- The innovation from Siod which scm uses is being able to garbage
- collect off the c-stack. All the code has been rewritten. See the
- file "ChangeLog" for a log of recent changes that have been made to
- scm.
-
- SCM DATA TYPES
-
- IMMEDIATEs:
- pointer: pointer to a cell
- inum: immediate 30 bit integers
- ichr: immediate characters
- iflags: #t, #f, (), #<eof>, #<undefined>, and #<unspecified>
- isym: `and', `begin', `case', `cond', `define', `do', `if',
- `lambda', `let', `let*', `letrec', `or', `quote', `set!',
- `#f', `#t', `#<undefined>', `#<eof>', `()', `#<unspecified>'
- IMCAR IMMEDIATEs: (only in car of evaluated code)
- ispcsym: special symbols: syntax-checked versions of first NUM_ISYMS
- isyms
- iloc: indexes to a variable's location in environment
- gloc: pointer to a symbol's value cell
-
- CELLs:
- Cells represent all SCM objects besides the immediates.
- A cell has a car and a cdr. Low-order bits in car identify
- the type of object. Rest of car and cdr hold object data.
- SIMPLE:
- cons: scheme cons-cell returned by (cons arg1 arg2)
- closure: applicable object returned by (lambda (args) ...)
- MALLOCs:
- spare: spare tc7 type code
- vector: scheme vector
- ssymbol: static scheme symbol (part of initial system)
- msymbol: malloced scheme symbol (can be GCed)
- string: scheme string
- bvect: uniform vector of booleans (bit-vector)
- ivect: uniform vector of integers
- uvect: uniform vector of non-negative integers
- fvect: uniform vector of short inexact real numbers
- dvect: uniform vector of double precision inexact real numbers
- cvect: uniform vector of double precision inexact complex numbers
- contin: applicable object produced by call-with-current-continuation
- cclo: SUBR and environment for compiled closure
-
- SUBRs:
- asubr: associative C function of 2 arguments.
-
- subr_0: C function of no arguments.
- subr_1: C function of one argument.
- cxr: car, cdr, cadr, cddr, ...
- subr_3: C function of 3 arguments.
-
- subr_2: C function of 2 arguments.
- rpsubr: transitive relational predicate C function of 2 arguments.
- subr_1o: C function of one optional argument.
- subr_2o: C function of 1 required and 1 optional argument.
- LSUBRs:
- lsubr_2: C function of 2 arguments and a list of arguments.
- lsubr: C function of list of arguments.
- PTOBs:
- inport: input port.
- outport: output port.
- ioport: input-output port.
- inpipe: input pipe.
- outpipe: output pipe.
- SMOBs:
- free_cell: unused cell on the freelist.
- flo: single-precision float.
- dblr: double-precision float.
- dblc: double-precision complex.
- bigpos: positive bignum.
- bigneg: negative bignum.
- promise: made by DELAY.
- arbiter: synchronization object.
- macro: macro expanding function.
- array: multi-dimensional array.
-
-
- DATA TYPE REPRESENTATIONS
- IMMEDIATE: B,D,E,F=data bit, C=flag code, P=pointer address bit
- q ................................
- pointer PPPPPPPPPPPPPPPPPPPPPPPPPPPPP000
- inum BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB10
- ichr BBBBBBBBBBBBBBBBBBBBBBBB11110100
- iflag CCCCCCC101110100
- isym CCCCCCC001110100
-
- IMCAR: only in car of evaluated code, cdr has cell's GC bit
- ispcsym 000CCCC00CCCC100
- iloc 0DDDDDDDDDDDDDDDEFFFFFFF11111100
- gloc PPPPPPPPPPPPPPPPPPPPPPPPPPPPP001
-
-
-
- ( jtype . data ) is a java handle
-
- HEAP CELL: G=gc_mark; 1 during mark, 0 other times.
- 1s and 0s here indicate type. G missing means sys (not GC'd)
- SIMPLE:
- cons ..........SCM car..............0 ...........SCM cdr.............G
- closure ..........SCM code...........011 ...........SCM env.............G
- MALLOCs:
- ssymbol .........long length....G0000101 ..........char *chars...........
- msymbol .........long length....G0000111 ..........char *chars...........
- string .........long length....G0001101 ..........char *chars...........
- vector .........long length....G0001111 ...........SCM **elts...........
- bvect .........long length....G0010101 ..........long *words...........
- spare G0010111
- ivect .........long length....G0011101 ..........long *words...........
- uvect .........long length....G0011111 ......unsigned long *words......
- spare G0100101
- spare G0100111
- fvect .........long length....G0101101 .........float *words...........
- dvect .........long length....G0101111 ........double *words...........
- cvect .........long length....G0110101 ........double *words...........
-
- contin .........long length....G0111101 .............*regs..............
- cclo .........long length....G0111111 ...........SCM **elts...........
- SUBRs:
- spare 010001x1
- spare 010011x1
- subr_0 ..........int hpoff.....01010101 ...........SCM (*f)()...........
- subr_1 ..........int hpoff.....01010111 ...........SCM (*f)()...........
- cxr ..........int hpoff.....01011101 ...........SCM (*f)()...........
- subr_3 ..........int hpoff.....01011111 ...........SCM (*f)()...........
- subr_2 ..........int hpoff.....01100101 ...........SCM (*f)()...........
- asubr ..........int hpoff.....01100111 ...........SCM (*f)()...........
- subr_1o ..........int hpoff.....01101101 ...........SCM (*f)()...........
- subr_2o ..........int hpoff.....01101111 ...........SCM (*f)()...........
- lsubr_2 ..........int hpoff.....01110101 ...........SCM (*f)()...........
- lsubr_2n..........int hpoff.....01110111 ...........SCM (*f)()...........
- rpsubr ..........int hpoff.....01111101 ...........SCM (*f)()...........
- PTOBs:
- port 00wroxxxxxxxxG1110111 ..........FILE *stream..........
- inport uuuuuuuuuuU00011xxxxxxxxG1110111 ..........FILE *stream..........
- outport 0000000000000101xxxxxxxxG1110111 ..........FILE *stream..........
- ioport uuuuuuuuuuU00111xxxxxxxxG1110111 ..........FILE *stream..........
- fport 00 00000000G1110111 ..........FILE *stream..........
- pipe 00 00000001G1110111 ..........FILE *stream..........
- SMOBs:
- free_cell
- 000000000000000000000000G1111111 ...........*free_cell........000
- flo 000000000000000000000001G1111111 ...........float num............
- dblr 000000000000000100000001G1111111 ..........double *real..........
- dblc 000000000000001100000001G1111111 .........complex *cmpx..........
- bignum ...int length...0000001 G1111111 .........short *digits..........
- bigpos ...int length...00000010G1111111 .........short *digits..........
- bigneg ...int length...00000011G1111111 .........short *digits..........
- xxxxxxxx = code assigned by newsmob();
- promise 000000000000000fxxxxxxxxG1111111 ...........SCM val..............
- arbiter 000000000000000lxxxxxxxxG1111111 ...........SCM name.............
- macro 000000000000000mxxxxxxxxG1111111 ...........SCM name.............
- array ...short rank..cxxxxxxxxG1111111 ............*array..............
-
-
-
-
- SMOBs
-
- SMOBs are a collection of miscellaneous types. The type code and
- GCMARK bit occupy the lower order 16 bits of the CAR half of the cell.
- The rest of the CAR can be used for sub-type or other information.
- The CDR contains data of size long.
-
- Inexact data types are subtypes of type tc16_flo. If the sub-type is:
- 0 - a single precision float is contained in the CDR.
- 1 - CDR is a pointer to a malloced double.
- 3 - CDR is a pointer to a malloced pair of doubles.
-
- To add a new type to scm:
- [1] long tc16_???;
- The type code which will be used to identify the new type.
- [2] static smobfuns ???smob = {mark???,free???,print???,equalp???};
- smobfuns is a structure composed of 4 functions:
- typedef struct {
- SCM (*mark)P((SCM));
- sizet (*free)P((CELLPTR));
- int (*print)P((SCM exp, SCM port, int writing));
- SCM (*equalp)P((SCM, SCM));
- } smobfuns;
- smob.mark is a function of one argument of type SCM (the cell
- to mark) and returns type SCM which will then be marked. If
- no further objects need to be marked then return an
- immediate object such as BOOL_F. 2 functions are provided:
- markcdr(ptr) which marks the current cell and
- returns the CDR.
- mark0(ptr) which marks the current cell and returns
- BOOL_F.
- smob.free is a function of one argument of type CELLPTR (the
- cell to collected) and returns type sizet which is the
- number of malloced bytes which were freed. Smob.free should
- free any malloced storage associated with this object. The
- function free0(ptr) is provided which does not free any
- storage and returns 0.
- smob.print is 0 or a function of 3 arguments. The first, of
- type SCM, is the smob object. The second, of type SCM, is
- the stream on which to write the result. The third, of type
- int, is 1 if the object should be WRITEn, 0 if it should be
- DISPLAYed. This function should return non-zero if it
- printed, and zero otherwise (in which case a hexadecimal
- number will be printed).
- smob.equalp is 0 or a function of 2 SCM arguments. Both of
- these arguments will be of type tc16???. This function
- should return BOOL_T if the SMOBs are equal, BOOL_F if they
- are not. If smob.equalp is 0, EQUAL? will return BOOL_F if
- they are not EQ?.
- [3] tc16_??? = newsmob(&???smob);
- Allocates the new type with the functions from [2].
-
- Promises and macros in eval.c and arbiters in repl.c provide examples
- of SMOBs. There are a maximum of 256 SMOBs.
-
- PTOBs
-
- PTOBs are similar to SMOBs but define new types of port to which SCM
- can read or write. The following functions are defined in the ptobfuns:
-
- typedef struct {
- SCM (*mark)P((SCM ptr));
- int (*free)P((FILE *p));
- int (*print)P((SCM exp, SCM port, int writing));
- SCM (*equalp)P((SCM, SCM));
- int (*fputc)P((int c, FILE *p));
- int (*fputs)P((char *s, FILE *p));
- sizet (*fwrite)P((char *s, sizet siz, sizet num, FILE *p));
- int (*fflush)P((FILE *stream));
- int (*fgetc)P((FILE *p));
- int (*fclose)P((FILE *p));
- } ptobfuns;
-
- The "free" component to the structure takes a FILE * or other C
- construct as its argument, unlike "free" in in a SMOB which takes the
- whole SMOB cell. Often, "free" and "fclose" can be the same function.
- See fptob and pipob in sys.c for examples of how to define PTOBs.
-
- GARBAGE COLLECTION
-
- The garbage collector is in the latter half of sys.c. Immediates
- always appear as parts of other objects, so they are not subject to
- explicit garbage collection. There is a heap (composed of heap
- segments) in which all cells reside. The storage for strings,
- vectors, continuations, doubles, complexes, and bignums is managed by
- malloc. There is only one pointer to each malloc object from its
- type-header cell in the heap. This allows malloc objects to be freed
- when the associated heap object is garbage collected.
-
- To garbage collect, first certain protected objects are marked (such
- as symhash). Then the stack (and marked continuations) are traversed.
- Each longword in the stack is tried to see if it is a valid SCM
- pointer into the heap. If it is, the object itself and any objects it
- points to are marked. If the stack is word rather than longword
- aligned (#define WORD_ALIGN), both alignments are tried. This
- arrangement will occasionally mark an object which is no longer used.
- This has not been a problem in practice and the advantage of using the
- c-stack far outweighs it.
-
- The heap is then swept. If a type-header cell pointing to malloc
- space is collected the malloc object is then freed. If the type
- header of smob is collected, the smob's free procedure is called to
- free its storage.
-
- INTERRUPTS
-
- If they are supported by the C implementation, init_signals() in scm.c
- sets up handlers for SIGINT and SIGALRM. The low level handlers for
- SIGINT and SIGALRM are int_signal() and alrm_signal(). All of the
- signal handlers immediately reestablish themselves by a call to
- signal().
-
- If an interrupt handler is defined when the interrupt is received, the
- code is interpreted. If the code returns, execution resumes from
- where the interrupt happened. Call-with-current-continuation allows
- the stack to be saved and restored.
-
- SCM does not use any signal masking system calls. These are not a
- portable feature. However, code can run uninterrupted by use of the C
- macros DEFER_INTS and ALLOW_INTS. DEFER_INTS sets the global variable
- ints_disabled to 1. If an interrupt occurs during a time when
- ints_disabled is 1 one of the global variables sig_deferred or
- alrm_deferred is set to 1 and the handler returns. When ALLOW_INTS is
- executed the deferred variables are checked and if set the appropriate
- handler is called.
-
- DEFER_INTS can not be nested. An ALLOW_INTS must happen before
- another DEFER_INTS can be done. In order to check that this
- constraint is satisfied #define CAREFUL_INTS in scmfig.h.
-
- CHANGING SCM
-
- When writing C-code a precaution is recommended. If your routine
- allocates a malloc type header from the heap make sure that a local
- SCM variable in your routine points to the type-header cell of the
- malloc object as long as you are using the malloced storage. This
- will prevent the malloc object from being freed before you are done
- with it.
-
- Also, if you maintain a static pointer to some (non-immediate) SCM
- object, you must either make your pointer be the value cell of a
- symbol (see errobj for an example) or make your pointer be one of the
- sys_protects (see symhash for an example). The former method is
- prefered since it does not require any changes to the SCM distribution.
-
- The macro ASSERT(_cond,_arg,_pos,_subr) signals an error if the
- expression (_cond) is 0. _arg is the offending object, _subr is the
- string naming the subr, and _pos indicates the position or type of
- error. _pos can be one of
- `ARG1',
- `ARG2',
- `ARG3',
- `ARG4',
- `ARG5',
- `WNA' (wrong number of args),
- `OVFLOW'
- `OUTOFRANGE'
- `NALLOC'
- `EXIT'
- `HUP_SIGNAL'
- `INT_SIGNAL'
- `FPE_SIGNAL'
- `BUS_SIGNAL'
- `SEGV_SIGNAL'
- `ALRM_SIGNAL'
- or a C string (char *).
-
- Error checking is not done by ASSERT if the flag RECKLESS
- is defined. An error condition can still be signaled in this case
- with a call to wta(_arg,_pos,_subr).
-
- To add a C routine to scm:
- [1] choose the appropriate subr type from the type list.
- [2] write the code and put into scm.c.
- [3] add a make_subr call to init_scm. Or put an entry into the
- appropriate iproc structure.
-
- To add a package of new procedures to scm (see crs.c for example):
- [1] create a new C file (foo.c).
- [2] at the front of foo.c put declarations for strings for your
- procedure names.
- static char s_twiddle_bits[]="twiddle-bits!";
- static char s_bitsp[]="bits?";
- [3] choose the appropriate subr types from the type list in code.doc.
- [4] write the code for the procedures and put into foo.c
- [5] create one iproc structure for each subr type used in foo.c
- static iproc subr3s[]={
- {s_twiddle-bits,twiddle-bits},
- {s_bitsp,bitsp},
- {0,0}};
- [6] create an init_<name of file> routine at the end of the
- file which calls init_iprocs with the correct type for each
- of the iprocs created in step 5.
- void init_foo()
- {
- init_iprocs(subr1s, tc7_subr_1);
- init_iprocs(subr3s, tc7_subr_3);
- }
- If your package needs to have a "finalization" routine called to
- free up storage, close files, etc, then also have a line in
- init_foo like:
- add_final(final_foo);
- final_foo should be a (void) procedure of no arguments.
- The finals will be called in opposite order from their definition.
- The line:
- add_feature("foo");
- will append a symbol 'foo to the (list) value of *features*.
- [7] put any scheme code which needs to be run as part of your
- package into Ifoo.scm.
- [8] put an IF into Init.scm which calls Ifoo.scm if your
- package is included:
- (if (defined? twiddle-bits!)
- (load (in-vicinity (implementation-vicinity)
- "Ifoo"
- (scheme-file-suffix))))
- or use (PROVIDED? 'FOO) instead of (DEFINED? TWIDDLE-BITS!) if
- you have added the feature.
- [9] put documentation of the new procedures into foo.doc
- [10] add lines to your makefile to compile and link SCM with your
- object file. Add a init_foo\(\)\; to the INITS=... line at the
- beginning of the makefile.
-
- These steps should allow your package to be linked into SCM with a
- minimum of difficulty. Your package should also work with dynamic
- linking if your SCM has this capability.
-
- Special forms (new syntax) can be added to scm.
- [1] define a new MAKISYM in scm.h and increment NUM_ISYMS.
- [2] add a string with the new name in the corresponding place
- in isymnames in repl.c.
- [3] add case clause to ceval near i_quasiquote (in eval.c).
-
- New syntax can now be added without recompiling SCM by the use of the
- PROCEDURE->SYNTAX, PROCEDURE->MACRO, PROCEDURE->MEMOIZING-MACRO, and
- DEFMACRO. See MANUAL for details.
-
- To use scm from another program call init_scm or run_scm as is done in
- main() in "scm.c".
-
- CONTINUATIONS
-
- The scm procedure call-with-current-continuation calls it's argument
- with an object of type `contin'.
-
- If CHEAP_CONTINUATIONS is #defined (in "scmfig.h") the contin just
- contains a jmp_buf. When the contin is applied, a longjmp of the
- jmp_buf is done.
-
- If CHEAP_CONTINUATIONS is not #defined the contin contains the jmp_buf
- and a copy of the C stack between the call_cc stack frame and
- BASE(rootcont). When the contin is applied:
- [1] the stack is grown larger than the saved stack, if neccessary.
- [2] the saved stack is copied back into it's original position.
- [3] longjmp of the jmp_buf is called.
-
- On systems with nonlinear stack disciplines (multiple stacks or
- non-contiguous stack frames) copying the stack will not work properly.
- These systems need to #define CHEAP_CONTINUATIONS in "scmfig.h".
-
- INTEGERS
-
- Scm has 30 bit immediate signed numbers called INUMs. An INUM instead
- of a pointer to a cell is flagged by a `1' in the second to low order
- bit position. Since cells are always 8 byte aligned a pointer to a
- cell has the low order 3 bits `0'. The high order 30 bits are used
- for the integer's value.
-
- Computations on INUMs are performed by converting the arguments to C
- integers (by a shift), operating on the integers, and converting the
- result to an INUM. The result is checked for overflow by converting
- back to integer and checking the reverse operation.
-
- The shifts used for conversion need to be signed shifts. If the C
- implementation does not support signed right shift this fact is
- detected in a #if statement in scmfig.h and a signed right shift (SRS)
- is constructed in terms of unsigned right shift.
-
- Scm also has large precision integers called bignums. They are stored
- as sign-magnitude with the sign occuring in the type code of the SMOBs
- bigpos and bigneg. The magnitude is stored as a malloced array of
- type BIGDIG which must be an unsigned integral type with size smaller
- than long. BIGRAD is the radix associated with BIGDIG.
-
- EVALUATION
-
- Top level symbol values are stored in the symhash table. Symhash is
- an array of lists of ISYMs and pairs of symbols and values.
-
- Whenever a symbol's value is found in the local environment the
- pointer to the symbol in the code is replaced with an immediate object
- (ILOC) which specifies how many environment frames down and how far in
- to go for the value. When this immediate object is subsequently
- encountered, the value can be retrieved quickly.
-
- Pointers to symbols not defined in local environments are changed to
- one plus the value cell address in symhash. This incremented pointer
- is called a GLOC. The low order bit is normally reserved for GCmark;
- But, since references to variables in the code always occur in the CAR
- position and the GCmark is in the CDR, there is no conflict.
-
- If the compile FLAG `CAUTIOUS' is #defined then the number of
- arguments is always checked for application of closures. If the
- compile FLAG `RECKLESS' is #defined then they are not checked.
- Otherwise, number of argument checks for closures are made only when
- the function position (whose value is the closure) of a combination is
- not an ILOC or GLOC. When the function position of a combination is a
- symbol it will be checked only the first time it is evaluated because
- it will then be replaced with an ILOC or GLOC.
-
- IMPROVEMENTS TO MAKE
-
- Convert MANUAL into texinfo format.
-
- Prefix and make more uniform all C function, variable, and constant
- names. Provide a file full of #define's to provide backward
- compatability.
-
- lgcd() needs to generate at most one bignum.
-
- divide() could use shifts instead of multiply and divide when scaling.
-
- If an open fails because there are no unused file handles, GC should
- be done so that file handles which are no longer used can be
- collected.
-
- If the symhash array is specially marked in garbage collection,
- msymbols with value #[undefined] which have no pointers to them can be
- collected. In Maclisp this was called GCTWA.
-
- Compaction could be done to malloced objects by freeing and reallocing
- all the malloc objects encountered in a scan of the heap. Whether
- compactions would actually occur is system depenedent.
-
- Copying all of the stack is wasteful of storage. Any time a
- call-with-current-continuation is called the stack could be re-rooted
- with a frame which calls the contin just created. This in combination
- with checking stack depth could also be used to allow stacks deeper
- than 64K on the IBM PC.
-
- DYNAMIC LINKING
-
- Scott Schwartz <schwartz@galapagos.cse.psu.edu> suggests: One way to
- tidy up the dynamic loading stuff would be to grab the code from
- perl5.
- ====
- This note should help with porting to Windows NT and finishing the
- port for dynamic linking under VMS.
- ================================================================
- Return-Path: <gjc@newjak.mitech.com>
- Date: Fri, 3 Sep 93 10:38:18 EDT
- From: gjc@newjak.mitech.com
- To: jaffer@ai.mit.edu
- Subject: dynamic linking.
-
- George Carrette
- GJC@MITECH.COM
- 3-SEP-1993
-
-
- Dear Aubrey:
-
- On the subject of shared libraries and dynamic linking for VMS and WINDOWS NT,
- and SunOs. Let us take VMS first.
-
- (1) Say you have this main.c program:
-
- main()
- {init_lisp();
- lisp_repl();}
-
- (2) and you have your lisp in files repl.c,gc.c,eval.c
- and there are some toplevel non-static variables
- in use called the_heap,the_environment, and some read-only
- toplevel structures, such as the_subr_table.
-
- $ LINK/SHARE=LISPRTL.EXE/DEBUG REPL.OBJ,GC.OBJ,EVAL.OBJ,LISPRTL.OPT/OPT
-
- (3) where LISPRTL.OPT must contain at least this:
-
- SYS$LIBRARY:VAXCRTL/SHARE
- UNIVERSAL=init_lisp
- UNIVERSAL=lisp_repl
- PSECT_ATTR=the_subr_table,SHR,NOWRT,LCL
- PSECT_ATTR=the_heap,NOSHR,LCL
- PSECT_ATTR=the_environment,NOSHR,LCL
-
- Notice: The "psect" (Program Section) attributes.
- LCL means to keep the name local to the shared library. You almost always want
- to do that for a good clean library.
- SHR,NOWRT means shared-read-only. Which is the default for code, and is also
- good for efficiency of some data structures.
- NOSHR,LCL is what you want for everything else.
-
- Note: If you do not have a handy list of all these toplevel variables,
- do not dispair. Just do your link with the /MAP=LISPRTL.MAP/FULL
- and then search the map file,
-
- $SEARCH/OUT=LISPRTL.LOSERS LISPRTL.MAP ", SHR,NOEXE, RD, WRT"
-
- And use an emacs keyboard macro to muck the result into the proper form.
- Of course only the programmer can tell if things can be made read-only.
- I have a DCL command procedure to do this if you want it.
-
-
- (4) Now MAIN.EXE would be linked thusly:
-
- $ DEFINE LISPRTL USER$DISK:[JAFFER]LISPRTL.EXE
-
- $LINK MAIN.OBJ,SYS$INPUT:/OPT
- SYS$LIBRARY:VAXCRTL/SHARE
- LISPRTL/SHARE
-
- Note the definition of the LISPRTL logical name. Without such a definition
- you will need to copy LISPRTL.EXE over to SYS$SHARE: (aka SYS$LIBRARY:)
- in order to invoke the main program once it is linked.
-
- (5) Now say you have a file of optional subrs. MYSUBRS.C
- And there is a routine INIT_MYSUBRS that must be called before using it.
-
- $ CC MYSUBRS.C
- $ LINK/SHARE=MYSUBRS.EXE MYSUBRS.OBJ,SYS$INPUT:/OPT
- SYS$LIBRARY:VAXCRTL/SHARE
- LISPRTL/SHARE
- UNIVERSAL=INIT_MYSUBRS
-
- Ok. Another hint is that you can avoid having to add the PSECT
- declaration of NOSHR,LCL by declaring variables 'status' in the
- C language source. That works great for most things.
-
- (6) Then the dynamic loader would have to do this:
-
- {void (*init_fcn)();
- long retval;
- retval = lib$find_image_symbol("MYSUBRS","INIT_MYSUBRS",&init_fcn,
- "SYS$DISK:[].EXE");
- if (retval != SS$_NORMAL) error(...);
- (*init_fcn)();}
-
- But of course all string arguments must be (struct dsc$descriptor *)
- and the last argument is optional if MYSUBRS is defined as a logical
- name or if MYSUBRS.EXE has been copied over to SYS$SHARE.
- The other consideration is that you will want to turn off CONTROL-C
- or other interrupt handling while you are inside most lib$ calls.
-
- As far as the generation of all the UNIVERSAL=... declarations.
- Well, you could do well to have that automatically generated from
- the public LISPRTL.H file, of course.
-
- VMS has a good manual called the "Guide to Writing Modular Procedures"
- or something like that, which covers this whole area rather well,
- and also talks about advanced techniques, such as a way to declare
- a program section with a pointer to a procedure that will be automatically
- invoked whenever any shared image is dynamically activated. Also,
- how to set up a handler for normal or abnormal program exit so that
- you can clean up side effects (such as opening a database).
- But for use with LISPRTL you probably don't need that hair.
-
- One fancier option that is useful under VMS for LISPLIB.EXE is to
- define all your exported procedures through an CALL VECTOR instead
- of having them just be pointers into random places in the image,
- which is what you get by using UNIVERSAL.
-
- If you set up the call vector thing correctly it will allow you to
- modify and relink LISPLIB.EXE without having to relink programs
- that have been linked against it.
-
- WINDOWS NT:
-
- The Software Developers Kit has a sample called SIMPLDLL.
- Here is the gist of it, following along the lines of the VMS description
- above (contents of a makefile for the SDK NMAKE)
-
- LISPLIB.exp:
- LISPLIB.lib: LISPLIB.def
- $(implib) -machine:$(CPU) -def:LISPLIB.def -out:LISPLIB.lib
-
- LISPLIB.DLL : $(LISPLIB_OBJS) LISPLIB.EXP
- $(link) $(linkdebug) \
- -dll \
- -out:LISPLIB.DLL \
- LISPLIB.EXP $(LISPLIB_OBJS) $(conlibsdll)
-
- The LISPDEF.DEF file has this:
-
- LIBRARY lisplib
- EXPORT
- init_lisp
- init_repl
-
- And MAIN.EXE using:
-
- CLINK = $(link) $(ldebug) $(conflags) -out:$*.exe $** $(conlibsdll)
-
- MAIN.EXE : MAIN.OBJ LISPLIB.LIB
- $(CLINK)
-
- And MYSUBRS.DLL is produced using:
-
- mysubrs.exp:
- mysubrs.lib: mysubrs.def
- $(implib) -machine:$(CPU) -def:MYSUBRS.def -out:MYSUBRS.lib
-
- mysubrs.dll : mysubrs.obj mysubrs.exp mysubrs.lib
- $(link) $(linkdebug) \
- -dll \
- -out:mysubrs.dll \
- MYSUBRS.OBJ MYSUBRS.EXP LISPLIB.LIB $(conlibsdll)
-
- Where MYSUBRS.DEF has
-
- LIBRARY mysubrs
- EXPORT
- INIT_MYSUBRS
-
- And the dynamic loader looks something like this, calling
- the two procedures LoadLibrary and GetProcAddress.
-
- LISP share_image_load(LISP fname)
- {long iflag;
- LISP retval,(*fcn)(void);
- HANDLE hLib;
- DWORD err;
- char *libname,fcnname[64];
- iflag = nointerrupt(1);
- libname = c_string(fname);
- _snprintf(fcnname,sizeof(fcnname),"INIT_%s",libname);
- if (!(hLib = LoadLibrary(libname)))
- {err = GetLastError();
- retval = list2(fname,LSPNUM(err));
- serror1("library failed to load",retval);}
- if (!(fcn = (LISP (*)(void)) GetProcAddress(hLib,fcnname)))
- {err = GetLastError();
- retval = list2(fname,LSPNUM(err));
- serror1("could not find library init procedure",retval);}
- retval = (*fcn)();
- nointerrupt(iflag);
- return(retval);}
-
- Note: in VMS the linker and dynamic loader is case sensitive,
- but all the language compilers, including C, will by default
- upper-case external symbols for use by the linker, although the
- debugger gets its own symbols and case sensitivity is language
- mode dependant. In Windows NT things are case sensitive generally
- except for file and device names, which are case canonicalizing
- like in the Symbolics filesystem.
-
- Also: All this WINDOWS NT stuff will work in MS-DOS MS-Windows 3.1
- too, by a method of compiling and linking under Windows NT,
- and then copying various files over to MS-DOS/WINDOWS.
-