ftp.barnyard.co.uk

home *** CD-ROM | disk | FTP | other *** search

/ ftp.barnyard.co.uk / 2015.02.ftp.barnyard.co.uk.tar / ftp.barnyard.co.uk / cpm / walnut-creek-CDROM / JSAGE / ZSUS / PROGPACK / SCZ01BET.LBR / -SCZ01.ZZZ / -SCZ01.

Wrap

Text File | 1989-05-24 | 43KB | 799 lines

SmallC/Z 05/07/89 SmallC/Z Compiler - Usage Documentation and Miscellaneous Notes The following is some preliminary documentation on SmallC/Z. It covers whatever information it seemed would be useful to get one started using the compiler and any other inanities that came to mind while writing it. As such, it will hit on quite a variety of topics, hopefully with some semblance of coherence. If this isn't the case, I accept responsibility, and can only say do your best with it, and maybe somebody will add to or improve on it. Topics covered: Running the compiler Topics I'd like opinions on ZCPR Support and the Configuration Header Program installation and use of debuggers. Use of assembler code with SmallC/Z Argument counts The Entry and Exit Vectors Using CZVLIB The new memory management scheme Basic memory layout Installing the compiler (optional in most cases.) Compiler and library version numbers copyright (c) Al Grabauskas, 1989 ---===*===--- Topics I'd like opinions on I'd like to hear opinions on anything you'd like to discuss regarding SmallC/Z. The following are a couple things I'd in particular like to get some opinions on. Most are further discussed elsewhere in this file, so if something doesn't make sense or ring a bell now, read on. - should the option switches stay of the form "-s", or should they go to the more standard zcpr "/s" format? - how does the "assembler configuration header" idea sit with folks? pros? cons? - should the entry and exit vectors be handled differently? one could, at the expense of code space, make the exit vector a stack, for instance... - is the dynamic memory management scheme reasonable? would a first fit allocation scheme be better than best fit? - are the existing CZVLIB functions reasonable in terms of programmer interface, or could their usage be simplified? - does the external file control block contain disk and user in versions of zcpr prior to 3.4? this is important to the way du: is handled currently in argv[0]. ---===*===--- Running the compiler The syntax for invoking the compiler is briefly as follows: scz [<input-redirector or filelist...] [>output-redirector] [options] where: input-redirector or output-redirector is a file name and file type, a file list is a list of comma delimited input files, and options are described below. All elements of the command tail are optional, and indeed, the compiler can be run without any command tail at all. Like its precursor, SmallC/Z supports redirection of input and output, and this has a significant impact on the way the user interacts with the compiler and programs created with it. We'll just look at this as it relates to using the compiler, and you can draw further conclusions based on that. One useful side effect is this: by starting the compiler with nothing except perhaps option switches in the command tail, you instruct it to take its input from stdin (the keyboard) and write its output to stdout (the terminal). In other words, if you need to quickly see what assembler code the compiler will generate for a given set of statements, just crank it up and type them in, and it'll spit the assembler source to the screen. When typing input at the keyboard a ^S pauses screen output, a ^C aborts execution, and a ^Z (the cp/m and zcpr "end of text file" byte) indicates end of file. Redirection can also be used to imbed the original C source code as comments in the assembler source output, so you can study the compilation of various statements at your liesure. This is described below under the "-l#" option switch. By default, input is obtained from stdin, but if either an input redirector, one or a list of filenames is given, then the file(s) named will be used as input. A filetype of "C" is defaulted to if none is specified for names in an input file list. Output normally goes to the screen, unless an output redirector or input file(s) are specified. The action for redirection is self-explanatory. In the case that one or more input files are given, then the output will be written to disk using the first filename in the list with a type of Z80. Redirection specifications get no default types. Options must each begin with a dash ("-"), and can occur anywhere in the command tail. Short descriptions of the option switches follow: -m Monitors compiler progress by echoing each function declaration to stderr (screen, non-redirectable). -a Alarm - rings the console bell on errors. -p Pause on each error. A carriage return from the keyboard continues processing. -l# Listing control, where # is a file descriptor number for the file to which the listing should be sent. Standard file descriptors are as follows: 0=stdin (not applicable here) 1=stdout 2=stderr These are also listed in the stdio.h #include file. If you specify 1 (stdout), the listing is intermixed with the output. SmallC/Z then precedes each line with a semicolon so the assembler sees it as a comment. Stderr (2), not being redirectable, always goes to the screen. No listing is produced if the -l switch isn't specified. -o Optimize for size, even at the expense of speed. -z# This one is an oddball, and will require a bit of explanation. By default, at the end of any compile SmallC/Z checks to see if main() is defined. If it is, the compiler goes back to the front of its output file and writes a (patchable) string. The default is "$INCLUDE SYSCCFG1.CCZ", and causes the assembler source to pull in a file of that name (the default one supplied has a zcpr "Z3ENV" header and some linkage information). In order to provide some flexibility here, the character following the -z switch is taken and inserted in the position occupied by '0' in the default file name. Both the default string and the position the -z switch affects are patchable. More on this under "ZCPR Support", below. See also "Installing the Compiler" - The null switch, or any unrecognized or erroneous option will display a (very) short help on the screen. ---===*===--- The stuff that follows may or may not be more involved with compiler library and the runtime code internals than many folks care about. If so, don't bother with it too much, but you may want to scan it somewhat in order to pick up some of the basic things about the compiled code's link to zcpr and general operation. ---===*===--- ZCPR Support and the Configuration Header The -z switch originally told the compiler to include zcpr support by writing a header that was built into the compiler to the output file. This approach was dropped for several reasons. Firstly, it meant the switch had to be specified for modules containing main() that were intended to run under zcpr, and this was easy to forget, which would require a recompile. Secondly, it chiselled the header that was output in stone, so to speak. Outputting an assembler include directive allows the programmer control over the configuration header used, and allows the header to be customized or added to without need to change the compiler. If you do need a header other than the standard one for a particular program, and you forgot to specify the switch, you can modify the output assembler source with a file patch utility - you don't even need to crank up an editor (which is nice, since the assembler code can be quite large). Admittedly, you could have done something similar with a simple #include in C, or even just a #asm...#endasm sequence, but this seemed more elegant, especially since 99% of the programs probably won't need or use anything other than a standard header - you can just forget about it and write your code. Once I decided to go this route, I realized that the header include file could serve a couple of purposes. The net result was this: the included assembler configuration header file contains any operating system specific program base code, plus data that provides linkage between your main() and the runtime support code. This is done by making references to certain names the linker will find in two small arrays: the "entry and exit vectors". More on these later, but the gist of this is that the compiler no longer contains any names assumed to be in the run-time support library except the arithmetic and logical operations routines in the CALL module. Those it simply can't do without - they provide the virtual machine that the compiler is written for. All other runtime support (that performed on entry prior to executing main() and that performed on exit, in particular) is connected to your program through the staring jump and the entry and exit vectors (which are processed by Uentry() and exit(), respectively) in the configuation header. (You can offload the entry and exit vectors to a linkable module, if you like, but the initial linkage between your program is still provided by the configuration header file.) The implications of this approach are that you are not necessarily tied to the standard library if you want to use this compiler - you just need CALL and some replacements or modified versions, as the case may be, of Uentry() and exit() as a minimum. Some of the stuff in the CSYSLIB module would likely also be desirable. You could, for example, completely replace the i/o subsystem fairly straightforwardly, or set up a runtime package that supports some Z80 based controller board that doesn't have anything resembling a cp/m or zcpr environment. You can also fairly straightforwardly add initialization and de-initialization code at points in the execution program that are at a "system level", i.e. before execution of and after leaving main(), with a minimum of hassle. More on this in the next section. Three very similar configuration headers are supplied: SYSCCFG0.CCZ - vanilla cp/m header SYSCCFG1.CCZ - zcpr type 1 header SYSCCFG3.CCZ - zcpr type 3 header The compiler is set up to put the assembler code $INCLUDE SYSCCFG1.CCZ into the front of assembler output files containing main(). This default is patchable, so you can change the $INCLUDE to whatever your assembler supports (I think most assemblers capable of handling the code output by SmallC/Z should like $INCLUDE just fine), the file name and/or the byte affected by the -Z option flag. SYSCCFG1.CCZ is deemed the most generic, since it will run fine on cp/m and zcpr systems of various revision levels, and will supply you with the ability to use the zcpr based functions by default. The cp/m header is supplied mainly for completeness. The type 3 header works as type 3 should for zcpr 3.3 or better. (It may, however, be desirable to relocate free memory - stack and heap space - UNDER the program if you link it high in core - more on this later too). You don't want to use that last one on systems that don't support type 3 zcpr programs. By the way, you can do custom headers with overlay areas for programs, if you like. I suspect that sort of thing is better done by the #include approach however, since its use is limited to a particular program. Better to save use of configuration header include files for run-time support related things, if only to maintain a logical consistency. One note on the standard configuration header. Since part of its intent is to support type 3 programs, it cannot be left to the linker to place an initial jump at 100h and start the cseg at 103h, since you may want a type 3 program to run elsewhere in memory. As a result, it contains a jump to Uentry(), which is the initial runtime setup routine in CLIB, and you'll have to supply the linker with an address to put the program at (/p: linker switch for L80, and /a: is recommended for SLRLNK). However, if you always plan to compile for execution at 100h, you can remove or comment out the jump, perhaps make sure the program environment type is 1, and NOT specify an address to the linker. Uentry() is the only routine in CLIB or CZVLIB whose end card contains its name, so the linker will know where to jump to. The important thing here is, if the jump is in the include file, specify an address, because otherwise the linker will generate an extra jump, and even though it will go to the right place, the Z3ENV header will be displaced and as a result the environment descriptor pointer won't be installed properly. Conversely, if the jump is not in the header file, don't specify an address, because then the linker will omit generating a jump and the first instruction you execute will be 'Z3ENV', which is highly unlikely to yield the desired results. I personally prefer the more generic solution, and since I use aliases, scripts and the MAKE rcp, it is really no extra hassle to specify the execution address at link time, but whatever suits your fancy and works is what you should do. That's at least partly what we do this stuff for, eh? ---===*===--- Program installation and use of debuggers. If you are running zcpr 3.0, you'll have to install the programs you compile with z3ins.com. Alternatively, you can edit the header files and put in a pointer to the environment for your system. This has an advantage even for zcpr 3.3 or later command processors, because the program is already installed when you go into it with a debugger like z8e.com. The default headers supplied have the environment pointer set to zero. Assuming you know Z80 assembler, you'll find that once you know the kind of code put out by SmallC/Z and the function of library code (and what the arithmetic and logical routines in the CALL module do in particular), that you can do reasonably meaningful debugging in assembler debuggers like z8e. If you generate *.sym files from the link, your symbol set will include the entry points of all functions (yours and the libraries'), and quite a few runtime system variable addresses. Non-global tags for specific functions you may be interested in can be obtained from assembler listings. Read the documentation for z8e - it is incredibly flexible in its handling of symbols, and can use multiple files to build its symbol table for a given run. If you understand the basic structure of the runtime support in the libraries, you can track down almost any bug this way. ---===*===--- Use of assembler code with SmallC/Z You can write C callable routines in assembler and vice versa, either as separate modules or in C source files. You can also throw in-line assembler code into C source files anyplace, but this requires a good understanding of the kind of code produced by SmallC/Z, and assumes that it won't change. I wouldn't generally recommend this practice unless there is no alternative. Putting in-line assembler in C source files is as simple as surrounding it with #asm and #endasm compiler directives. The code between them is copied verbatim by the compiler to its output file. If you do this inside function definition braces, the compiler will add the function name to its symbol table, otherwise you may have problems with it generating spurious extrn statements. The closing brace will generate a "ret" instruction, so you can omit that if your function exits from the bottom of you assembler code. SmallC/Z places function arguments on the stack. They are pushed in the order they occur in the call, so on entry to your assembler based function, the stack contains the return address and the arguments of the call in "reverse" order. In general, if you are recovering arguments further in than 2 or 3 pops, you are probably better off getting them with a ld hl,stkoffset ; bytes to offset into the stack add hl,sp ; pointer to argument sequence of instructions. Your assembler function should return the stack in the same condition it got it (at least the stack pointer should be the same, allowing for the return address popped by the "ret" instruction). The contents of the hl register are considered the return value by C. Other registers are ignored and need not be preserved. ---===*===--- Argument counts If you are writing a function that needs to know how many arguments are on the stack for it, it will be in the accumulator IF the calling C function didn't have the symbol NOCCARGC #defined at he biginning of its source. #defining NOCCARGC is a compiler directive that shuts this feature off (and as a result generates slightly smaller and faster code). If you need this count in C code, make the first executable line of your function something like: argcount=CCARGC(); This needs to be first because the count in the accumulator would soon be lost through normal operations. Be careful with NOCCARGC. Currently four CLIB functions and two CZVLIB functions will not operate properly if called from routines compiled with argument counts suppressed: CLIB - printf(), fprintf(), scanf(), fscanf(). CZVLIB - Vwrite(), Vrcwrite() ---===*===--- The Entry and Exit Vectors These are processed by the Uentry() and exit() routines, respectively. The motivation for them was to decouple the library from the compiler and subsystem chunks of it from the library. The idea was originally developed to allow using the older, more primitive memory management routines for older, existing code that makes assumptions the new routines can't live with. I needed a means to do different initializations early in the runtime code, depending which routines were in use. While that by itself is straightforward enough, a more generic solution seemed desirable because I expected to run into this sort of thing again, and didn't want to have to keep going into the library source to accomodate it. After a little thought, I realized that I could provide a means for sensibly adding and/or substituting what I've since come to regard as runtime support subsystems, without having to abuse the standard library source. It was only a tiny step from that to modularizing and pretty much decoupling the standard library entirely (with the exception of the arithmetic and logical module CALL, as I mentioned above). The entry and exit vectors are just two small linear arrays of pointers to callable routines. The vector bases are named with the global tags Uinvec and Uexvec respectively. The last routine in each of them should never return to its caller. The basic operation goes like this. At program startup, the initial jump goes to Uentry(), which does some basic things - it saves the initial value of the HL register and the address on the stack that returns to the system in runtime variables that can be referenced globally. It then determines the top (under the CCP or any RSX's beneath it) and bottom of memory (initially at the global tag Upgmend), and likewise saves these in global system variables. Uentry() then loads HL with a pointer to Uinvec and proceeds to loop, emulating calls to each routine pointed to by Uinvec with a "push" of the return address and "jp (hl)" instructions. Other than the push of the return address, it stores nothing on the stack, since at least one of those routines will set up the runtime stack. The routines on Uinvec in the default configuration are for memory management, and o/s interface initialization, followed by a call to Umain(), the main routine in CSYSLIB, which sets up a couple other things like the command tail in argv[] and initializes argc, then excutes your main() function. A couple related points: - the i/o subsystem doesn't need any initialization, so no vector is included for it. It does, however, have a vector in Uexvec, so it can close any files you neglected to prior to returning to the o/s. - Uentry(), because of the way its code is written, doesn't care what order the routines in Uinvec are in, or even how long it is. However, I will point out that the memory management subsystem should typically be first (as it is in the default setup), since it should decide where the stack goes, and since other subsystems (i/o in particular) are logically likely to use it. Similarly, i/o de-initilization in the exit vector should be last, just prior to the "return to o/s" routine pointer, because the other routines in the exit vector (the exit code handler, for instance) may want to use console or other i/o, and typical i/o de-initializers include closing the stdin, stdout and stderr streams. (Yes, it is necessary to close these, because they might be files - remember that redirection is supported). The "exit trap" (described below) vector should be first, for reasons explained below. - As long as the topic of the length of entry and exit vectors came up, I may as well clarify the point made earlier, that the last routine in each should never return. That's the only thing that defines how long they are. Uentry() will quite contentedly keep churning through and calling whatever is pointed to as long as it returns. The reason that Uinvec can be considered "terminated" is that it calls Umain(), whose last statement is a call to the exit() function, and hence will NEVER return to Uentry(). By the way, zero is a perfectly valid value in these vectors, and results in a warm boot. That's why Uexvec is followed by zero - in the case that some program has a dysfunctional exit trap or some such, it's an attempt to cover for a worst case situation (though the likelihood there is that exit() would never regain control to execute that warm boot). The operation of exit() as regards Uexvec is similar. exit() takes the value passed to it as an argument, stores it in a global variable, then proceeds to emulate calls to the routines on the exit vector. By default, the first of these is a user exit trap, which is a null routine - a simple "ret" instruction. A call to the CLIB function setxtrap(addr) inserts addr, which should be a pointer to an exit trap routine, into this word, and setxtrap(0) restores the pointer to the null routine. The reason it's first is straightforward: the user's exit trap routine might be coded to abort the exit under certain circumstances, in which case you wouldn't want to have done any other exit processing. One hopes that the exit trap routine allows SOME circumstance that completes exit processing, of course, since otherwise there would be no way to terminate the program. The purpose of the user exit trap is to allow you to perform any cleanup you need prior to terminating the program, even if it was being exited from a library routine that isn't really under your control, and to selectively abort exits if you so desire. Executing function exabrt() in your exit trap will quietly and cleanly abort the exit (continue the program fom the point exit() was called). It's up to you to make sure that there are no side effects to using this - most code isn't written expecting a return from a call to exit(). As far as the library exits go, I feel you can safely abort ^C exits, but no others. There are very few, so that shouldn't present a problem. A C #include file called SYSEXITS.H has been provided, with #defines that give the values of exit() codes used by routines in CLIB. These are all unique, so that you can determine the context of the exit. For example, there are two CLIB routines that respond to a ^C at the keyboard by calling exit() - Uconin() and poll(). Each has a distinct code in SYSEXITS.H, so that if you wish, you could abort all exits due to ^C, or just those from one but not the other, or you could allow them, but perform some cleanup prior to doing so. (Incidentally, the standard exit code handler supplied in CLIB and invoked AFTER this, equalizes the two, so that for zcpr "if" purposes there is only one code for user aborts - 255). The convention I've sort of adopted is that a 0 exit code means no error, 1-127 are for programs to use as they see fit, and 128-255 (actually -1 to -128 if defined as character due to sign extension) are reserved for runtime system codes. That should be enough for everybody concerned, with room to spare, and should help avoid confusion. The user exit trap vector is followed by an exit code interpreter. The CLIB exit code interpreter simply kicks out a message to the console for the exits defined in SYSEXITS.H. If you linked with CZVLIB preceeding CLIB, the error code interpreter will do the same, and if it's running in a zcpr environment, will also set the program error flag to the value of the code passed to exit(). (Note that the zcpr program error flag is one byte, which is why there are only 255 codes described above). The exit code interpreter is followed by an i/o subsystem exit routine, which consists of closing all unclosed files. These are finally followed by the return to system vector (which, as its description implies, doesn't return to exit()). This points to a routine that jumps to the address that Uentry() took from the stack as a system return address when the program started up. If, at any time during execution of your program, some condition arises that warrants a warm boot (such as a call to mrelease() the CCP for use as free core by the memory management routines), you should call the CLIB function setwbt(), which will zero this word, causing a warm boot at exit(). It is often a good idea to setwbt() while debugging rpograms still under development. A little thought will show just how flexible this scheme is. There are likely more possibilities here than anybody, or at least most folks including myself, will probably fully utilize. It will make many things that would otherwise be a pain, in particular the tight coupling of code packages to the runtime support, including opportunities for self-initialization and exit housekeeping, quite simple. One could, for instance, if one needed to use the SYSLIB routines for some reason, set up a set of C callable front ends for them, an initialization and de-initialization routine, and a proper configuration header include file, then be off and running (assuming, of course that there are no name clashes with the standard library at link time, which I'm almost sure there are). Likewise, if you were writing code for some Z80 based controller board, you could write your own runtime support package and corresponding header, even still using parts of CLIB as apropos, develop and compile controller code on your system, then move it to the controller, essentially constructing a cross-compiler for the other environment. The possibilities are limited mainly by the imagination. ---===*===--- Using CZVLIB CZVLIB contains zcpr dependent functions. One rule to observe with it (or any modules to be found outside CLIB itself) is that you must link everything you'll need PRIOR to searching CLIB for its modules. That's because CLIB contains the Upgmend module, which the runtime uses to find the end of the program, and hence must be linked last. Given use of the default configuration header or a similar one, and that the program is linked with CZVLIB being searched before CLIB, the Usysinit() routine in the entry vector will be taken from CZVLIB rather than CLIB, and certain things in the runtime o/s interface initialization will be different. These will be described here. In general, Usysinit() and the exit vector based routines in CZVLIB will check to see that they are running in a zcpr environment before excuting zcpr dependent code. However, this is not true of the other routines in CZVLIB like the TCAP-based console output functions and so forth in most cases, because it would have bloated and slowed down the code too much. Therefore, the rule of thumb is that code that gets executed automatically before or after your code has control will check to see if it is in a zcpr environment, and won't perform any zcpr based code if not. However, once control is passed to your code, you should explicitly check in the beginning if the environment is zcpr if you need it, and perform some appropriate action such as terminating with a message if not. You can do this by calling Zenvck(), which returns 0 if the environment pointer in the Z3ENV header doesn't point to a valid environment descriptor and returns a pointer to the env if it does. Zenvck() also sets or resets the zero flag based on the value it's returning to facilitate simpler calls from assembler. This test will of course fail if the program is not installed for your system. Note that you can do different initialization processing (and thus modify anything described here) by either supplying your own Usysinit() prior to searching either of the libs or by modifying the configuration header to call a different name and supplying your routine with that name. The latter is probably better, since then you can also call Usysinit() from within it, and simply do additional initialization of your own. That way you'll know any setup the library routines need has been performed and you won't have to worry about doing it yourself. Some other differences: Argv[0] for programs linked to run under zcpr is taken from the external fcb, and as such, reflects the name the program was actually invoked under. Programs linked to run under cp/m simply have 'CMDNAME' for argv[0]. So will programs linked with CZVLIB, but run under vanilla cp/m. The code passed to exit(code) will be placed in the zcpr program error flag. Zero means "no error". This brings up the point that you should always pass a code to exit() as a parameter, even if it is only zero, since exit() will take whatever is on the stack as a code, and if it isn't, you may be turning on the o/s program error flag when you don't intend to. See the CZVLIB docs for more about currently available functions. ---===*===--- The new memory management scheme One of the major deficiencies of Small-C was its lack of a flexible approach to memory management. The scheme it had, while adequate for many purposes, definitely made it difficult to use complex dynamic memory structures like trees. What was needed was a flexible approach to dynamic use of ram, without regard to the order that it was allocated and freed in. This led to a complete redesign of the memory management routines in the library. The old scheme is still available as a separate linkable module for backward compatibility and situations where the couple hundred bytes are more crucial than the extended functions. It can be accessed by simply linking SMALLALC.REL to your code prior to searching and linking CLIB. The new scheme is both far more flexible and presents some very interesting possibilities, I think. Some of its features are: - Allocations and deallocations can be performed in a completely arbitrary order. the currently free core is kept in a linked list, so later allocations are not lost in the process of freeing earlier ones. - The free list is kept as defragmented as possible, i.e. as allocated ram is freed, adjacent nodes in the free list are merged to keep the available memory chunks as large as possible. - The runtime stack now grows downward from the heap, so that it is easier to make sure it isn't corrupting anything, and so that it will be possible to do things like write RSX loaders in SmallC/Z. - Allocations in the heap are performed downwards from the top, so that the largest piece of free core tends to stay just above the stack, which allows relocation of the stack upwards as well as downwards for purposes of adjusting the relative sizes of the two. Note: the routines for resizing the stack vs. heap aren't yet written, but should be simple, and will happen soon. additional functions like checking if the stack has overrun its bounds, and one additional system pointer that will make it possible to independently relocate either or both anyplace are planned too. - The functions xalloc(), xfree(), and xainit() now provide the ability to use the same basic memory management primitives temporarily within any allocated block (as long as it is large enough to contain the several words of control information necessary), so that sporadic very small allocations can be specifically taken from one area without fragmenting the overall core space. This "in block" memory mangement has the full functionality of the overall scheme (free nodes kept in a list, defragmentation on the fly, etc). This is highly recommended for such small requests that are made at arbitrary points during the program's execution, because otherwise a best fit allocation scheme will tend to fragment your core. - Since the free list is now available, an mrelease() function can now be provided to release the space taken by code used only once in programs for consideration by the memory management routines to satisfy requests. Similarly, since the CCP is not overwritten now by default, mrelease() can be used to add it to the free memory pool. (Be sure to execute a setwbt() function if you do this, so that the program warm boots at exit, instead of executing a return to your data areas...) ---===*===--- Basic memory layout There is a stack, used for function call frames and parameters, and a heap, used for dynamic allocation. The heap takes the form of a singly linked list ABOVE the stack, with free nodes occurring in descending order by address. The first two bytes of each node is an unsigned size of the node itself, and if the node is in the free list, this is followed by a two byte pointer to the next node, or zero. If the node is currently allocated, the size bytes are still there, and the pointer returned to the user points to the byte following the size bytes. When the node is freed, the size is used to merge it to any adjoining nodes, and the resulting node is linked into the list (if not already in it as an effect of merging adjacent nodes.) Granted, there is some space overhead involved here, but the flexibility afforded is worthwhile in cases where complex data structures are being built dynamically. The algorithm for allocation is to search for a best fit, highest address node, then to allocate the minimum fragment of the node necessary. Since no node smaller than four bytes can exist in the free list (two for size, two for pointer to next), this means that if the remainder of the node is three bytes or less, the entire node is allocated. This will only occur if the node was created using the mrelease() function or in the case where the entire remaining core will not be enough to constitute a free node in itself. These are the only situations that would lead to this, since the normal low level allocate function always rounds the node size up to the next multiple of four in anticipation of memory fragmentation. In general, for any request of "n" bytes, the actual memory usage if granted is: (n+2) + ((n+2)mod(4)) unless the remaining free core after the above would be less than 3. Stack and heap are separated by a pointer to a "fence" location. The stack itself bottoms out at the top of the program, and the top of the heap is set to just below the CCP or an RSX if one or more are present. SmallC/Z avoids overwriting the CCP so that it can terminate by executing a return rather than a warm boot. The setwbt() function is provided to "set warm boot on" if the programmer wishes to do so, which could be necessitated by a variety of things, including mrelease()'ing the CCP itself. ----------------------------------------------------------------------------- Umemtop ------> CCP or RSX base ------------------------------------- Uheaptop -----> Top of heap | space for the heap to live. initially half ^ | of free core. | | Ufrechn ------> | pointer to free node list | | this is initialized to one node the size of v | free core, then varies as memory fragments. | the last last is free core above fence. | v Uheapbot --/--> fence address -------------------------------------- Ustktop --/ ^ The stack grows down from here, and the heap | is placed above it. | Ustkbot ------> Bottom of stack Umembot ------> Top of program ------------------------------------- Upgmtop: | | Uloadpt: Bottom of program ---------------------------------- ----------------------------------------------------------------------------- Figure: A rough layout of the memory structure and its associated pointers in the default setup. Customization is possible for unusual situations. Both the stack and heap have separate sets of top and bottom pointers so that they can be independently placed anywhere in core. By default they occupy the area between program top and either the bottom of the CCP or the lowest memory protected by an RCP, with the heap above the stack and the core split evenly between the two. However, each can be relocated independent of the other. In general, one wouldn't want to relocate the heap once it is in use, since the program has pointers into it. One could do it through the entry vector routine, however, since it excutes before almost anything else. It's fine to relocate the stack any time, if one is sensible about it. ----------------------------------------------------------------------------- Initialization of the above scheme is done in runtime routine Umeminit(), which is called by Uentry after it determines and stores the top and bottom of usable core. The Umeminit() routine is responsible for any initialization of its associated memory mangement scheme, and in particular, it MUST set the stack pointer. The older scheme has its own Umeminit() routine which initializes that system more according to the original Small C's style. Module UPGMTOP contains the global tag Upgmtop:, and should be the last routine linked, i.e. CLIB should be the last searched library. ---===*===--- Installing the compiler The compiler should be usable as is on most systems. However, you may want to change some modifiable defaults. You can do so with an appropriate patching tool. Once the compiler has stabilized, I'll release the source, and you'll also have the option of recompiling the compiler. Current patch point(s): 011dh offset into header include string affected by -z flag This is a two byte value, low order byte first. It is an offset (C array index), so the first byte is zero, etc. Default is 0100h (decimal 16). 011fh header include string. This is written to the beginning of of assembler source files output by SmallC/Z that contain a main() function, and is affected by the above patch. Default is "$INCLUDE SYSCCFG1.CCZ". The character affected by -z# is "1". You have 25 bytes, including the necessary trailing null. ---===*===--- Compiler and library version numbers The compiler and libraries "bury" their version numbers in programs you compile and link with them, so that you can determine the versions of the the compiler and libraries used to create any program. The version numbers are preceeded by eye-catchers so you can find them in binary executables, and are given global tags and terminated by nulls so you can refer to the version number strings in program code if you like. For any *.COM file created with SmallC/Z, just go into a patch utility do repeated searches for the string 'SCZ'. Some may be spurious, but some will count. The version number is an ascii string of the form "#.#", possibly followed by some character, then a null terminator, so it's displayable from in the program with standard library functions. eye-catcher global tag current version ----------- ---------- --------------- Compiler version 'SCZ' Usczver: 0.1b CLIB version 'SCZCLB' Uclibver: 0.1 CZVLIB version 'SCZCZV' Uczvlver: 0.1