home *** CD-ROM | disk | FTP | other *** search
Text File | 1997-06-14 | 42.0 KB | 1,267 lines |
- 1.1) General introduction
- 1.2) Liability and authorship
- 1.3) General setup issues
- 1.4) Command line switches
- 1.5) Credits
- 2.1) ANSI compatability
- 2.2) Missing features
- 2.3) C++ style features
- 2.4) Known bugs
- 3.1) Standard run-time libraries.
- 3.2) DOS libraries
- 4.1) Implementation-dependent keywords
- 4.2) Implementation-dependent preprocessor functions
- 4.3) Phi-text compatability
- 5.1) Errors
- 6.1) Stack frames
- 6.2) ASM interface
- 6.3) Segmentation
- 6.4) Optimizations
- 7.1) Description of directory tree
- 7.2) Porting
-
- 1.1) General introduction
-
- CC386 is a generic 386 DOS C compiler. Every effort has been made to
- have it recognize standard ANSI syntax; however it should NOT be
- expected to produce code conforming to the ansi standards, especially
- in regard to floating point. CC386 outputs assembly language code
- suitable for TASM and NASM, and possibly it will work with MASM.
-
- This package includes various support programs and libraries required
- to build code that will run under an MSDOS DPMI server. TRAN's PMODE
- is used for the server; so if there is no memory management software
- programs generated by this package will still work.
-
- You need TASM and TLINK to use this package. You can get by with
- WLINK; however TASM is still a requirement. The compiler itself will
- generate NASM compatabile code, however this package is not sufficient
- to have that code actually run under DOS.
-
- This package consists of the compiler, some borland DPMI stubs
- currently needed for the compiler, run-time libraries for DOS, and
- header files. Seperate packages have the source for the compiler and
- the run-time libraries.
-
- Additionally, the package includes a program 'CL386' which will
- call the compiler, TASM, and TLINK to create programs for you.
-
- Warning: you must have a version of TASM earlier than version 4.1;
- TASM 4.1 has some bugs. The code was tested with 4.0; it will probably
- work with 3.0 and maybe even 2.0.
-
- Part of the run-time libraries is a debug-style debugger that can be
- linked into the code. Note that this debugger will NOT work inside
- a win95 DOS box; however the rest of the package will.
-
- New features and bug fixes:
-
- revamp of floating point to make it at least work a little. Now that I
- have a coprocessor :).
-
- exception handling that works, including a floating point exception
- handler that will catch it if you use floating point when there is
- no coprocessor.
-
- revamp of all the implicit cast operations to make them work properly
-
- fix to code generation for pointers to functions
-
- fix to the bit-size variables used in structures
-
- inline assembler recognizes ALL 486 opcodes (32-bit addressing modes only)
-
- various fixes to the debugger
-
- addition of SPAWN functions to the run-time library
-
-
- This program IS capable of compiling itself and having the image run;
- I do not distribute this latter image because it will not run on a 386
- when there is no FP coprocessor.
-
- 1.2) Liability and authorship
-
- This compiler is presented on an 'as-is' basis without any guarantee of
- usability or fitness for any given application. Risks associated with
- using it, including financial loss or loss of life are not the
- responsibility of the authors. However, this compiler is intended as
- an educational tool, and is not to be used commercially in any case
- without the express written consent of the authors.
-
- The original author is Matthew Brandt. As he left it it was a K&R
- style compiler with no floating point and minimal preprocessor support,
- targeted only for the m68k. Much of the work was done on a Unix
- machine and later ported to DOS. You can find his version on one of
- the Motorola file sites if you wish to compare. The current version
- has been updated extensively to support a variety of ANSI constructs as
- well as i386 support and a better preprocessor. However, parts of
- the program still reflect Mathew's work.
-
- I have done my part of the coding with 16 and 32 bit MSDOS compilers.
- This version of the code has NO dos-specific features in it and should
- be portable to any 16 or 32 bit ANSI compiler.
-
-
- 1.3) General setup issues
-
- To install the version which creates DOS executables, run the install.bat
- file. You need to give it a file name:
-
- install C:
-
- will install the necessary files on drive C:. A directory tree will
- be made under \CC386 and all necessary files will be copied there.
- Read the file INTRO.DOS for a brief overview of how to create programs
- for DOS.
-
-
- The only thing really needed for the compileer to work is a pointer
- to the include directories. These may be specified in the environmnt
- variable 'CCINCL' or with the /I command line switch. I normally set
- CCINCL=\cc386\include to get at the ansi headers and then use /I if any
- other directories are required. If you use the install.bat program
- CL386 will take care of the include and library issues via its configuration
- file and there is no other setup required other than to run the install
- program.
-
- 1.4) Command line switches
-
- Switches prefixed with a '+' or '-' may be turned on orr off. The
- last occurance of the switch determines the state. For these switches
- '/' is equivalent to '-'. Note that codegen parameters must generally
- be the same for all modules in a program, or unpredictable results will
- occur.
-
-
- +e - make error file
-
- +i - make preprocessed file
- default is -i
-
- /ffile - process arguments in file 'file'
- +l - make LST file
- default is -l
- /w-all - no warnings, errors only
- warnings may also be suppressed individually. See ERROR.DOC
- -A - disable ANSI compatability and enable some non-standard features
- default is +A
- /C - codegen params
- /C+d - display internal diagnostics
- /C-b - no BSS
- /C-l - don't put C source in the ASM file
- /C-m - don't mangle symbols with a leading underscore
- /C+p - pack variables for space. On a 68020+ minimize at word
- alignment
- /C+r - reverse order of bit ops
- /C+F - (386) force the TASM .MODEL directive to use FLAT mode
- This may become the default in some future release.
- /C+N - generate NASM code
- /C-R - use stack pointer rather than link registers
-
- default is /C+blmR-prFN
- /Dxxx - define a macro 'xxx'
- /E## - max number of errors to generate
- /Idirs - specify include directories. use a semicolon to seperate multiple
- directory specifications. The directories specified by the
- environment variable "CCINCL" are always searched first.
- /O - Optimizer params
- /O-Rxxx Turn off register optimizations. In place of the xxx
- put any combination of:
-
- a - turn off address rigister optimizations
- f - turn off floating point register optimizations
- d - turn off data register optimizations
-
- default is all register optimizations enabled
-
- +S - reserved, no use (yet)
- default is -S
-
- Compiler will look for the symbol CC386 (or CC68K) in the environment.
- If it finds it, it will evaluate any command line arguments in it
- prior to evaluating the command line. Note that command line
- parameters will override the environment variable; in particular
- specifying a search path both in the environment var and on the command
- line will result in loss of the search-path environment. There is an
- alternate environment variable CCINCL which specifies include paths
- which will be appended to the command line specification.
-
- 1.5) Credits
-
- The following people contributed source code to this program.
-
- Matthew Brandt: original K&R C compiler
- Thomas Pytel (TRAN): DPMI extender for DOS
- Kirill Joss: CL386 compiler shell
- David Lindauer: Ansification, preprocessor, run-time libraries,
- 386 code gen, miscellaneous enhancements to original compiler
-
-
- Many people were instrumental in locating bugs, I'd like to acknowledge
- two who were especially helpful with *lots* of testing:
-
- Johann Klockars
- Kirill Joss
-
- And thanks to David Gurevich and Kirill Joss for helpful suggestions on
- packaging.
-
- 2.1) ANSI compatability
-
- This compiler is meant to be ANSI compatible at the source level. However
- I have never seen the ANSI documentation for what that means; If
- you find something it doesn't do, let me know!
-
- However, there is no guarantee that code generated will meet ANSI
- runtime requirements in terms of evaluation ordering, especially with
- casts. floating point is done using the host coprocessor , and is NOT
- adjusted for ANSI/IEEE compatability.
-
- The run-time library is designed to act like an ANSI library; however
- the internals are most likely somewhat different. Especially since I
- did away with static buffers where possible.
-
-
- 2.2) Missing features
-
- The following are known to be missing:
-
- a) libraries don't handle any kind of floating point
- b) expressions of the form:
- (T)
-
- are not handled correctly when T is a typedef
-
- 2.3) C++ style features
-
- The C compiler has some rudimentary C++ support. It recognizes:
-
- 1) Overloaded functions (but not the overload keyword)
- 2) Variable declarations anywhere
- 3) Reference variables
- 4) Function parameter defaults
- 5) Stricter type checking
- 6) Improved init of static pointers and reference variables
- 7) Detailed C++ error messages
-
- classes and C++ keywords aren't yet supported.
-
- To enable these features use the extension .CPP on your input file
-
- 2.4) Known bugs
-
- The following known bugs exist:
-
- a) Expression evaluation is recursive. With a 4K compiler stack
- the limit is approximately something like:
-
- a = (b()+(c()+(d()+(e()+f()))));
-
- Beyond this unpredictable results will occur. Raise the stack limit
- or rearrange the expression with higher order parenthesis to the left.
- Notice this would not be a problem without the grouping parenthesis
- because the compiler wouldn't have to maintain so many contexts. I
- compile the compiler with a 20K stack
-
- b) Floating point may or may not work. A floating point library will
- be added later and this will be checked out
-
- d) expressions such as :
-
- a = b = c;
-
- may not return the correct value to anything other than the rightmost
- assignment. In general it will work, but if there are multiple
- implicit casts going on from one assigment to the next it may not work
- correctly.
-
- e) % may not work properly for signed divisions. The sign may be
- wrong but the value will be correct. This may or may not be a problem,
- I haven't analyzed it.
-
- f) long and unsigned constants will not be optimized or evaluated
- correctly when there are two or more of them in an expression (type may
- not propogate)
-
- g) The identifier 'pascal' is a standard keyword rather than something
- a user may redefine.
-
- 3.1) Standard run-time libraries.
-
- libs were implemented according to 'The Waite Group's Essential Guide
- to ANSI C'- ISbN 0-672-22673-1. My copy is circa 1989.
-
- Floating point library functions are not supported at this time, as I
- have no way to test them. This includes things like atof and difftime
- as well as most of the math libraries.
-
- All the functions in this book were implmented except floating point.
- However, process control stuff is kind of sketchy at this time. Most
- of the IO library, part of the time library, and the malloc library
- functions require operating system support. Documentation for this
- is provided with the run-time library sources; however this
- package contains sufficient code to use DPMI as the operating system.
-
- There may be a variety of cases where things don't work as expected.
- For example scanf will only read one line no matter what... when a
- function such as strftime requires a buffer length to be given the
- results are undefined if the text length exceeds the buffer length.
- Also I just found out the opening a file with the 'a' attribute is
- supposed to override any attempt to set the position for write in the
- file... in this implementation all it does is position to the end of
- the file at open time.
-
- The libraries were originally designed in a reentrant fashion; however
- this breaks much standard code and the version of the libraries
- included here has static buffers where called for.
-
- ERRNO isn't supported at this time.
-
- Many of the library functions depend on having the startup/rundown code
- included. This code initializes a few global variables and executes
- any startup/rundown functions the libraries need for initialization
- and cleanup.
-
- To use the libraries two files must be included in your link. First is
- the startup module (c0dos or c0dosd) and second is the library itself (cldos).
- An example build if _main is defined in q.c:
-
- cc386 q.c
- tasm /ml /m2 q.c
- tlink c0dos q,q,q,cldos
-
- will build q.exe. Note that the startup module MUST be the first object
- module specified as it defines the segmentation setup required.
-
- Two startup modules are provided; c0dos is a standard C startup module.
- c0dosd is the same module but it will draw in a debugger from the library
- (approximately 16K) and call it rather than execute your code. The debugger
- is somewhat similar to DEBUG. When the debugger starts up the EIP and
- registers will be set to the values they would have at the beginning
- of your _main function. The debugger traps several exceptions including
- int 3 so you can put int 3 in your code at places you want to debug.
- Warning! The debugger will NOT work in a DOS box in windows 95, as I
- could not get appropriate access to exceptions.
-
- The startup modules use TRAN's PMODE to manage pmode resources. I
- use version 3.07... I had to modify the class names in his segment
- declarations to make them different from the 32-bit code segments but other
- than that they are his release. I have included the sources as per his
- licensing in msdos\pmode307.
-
- I manage several exceptions; traps 6,13, and 14 are all routed through
- the signal-handling code; by default the print general protection fault and
- exit but you can trap them using the signal mechanism if you want.
- Unless you are in a DOS box... likewise traps 7, 8, and 16 relating
- to floating point are routed through the signal handling code unless
- you are in a DOS box. The default signal handling code just prints a
- message and jumps to the program exit point...
-
- I also manage ctrl-c interrupt from DOS (but not ctrl-brk from BIOS, that
- is handled via the DOS interrupt) and exit the program cleanly if ctrl-c
- is pressed. Note that ctrl-c will even exit the debugger! I should
- probably fix that...
-
- 4.1) Implementation-dependent keywords
-
- a) The following implementation-dependent keywords have been added
-
- i386 use
- _interrupt Generate a function which may be used
- as a trap/interrupt.
- _genbyte Generate data in the code segment
- _absolute Allocate a global variable at
- an absolute address. Such variables
- will be directly addressable.
- pascal force the function declaration to use
- pascal calling conventions.
-
- b) The following implementation variables have been added. these
- variables directly access the assembly language registers they name.
- Note they should be used with caution and may change periodically at
- the compiler's discretion. Also, casts of them or assignements to
- them may change the machine state functionally... and wreck the
- code the compiler has generated.
-
- i386
- _EAX _EBX _ECX _EDX _ESP _EBP _ESI _EDI
-
- the 386 compiler also knows the keyword 'asm' which is an escape to
- allow inline assembly. The syntax is:
-
- asm my_instruction;
-
- or
-
- asm {
- my list of instructions;
- }
-
- The compiler catches most errors in inline assembly code at compile
- time. It will also translate the names of local variables into
- proper stack-based addressing modes.
-
- 4.2) Implementation-dependent preprocessor functions
-
- the preprocessor is more-or less ansi compatible.
-
- The following #pragma statements are supported:
-
- #pragma regopt xxx
- now obsolete
-
- #pragma startup xxx #
- xxx may be any function name
- # may be a priority value from 20 - 90 (other values are used
- by the run-time library) Higher priority functions get run
- first.
-
- This option tells the compiler to inform the startup routines
- that this function should be run prior to calling main.
-
- #pragma rundown xxx #
- xxx may be any functionn name
- # may be an integer value from 20-90 (other values are used by
- the run-time library). Higher priority functions get run
- first.
-
- This option tells the compiler to inform the startup routines
- that this function should be run after main exits.
-
- The following macros are predefined:
-
- _i386_ (386 only) compiler is generating 386 code
- _m68k_ (68k only) compiler is generating 68K code
-
- __cplusplus if the compiler is allowing C++ extensions
- __FILE__ the file name of the source file as a string
- __DATE__ The date as a string
- __TIME__ The time as a string
- __LINE__ the line number as a number
-
- #if macros can use defined(xxx) to determine if a macro is defined.
-
- 4.3) Phi-text compatability
-
- This compiler is capable of understanding 'phi-text' which is an
- extended text-based character set. It is somewhat preferable to
- UNICODE for western programmers as it does not encompass thousands of
- characters that are little used by main-stream westerners.
-
- Phi text is a banked character set. Each character in its full
- form is 32-bits; this encompasses the following information:
-
- cwb: a number from 32 to 127 describing the character
- bank: a bank from 0 to 15. BANK 0 is the ASCII character set with
- some modifications to control characters.
-
- basic Attributes: BOLD, UNDERLINE, ITALIC, HIDDEN, and REVERSED
- attributes. BLINKING may be substituted for ITALIC however we
- normally use ITALIC.
-
- color: 16 color renditions. The colors have been chosen to reflect
- complementary colors. Foreground and background may be
- specified for each character.
-
- size 16 size attribute
- font: 16 font attribute
-
-
- 32 bits per character is a bit much for some applications; an
- application may elect to ignore certain fields. This compiler ignores
- ALL fields but the bank and the CWB (although it may look briefly
- at attributes, I don't remember). Internally, the characters are thus
- represented with 16-bit fields in this compiler.
-
- To ease the storage requirements of such a character set, there exists
- a 'streamed' form of phi-text. This takes advantage of the notion that
- attributes are not likely to change as rapidly as the character
- information. Basically, if the high bit of a streamed byte is set
- it indicates that control information is embedded which indicates the
- new attributes. There is also a 'repeat' code so that long strings
- of repeating characters (for example spaces) get packed together. This
- is not quite as efficient as tabbing but in the long run it works out
- better because there are several situations where such strings may
- occur in phi-text and they do NOT always involve spaces. In this
- compiler, the incoming text is in streamed format (which defaults to
- ASCII unless an appropriate editor is used). The streamed format is
- converted to a flat format and all information is stripped except that
- essential to detecting the character. Preprocessing is done on the
- flat version... but when the scanner starts looking for tokens it then
- converts the flat version back to stream (minus colors and attributes)
- for more effeciency in the parser and back end. If the source file
- is streamed phi-text the list and assembly files will also be streamed
- phi-text; color information is added to the list file just to
- make it a little flashy, although I have a monochrome monitor so the
- colors I picked may be awkward.
-
- One problem exists with streamed formats: in case of an error situation
- it is possible to lose important synchronization and so wreck more than
- a single character. For this reason streamed phi-text is partially
- synchronizing; at the beginning of each line all attribute information
- defaults back to a standard default. In this way one never loses more
- than a complete line in the presence of simple errors. And even at
- that a smart editor could be designed to help one recover from such
- simple errors... provided that errors occurred often enough to be worth
- the effort.
-
- We have seen that phi-text is composed of 16 banks with 96 characters
- per bank, for a total of 1536 characters. The first bank is pure
- ASCII, with a few modifications, but what are the other banks?
- About half of them are currently unused. Of those some have been
- deliberately reserved for application speccific and system-specific
- uses by the designer of phi-text. The defined characters can be broken
- roughly into the following groups:
-
- 1) ASCII characters
- 2) European extensions (accented characters)
- 3) greek characters
- 4) cyrillic characters
- 5) line drawing characters
- 6) mathematics characters
- 7) miscellaneous characters
-
- While it IS possible to extend certain C operators with more compact
- character representations in a compiler like this one, use of phi-text
- has been limited to allowing greek and cyrillic characters in variable
- names, and to allowing things in boxes to be treated as comments. The
- primary editor for phi-text has an extension that allows usage of the
- arrow keys to draw lines on the screen and this makes beautifying code
- a snap.
-
- For more information about phi-text, contact:
-
- Paul McKneely
- P.O. BOX 5641
- Pasadena, TX, 77508
-
- email: gecko@onramp.net
-
- 5.1) Errors
-
- This is a list of possible errors. There are two types of errors... 'Errors'
- and 'Warnings'. an 'Error' signifies an event which the compiler cannot
- handle, whereas a 'warning' is a diagnostic which indicates that something
- is possibly wrong but the compiler will make assumptions about it.
-
- This list is slightly outdated; it is missing new errors which the
- inline assembler can generate.
-
- Each 'warning' will have a value in parenthesis, this value may be used
- on the command line to supress the warning. the value 'all' may be used
- to supress all warnings. Errors may not be suppressed.
-
- Example
-
- cc -w-ieq a.c ; Suppress the 'Possibly incorrect assignment' warning.
- cc -w-all a.c ; Suppress ALL warnings
-
- Some of these errors result when the compiler is in C++ mode.
-
-
-
-
-
- Error: _int keyword not allowed in Pascal declarations
-
- Pascal declarations may not be used as traps or interrupts.
-
- Error: Ambiguity between %s and %s
-
- C++. Compiler cannot choose between two almost equivalent
- overloaded definitions.
-
- Warning: ('cln') Argument list too long %s
-
- Argument list for the function call specified is too long. Compiler
- ignores the extra args.
-
- Error: Argument list too long in redeclaration of function '%s'
-
- A prototyped function has been redeclared with a different argument list
-
- Error: Argument list too short %s
-
- Too few parameters have been supplied in a function call.
-
- Error: Argument list too short in redeclaration of function '%s'
-
- A prototyped function has been redeclared with a different argument list
-
- Error: Bit field must be signed or unsigned int
- ANSI C requires a bit field to be of one of these types.
- If extensions are allowed bit fields can be of any integer
- type.
-
- Error: Bit field only allowed on scalar types
-
- Bit fields can only be used on integral types.
- This error will occur if in non-ansi mode and you use any
- non-integral type as the basis for a bit field.
-
- Error: Bit field too big
-
- Bit fields must fit within the processor word size.
-
- Warning: ('pro') Call to function '%s' with no prototype
-
- A function call has been made to a function that has not been
- previously declared. Compiler guesses at argument types.
-
- Error: Cannot cast %s
-
- C++. Some casting of classes is not allowed.
-
- Error: Cannot define a pointer or reference to a reference
-
- C++. Reference variables are treated specially in this regard
-
- Error: Cannot initialize '%s'
-
- An error occurred while trying to process a variable initialization
-
- Error: Cannot modify a const val
-
- a CONST value may not be modified
-
- Error: Cannot open file \"%s\" for read access
-
- An include file was not found
-
- Error: Cannot overload 'main'
-
- C++. main() must not be overloaded
-
- Error: Cannot take address of bit field
-
- Pointers to bit fields not allowed
-
- Error: Cannot use bit field as a non-member
-
- Only structure members may have a bit field qualifier.
-
- Warning: ('cno') Code has no effect
-
- This line of code compiled to nothing
-
- Error: Constant value expected
-
- In general initializers must be constant values. Some others
- must as well
-
- Error: Constructor/destructor must be untyped
-
- C++ can't type constructors/destructors
-
- Error: Continue not allowed
-
- Not in scope where a continue makes sense
-
- Warning: ('cnv') Conversion may truncate ignificant digits
-
- An implicit cast may result in loss of significant digits. This
- warning is NOT produced for explicit casts.
-
- Error: Could not find a match for '%s'
-
- C++. This function call is not prototyped either directly or
- with an overload or defaulted function prototype
-
- Warning: ('dpc') Dangerous pointer cast
-
- If you get this, it will happen when the size of the pointer is
- not the same as the size of the (scalar) type youy are using
- with it.
-
- Error: Declaration expected
-
- Parser got a statment or other value when it was expecting a
- declration.
-
- Error: Declaration not allowed here
-
- Parser found a declaration when it was expecting a statement
-
- Error: Default missing after parameter '%s'
-
- C++... this parameter was assumed to have a default which is missing.
-
- Error: Destructor for class '%s' expected
-
- C++. A destructor was expected.
-
- Error: Duplicate case %d
-
- Two case statements evaluate to the same value
-
- Error: Duplicate label '%s'
-
- The label occurs twice in the same procedure.
-
- Error: Duplicate symbol '%s'
-
- The symbol is being redefined.
-
- Error: Ellipse (...) not allowed in Pascal declarations
-
- Pascal-style declarations may not have variable arguments.
-
- Error: Expected '%c'
-
- The compiler expected a specific character or token.
-
- Error: Expression expected
-
- The compiler was ready to parse an expression but found something
- else
-
- Error: File ended with comment in progress
-
- Comments must have an ending point within the same file or
- include file.
-
- Error: File name expected in #include directive
-
- #include directive must have a file name
-
-
- Error: Function declaration not allowed here
-
- A function declaration was attempted in an invalid place, for
- example inside a structure or inside another function.
-
- Warning: ('ret') Function should return a value
-
- This error occurs when a function is not of type 'void'
- and you exit without returning a value.
-
- Error: Identifier expected
-
- The parser was expecting a variable/function name.
-
- Error: Illegal call to main() from within program
-
- C++. C++ programs may not call main()
-
- Error: Illegal character '%c'
-
- The parser detected an illegal character sequence.
-
- Error: Illegal pointer
-
- An attempt was made to use a non-pointer in a pointer context
-
- Error: Illegal pure declaration syntzx of '%s'
-
- C++. Virtual declaration syntax is wrong
-
- Warning: ('irg') Illegal register var '%s'
-
- the size of the variable was too big for it to fit in a
- register
-
- Error: Illegal storage class specifier '%s'
-
- Conflicting or illegal specifier on a declaration.
-
- Error: Illegal storage class specifier on '%s'
-
- Conflicting or illegal specifier on a declaration.
-
- Error: Illegal typedef of '%s'
-
- Attempt to reuse a symbol name as a typedef.
-
- Error: Illegal use of reference operator
-
- Attempt to use '&' in a context where it is not permitted.
-
- Error: Illegal use of void pointer
-
- Cannot take the size of a void pointer.
-
- Error: Inserted '%c'
-
- The parser guessed at a symbol to insert.
-
- Error: Invalid '&' on register var '%s'
-
- Cannot take the address of a register
-
- Error: Invalid floating point
-
- Cannot use floating point in certain types of math functions
- (e.g. logic functions)
-
- Error: Invalid preprocessor directive '%s'
-
- Preprocessor directive is unknown
-
- Error: Invalid trap id
-
- CPU-specific. Indicates a cpu operation (int or trap) was called
- with an identifier that is too large
-
- Error: '%s' is not a function
-
- Cannot call non-functions.
-
- Error: '%s' is not a label
-
- Cannot jump to non-labels.
-
- Error: Local class functions not supported
-
- C++. Cannot support class definitions as local variables
-
- Error: Local variables may not be used as parameter defaults
-
- C++ .Paremeter defaults must be in scope prior to calling the function
-
- Warning: ('lli') long long int type not supported, defaulting to long int
-
- long long int type will parse correctly but it is unsupported
-
- Error: Lvalue expected
-
- cannot assign to the address of a variable
-
- Error: Macro substitution error
-
- Macro expansions are limited to 4096 characters
-
- Error: Misplaced else
-
- Unexpected else found in input stream
-
- Error: '%s' must be a predefined class or struct
-
- C++. Cannot work with this structure/class because it has not
- been fully defined.
-
- Warning: ('zer') No memory allocated for '%s'
-
- an unsized array has no initializers either.
-
- Warning: ('nsf') Nonexistant static func '%s'
-
- a static function was prototyped but never declared
-
- Warning: ('npo') Nonportable pointer conversion
-
- An implicit pointer conversion may result in code that compiles
- incorrectly with other C compilers
-
- Error: Non-scalar array index
-
- Array indexes must be of integral type
-
- Error: Numeric constant is too large
-
- an integer or hex constant was too large for the base type,
- or the non-fractional part of a floating-point number could
- not fit in a long-integer.
-
- Error: Pointer type expected
-
- A pointer was expected.
-
- Warning: '(ieq') Possible incorrect assignment
-
- the symbol '=' was used at the outer scope in an if statement
- expression.
- This could be intended, but often is a mistype of the symbol
- '==' so the compiler warns you.
-
- Warning: ('san') Possible superfluous &
-
- & isn't needed when taking the address of an array. This is a junk
- message; ansi C doesn't care either way.
-
- Warning: ('sud') Possible use of '%s' before assignment
-
- A variable has been used but it possibly has not been initialized
- with a value
-
- Error: Reference initialization needs lvalue
-
- C++. Reference syntax calls for something whose address can be
- taken.
-
- Error: Reference member '%s' in a class with no constructors
-
- C++. The reference variable cannot be initted at class startup because
- the constructor is supposed to do it/
-
- Error: Reference variable '%s' must be initialized
-
- C++. Cannot change what a reference variable equates to
- at run-time.
-
- Error: Return type is void
-
- Attempt to return a value from a void function
-
- Error: Size is unknown or zero
-
- Attempt to use the size of a variable with a type that has
- been forward declared.
-
- Error: Size of '%s' is unknown or zero
-
- Attempt to use the size of a variable with a type that has
- been forward declared.
-
- Error: Startup/rundown function '%s' is unknown or not a function
-
- A function named in the '#pragma startup' or '#pragma rundown'
- is either not a function or is not defined
-
- Error: String constant too long
-
- A multi-line string is too long.
-
- Warning: ('spc') Suspicious pointer conversion
-
- A pointer operation is being performed on pointers which have
- different base types.
-
- Warning: ('fun') Static function '%s' is declared but never used
-
- This static function is just a space waster.
-
- Warning: ('sud') Static variable '%s' is declared but never used
-
- This static variable is just a space waster.
-
- Warning: '(fsu) Structure '%s' is undefined
-
- The compile completed with a structure whose type was never
- defined.
-
- Error: Switch argument must be of integral type
-
- Switch arguments must be integers.
-
- Warning: ('tua') Temporary used for parameter %s
-
- C++. A constant was passed in a reference parameter and the compiler
- automatically made a variable so the called function would be
- happy.
-
- Warning: ('tui') Temporary used to initialize %s
-
- The reference variable is initted with a constant; extra storage
- had to be created for it.
-
- Error: Too many initializers
-
- A structure/array has too many initializers
-
- Error: Type expected in sizeof
-
- sizeof argument was not a type or variable
-
- Error: Type mismatch
-
- Generic type mismatch
-
- Error: Type mismatch in arg '%s'
-
- type mismatch for function calls
-
- Error: Type mismatch in redeclaration of '%s'
-
- A variable has been redeclared with a different type from
- before.
-
- Error: Type mismatch in return
-
- The value being returned does not match the function type.
-
- Error: Unbalanced preprocessor directives
-
- #if- #endif directives were not balanced.
-
- Warning: Undefined label '%s'
-
- The label should appear somewhere as there is a goto to it.
-
- Error: Undefined symbol '%s'
-
- This is an unknown symbol
-
- Error: Unexpected '%s'
-
- This keyword was unexpected.
-
- Warning: (' urc') Unreachable code
-
- Code stream can never get here.
-
- Warning: ('lun') Unused label '%s'
-
- A label was declared but never used/
-
- Error: User error: %s
-
- #Error directive results in this
-
- Warning: ('sas') Variable '%s' is assigned a value which is never used
-
- After assignment to the var, there is no subsequent use
-
- Error: Variable '%s' cannot have a type qualifier
-
- C++. ???
-
- Error: Variable '%s' is not a class instance
-
- C++. A class instance was expected.
-
- Warning: ('sun') Variable '%s' is declared but never used
-
- This variable was declared but nothing ever referenced it. Space
- waster.
-
- 6.1) Stack frames
-
- There are a variety of options for stack frames.
-
- a) Standard C-style stack frames. An index register (EBP or A6) is
- used to point at a value between the paramenters and the function local
- variables; all local variables and function parameters are indexed from
- this base register. This is the default.
-
- b) The compiler can free the link register and index all local
- variables and function paremeters off the stack pointer
-
- c) (68K only) parameters lists may be located anywhere in memory;
- the parameter list pointer is passed in A0. A0 is then transfered to
- A6 and parameters are indexed off A6. Meanwhile local variables are
- indexed from the stack pointer.
-
- On the 68K, several codegen options are available. By default the
- 68K compiler generates PIC code based around a 32K memory model. A
- 68020 mode is available for speciying extended 68020 features such as
- enhanced addressing modes and specialized instructions. Another option
- is to generate 68000 code in such a way that a data section greater
- than 32K can be used. The final option allows one to disable PIC mode
- and generate code that will be placed at absolute addresses in memory.
-
- 386 code is fairly straightforward. It is a little more complex than
- need be because of the need to use special function registers for some
- operations like multiplies and shifts. 386 Code will be a little bulky
- because of this need.
-
- 68K code is position-independent. All global data is accessed off of
- register A5 or A6; function arguments are indexed off of A6 or A7; and
- the stack is indexed off A7. String constants are indexed off the PC.
- Because of this, the total data size may only be 32K unless either the
- /C+2 or the /C+L or the /C+A options are used.
-
- 6.2) ASM interface
-
- the assembly language program must not modify any registers except the
- scratch registers:
-
- 386: EAX,ECX,EDX
-
- Parameters are passed on the stack, with the leftmost parameter at the
- lowest address.
-
- In all assembly situations it is convenient to use an index register
- to index the parameters. The index register must be loaded with the
- address of the first parameter (which will be the stack pointer + 4
- if you don't push the index register, or the stack pointer +8 if you
- do). Parameters normally take four bytes for the standard data types;
- however double and long double types take 8 and twelve bytes
- respectively. If you pass a structure by value the amount of stack
- space used is dependent on the size of the structure.
-
- 6.3) Segmentation
-
- The following segments or sections may appear in the output file:
-
- 386 name use
- .CODE Code and string constants
- .DATA Initialized global data
- .?DATA Unintialized global data
- INITDATA #pragma startup links
- EXITDATA #pragma rundown links
- CPPDATA C++ static initializations
-
- The following switches affect code generation:
-
- /C-b combine the BSS with the DATA
- /C-l don't put line numbers in ASM file
- /C-m donn't mangle with underscores
- /C+p pack variables
- /C+r use reverse order for bit fields.
- Note that this option reverses the allocation order
- but does not reverse the value in the field.
-
- The following #pragma statements affect code generation:
-
- #pragma regopt - enable/disable register allocations
- #pragma startup - name a routine to be executed on startup
- #pragma rundown - name a routine to be executed on rundown
-
- 6.4) Optimizations
-
- the compiler performs the following optimizations:
-
- a) Constant folding.
-
- When common math is done with constants, the compiler will evaluate
- the expression and replace it with a constant.
-
- b) Reduction in strength
-
- multiplies and divides are turned into shifts when appropriate. Mods
- are turned into ands when appropriate.
-
- c) Target optimization
-
- When the target for an assignment is known, a temp register will not
- be allocated, but the target will be used directly. This keeps us
- from generating dead temp registers that will later have to be
- optimized out of the icode.
-
- d) Dead code elimination
-
- Delete jumps to jumps, jumps to the next statement, and dead code.
- also delete any temporaries that 2) came up with that are now
- unused.
-
- Note that the SETJMP libraries for example will NOT save the state of
- floating point registers. So there is a switch to disable optimization
- into floating point registers in case you need to setjump to a routine
- that uses floating point. Address and data register optimizations can
- also be turned off. See switches.doc.
-
- e) reordering expressions
- In some cases the compiler generates better code if expressions are
- reordered; for example:
-
- a = a + 10;
-
- can be turned into a += 10 and better code gets generated. Also,
- a lot of work has been put into optimizing usage of based/indexed
- modes of the processors when it can be done. The present version
- will even use index register scaling when possible!
-
- f) base + index addressing modes
-
- This compiler goes to some length to identify when base + index addressing
- modes may be used to generate an address
-
- 7.1) Description of directory tree
-
- Sources to this compiler are included in seperate packages.
-
- Sources should be generic; that is they should work on any architecture
- where the byte size is 8 bits. However, I use weird tab settings in my
- editor. If you want to comprehend the sources get a beautifier of some
- sort and run the sources through it first; or set your editor tab
- setting to 2 to see what I see.
-
-
- The directory structure is:
-
- CLIBS
- various sources for runn-time library
- DOC
- documentation
- EXAMP
- a simple example
- (there is a more complex one in clibs\startup\test)
- INCLUDE
- compiler header files
- OBJECT
- compiler make/objeect files
- SOURCE
- compiler source files
-
- There are two groups of sources:
- 1) the compiler
- in the SOURCE, INCLUDE, OBJECT directories
- 2) libraries you can use in conjunction
- with the compiler to gennerate programs (target run-time libraries)
- in the CLIBS directory
-
- I often use the set of triple directories:
- SOURCE
- OBJECT
- INCLUDE
-
- for a given project so I won't clutter up a single directory with dozens
- of files. When this triple comes up, sources are inn the SOURCE directory,
- headers the sources depend on are in the INCLUDE direcorty, and you can
- expect me to chdir to the OBJECT directory to compile the program...
- thus you will find the make file there.
-
- For these triples you thus have to use an include path which consists of
- the INCLUDE directory when you compile the source files.
-
- proto.bat generates the file INCLUDE\CC.P; which is a protootype file
- I'm using to keep the compiler honest with me. You shouldn't have
- to change that unless you make major changes to the sources... but you can
- edit CC.P directly and put new prototypes in if you want. I often do.
- I only use proto.bat when I'm making major changes to the compiler.
-
- 7.2) Porting
-
- This version of the compiler is intended to be portable; one need only
- rewrite the back end for the given target. This portability probably
- extends only to processors with a 'byte' architecture.
-
- The following symbols have to be defined on the command line:
-
- -DPROGNAME="CC386" ; Name of the program whnich will
- appear in the bannder
-
- -DENVNAME="CC386" ; Name of the environment variable to
- consult for command line parameters
-
- -DGLBDEFINE="_i386_" ; Symbol to define in the source; can
- be used to identify processor-specific needs
- -DSOURCEXT=".ASM" ; Extnension to use on the output file
-
- These definitions are imported by CMAIN.C to define the program
- environment. I have shown you the definitions used by the 386
- compiler; change them as necessary for your target.
-
- The following files comprise the 386 backend. They should be all you
- have to change to port the compiler to a new processor. I suggest you
- rename them to something else before changing them:
-
- an386.c - Register optimization
- reg386.c - Register allocation for expressions
- conf386.c - configuration; int sizes and free registers and such
- outas386.c - outputs ASM code
- gexpr386.c - turn the expression parse trees into code
- gstmt386.c - turn the stmt parse trees into code
- peep386.c - Peephole analysis for this processor
-
- For more information on porting contact the author of the code.
-
- David Lindauer (gclind01@starbase.spd.louisville.edu)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-