home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Club Amiga de Montreal - CAM
/
CAM_CD_1.iso
/
files
/
379a.lha
/
p2c1_13a
/
src
/
src.zoo
/
NOTES
< prev
next >
Wrap
Text File
|
1990-03-09
|
33KB
|
752 lines
THE GRAND P2C NOTES FILE:
This file contains notes to myself recording bugs, flaws, and suggested
improvements to p2c. They have roughly been separated into "old", "older",
and "oldest" groups. I can't guarantee I'll do any of these. If you do,
please mail me the diffs so I can incorporate them into the next release.
Thanks!
-- Dave Gillespie
daveg@csvax.caltech.edu
-----------------------------------------------------------------------------
Technically speaking, "for byte := min to max do" is legal even
if min > 255, i.e., the limits only need to be in range if the
body of the loop executes. Thus, FOR loops should really use a
shadow parameter not just when max is the constant 255, but
whenever min or max are not provably in byte range.
-----------------------------------------------------------------------------
Have a "-M"-like mode in which FREE is suppressed altogether, useful
in case p2c crashes because bugs have corrupted malloc's free list.
-----------------------------------------------------------------------------
For expressions building small-sets whose maximum element is <= 15,
use "1 << x" instead of "1L << x".
-----------------------------------------------------------------------------
Handle VAX Pascal VARYING OF CHARs used as the arguments of WITH
statements.
-----------------------------------------------------------------------------
Have a p2crc feature which identifies a given named constant as
being a character, not a string of length 1. Have an option in
which this is the default interpretation.
-----------------------------------------------------------------------------
StringTruncLimit would be helped by expr.c:strmax() interpreting
sprintf control strings. For %s, use strmax of the corresponding
argument. For %d, use 11, etc. For %.10s, use min(10, strmax(arg)).
For %.*s, just use strmax(arg), I guess.
Have a mode in which such assignments are automatically truncated.
Perform truncation testing for non-VAR procedure arguments, too.
-----------------------------------------------------------------------------
In cref.p, the "strappend(buf,#0)" statement translates into
"strcpy(STR1,buf); strcpy(buf,STR1)" with a warning about a null
character in a sprintf control string!
-----------------------------------------------------------------------------
Still having problems where the opening comment of an imported
module's interface text is copied into the program.
-----------------------------------------------------------------------------
VAX Pascal features not yet handled:
[UNSAFE] attribute is only implemented in a few situations.
[UNBOUND] attribute on a procedure says it doesn't need a static link.
[TRUNCATE] attribute on a parameter allows optional params w/no default.
[LIST] attribute on a parameter is like &rest in Lisp.
Support types like, e.g., [LONG] BOOLEAN.
Can assign "structurally compatible" but different record types.
File intrinsics need serious work, especially OPEN.
If a copy param is [READONLY], don't need to copy it.
If a procedure is [ASYNCHRONOUS], make all its variables volatile.
If a procedure is [NOOPTIMIZE], make all its variables volatile.
Provide a real implementation of BIN function and :BIN read format.
BIT_OFFSET and CARD intrinsics are not supported.
-----------------------------------------------------------------------------
Modula-2 features not yet handled:
Local modules are faked up in a pretty superficial way.
WORD is compatible with both pointers and CARDINALs.
WORD parameters are compatible with any word-sized object.
ARRAY OF WORD parameters are compatible with absolutely anything.
Improve treatment of character strings.
Find manuals for real implementations of Modula-2 and implement
any common language extensions.
Fix p2c to read system.m2 instead of system.imp automatically.
-----------------------------------------------------------------------------
Oregon Software Pascal features not yet handled:
procedure noioerror(var f:file);
Built-in. Sets a flag on an already-open file so that
I/O errors record an error code rather than crashing.
Each file has its own error code.
function ioerror(var f:file) : boolean;
True if an error has occurred in the file.
function iostatus(var f:file) : integer;
The error code, when ioerror was true.
reset and rewrite ignore the third parameter, and allow a fourth
param which is an integer variable that receives a status code.
Without this param, open errors are fatal. An optional param
may be omitted as in reset(f,'foo',,v);
-----------------------------------------------------------------------------
In p_search, if a file contains const/var/type/procedure/function
declarations without any module declaration, surround the entire
file with an implicit "module <generated-name>; {PERMANENT}" ... "end.".
This would help the Oregon Software dialect considerably.
-----------------------------------------------------------------------------
Provide an explicit IncludeFrom syntax for "no include file".
E.g., "IncludeFrom dos 0".
-----------------------------------------------------------------------------
In docast, smallsets are converted to large sets of the requested type.
Wouldn't it be better to convert to a set of 0..31 of the base type?
This would keep foo([]), where the argument is "set of char", from
allocating a full 256-bit array for the temporary.
-----------------------------------------------------------------------------
When initializing a constant variant record or array of same in which
non-first variants are initialized, create a function to do the
initialization, plus, for modules w/o initializers, a note to call
this function. Another possibility: Initialize the array as well as
possible, but leave zeros in the variant parts. Then the function
has only to fix up the non-first variant fields.
-----------------------------------------------------------------------------
Figure out some way to initialize packed array constants, e.g., a short
macro PACK4(x,y)=(((x)<<4)+(y)) which is used inside the C initializer.
Alternatively, implement initializer functions as above and use those.
-----------------------------------------------------------------------------
How about declaring Volatile any variables local to a function which
are used after the first RECOVER? GNU also suggests writing the
statement: "&foo;" which will have no side effect except to make
foo essentially volatile, without relying on ANSI features.
-----------------------------------------------------------------------------
Test the macros for GET, PUT, etc.
-----------------------------------------------------------------------------
Can the #if 0'd code for strinsert in funcs.c be changed to test
strcpyleft?
-----------------------------------------------------------------------------
Even in Ansi mode, p2c seems to be casting Anyptrs into other pointer
types explicitly. This is an automatic conversion in Ansi C.
-----------------------------------------------------------------------------
A Turbo typed constant or VAX initialized variable with a VarMacro
loses its initializer!
-----------------------------------------------------------------------------
Test the ability of the parser to recover from common problems such
as too many/few arguments to a procedure, missing/extra semicolon, etc.
One major problem has been with undeclared identifiers being used as
type names.
-----------------------------------------------------------------------------
Line breaker still needs considerable tuning!
-----------------------------------------------------------------------------
How about indenting trailing comments analogously to the code:
Try to indent to column C+(X-Y), where C=original column number,
X=output indentation, Y=original input indentation.
Even fancier would be to study all the comment indentations in the
function or struct decl to discover if most comments are at the same
absolute indentation; if so, compute the average or minimum amount of
space preceding the comments and indent the C comments to an analogous
position.
-----------------------------------------------------------------------------
After "type foo = bar;" variables declared as type foo are translated
as type bar. Ought to assume the user has two names for a reason,
and copy the distinction into the C code.
-----------------------------------------------------------------------------
Warn if address is taken of an arithmetic expression like "v1+1".
Allow user to declare certain bicalls as l-values, e.g., so that
LSC's topLeft and botRight macros won't generate complaints.
-----------------------------------------------------------------------------
Consider changing the "language" modes into a set of p2crc files
which can be included to support the various modes.
-----------------------------------------------------------------------------
If we exchange the THEN and ELSE parts of an IF statement, be sure
to exchange their comments as well!
-----------------------------------------------------------------------------
How about checking for a ".p2crc" file in the user's home directory.
-----------------------------------------------------------------------------
Store comments in the following situations:
On the first line of a record decl.
On the default clause of a CASE statement
(use same trick as for ELSE clauses).
On the "end" of a CASE statement.
On null statements.
Use stealcomments for, e.g., decl_comments and others.
-----------------------------------------------------------------------------
Think of other formatting options for format_gen to support.
-----------------------------------------------------------------------------
Consider converting gratuitous BEGIN/END pairs into gratuitous
{ } pairs.
-----------------------------------------------------------------------------
The construction "s := copy(s, 1, 3)" converts to a big mess that
could be simplified to "s[3] = 0".
-----------------------------------------------------------------------------
Have a mode (and make it the default!) in which declarations are mixed
if and only if the original Pascal decls were mixed. Simply store
a flag in each meaning to mark "mixed-with-preceding-meaning".
-----------------------------------------------------------------------------
Have a column number at which to put names in variable and typedef
declarations. Have another option to choose whether a '*' preceding
a name should be left- or right-justified within the skipped space:
int *foo; or
int * foo;
-----------------------------------------------------------------------------
Support the /*
*
*/ form for multi-line comments.
-----------------------------------------------------------------------------
Have an indentation parameter for the word "else" by itself on a line.
-----------------------------------------------------------------------------
Have an option to use C++'s "//" comments when possible.
(0=never, 1=always, def=only for trailing comments.)
-----------------------------------------------------------------------------
Allow real comments to come before top-of-file comments like {Language}.
-----------------------------------------------------------------------------
Teach the line breaker to remove spaces around innermost operators
if in a crunch.
-----------------------------------------------------------------------------
Is it possible that the line breaker is losing counts? A line that
included lots of invisible parens converted to visible ones was
allowed to be suspiciously long.
-----------------------------------------------------------------------------
The notation t^ where t is a text file should convert \n's to
spaces if necessary.
-----------------------------------------------------------------------------
The assignment and type cast "f4 := tf4(i)" where type
"tf4 = function (i, j : integer) : str255"
generates something really weird.
-----------------------------------------------------------------------------
The conditional expression strsub(s,1,4) = 'Spam'
could be translated as strncmp(s, "Spam", 4)
-----------------------------------------------------------------------------
Consider an option which generates a "file.p2c" or "module.p2c"
file, that will in the future be read in by p2c as another p2crc
type of file, both when the module is re-translated later and when
it is imported. This file would contain commands like "NoSideEffects"
for functions which are found to have this property, etc.
-----------------------------------------------------------------------------
Extend the "file.log" or "module.log" file to contain a more detailed
account of the translation, including all notes and warnings which were
even considered. For example, ALL calls to na_lsl with non-constant
shifts would be noted, even if regular notes in this case were not
requested. Also, funny transformations along the lines of
"str[0] := chr(len)" and "ch >= #128" should be mentioned in the log.
How about a summary of non-default p2crc options and command-line args?
-----------------------------------------------------------------------------
Create a TypeMacro analogous to FuncMacro, VarMacro, and ConstMacro.
Should the definition be expressed in C or Pascal notation? (Yuck---not
a C type parser!)
-----------------------------------------------------------------------------
In argument type promotions, should "unsigned char" be promoted to
"unsigned int"?
-----------------------------------------------------------------------------
Turbo's FExpand translation is really weird.
-----------------------------------------------------------------------------
Can we translate Erase(x) to unlink(x)? (This could just be a FuncMacro.)
-----------------------------------------------------------------------------
There should be an option that causes a type to be explicitly named,
even if it would not otherwise have had a typedef name. Have a mode
that does this for all pointer types.
-----------------------------------------------------------------------------
Make sure that the construction: if blah then {comment} else other
does not rewrite to if (!blah) other; i.e., a comment in this situation
should generate an actual placeholder statement. Or perhaps, a null
statement written explicitly by the Pascal programmer should always
produce a placeholder.
-----------------------------------------------------------------------------
Allow the line breaker to treat a \003 as if it were a \010. The penalty
should be enough less than SameIndentPenalty that same-indent cases will
cause the introduction of parentheses.
-----------------------------------------------------------------------------
A comment of the form "{------}" where the whole comment is 78, 79 or 80
columns wide, should be reduced by two to take the larger C comment
brackets into account. Also, "{*****}", etc.
-----------------------------------------------------------------------------
There should be a mode that translates "halt" as "exit(0)", and another
that translates it as "exit(1)".
-----------------------------------------------------------------------------
There should be a mode in which strread's "j" parameter is completely
ignored. Also, in this mode, don't make a copy of the string being
read.
-----------------------------------------------------------------------------
Is there an option that generates an fflush(stdout) after every write
(not writeln) statement? It should be easy to do---the code is already
there to support the prompt statement.
-----------------------------------------------------------------------------
Check out the Size_T_Long option; size_t appears to be int on most
machines, not long.
-----------------------------------------------------------------------------
The type "size_t" should really be made into a separate type, with a
function to cast to type "size_t". This function would always do
the cast unless sizeof(int) == sizeof(long), or unless the expression
only involves constants and objects or functions of type "size_t".
-----------------------------------------------------------------------------
Finish the Turbo Pascal features (in the file turbo.imp).
-----------------------------------------------------------------------------
Are there any ways to take advantage of "x ?: y" in GCC?
Is it worth using GCC constructor expressions for procedure variables?
How about generating "volatile" and "const" for suitable functions?
(doing this in the .h file would be very difficult...)
Use the "asm" notation of 5.17 to implement var x ['y'] declarations.
-----------------------------------------------------------------------------
Recognize GCC extensions in pc_expr(). (By the way, remember
to implement += and friends in pc_expr(), too!)
-----------------------------------------------------------------------------
Lightspeed C can't handle "typedef char foo[];" which arises from a
MAXINT-sized array type declaration.
-----------------------------------------------------------------------------
"Return" and friends are only introduced once. In code of the form:
if (!done) { foo(); } if (!done) { bar(); }
p2c should, after patching up bar(), check if the foo() branch is
now also ripe for rearranging.
-----------------------------------------------------------------------------
Have a global "paranoia" flag. Default=use current defaults for other
options. 1=conservative defaults for other options. 0=sloppy defaults
for other options.
-----------------------------------------------------------------------------
Rather than just generating a note, have writes of attribute characters
convert into calls to a "set attribute" procedure, such as nc_sethighlight.
Is there any way of generalizing this into something useful for
non-HP-Pascal-workstation users?
-----------------------------------------------------------------------------
Warn when character constants which are control codes are produced.
(E.g., arrow keys, etc.) Also, have an option which deletes all
highlighting codes from strings being output.
-----------------------------------------------------------------------------
Think how nice things would be if the arithmetic routines actually
maintained the distinction between tp_int and tp_integer themselves,
so that makeexpr_longcast didn't have to second-guess them.
-----------------------------------------------------------------------------
Importing FS *still* copies its "file support" comment into the importing
program!
-----------------------------------------------------------------------------
Should parameterize those last few hard-wired names, such as "P_eoln",
"LONG_MAX", ... ?
-----------------------------------------------------------------------------
Check if we need to cache away any more options' values, as we did for
VarStrings. How about FoldConstants, SetBits, CopyStructs?
=============================================================================
Support the "CSignif" option (by not generating C identifiers which
would not be unique if only that many characters were significant).
-----------------------------------------------------------------------------
What if a procedure accesses strmax of a var-string parameter of a
parent procedure? (Right now this generates a note.)
-----------------------------------------------------------------------------
Handle full constructors for strings.
Handle small-array constants.
-----------------------------------------------------------------------------
Have an option that causes ANYVAR's to be translated to void *'s. In
this mode, all uses of ANYVAR variables will need to be cast to the
proper type, in the function body rather than in calls to the function.
-----------------------------------------------------------------------------
Handle reading enums. Add full error checking for reading booleans.
(And integer subranges?)
-----------------------------------------------------------------------------
Support the "BigSetConst" option by creating constant arrays just as the
Pascal compiler does.
-----------------------------------------------------------------------------
The 2^(N+1) - 2^M method for generating [M..N] is not safe if N is 31.
If the small-sets we are dealing with encompass the value 31 (== setbits-1)
then we must use the bitwise construction instead. (Currently, the
translator just issues a note.)
(If N is 31, 2^32 will most likely evaluate to 0 on most machines, which
is the correct value. So this is only a minor problem.)
-----------------------------------------------------------------------------
Big-set constants right now are always folded. Provide a mechanism
for defined set constants, say by having a #define with an argument
which is the name of the temporary variable to use for the set.
-----------------------------------------------------------------------------
Should we convert NA_LONGWORD-type variants into C casts?
-----------------------------------------------------------------------------
Are there implementations of strcpy that do not return their first
argument? If so, should have an option that says so.
-----------------------------------------------------------------------------
Handle absolute-addressed variables better. For var a[12345]:integer,
create an initialized int *. For var a['foo']:integer, create an int *
which is initialized to NULL and accessed by a macro which locates the
symbol the first time the variable is used.
-----------------------------------------------------------------------------
Handle the idiom, "reset(f, name); open(f);"
-----------------------------------------------------------------------------
Should have an option that lowercases all file names used in "reset",
"fixfname", etc. This should be on by default.
-----------------------------------------------------------------------------
Add more complete support for conformant arrays. Specifically, support
non-GNU compilers by converting variable-sized array declarations into
pointer declarations with calls to alloca or malloc/free (what if the
function uses free and contains return statements)? Also convert
variable-array references into explicit index arithmetic.
-----------------------------------------------------------------------------
Have a mode in which the body of a TRY-RECOVER is moved out into
a sub-procedure all its own, communicating with the parent through
varstructs as usual, so that the ANSI C warning about what longjmp
can do to the local variables is avoided. Alternatively, have an
option in which all necessary locals are declared volatile when
setjmps are present.
-----------------------------------------------------------------------------
If a sub-procedure refers to a parent's variable with the VAX Pascal
[STATIC] attribute, that variable is declared "static" inside the
varstruct! Need to convert it into a varref to a static variable in
the parent.
-----------------------------------------------------------------------------
When comparing records and arrays a la UCSD Pascal, should expand
into an "&&" expression comparing each field individually. (What about
variants? Maybe just compare the first variant, or the tagged
variant.) Probably best to write a struct-comparison function the
first time a given type of struct is compared; that way, the function
can include a complete if tree or switch statement in the case of
tagged unions.
-----------------------------------------------------------------------------
In the checkvarchanged code, take aliasing of VAR parameters into account.
For example, in "procedure p(s1 : string; var s2 : string)" p2c now avoids
copying s1 if s1 is not changed within p, but probably should also require
that s2 not change, or at least that s1 have been read and used before s2
is changed.
-----------------------------------------------------------------------------
Provide an option that tells the code generator to provide helpful
comments of its own when generated code may be obscure.
=============================================================================
Compact the various data structures. In particular, typical runs
show the majority of memory is used by Meanings and Symbols for
global and imported objects.
-----------------------------------------------------------------------------
The program wastes memory. Find ways to reduce memory usage, and to
avoid leaving dead records on the heap. (Garbage collection? Yuck!)
(Maybe GC between each function declaration would be okay.)
-----------------------------------------------------------------------------
Assign better names to temporaries. Also, could avoid making redundant
temporaries by generating a unique temporary every time one is needed,
then crunching them down at the end just before the declarations are
written out. Each temporary would maintain a list of all the other
temporaries (of the same type) with which it would conflict. (This
would avoid the current method's waste when several temps are created,
then most are cancelled.) Also, note that char STR1[10], STR2[20] can
be considered type-compatible and merged into STR[20].
-----------------------------------------------------------------------------
Don't generate _STR_xxx structure names if they aren't forward-referenced.
-----------------------------------------------------------------------------
Can optimize, e.g., "strpos(a,b) = 0" to a call to strstr() or strchr(),
even though these ANSI functions are not Pascal-like enough to use in
the general case.
-----------------------------------------------------------------------------
Complete the handling of "usecommas=0" mode.
-----------------------------------------------------------------------------
Optimize "s := strltrim(s)", "s := strrtrim(s)", and both together.
-----------------------------------------------------------------------------
Convert "(ch < 'a') or (ch > 'z')" to "!islower(ch)", and so on.
Also do "!islower(ch) && !isupper(ch)" => "!isalpha(ch)", etc.
-----------------------------------------------------------------------------
Find other cases in which to call mixassignments().
-----------------------------------------------------------------------------
The sequence: sprintf(buf + strlen(buf), "...", ...);
sprintf(buf + strlen(buf), "...", ...);
could be changed to a single sprintf.
Also, "sprintf(temp, "...%s", ..., buf); strcpy(buf, temp); (above);"
could be crunched down. (This comes from strinsert, then strappend.)
-----------------------------------------------------------------------------
If there is only one assignment to a structured function's return
variable, and that assignment is at the very end and assigns a local
variable to the return variable, then merge the variables.
(Example: name_list in netcmp's datastruct module. RET_name_list
should be renamed to namestr.)
-----------------------------------------------------------------------------
Have an option that causes if-then-else's to be replaced by ? :'s in
certain cases. If the branches of the if are either both returns or
both assignments to the same variable, which has no side effects, and
if the whole conditional will be simple enough to fit on one line when
printed, then the ? : transformation is okay.
-----------------------------------------------------------------------------
Have an option that makes limited use of variable initialization.
If the first statement of a function is an assignment of a constant
to a local variable, then the assignment is changed to an initialization.
(Non-constant initializers are probably too hard to check for safety.)
Should variables with initializers still be mixed?
(Example: valid_node_class in netcmp's datastructs module.)
File variable initialization is an especially good application for this.
-----------------------------------------------------------------------------
Have an option that finds cases of multiple assignment. For example:
a := x; b := x; => a = b = x;
a := x; b := a; => a = b = x;
(provided the objects in question have the same type).
-----------------------------------------------------------------------------
Need an option that causes $if$'s to change to #if's, instead of being
evaluated at translation-time. (This is *really hard* in general;
do it for some common cases, such as entire statements being commented
out, or fields being commented out of records.)
-----------------------------------------------------------------------------
Have an option that prevents generation and inclusion of .h files for
modules; instead, each module would contain extern declarations for
the things it imports. (Only declare names that are actually used---
this means the declarations will have to be emitted on a procedure-by-
procedure basis.)
-----------------------------------------------------------------------------
Extend the ExpandIncludes option to compile include files as separate
stand-alone C modules. The hard part is warning when the new module
uses a procedure which was declared static in an earlier C module.
Remember to re-emit all the proper #include's at the beginning.
Anything else?
-----------------------------------------------------------------------------
Consider an option where the variables in a varStruct are made into
globals instead, or into parameters if there are few of them.
-----------------------------------------------------------------------------
Perform a flow analysis after everything else is done; use this to
eliminate redundant checks, e.g., for nil pointers or non-open or
already-open files. Also eliminate redundant "j" variables for
strwrite statements.
-----------------------------------------------------------------------------
Need a method for simple pattern matching predicates in FuncMacros.
For example, allow special translations of functions where a given
argument is a known constant, or known to be positive, or two
arguments are the same, etc.
-----------------------------------------------------------------------------
Have some way to provide run-time templates for fixblock/fixexpr-like
applications. The user enters two C expressions A and B, possibly including
Prolog-like logical variables. If fixexpr sees an expression matching A,
it rewrites it into the form of B.
-----------------------------------------------------------------------------
Have an option to cause selected Pascal procedures or functions to be
expanded in-line. Do this either by generating the keyword "inline",
or by doing the expansion in the translator.
-----------------------------------------------------------------------------
Technically speaking, strcmp shouldn't be used to compute < and > for
strings on a machine with signed chars. Should we care?
-----------------------------------------------------------------------------
Have an option for creating a "display" of LINK pointers local to a
function. Should only create such pointers for static levels which are
referred to in the function body.
-----------------------------------------------------------------------------