home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 10 Tools
/
10-Tools.zip
/
pccts1.zip
/
BUGS100
< prev
next >
Wrap
Text File
|
1993-03-30
|
13KB
|
498 lines
C u r r e n t B U G L i s t
December 1, 1992
(1) Only lexical class START could have over 100 tokens.
(2) Ambiguity messages were printing junk when the grammar did not
end up in lexclass START at the end of the file. (fix is too big
for this posting)
(3) This bug causes antlr to suffer segmentation faults/bus errors
when processing a grammar that has multiple lexical classes (say
7) and over 200 token definitions. This is the same as bug (1).
The bug is in misc.c at lines 58-64:
for (i=0; i<NumLexClasses; i++)
{
lclass[i].exprs = (char **)
realloc(lclass[i].exprs, tsize*sizeof(char *));
require(lclass[i].exprs != NULL, "Ttrack: can't extend ExprStr");
for (p= &lclass[i].exprs[tsize-more],i=1; i<=more; i++) *p++ = NULL;
}
Here is the fix (thanks to Mark Scheevel):
for (i=0; i<NumLexClasses; i++)
{
int j;
lclass[i].exprs = (char **)
realloc(lclass[i].exprs, tsize*sizeof(char *));
require(lclass[i].exprs != NULL, "Ttrack: can't extend ExprStr");
for (p= &lclass[i].exprs[tsize-more],j=1; j<=more; j++) *p++ =NULL;
}
Thanks to Mark Scheevel and Tom Reyes (unisql!treyes@cs.utexas.edu)
for this bug report and fix.
(4) When one.c encountered an error, it printed random garbage.
one.c: The line in the 'extract()' routine which reads:
fprintf(stderr,"unbag: line %d: bad file format: %s\n", stop, line);
should read:
fprintf(stderr,"unbag: line %d: bad file format: %s\n", line, text);
Thanks to Fred Scholldorf (scholldorf@nuclear.physics.sunysb.edu).
Page 1
PCCTS
(5) When compiling antlr on a MC 680x0 system with gnus gcc (2.1),
you have to add -fwriteable-strings to CFLAGS. Otherwise, antlr
will dump core.
From: pia@hotmama.toppoint.de
(6) Regarding the C front-end example provided with ANTLR, line 299
must be removed from main.c:
fprintf(" %s\n", a->data.s.name);
which is obviously wrong without a stream parameter and is a duplicate
of the line after it anyway. Another cut-n-paste error, I guess.
(7) A bug in set.c allowed malloc() and friends to be called with a
size argument of 0 which caused problems on some machines. The
fix causes malloc() to be called less resulting in slight speed
improvements to DLG, and ANTLR.
file set.c INSERT
if ( n == 0 ) return t; /* TJP 4-27-92 fixed for empty set */
in func set_and() in context:
n = (b.n > c.n) ? c.n : b.n;
--->[INSERT]
set_ext(&t, n);
and in func set_dif() in context:
n = (b.n <= c.n) ? b.n : c.n ;
--->[INSERT]
set_ext(&t, b.n);
in func set_ex() INSERT
if ( n == 0 ) return;
in context:
if ( a->n == 0 )
{
------->[INSERT]
a->setword = (unsigned *) calloc(n, BytesPerWord);
Also note that an extra, slightly different copy of set.c was included
in the DLG bag. This version is not used by the makefile in the DLG
directory and can be ignored.
(8) ANTLR generates unknown escape sequences in scan.c and err.c
because it dumps your token definitions verbatim to the output
Page 2
PCCTS
files. You get messages like:
err.c:113: warning: unknown escape sequence `\+'
One user, ggf@saifr00.cfsat.honeywell.com (Gary Frederick), overcame
this by:
> I 'escaped' the above lines by doing this to err.c
> /* 04 */ "\\*/",
> /* 05 */ "\\>\\>",
These are only warnings, but give you "+" instead of "\+" when you
print a token in your program.
(9) In order to compile with gcc version 1.37.1 you will need a new
version of proto.h called proto.h.new which can be obtained via
this mail server. In addition, I had to do the following things:
o Use the -fwritable-strings gcc option.
o In file fset2.c at line 25, add
#ifdef __STDC__
Tree *tmake(Tree *root, ...);
#else
Tree *tmake();
#endif
o Remove the definition of scarfPAction():
char *scarfPAction();
from the top of files: antlr.c and scan.c (right before #include of
attrib.h).
o In file trax.h, change
#ifdef ANSI
to
#ifdef __STDC__
Also, change the definition of mFree to:
void mFree(void *, char *, int);
o In file lexhelp.c, change line 41 to read
int begin, end;
which changes them from chars to ints.
Page 3
PCCTS
o In file bits.c, line 295 change eMsg1 to eMsgd:
require(j<NumLexClasses, eMsgd("No label or expr for token %d",i));
o In file, gen.c, change line 56 to
static void dumpRetValAssign(char *, char *);
And, add
static dumpAfterActions(FILE *output);
right after on line 57. Add the non-ANSI version at line 60
static dumpAfterActions();
o In file misc.c, change line 313 to read:
void *e;
o In file trax.c in the support/trax directory, change line 401 to
read:
pChk( (char *)p+sizeof(Trax) );
Also, change line 232 to read:
void *p;
I think that that includes everything. Good luck.
(10) The manual references a set of memory allocation debugging rou-
tines called "trax.h". It indicates that they are in the support
directory. This is not the case. If you are interested, mail to
'parrt@ecn.purdue.edu' (a human) for a copy.
(11) As per the manual, PCCTS had a problem with multiple ANTLR macro
invocations. This has been fixed (we think). In file dlgauto.h
in the pccts/h directory make the following additions:
zzrdstream( f )
FILE *f;
{
zzline = 1;
zzstream_in = f;
zzfunc_in = NULL;
zzcharfull = 0; /* TJP added May 1992 */ <-------- ADD
}
zzrdfunc( f )
int (*f)();
{
zzline = 1;
zzstream_in = NULL;
Page 4
PCCTS
zzfunc_in = f;
zzcharfull = 0; /* TJP added May 1992 */ <-------- ADD
}
You should be able to do this:
parse(f1,f2)
FILE *f1, *f2;
{
ANTLR(grammar(), f1);
ANTLR(grammar(), f2);
}
Before, the second invocation of the ANTLR macro did not know that it
needed to get another character before beginning. Let us know if this
screws up anything else. Be aware that switching between input
streams will not work because characters are lost when switching to a
new stream.
(12) Character delimiters within ANTLR actions cause problems: e.g.
<< '"' >>
(13) Marlin Prowell found a bug in the parameter passing code genera-
tion. He reports:
I traced the error to the strmember() routine (in lex.c) which
uses too simple an algorithm to test if one string is contained
within another. The routine is only used to determine if the
var-part of $var is contained in the parameter or result strings,
as in:
strmember ("int x, int y", "i");
The strmember function will return true if "i" matches any letter, not
just when "i" matches an entire word. It should return true when the
second parameter is a *word* in the first parameter. I include below
a revised (and overly cautious) version of strmember. This fixes the
problem I described above, and PCCTS now produces correct code for the
example.
/* check to see if string e is a word in string s */
int
strmember(s, e)
char *s, *e;
{
register char *p;
require(s!=NULL&&e!=NULL, "strmember: NULL string");
if ( *e==' ' ) return 1; /* empty string is always member */
do {
while ( *s!=' ' && !isalnum(*s) && *s!='_' )
++s;
p = e;
Page 5
PCCTS
while ( *p!=' ' && *p==*s ) {p++; s++;}
if ( *p==' ' ) {
if ( *s==' ' ) return 1;
if ( !isalnum (*s) && *s != '_' ) return 1;
}
while ( isalnum(*s) || *s == '_' )
++s;
} while ( *s!=' ' );
return 0;
}
Marlin Prowell
mbp@nyssa.wa7ipx.ampr.org
(14) A bug in ANTLR code generation for k>=2 was found. The bug was
due to the grammar analysis phase's inability to detect invalid
LL(k) FIRST trees. For example,
a : A A
| b B
;
b : {C} A
;
generated invalid code:
...
if ( (LA(1)==A) && (LA(2)==A) && !(LA(1)==A) ) {
zzmatch(A); zzCONSUME;
zzmatch(A); zzCONSUME;
}
else if ( (LA(1)==A || LA(1)==C) && (LA(2)==A || LA(2)==B) ) {
b();
zzmatch(B); zzCONSUME;
}
...
This has been corrected to generate (in my latest version):
...
if ( (LA(1)==A) && (LA(2)==A) ) {
zzmatch(A); zzCONSUME;
zzmatch(A); zzCONSUME;
}
else if ( (LA(1)==A || LA(1)==C) && (LA(2)==A || LA(2)==B) ) {
b();
zzmatch(B); zzCONSUME;
}
...
Terence Parr
Page 6
PCCTS
(15) ANTLR attempts to read from a closed file sometimes. This was
caused by ANTLR's insistence upon always getting the next k
tokens of lookahead instead of upon demand. Upon EOF, the
current input file was closed during the EOF lexical action and
then ANTLR tried to fill its lookahead queue. This has been
fixed in our version, but appears not to be much trouble for any-
one since we haven't heard anything. Next release will have many
fixes.
(16) In charptr.h, the macro zzdef0 was incorrect. The correct macro
is:
#define zzdef0(a) {*(a)=NULL;}
Also, the zzcr_attr() function didn't check for the out of memory con-
dition. Add
if ( *a == NULL ) {fprintf(stderr, "zzcr_attr: out of memory!\n"); exit(-1);}
right after the call to malloc().
(17) The PASCAL example expression grammar was incorrect. Rules expr,
simpleExpr, and term use {...} optional braces when (...)* clo-
sure should be used; e.g.
term : factor {("|"/"| Div | Mod | And ) factor} ;
should be
term : factor (("|"/"| Div | Mod | And ) factor)* ;
(18) Chris Song, dsong@ncsa.uiuc.edu, National Center for Supercomput-
ing Applications; found this bug:
line 797 of antlr/misc.c:
p->ntype<=NumJuncTypes ...
should be NumNodeTypes
(19) Chris Song, dsong@ncsa.uiuc.edu, National Center for Supercomput-
ing Applications; found this bug:
antlr/lex.c: line 52-57 of genLexDescr
The first node in the link list LexActions is a sentinal (see
list_add()). So it should be skipped. The correction is as follows:
if (LexActions != NULL)
{
for (p = LexActions->next; p!=NULL; p=p->next)
{
...
}
Page 7
PCCTS
}
(20) Chris Song, dsong@ncsa.uiuc.edu, National Center for Supercomput-
ing Applications; found this bug:
dlg/automata.c: line 124
last_done = NFA_NO(d_state);
The NFA_NO should be DFA_NO?
(21) LL(k) grammar analysis didn't make mucho copies of EOF when it
needed to:
type : type_name
| {class_name} type_name
;
type_name: ID ;
class_name: ID ;
First alternative should see ( ID ( @ @ ) ), not ( ID @ ).
Also, bug in left_factor: ID ( ID ( @ @ ) became ID not ( ID ( @ @ )
).
Terence
(22) DLG generated spurious crap upon bad argument lists such as "DLG
-j".
(23) ANTLR's -p option to print out a grammar without actions had a
bug that did not print alternative '|' operators for {...}
subrules. In addition, this option has been improved to not
print an extra level of (...) for rules.
(24) A bug in set_dif() of set.c caused DLG to generate incorrect
scanners. When set_dif() had the empty set subtracted from a
non-empty set, it incorrectly returned an empty set. This
prevented the DLG generating correct code for [] and ~[].
Page 8