home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sources.misc
- From: jjsc@inf.rl.ac.uk (John Cullen)
- Subject: v11i063: Program solution to the "missing comment terminator" problem!
- Message-Id: <8826@nfs4.rl.ac.uk>
- Date: 15 Mar 90 10:05:38 GMT
- Reply-To: jjsc@inf.rl.ac.uk ()
- Distribution: world
- Organization: Rutherford Appleton Laboratory, Informatics Department, U.K.
- Lines: 113
- Sender: allbery@uunet.UU.NET (Brandon S. Allbery - comp.sources.misc)
-
- Posting-number: Volume 11, Issue 63
- Submitted-by: jjsc@inf.rl.ac.uk (John Cullen)
- Archive-name: dangle/part01
-
-
- I am posting this program on behalf of a friend without access to usenet news.
- I wasn't sure where I should post, however since there has been a lot of
- discussion of late about the problems of inadvertantly commenting out code
- (due to nested comments, loss of terminating */, etc) I thought that at the
- least it should go to comp.lang.c. Also, since it's of a general nature (and
- not all C programmers read comp.lang.c :-) I'm cross posting to c.sources.misc.
- [Cross-posting between moderated and unmoderated groups is not a good idea.
- I have not cross-posted to comp.lang.c. ++bsa]
-
- Any comments, suggestions, complaints (heaven forbid :-) to the author please,
- although if people have difficutly reaching him, I will forward messages for
- a while.
-
- Enjoy,
- John.
-
- PS. Please note that Barry cannot receive mail sent via the UUCP link at
- uk.ac.ukc - anyone sending mail via uucp, please send here and I'll forward.
-
- -------------------------------- dangle.c ---------------------------------
- I have often encountered problems with mailing programs due to some mailers
- truncating long lines - in fact you may find this program suffers as it is my
- style to use the full screen width. The commonest problem is due to a close-
- comment symbol being corrupted, with the result that the compiled program is
- somewhat shortened and less than functional!
-
- I wrote this little program to scan suspect files, and to report on possible
- problems. It's up to the user to check the output, and decide if there is a
- true problem. It checks for comments and strings which bridge lines etc.
- It is over zealous in reporting; but misses nothing in valid C, and some other
- languages also.
-
- The technique used is that of a finite state machine, one that I have found to
- be very robust in the presence of invalid inputs; and one that recovers quickly.
- The original program was written for a Prime 9955, for which the CI compiler
- produces amazingly compact code at optimize 1 level. I must admit that I have
- fiddled with the source a bit, simply out of interest to see the effects.
-
- Please feel free to use it as it stands, modify it, and use the technique for
- other purposes. If expanded, it could become a fast full C lexer. Have a go with
- it, and let me know how you got on.
-
- Barry
-
- ----------------------------------- cut here -----------------------------------
- #! /bin/sh
- # This file was wrapped with "dummyshar". "sh" this file to extract.
- # Contents: dangle.c
- echo extracting 'dangle.c'
- if test -f 'dangle.c' -a -z "$1"; then echo Not overwriting 'dangle.c'; else
- sed 's/^X//' << \EOF > 'dangle.c'
- X/* PROGRAM DANGLE: usage is: DANGLE <file_name> - no options . This program will
- X/* inspect C programs (and others) and report on comments which bridge several
- X/* lines, in case a closing quote has been omitted. It also checks strings and
- X/* quoted characters, and reports if they bridge lines, or a quote is missing.
- X/* The program operates as a finite state machine, using an enum state variable.
- X/* Author: GORMAN_B@UK.AC.LANCSP.P1 Date: February 15th 1990
- X
- X/******************************************************************************/
- X#include <stdio.h> /* standard i/o routines, EOF etc */
- X#ifndef __CI /* test for Prime CI compiler */
- X#define short int /* the use of long and short in this program is */
- X#define long int /* optimum for CI, what about other compilers? */
- X#endif
- X#if EOF<0
- X#define NOT_EOF(c) (c)>=0 /* slightly better code generated */
- X#define IS_EOF(c) (c)<0 /* by avoiding compare instruction */
- X#else
- X#define NOT_EOF(c) (c)!=EOF /* EOF should be -1, but may not be; */
- X#define IS_EOF(c) (c)==EOF /* so define something that will work */
- X#endif
- X#define case_NORMAL default /* save the compiler generating two instructions! */
- X/******************************************************************************/
- Xmain(argc,argv) int argc; char *argv[]; /* expects name of file to be given */
- X/******************************************************************************/
- X {short copy=0; long chr, line=0, quote; char buffer[162], *p=buffer;
- X FILE *input; enum {NORMAL,HYPER,QUOTED,SLASH,COMMENT,STAR} state=NORMAL;
- X
- X if(argc!=2||!(input=fopen(argv[1], "r"))) exit(1);
- X
- X printf("Checking file %snn", argv[1]);
- X
- X do {*p++=chr=getc(input);
- X
- X if(chr=='n'||IS_EOF(chr))
- X
- X {short query=(state>=QUOTED); *p=0; p=buffer; line++;
- X
- X if(query|copy) {copy=query; printf("%d: %s", line, p);}}
- X
- X switch(state) {
- X/*----------------------------finite state machine----------------------------*/
- Xcase_NORMAL: if(chr=='"'||chr==''') {state=QUOTED; quote=chr; break;} /* "' */
- X if(chr=='/') state=SLASH; break; /* check if start of a comment */
- X
- Xcase HYPER: state=QUOTED; break; /* allow to skip next char, including n */
- X
- Xcase QUOTED: if(chr==quote||chr=='n') {state=NORMAL; break;} /* note n trap */
- X if(chr=='\') state=HYPER; break; /* deal with in quotes */
- X
- Xcase SLASH: if(chr!='/') state=(chr=='*'?COMMENT:NORMAL); break; /* //* etc */
- X
- Xcase COMMENT:if(chr=='*') state=STAR; break; /* skip to *, n dealt with */
- X
- Xcase STAR: if(chr!='*') state=(chr=='/'?NORMAL:COMMENT);}} /* deal with -> **/
- X/*----------------------------------------------------------------------------*/
- X while(NOT_EOF(chr));
- X
- X printf("n");}
- X
- X/************************************THE END***********************************/
- EOF
- chars=`wc -c < 'dangle.c'`
- if test $chars != 2954; then echo 'dangle.c' is $chars characters, should be 2954 characters!; fi
- fi
- exit 0
-
-
- ---
- ===============================================================================
- John Cullen || JANET : jjsc@uk.ac.rl.inf
- System Support Group || ARPA : jjsc%inf.rl.ac.uk@nsfnet-relay.ac.uk
- Informatics Department || BITNET: jjsc%uk.ac.rl.inf@ukacrl
- Rutherford Appleton Laboratory || UUCP : {...!mcvax}!ukc!rlinf!jjsc
- Chilton, Didcot, Oxon. OX11 0QX || VOICE : +44 (0)235 821900 ext 6555
- ===============================================================================
- Your fortune cookie says:
- I'd give my right arm to be ambidextrous.
-
-