home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.perl
- Path: sparky!uunet!ftpbox!mothost!merlin.dev.cdx.mot.com!fendahl.dev.cdx.mot.com!mcook
- From: mcook@fendahl.dev.cdx.mot.com (Michael Cook)
- Subject: Re: C comment parser?
- Message-ID: <mcook.724208094@fendahl.dev.cdx.mot.com>
- Sender: news@merlin.dev.cdx.mot.com (USENET News System)
- Nntp-Posting-Host: fendahl.dev.cdx.mot.com
- Organization: Motorola Codex, Canton, Massachusetts
- References: <1268@galileo.rtn.ca.boeing.com>
- Date: Sun, 13 Dec 1992 00:54:54 GMT
- Lines: 37
-
- meb4593@galileo.rtn.ca.boeing.com (Michael Bain) writes:
-
- >I'm having difficulty developing the regular expression to remove C
- >comments from a variable (say $_) that contains a C function. (I'm
- >hacking a McCabe metric and am reading in a function at a time)
-
- >What I have so far seems to work *most* of the time, but oftentimes it
- >misses an ending C comment delimiter (*/) and ends up removing a lot of
- >code until it reaches the last delimiter.
-
- >Here's the RE:
- > s?/\*(.*\n)*.*\*/??; # remove comments
-
- >Does somebody have a better one that works?
-
- Remember that RE's are maximally matched (in Perl, and most Unix tools). If
- $_ has two or more comments in it, that s/// is going to delete them all in
- one swoop, and it'll also delete everything between the comments.
-
- I can give you an RE that'll work. Here:
-
- s%/\*([^*]|\*+[^/*])*\*+/%%g
-
- But RE's are really the way to go. The if comment is very big, Perl will dump
- core. (I tried it.)
-
- You can do it with index():
-
- $pos = 0;
- while (($pos = index($_, "/*", $pos)) >= 0)
- {
- $end = index($_, "*/", $pos + 2);
- $end >= 0 || die "unterminated comment";
- substr($_, $pos, $end - $pos + 2) = '';
- }
-
- Michael.
-