home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.lisp
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!news.sei.cmu.edu!fs7.ece.cmu.edu!crabapple.srv.cs.cmu.edu!ram
- From: ram+@cs.cmu.edu (Rob MacLachlan)
- Subject: Re: Text Processing
- Message-ID: <BzBK91.FLG.1@cs.cmu.edu>
- Sender: news@cs.cmu.edu (Usenet News System)
- Nntp-Posting-Host: lisp-pmax1.slisp.cs.cmu.edu
- Organization: School of Computer Science, Carnegie Mellon
- References: <723262882.AA00000@blkcat.UUCP>
- Date: Tue, 15 Dec 1992 21:12:30 GMT
- Lines: 22
-
- In article <723262882.AA00000@blkcat.UUCP> Bruce.Feist@f615.n109.z1.fidonet.org (Bruce Feist) writes:
- >I'm open to suggestions on how to process text files in Common LISP for
- >lexical analysis. Right now, I'm doing it character-by-character with
- >read-char and then using concat to combine the characters I read into tokens;
- >is there a better way?
-
- If you are doing:
- (setq buffer (concatentate buffer (string char)))
- you would be doing better by preallocating a string buffer and growing it as
- needed, then taking SUBSEQ at the end of accumulation.
-
- >How much more efficient would it be to use read-line and separate tokens from
- >each other?
-
- It depends on the implementation's relative costs of stream operation overhead
- and string allocation. I would guess that in implementations with
- generational GC, the READ-LINE strategy could well be faster. Most
- implementation have internal operations to read a large number of characters
- into a preallocated buffer, which could get the best of both worlds at a
- portability cost.
-
- Rob
-