home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.perl
- Path: sparky!uunet!ferkel.ucsb.edu!taco!gatech!europa.asd.contel.com!howland.reston.ans.net!wupost!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!ira.uka.de!math.fu-berlin.de!news.netmbx.de!Germany.EU.net!mcsun!sunic!seunet!enea!sommar
- From: sommar@enea.se (Erland Sommarskog)
- Subject: Another RE query
- Message-ID: <1992Dec13.150325.9754@enea.se>
- Organization: Enea Data AB
- Date: Sun, 13 Dec 1992 15:03:25 GMT
- Lines: 30
-
- According to the FAQ you cannot handle balanced text with Perl's
- regular expressions, and it seems that what I'm trying to handle
- is just that. But I'd like to check out with the expertise just
- in case.
-
- Basically what I'm writing is a simple text formatter in which
- user-defined strings are converted to escape-sequences for enscript.
- Those strings occurs in pairs to mark start and end of the text
- to be in deviating font. Anyway, innocently I tried something
- like:
-
- s!${start}((.|\n)*${stop}!${esc}F${ch1}$1${esc}F${ch2}!g
-
- But of course I got bitten if the pair twice in the same paragraph.
- (I read text paragraph by paragraph, a deliberate restriction.) Would
- $stop be guaranteed to be one character long, it would be trivial:
-
- s!${start}([^${stop}]*)${stop}!${esc}F${ch1}$1${esc}F${ch2}!g
-
- But $start and $stop can be any length, although would only expect
- one or two characters in practice. But two characters is enough to
- kill the idea. (Another problem is that the pairs may be nested,
- but I ignore that case.)
-
- Since it seems this can't be done with regular expressions, I've
- redone the whole thing with index and substr like I would have
- done in a more traditional language. Still I would to check out
- that I didn't miss anything...
- --
- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
-