NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / lang / perl / 4883 < prev next >

Wrap

Internet Message Format | 1992-07-22 | 1.2 KB

Path: sparky!uunet!usc!cs.utexas.edu!qt.cs.utexas.edu!yale.edu!yale!mintaka.lcs.mit.edu!ai-lab!life!jba From: jba@ai.mit.edu (Jonathan Amsterdam) Newsgroups: comp.lang.perl Subject: back-references in regexps Message-ID: <JBA.92Jul22141349@kix.ai.mit.edu> Date: 22 Jul 92 18:13:49 GMT Sender: news@ai.mit.edu Distribution: comp Organization: MIT Artificial Intelligence Laboratory Lines: 22 I've been looking closely at Perl regular expressions and have noticed some interesting things about back references: 1. It is possible to have a valid back-reference (i.e. one that doesn't always fail) inside its own paren. E.g. /(a(b|\1))+/ 2. It is possible to have a valid back-reference before its paren. E.g. /(\2|(\w))+/ 3. A back-reference can succeed even if its paren was part of a failure earlier in the scan. E.g. /((ab)d|abc)\2/ will match "abcab". I'm particularly interested in 3. Is it a commonly known fact? Is it considered a feature or a sort of fall-through-the-cracks don't-care kind of thing? (One might at first believe, or even wish, that a back-reference to something that failed would also fail.) Does anybody have any scripts that use any of the above three properties? Thanks, Jonathan Amsterdam