home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!mcsun!uknet!comlab.ox.ac.uk!imc
- From: imc@comlab.ox.ac.uk (Ian Collier)
- Newsgroups: comp.lang.rexx
- Subject: Re: Regular expression syntax
- Message-ID: <2414.imc@uk.ac.ox.prg>
- Date: 8 Sep 92 16:12:49 GMT
- References: <19920908084344SEB1525@MVS.draper.com>
- Organization: Oxford University Computing Laboratory, UK
- Lines: 55
- X-Local-Date: Tuesday, 8th September 1992 at 5:12pm BST
- Originator: imc@msc2.comlab
-
- In article <19920908084344SEB1525@MVS.draper.com>, SEB1525@MVS.draper.com (Steve Bacher) wrote:
- >Now that the idea has been seriously suggested, why don't we try to
- >come up with a nice REXXy syntax for regular expressions?
-
- >How about this to kick off a discussion...
-
- > Unix-style regexp Proposed REXX-style expression
- >
- > abc "abc"
- > a.c "a" ANY "c"
- > a* ZERO_OR_MORE("a")
- > a.* "a" ZERO_OR_MORE(ANY)
- > a[bc]d (them's sq. brackets) "a" ("b" OR "c") "d"
- > ^abc BEGIN "abc"
- > abc$ "abc" END
-
- Sorry, but I don't think this is "REXXy". Apart from anything else,
- ZERO_OR_MORE("a"), as well as being rather too verbose IMHO, looks like a
- function call but isn't. What kind of animal is it anyway? (note, I'm not
- asking what it does, but what "part of speech" it is in Rexx).
-
- Remember, we do already have some symbols, notably & (and) and | (or).
- That's not to say that I would prefer ".+" over any other method of notating
- "one arbitrary character or more", however, just that I don't think you are
- justified in using "OR" as a reserved word in patterns.
-
- >Then we could have a function MATCH(string,pattern) where pattern
- >is a regexp as above. And we could have PARSE PATTERN pattern WITH...
-
- I'd prefer to limit regular expressions to builtin functions, if we are to
- have them at all, really. Anyway, we might have trouble with the PARSE
- syntax if we were to want to know which character represented the "." and
- ".*" in "foo. .*blah".
-
- I think we are (mainly) wanting to provide regular expressions for unix
- users who are familiar with them. I've programmed in Rexx for years and
- never wanted one (though I have used them in other contexts, e.g. in awk).
- Therefore I suggest that a syntax familiar to those users (e.g. in the
- style of "egrep" et al, but certainly not emacs!) be used, rather than an
- artificially Rexx-ized version. It may not seem particularly obvious to you
- to type something like
-
- call match input,"foo(.) (.*)blah$","x","y" [*]
-
- but I'm sure the unix-types will love it (no disrespect, of course). ;-)
-
- Ian Collier
- Ian.Collier@prg.ox.ac.uk | imc@ecs.ox.ac.uk
-
- [*] In this example, the first parameter is the string to be matched, the
- second parameter is the regular expression, and the subsequent parameters
- are the names of variables to be assigned with the substrings which matched
- parenthesised expressions in the pattern, namely "." and ".*" respectively.
- Some value indicating the amount of success of the match will probably be
- returned from the function.
-