Syntax for Regular Expressions: Current Syntax


The new regular expression language is used with JavaStar API methods that have names end with RE. For example, JSComponent.hasMemberRE() and JS.dialogRE() both use this syntax.

This language is a subset of Perl's regular expressions, similar to UNIX egrep.

Alternatives

A regular expression can consist of one or more alternatives. An alternative is a sequence of items. Alternatives are separated by | (the pipe symbol). An alternative matches if all the items match in the order they occur. The regular expression matches if any of the alternatives match.

Items

An item is either an assertion or a quantified atom. Assertions are:

Assertions in regular expressions

Expression Use
\b Matches on word boundary, between \w and \W or between \W and \w.
\B Matches on non-word boundary.

A quantified atom is an atom followed optionally by one of the following which indicate how many times the atom must or may occur. If no quantifier is given, the atom must occur exactly once.

Quantifiers are:

Quantifiers in regular expressions

Expression Use
* 0 or more times
+ 1 or more times
? 0 or 1 time

An atom is:

Excluded Syntax

Regular expression syntax not included are:

Excluded Perl regular expressions

Character Use
^ and $ These are not useful, since the library always match whole strings.
{n,m} {n,} {n} These are not currently implemented.
\1 , etc. (backreferences to substrings.) Not implemented
\0 \033 \x7f \cD These are not necessary. JavaStar Strings can contain any legal Java characters that do not have special meaning to the regular expression language, as well as those if prefixed by backslash. You can use Java unicode escapes \u0000 - \uFFFF, interpreted by Java. That is "\u0000" not "\\u0000".




Send feedback to JavaStar-feedback@suntest.com
Copyright © 1998 Sun Microsystems, Inc. 901 San Antonio Road, Palo Alto, CA 94303. All rights reserved.