BackUp LevelNext

Searching with Regular Expressions

Studio supports searching with regular expressions (or regexes) to match patterns in character strings in the Extended Find and Replace commands. Regular expressions allow you to specify all the possible variants in a search and to precisely control replacements. Ordinary characters are combined with special characters to define the pattern for the search. The regex parser evaluates the selected files and returns each matching pattern.

In the Find command, the matching pattern is added to the find list. In the Replace operation, it triggers insertion of the replacement string. When replacing a string, it is just as important to ensure what is not found as what is. Simple regular expressions can be concatenated into complex search criteria.

Note

The rules listed in this section are for creating regular expressions in Studio. The rules used by other regex parsers may differ.

Special characters

Because special characters are the operators in regular expressions, in order to represent a special character as an ordinary one, you need to precede it with a double backslash (\\)

Single-character regular expressions

This section describes the rules for creating regular expressions. You can use regular expressions in the Search > Extended Find and Replace commands to match complex string patterns.

The following rules govern one-character regexes that match a single character:

Character classes

You can specify a character by using one of the POSIX character classes. You enclose the character class name inside two square brackets, as in this example:

REReplace("Allaire's Web Site","[[:space:]]","*","ALL")

This code replaces all the spaces with *, producing this string:

Allaire's*Web*Site

The following table shows the POSIX character classes that Studio supports.

Supported Character Classes 
Character Class
Matches
alpha
Matches any letter. Same as [A-Za-z].
upper
Matches any upper-case letter. Same as [A-Z].
lower
Matches any lower-case letter. Same as [a-z].
digit
Matches any digit. Same as [0-9].
alnum
Matches any alphanumeric character. Same as [A-Za-z0-9].
xdigit
Matches any hexadecimal digit. Same as [0-9A-Fa-f].
space
Matches a tab, new line, vertical tab, form feed, carriage return, or space.
print
Matches any printable character.
punct
Matches any punctuation character, that is, one of ! ` # S % & ` ( ) * + , - . / : ; < = > ? @ [ / ] ^ _ { | } ~
graph
Matches any of the characters defined as a printable character except those defined to be part of the space character class.
cntrl
Matches any character not part of the character classes [:upper:], [:lower:], [:alpha:], [:digit:], [:punct:], [:graph:], [:print:], or [:xdigit:].

Multi-character regular expressions

You can use the following rules to build a multi-character regular expressions:

Backreferences

Studio supports backreferencing, which allows you to match text in previously matched sets of parentheses. A slash followed by a digit n (\n) is used to refer to the nth parenthesized sub-expression.

One example of how backreferencing can be used is searching for doubled words -- for example, to find instances of `the the' or `is is' in text. The following example shows the syntax you use for backreferencing in regular expressions:

("There is is coffee in the the kitchen",
"([A-Za-z]+)[ ]+\1","*","ALL")

This code searches for words that are all letters ([A-Za-z]+) followed by one or more spaces [ ]+ followed by the first matched sub-expression in parentheses. The parser detects the two occurrences of is as well as the two occurrences of the and replaces them with an asterisk, resulting in the following text:

There * coffee in * kitchen

Anchoring a regular expression to a string

All or part of a regular expression can be anchored to either the beginning or end of the string being searched:

Expression examples

The following examples show some regular expressions and describe what they match.

Regular Expression Examples 
Expression
Description
[\?&]value=
A URL parameter value in a URL.
[A-Z]:(\\[A-Z0-9_]+)+
An uppercase DOS/Windows full path that (a) is not the root of a drive, and (b) has only letters, numbers, and underscores in its text.
[A-Za-z][A-Za-z0-9_]*
A ColdFusion variable with no qualifier.
([A-Za-z][A-Za-z0-9_]*)(\.[A-Za-z][A-Za-
z0-9_]*)?
A ColdFusion variable with no more than one qualifier, for example, Form.VarName, but not Form.Image.VarName.
(\+|-)?[1-9][0-9]*
An integer that does not begin with a zero and has an optional sign.
(\+|-)?[1-9][0-9]*(\.[0-9]*)?
A real number.
(\+|-)?[1-9]\.[0-9]*E(\+|-)?[0-9]+
A real number in engineering notation.
a{2,4}
Two to four occurrences of 'a': aa, aaa, aaaa.
(ba){3,}
At least three 'ba' pairs: bababa, babababa, ...

Resources

An excellent reference on regular expressions is Mastering Regular Expressions by Jeffrey E.F. Friedl, published by O'Reilly & Associates, Inc.


BackUp LevelNext

allaire

AllaireDoc@allaire.com
Copyright © 1998, Allaire Corporation. All rights reserved.