The module defines the following functions and constants, and an exception:
match
and
search
methods, described below.
The expression's behaviour can be modified by specifying a
flags value. Values can be any of the following variables,
combined using bitwise OR (the |
operator).
(?i)
Perform case-insensitive matching; expressions like [A-Z]
will match
lowercase letters, too. This is not affected by the current locale.
(?L)
Make \w
, \W
, \b
,
\B
, dependent on the current locale.
(?m)
When specified, the pattern character ^
matches at the
beginning of the string and at the beginning of each line
(immediately following each newline); and the pattern character
$
matches at the end of the string and at the end of each line
(immediately preceding each newline).
By default, ^
matches only at the beginning of the string, and
$
only at the end of the string and immediately before the
newline (if any) at the end of the string.
(?s)
Make the .
special character any character at all, including a
newline; without this flag, .
will match anything except
a newline.
(?x)
Ignore whitespace within the pattern
except when in a character class or preceded by an unescaped
backslash, and, when a line contains a #
neither in a character
class or preceded by an unescaped backslash, all characters from the
leftmost such #
through the end of the line are ignored.
The sequence
prog = re.compile(pat) result = prog.match(str)is equivalent to
result = re.match(pat, str)but the version using
compile()
is more efficient when the
expression will be used several times in a single program.
Match
object. Return None
if the string does not
match the pattern; note that this is different from a zero-length
match.
None
if no
position in the string matches the pattern; note that this is
different from finding a zero-length match at some point in the string.
>>> re.split('[\W]+', 'Words, words, words.') ['Words', 'words', 'words', ''] >>> re.split('([\W]+)', 'Words, words, words.') ['Words', ', ', 'words', ', ', 'words', '.', '']This function combines and extends the functionality of the old
regex.split()
and regex.splitx()
.
>>> def dashrepl(matchobj): ... if matchobj.group(0) == '-': return ' ' ... else: return '-' >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files') 'pro--gram files'The pattern may be a string or a regex object; if you need to specify regular expression flags, you must use a regex object, or use embedded modifiers in a pattern; e.g.
sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'.The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer, and the default value of 0 means to replace all occurrences.
Empty matches for the pattern are replaced only when not adjacent to a
previous match, so sub('x*', '-', 'abc')
returns '-a-b-c-'.
sub()
, but return a tuple
(new_string, number_of_subs_made)
.