Once you have an object representing a compiled regular expression, what do you do with it? RegexObject instances have several methods and attributes. Only the most significant ones will be covered here; consult the Library Reference for a complete listing.
Method/Attribute | Purpose |
---|---|
match |
Determine if the RE matches at the beginning of the string. |
search |
Scan through a string, looking for any location where this RE matches. |
split |
Split the string into a list, splitting it wherever the RE matches |
sub |
Find all substrings where the RE matches, and replace them with a different string |
subn |
Does the same thing as sub(), except you can limit the number of replacements |
These methods return None
if no match can be found. If they're
successful, a MatchObject
instance is returned, containing
information about the match: where it starts and ends, the substring
it matched, and more.
You can learn about this by interactively experimenting with the re module. (If you have Tkinter available, you may also want to look at redemo.py, a demonstration program included with the Python distribution. It allows you to enter REs and strings, and displays whether the RE matches or fails. redemo.py can be quite useful when trying to debug a complicated RE.)
First, run the Python interpreter, import the re module, and compile a RE:
Python 1.5.1 (#6, Jul 17 1998, 20:38:08) [GCC 2.7.2.3] on linux2 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> import re >>> p = re.compile('[a-z]+') >>> p <re.RegexObject instance at 80c3c28>
None
in this case, which will cause the interpreter to
print no output. You can explicitly print the result of
match() to make this clear.
>>> p.match( "" ) >>> print p.match( "" ) None
>>> m = p.match( 'tempo') >>> print m <re.MatchObject instance at 80c4f68>
Method/Attribute | Purpose |
---|---|
group() |
Return the string matched by the RE |
start() |
Return the starting position of the match |
end() |
Return the ending position of the match |
span() |
Return a tuple containing the (start, end) of the match |
Trying these methods will soon clarify their meaning:
>>> m.group() 'tempo' >>> m.start(), m.end() (0, 5) >>> m.span() (0, 5)
>>> print p.match('::: message') None >>> m = p.search('::: message') ; print m <re.MatchObject instance at 80c9650> >>> m.group() 'message' >>> m.span() (4, 11)
None
. This usually looks like:
p = re.compile( ... ) m = p.match( 'string goes here' ) if m: print 'Match found: ', m.group() else: print 'No match'