By now you've probably noticed that regular expressions are a very compact notation, but they're not terribly readable. REs of moderate complexity can become lengthy collections of backslashes, parentheses, and metacharacters, making them difficult to read and understand.
For such REs, specifying the re.VERBOSE
flag when
compiling the regular expression can be helpful, because it allows
you to format the regular expression more clearly.
The re.VERBOSE
flag has several effects. Whitespace in the
regular expression that isn't inside a character class is
ignored. This means that an expression such as dog | cat is
equivalent to the less readable dog|cat, but [a b]
will still match the characters "a", "b", or a
space. In addition, you can also put comments inside a RE; comments
extend from a "#" character to the next newline. When used with
triple-quoted strings, this enables REs to be formatted more neatly:
pat = re.compile(r""" \s* # Skip leading whitespace (?P<header>[^:]+) # Header name \s* : # Whitespace, and a colon (?P<value>.*?) # The header's value -- *? used to # lose the following trailing whitespace \s*$ # Trailing whitespace to end-of-line """, re.VERBOSE)
pat = re.compile(r"\s*(?P<header>[^:]+)\s*:(?P<value>.*?)\s*$")