String literals are described by the following lexical definitions:
stringliteral: shortstring | longstring shortstring: "'" shortstringitem* "'" | '"' shortstringitem* '"' longstring: "'''" longstringitem* "'''" | '"""' longstringitem* '"""' shortstringitem: shortstringchar | escapeseq shortstringchar: <any ASCII character except "\" or newline or the quote> longstringchar: <any ASCII character except "\"> escapeseq: "\" <any ASCII character>
In ``long strings'' (strings surrounded by sets of three quotes),
unescaped newlines and quotes are allowed (and are retained), except
that three unescaped quotes in a row terminate the string. (A
``quote'' is the character used to open the string, i.e. either
'
or "
.)
Escape sequences in strings are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:
\ newline |
Ignored |
\\ |
Backslash (\ ) |
\' |
Single quote (' ) |
\" |
Double quote (" ) |
\a |
ASCII Bell (BEL) |
\b |
ASCII Backspace (BS) |
\f |
ASCII Formfeed (FF) |
\n |
ASCII Linefeed (LF) |
\r |
ASCII Carriage Return (CR) |
\t |
ASCII Horizontal Tab (TAB) |
\v |
ASCII Vertical Tab (VT) |
\ ooo |
ASCII character with octal value ooo |
\x xx... |
ASCII character with hex value xx... |
In strict compatibility with Standard C, up to three octal digits are accepted, but an unlimited number of hex digits is taken to be part of the hex escape (and then the lower 8 bits of the resulting hex number are used in all current implementations...).
All unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken. It also helps a great deal for string literals used as regular expressions or otherwise passed to other modules that do their own escape handling.)