Raw Strings in Python
In Python, raw strings are denoted by a prefix of r
or R
, as in r'...'
or r"..."
. These strings interpret backslashes \
as literal characters, which is particularly beneficial when dealing with strings containing many backslashes, such as Windows paths or regular expression patterns.
Escape sequences
Python uses escape sequences, preceded by a backslash \
(like \t
or \n
), to represent characters that cannot be directly included in a string, such as tabs or line feeds.
s = 'a\tb\nA\tB'
print(s)
# a b
# A B
Raw strings interpret backslashes as literal characters
Strings prefixed with r
or R
, such as r'...'
and r"..."
, are called raw strings and treat backslashes \
as literal characters. Unlike in regular strings, escape sequences are not given special treatment in raw strings.
rs = r'a\tb\nA\tB'
print(rs)
# a\tb\nA\tB
Despite their unique behavior, raw strings are not a distinct type. They are simply regular strings in which each backslash is represented as \\
.
print(type(rs))
# <class 'str'>
print(rs == 'a\\tb\\nA\\tB')
# True
In a regular string, an escape sequence is interpreted as a single character. However, in a raw string, each backslash is considered a separate character.
s = 'a\tb\nA\tB'
print(len(s))
# 7
print(list(s))
# ['a', '\t', 'b', '\n', 'A', '\t', 'B']
rs = r'a\tb\nA\tB'
print(len(rs))
# 10
print(list(rs))
# ['a', '\\', 't', 'b', '\\', 'n', 'A', '\\', 't', 'B']
Windows paths
Raw strings are particularly useful for representing Windows paths, which are delimited by backslashes \
. Instead of having to escape each backslash as \\
in a regular string, you can use raw strings to write them as is.
path = 'C:\\Windows\\system32\\cmd.exe'
rpath = r'C:\Windows\system32\cmd.exe'
print(path == rpath)
# True
However, be aware that a string ending with an odd number of backslashes will cause an error. To avoid this issue, you can either represent the entire path as a regular string (replacing each backslash with \\
) or represent the path as a raw string up to the last backslash, then represent the last backslash as a regular string and concatenate them.
path2 = 'C:\\Windows\\system32\\'
# rpath2 = r'C:\Windows\system32\'
# SyntaxError: EOL while scanning string literal
rpath2 = r'C:\Windows\system32' + '\\'
print(path2 == rpath2)
# True
Convert regular strings to raw strings with repr()
You can use the built-in function repr()
to transform a regular string into its raw string equivalent.
s = 'a\tb\nA\tB'
s_r = repr(s)
print(s_r)
# 'a\tb\nA\tB'
The string returned by repr()
is enclosed in single quotes ('
).
print(list(s_r))
# ["'", 'a', '\\', 't', 'b', '\\', 'n', 'A', '\\', 't', 'B', "'"]
By using slicing, you can extract the raw string equivalent from the result of repr()
.
rs = r'a\tb\nA\tB'
s_r2 = repr(s)[1:-1]
print(s_r2)
# a\tb\nA\tB
print(s_r2 == rs)
# True
print(r'\t' == repr('\t')[1:-1])
# True
Raw strings cannot end with an odd number of backslashes
A raw string ending with an odd number of backslashes will lead to an error, as the backslashes escape the trailing quotation marks ('
or "
).
# print(r'\')
# SyntaxError: EOL while scanning string literal
print(r'\\')
# \\
# print(r'\\\')
# SyntaxError: EOL while scanning string literal