the 'r' means the the following is a "raw string".
ie. backslash characters are treated literally instead of signifying special treatment of the following character.
so
and
another way to write it would be
'\n'
is a single newlineand
r'\n'
is two characters - a backslash and the letter 'n'another way to write it would be
'\\n'
because the first backslash escapes the second
an equivalent way of writing this
print (re.sub(r'(\b\w+)(\s+\1\b)+', r'\1', 'hello there there'))
is
print (re.sub('(\\b\\w+)(\\s+\\1\\b)+', '\\1', 'hello there there'))
Because of the way Python treats characters that are not valid escape characters, not all of those double backslashes are necessary - eg
'\s'=='\\s'
however the same is not true for '\b'
and '\\b'
. My preference is to be explicit and double all the backslashes.
No comments:
Post a Comment