Escape sequences
Escape sequences are used to represent certain special characters within string literals and character constants.
The following escape sequences are available. ISO C requires a diagnostic if the backslash is followed by any character not listed here:
Escape sequence |
Description | Representation |
---|---|---|
\'
|
single quote | byte 0x27 (in ASCII encoding)
|
\"
|
double quote | byte 0x22 (in ASCII encoding)
|
\?
|
question mark | byte 0x3f (in ASCII encoding)
|
\\
|
backslash | byte 0x5c (in ASCII encoding)
|
\a
|
audible bell | byte 0x07 (in ASCII encoding)
|
\b
|
backspace | byte 0x08 (in ASCII encoding)
|
\f
|
form feed - new page | byte 0x0c (in ASCII encoding)
|
\n
|
line feed - new line | byte 0x0a (in ASCII encoding)
|
\r
|
carriage return | byte 0x0d (in ASCII encoding)
|
\t
|
horizontal tab | byte 0x09 (in ASCII encoding)
|
\v
|
vertical tab | byte 0x0b (in ASCII encoding)
|
\nnn
|
arbitrary octal value | byte nnn
|
\xnn
|
arbitrary hexadecimal value | byte nn
|
\unnnn
|
Unicode character that is not in the basic character set. May result in several characters. |
code point U+nnnn
|
\Unnnnnnnn
|
Unicode character that is not in the basic character set. May result in several characters. |
code point U+nnnnnnnn
|
Notes
Of the octal escape sequences, \0
is the most useful because it represents the terminating null character in null-terminated strings.
The new-line character \n
has special meaning when used in text mode I/O: it is converted to the OS-specific newline byte or byte sequence.
Octal escape sequences have a length limit of three octal digits but terminate at the first character that is not a valid octal digit if encountered sooner.
Hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. If the value represented by a single hexadecimal escape sequence does not fit the range of values represented by the character type used in this string literal or character constant (char, char16_t, char32_t, or wchar_t), the result is unspecified.
A universal character name in a narrow string literal or a 16-bit string literal may map to more than one character, e.g. \U0001f34c is 4 char code units in UTF-8 (\xF0\x9F\x8D\x8C) and 2 char16_t code units in UTF-16 (\uD83C\uDF4C).
The question mark escape sequence \? is used to prevent trigraphs from being interpreted inside string literals: a string such as "??/" is compiled as "\", but if the second question mark is escaped, as in "?\?/", it becomes "??/"
Example
#include <stdio.h> int main(void) { printf("This\nis\na\ntest\n\nShe said, \"How are you?\"\n"); }
Output:
This is a test She said, "How are you?"
References
- C11 standard (ISO/IEC 9899:2011):
- 5.2.2 Character display semantics (p: 24-25)
- 6.4.4.4 Character constants (p: 67-70)
- C99 standard (ISO/IEC 9899:1999):
- 5.2.2 Character display semantics (p: 19-20)
- 6.4.4.4 Character constants (p: 59-61)
- C89/C90 standard (ISO/IEC 9899:1990):
- 2.2.2 Character display semantics
- 3.1.3.4 Character constants