Escape sequences

From cppreference.com
< clrm; | language

Escape sequences are used to represent certain special characters within string literals and character constants.

The following escape sequences are available. ISO C requires a diagnostic if the backslash is followed by any character not listed here:

Escape
sequence
Description Representation
\' single quote byte 0x27 (in ASCII encoding)
\" double quote byte 0x22 (in ASCII encoding)
\? question mark byte 0x3f (in ASCII encoding)
\\ backslash byte 0x5c (in ASCII encoding)
\a audible bell byte 0x07 (in ASCII encoding)
\b backspace byte 0x08 (in ASCII encoding)
\f form feed - new page byte 0x0c (in ASCII encoding)
\n line feed - new line byte 0x0a (in ASCII encoding)
\r carriage return byte 0x0d (in ASCII encoding)
\t horizontal tab byte 0x09 (in ASCII encoding)
\v vertical tab byte 0x0b (in ASCII encoding)
\nnn arbitrary octal value byte nnn
\xnn arbitrary hexadecimal value byte nn
\unnnn Unicode character that is not in the basic character set.
May result in several characters.
code point U+nnnn
\Unnnnnnnn Unicode character that is not in the basic character set.
May result in several characters.
code point U+nnnnnnnn

Notes

Of the octal escape sequences, \0 is the most useful because it represents the terminating null character in null-terminated strings.

The new-line character \n has special meaning when used in text mode I/O: it is converted to the OS-specific newline byte or byte sequence.

Octal escape sequences have a length limit of three octal digits but terminate at the first character that is not a valid octal digit if encountered sooner.

Hexadecimal escape sequences have no length limit and terminate at the first character that is not a valid hexadecimal digit. If the value represented by a single hexadecimal escape sequence does not fit the range of values represented by the character type used in this string literal or character constant (char, char16_t, char32_t, or wchar_t), the result is unspecified.

A universal character name in a narrow string literal or a 16-bit string literal may map to more than one character, e.g. \U0001f34c is 4 char code units in UTF-8 (\xF0\x9F\x8D\x8C) and 2 char16_t code units in UTF-16 (\uD83C\uDF4C).

The question mark escape sequence \? is used to prevent trigraphs from being interpreted inside string literals: a string such as "??/" is compiled as "\", but if the second question mark is escaped, as in "?\?/", it becomes "??/"

Example

#include <stdio.h>

int main(void)
{
    printf("This\nis\na\ntest\n\nShe said, \"How are you?\"\n");
}

Output:

This
is
a
test

She said, "How are you?"

References

  • C11 standard (ISO/IEC 9899:2011):
  • 5.2.2 Character display semantics (p: 24-25)
  • 6.4.4.4 Character constants (p: 67-70)
  • C99 standard (ISO/IEC 9899:1999):
  • 5.2.2 Character display semantics (p: 19-20)
  • 6.4.4.4 Character constants (p: 59-61)
  • C89/C90 standard (ISO/IEC 9899:1990):
  • 2.2.2 Character display semantics
  • 3.1.3.4 Character constants

See also