character literal

From cppreference.com
< cpplrm; | language
C++ language
General topics
Flow control
Conditional execution statements
Iteration statements (loops)
Jump statements
Functions
Function declaration
Lambda function declaration
inline specifier
Exception specifications (deprecated)
noexcept specifier (C++11)
Exceptions
Namespaces
Types
Specifiers
decltype (C++11)
auto (C++11)
alignas (C++11)
Storage duration specifiers
Initialization
Expressions
Alternative representations
Literals
Boolean - Integer - Floating-point
Character - String - nullptr (C++11)
User-defined (C++11)
Utilities
Attributes (C++11)
Types
typedef declaration
Type alias declaration (C++11)
Casts
Implicit conversions - Explicit conversions
static_cast - dynamic_cast
const_cast - reinterpret_cast
Memory allocation
Classes
Class-specific function properties
Special member functions
Templates
Miscellaneous

Syntax

' c-char ' (1)
u8 ' c-char ' (2) (since C++17)
u ' c-char ' (3) (since C++11)
U ' c-char ' (4) (since C++11)
L ' c-char ' (5)
' c-char-sequence ' (6)

where

  • c-char is either
  • a character from the basic source character set minus single-quote ('), backslash (\), or the newline character,
  • escape sequence, as defined in escape sequences
  • universal character name, as defined in escape sequences
  • c-char-sequence is a sequence of two or more c-chars.
1) narrow character literal or ordinary character literal, e.g. 'a' or '\n' or '\13'. Such literal has type char and the value equal to the representation of c-char in the execution character set. If c-char is not representable as a single byte in the execution character set, the literal has type int and implementation-defined value
2) UTF-8 character literal, e.g. u8'a'. Such literal has type char and the value equal to ISO 10646 code point value of c-char, provided that the code point value is representable with a single UTF-8 code unit. If c-char is not in Basic Latin or C0 Controls Unicode block, the program is ill-formed.
3) UCS-2 character literal, e.g. u'', but not u'' (u'\U0001f34c'). Such literal has type char16_t and the value equal to the value of c-char in Unicode, if it is a part of the basic multilingual plane. If c-char is not part of the BMP, the program is ill-formed.
4) UCS-4 character literal, e.g. U'' or U''. Such literal has type char32_t and the value equal to the value of c-char in Unicode.
5) wide character literal, e.g. L'' or L''. Such literal has type wchar_t and the value equal to the value of c-char in the execution wide character set. If c-char is not representable in the execution character set (e.g. a non-BMP value on Windows where wchar_t is 16-bit), the value of the literal is implementation-defined.
6) Multicharacter literal, e.g. 'AB', has type int and implementation-defined value.

Notes

Many implementations of multicharacter literals use the values of each char in the literal to initialize successive bytes of the resulting integer, in big-endian order, e.g. the value of '\1\2\3\4' is 0x01020304.

In C, character constants such as 'a' or '\n' have type int, rather than char.

See also