std::filesystem::u8path

From cppreference.com
< cpplrm; | filesystemlrm; | path
Defined in header <filesystem>
template< class Source >
path u8path( const Source& source );
(1) (since C++17)
template< class InputIt >
path u8path( InputIt first, InputIt last );
(2) (since C++17)

Constructs a path p from a UTF-8 encoded sequence of chars, supplied either as an std::string, or as std::string_view, or as a null-terminated multibyte string, or as a [first, last) iterator pair.

  • If path::value_type is char and native encoding is UTF-8, constructs a path directly as if by path(source) or path(first, last). Note: this is the typical situation of a POSIX system that uses Unicode, such as Linux.
  • Otherwise, if path::value_type is wchar_t and native encoding is UTF-16 (this is the situation on Windows), or if path::value_type is char16_t (native encoding guaranteed UTF-16) or char32_t (native encoding guaranteed UTF-32), then first converts the UTF-8 character sequence to a temporary string tmp of type path::string_type and then the new path is constructed as if by path(tmp)
  • Otherwise (for non-UTF-8 narrow character encodings and for non-UTF-16 wchar_t), first converts the UTF-8 character sequence to a temporary UTF-32-encoded string tmp of type std::u32string, and then the new path is constructed as if by path(tmp) (this path is taken on a POSIX system with a non-Unicode multibyte or single-byte encoded filesystem)

Parameters

source - a UTF-8 encoded std::string, std::string_view, a pointer to a null-terminated multibyte string, or an input iterator with char value type that points to a null-terminated multibyte string
first, last - pair of InputIterators that specify a UTF-8 encoded character sequence
Type requirements
-
InputIt must meet the requirements of InputIterator.
-
The value type of InputIt must be char

Return value

The path constructed from the input string after conversion from UTF-8 to the filesystem's native character encoding.

Exceptions

May throw std::bad_alloc if memory allocation fails.


Notes

On systems where native path format differs from the generic path format (neither Windows nor POSIX systems are examples of such OSes), if the argument to this function is using generic format, it will be converted to native.

Example

#include <cstdio>
#ifdef _MSC_VER
#include <io.h>
#include <fcntl.h>
#else
#include <locale>
#include <clocale>
#endif
#include <fstream>
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;

int main()
{
#ifdef _MSC_VER
    _setmode(_fileno(stderr), _O_WTEXT);
#else
    std::setlocale(LC_ALL, "");
    std::locale::global(std::locale(""));
    std::cout.imbue(std::locale());
    std::wcerr.imbue(std::locale());
#endif

    fs::path p = fs::u8path(u8".txt");
    std::ofstream(p) << "File contents"; // Prior to LWG2676 uses operator string_type()
                                         // on MSVC, where string_type is wstring, only
                                         // works due to non-standard extension.
                                         // Post-LWG2676 uses new fstream constructors

    // native string representation can be used with OS APIs
    if (std::FILE* f =
#ifdef _MSC_VER
                _wfopen(p.c_str(), L"r")
#else
                std::fopen(p.c_str(), "r")
#endif
        )
    {
        int ch;
        while((ch=fgetc(f))!= EOF) putchar(ch);
        std::fclose(f);
    }

    // multibyte and wide representation can be used for output
    std::cout << "\nFile name in narrow multibyte encoding: " << p.string() << '\n';
    std::wcerr << "File name in wide encoding: " << p.wstring() << '\n';

    fs::remove(p);
}

Output:

File contents
File name in narrow multibyte encoding: .txt
File name in wide encoding: .txt

See also

(C++17)
represents a path
(class)