mbox

[0/3] : C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and c8rtomb().

Message ID 8c4dcb0d-c2d9-0182-034d-b217e7881d77@honermann.net
Headers

Message

Tom Honermann June 7, 2021, 2:07 a.m. UTC
  This series of patches provides the following:
- A fix for bug 25744 [1].
- Implementations of the mbrtoc8 and c8rtomb functions adopted for
   C++20 via WG21 P0482R6 [2] and proposed for C2X in WG14 N2653 [3].
- A char8_t typedef as proposed for C2X in WG14 N2653 [3].
- A _CHAR8_T_SOURCE feature test macro to be used to opt-in to support
   for the char8_t typedef and mbrtoc8/c8rtomb functions in pre-C++20
   or C source code.

Patch 1: A fix and test for bug 25744 [1].
Patch 2: Definitions of the mbrtoc8 and c8rtomb functions, the char8_t
          typedef, and the _CHAR8_T_SOURCE feature test macro.
Patch 3: Tests for the mbrtoc8 and c8rtomb functions and the char8_t
          typedef.

The fix for bug 25744 [1] is included in this patch series because the 
tests for mbrtoc8 and c8rtomb depend on it for exercising the special 
case where a pair of Unicode code points is converted to/from a single 
double byte sequence.  Such conversion cases exist for Big5-HKSCS.

Tom.

[1]: Bug 25744
      "mbrtowc with Big5-HKSCS returns 2 instead of 1 when consuming the
      second byte of certain double byte characters"
      https://sourceware.org/bugzilla/show_bug.cgi?id=25744

[2]: WG21 P0482R6
      "char8_t: A type for UTF-8 characters and strings (Revision 6)"
      https://wg21.link/p0482r6

[3]: WG14 N2653
      "char8_t: A type for UTF-8 characters and strings (Revision 1)"
      http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm