diff mbox series

[3/3] : C++20 P0482R6 and C2X N2653: Tests for mbrtoc8, c8rtomb, char8_t

Message ID 2c218996-a320-9520-2ed0-8797c109ec19@honermann.net
State Superseded
Headers show
Series : C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and c8rtomb(). | expand

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Tom Honermann Feb. 27, 2022, 4:53 p.m. UTC
This patch adds tests for the mbrtoc8 and c8rtomb functions adopted for 
C++20 via WG21 P0482R6 [1] and for C2X via WG14 N2653 [2], and for the 
char8_t typedef adopted for C2X via WG14 N2653 [2].

The tests for mbrtoc8 and c8rtomb specifically exercise conversion 
from/to Big5-HKSCS because of special cases that arise with that 
encoding. Big5-HKSCS defines some double byte sequences that convert to 
more than one Unicode code point. In order to test this, the locale 
dependencies for running tests under wcsmbs is expanded to include 
zh_HK.BIG5-HKSCS.

Tested on Linux x86_64.

Tom.

[1]: WG21 P0482R6
      "char8_t: A type for UTF-8 characters and strings (Revision 6)"
      https://wg21.link/p0482r6

[2]: WG14 N2653
      "char8_t: A type for UTF-8 characters and strings (Revision 1)"
      http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

Comments

Adhemerval Zanella Netto May 17, 2022, 6:34 p.m. UTC | #1
On 27/02/2022 13:53, Tom Honermann via Libc-alpha wrote:
> This patch adds tests for the mbrtoc8 and c8rtomb functions adopted for C++20 via WG21 P0482R6 [1] and for C2X via WG14 N2653 [2], and for the char8_t typedef adopted for C2X via WG14 N2653 [2].
> 
> The tests for mbrtoc8 and c8rtomb specifically exercise conversion from/to Big5-HKSCS because of special cases that arise with that encoding. Big5-HKSCS defines some double byte sequences that convert to more than one Unicode code point. In order to test this, the locale dependencies for running tests under wcsmbs is expanded to include zh_HK.BIG5-HKSCS.
> 
> Tested on Linux x86_64.
> 
> Tom.
> 
> [1]: WG21 P0482R6
>      "char8_t: A type for UTF-8 characters and strings (Revision 6)"
>      https://wg21.link/p0482r6
> 
> [2]: WG14 N2653
>      "char8_t: A type for UTF-8 characters and strings (Revision 1)"
>      http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm
> commit 1fb259ae6b5da6865140c203a484b0735fc152d0
> Author: Tom Honermann <tom@honermann.net>
> Date:   Sun Feb 27 10:39:00 2022 -0500
> 
>     Tests for mbrtoc8(), c8rtomb(), and the char8_t typedef.
>     
>     This change adds tests for the mbrtoc8 and c8rtomb functions adopted for
>     C++20 via WG21 P0482R6 and for C2X via WG14 N2653, and for the char8_t
>     typedef adopted for C2X from WG14 N2653.
>     
>     The tests for mbrtoc8 and c8rtomb specifically exercise conversion to
>     and from Big5-HKSCS because of special cases that arise with that encoding.
>     Big5-HKSCS defines some double byte sequences that convert to more than
>     one Unicode code point.  In order to test this, the locale dependencies
>     for running tests under wcsmbs is expanded to include zh_HK.BIG5-HKSCS.

Patch looks ok in general, some comments below.

> 
> diff --git a/wcsmbs/Makefile b/wcsmbs/Makefile
> index bda281ad70..9b8445fa48 100644
> --- a/wcsmbs/Makefile
> +++ b/wcsmbs/Makefile
> @@ -52,6 +52,7 @@ tests := tst-wcstof wcsmbs-tst1 tst-wcsnlen tst-btowc tst-mbrtowc \
>  	 tst-c16c32-1 wcsatcliff tst-wcstol-locale tst-wcstod-nan-locale \
>  	 tst-wcstod-round test-char-types tst-fgetwc-after-eof \
>  	 tst-wcstod-nan-sign tst-c16-surrogate tst-c32-state \
> +	 test-char8-type test-mbrtoc8 test-c8rtomb \
>  	 $(addprefix test-,$(strop-tests)) tst-mbstowcs \
>  	 tst-wprintf-binary
>  
> @@ -59,7 +60,7 @@ include ../Rules
>  
>  ifeq ($(run-built-tests),yes)
>  LOCALES := de_DE.ISO-8859-1 de_DE.UTF-8 en_US.ANSI_X3.4-1968 hr_HR.ISO-8859-2 \
> -	   ja_JP.EUC-JP zh_TW.EUC-TW tr_TR.UTF-8 tr_TR.ISO-8859-9
> +	   ja_JP.EUC-JP zh_TW.EUC-TW tr_TR.UTF-8 tr_TR.ISO-8859-9 zh_HK.BIG5-HKSCS
>  include ../gen-locales.mk
>  
>  $(objpfx)tst-btowc.out: $(gen-locales)

Ok.

> diff --git a/wcsmbs/test-c8rtomb.c b/wcsmbs/test-c8rtomb.c
> new file mode 100644
> index 0000000000..14564fa00a
> --- /dev/null
> +++ b/wcsmbs/test-c8rtomb.c
> @@ -0,0 +1,519 @@
> +/* Test c8rtomb.
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <errno.h>
> +#include <limits.h>
> +#include <locale.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <uchar.h>
> +#include <wchar.h>
> +#include <support/check.h>
> +
> +static int
> +test_truncated_code_unit_sequence (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Missing trailing code unit for a two code byte unit sequence.  */
> +  u8s = (const char8_t*) u8"\xC2";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */

Line too long and there are multiple occurances below as well. I think
the extra comments seems redudant. 

> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Missing first trailing code unit for a three byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xE0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);

Usually tests with stack allocated variable use a new scope to avoid
variables to scape and simplify a bit their initialization:

  {
    mbstate_t s = { 0 };
    const char8_t *u8s = (const char8_t*) u8"\xE0";
    char buf[MB_LEN_MAX] = { 0 };

    TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0);
    errno = 0;
    TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1);
    TEST_COMPARE (errno, EILSEQ);
  }

> +
> +  /* Missing second trailing code unit for a three byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xE0\xA0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Missing first trailing code unit for a four byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xF0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Missing second trailing code unit for a four byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xF0\x90";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Missing third trailing code unit for a four byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xF0\x90\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t)  0); /* 3rd byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_invalid_trailing_code_unit_sequence (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Invalid trailing code unit for a two code byte unit sequence.  */
> +  u8s = (const char8_t*) u8"\xC2\xC0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Invalid first trailing code unit for a three byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xE0\xC0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Invalid second trailing code unit for a three byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xE0\xA0\xC0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Invalid first trailing code unit for a four byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xF0\xC0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Invalid second trailing code unit for a four byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xF0\x90\xC0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Invalid third trailing code unit for a four byte code unit sequence.  */
> +  u8s = (const char8_t*) u8"\xF0\x90\x80\xC0";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t)  0); /* 3rd byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_lone_trailing_code_units (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Lone trailing code unit.  */
> +  u8s = (const char8_t*) u8"\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Lone trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_overlong_encoding (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Two byte overlong encoding.  */
> +  u8s = (const char8_t*) u8"\xC0\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Invalid lead code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Two byte overlong encoding.  */
> +  u8s = (const char8_t*) u8"\xC1\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Invalid lead code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Three byte overlong encoding.  */
> +  u8s = (const char8_t*) u8"\xE0\x9F\xBF";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Four byte overlong encoding.  */
> +  u8s = (const char8_t*) u8"\xF0\x8F\xBF\xBF";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_surrogate_range (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Would encode U+D800.  */
> +  u8s = (const char8_t*) u8"\xED\xA0\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Would encode U+DFFF.  */
> +  u8s = (const char8_t*) u8"\xED\xBF\xBF";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_out_of_range_encoding (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Would encode U+00110000.  */
> +  u8s = (const char8_t*) u8"\xF4\x90\x80\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  /* Would encode U+00140000.  */
> +  u8s = (const char8_t*) u8"\xF5\x90\x80\x80";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Invalid lead code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_null_output_buffer (void)
> +{
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  /* Null character with an initial state.  */
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (NULL, u8"X"[0], &s), (size_t) 1); /* null byte processed */
> +  TEST_VERIFY (mbsinit (&s));    /* Assert the state is now an initial state.  */
> +
> +  /* Null buffer with a state corresponding to an incompletely read code
> +     unit sequence.  In this case, an error occurs since insufficient
> +     information is available to complete the already started code unit
> +     sequence and return to the initial state.  */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8"\xC2"[0], &s), (size_t)  0);  /* 1st byte processed */
> +  errno = 0;
> +  TEST_COMPARE (c8rtomb (NULL, u8"\x80"[0], &s), (size_t) -1); /* No trailing code unit */
> +  TEST_COMPARE (errno, EILSEQ);
> +
> +  return 0;
> +}
> +
> +static int
> +test_utf8 (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  TEST_VERIFY_EXIT (setlocale (LC_ALL, "de_DE.UTF-8") != NULL);

Use xsetlocale.

> +
> +  /* Null character.  */
> +  u8s = (const char8_t*) u8"\x00"; /* U+0000 => 0x00 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 1); /* 1st byte processed */
> +  TEST_COMPARE (buf[0], (char) 0x00);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First non-null character in the code point range that maps to a single
> +     code unit.  */
> +  u8s = (const char8_t*) u8"\x01"; /* U+0001 => 0x01 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 1); /* 1st byte processed */
> +  TEST_COMPARE (buf[0], (char) 0x01);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to a single code unit.  */
> +  u8s = (const char8_t*) u8"\x7F"; /* U+007F => 0x7F */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 1); /* 1st byte processed */
> +  TEST_COMPARE (buf[0], (char) 0x7F);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to two code units.  */
> +  u8s = (const char8_t*) u8"\xC2\x80"; /* U+0080 => 0xC2 0x80 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 2); /* 2nd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xC2);
> +  TEST_COMPARE (buf[1], (char) 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to two code units.  */
> +  u8s = (const char8_t*) u8"\u07FF"; /* U+07FF => 0xDF 0xBF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 2); /* 2nd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xDF);
> +  TEST_COMPARE (buf[1], (char) 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to three code units.  */
> +  u8s = (const char8_t*) u8"\u0800"; /* U+0800 => 0xE0 0xA0 0x80 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xE0);
> +  TEST_COMPARE (buf[1], (char) 0xA0);
> +  TEST_COMPARE (buf[2], (char) 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to three code units
> +     before the surrogate code point range.  */
> +  u8s = (const char8_t*) u8"\uD7FF"; /* U+D7FF => 0xED 0x9F 0xBF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xED);
> +  TEST_COMPARE (buf[1], (char) 0x9F);
> +  TEST_COMPARE (buf[2], (char) 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to three code units
> +     after the surrogate code point range.  */
> +  u8s = (const char8_t*) u8"\uE000"; /* U+E000 => 0xEE 0x80 0x80 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xEE);
> +  TEST_COMPARE (buf[1], (char) 0x80);
> +  TEST_COMPARE (buf[2], (char) 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Not a BOM.  */
> +  u8s = (const char8_t*) u8"\uFEFF"; /* U+FEFF => 0xEF 0xBB 0xBF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xEF);
> +  TEST_COMPARE (buf[1], (char) 0xBB);
> +  TEST_COMPARE (buf[2], (char) 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Replacement character.  */
> +  u8s = (const char8_t*) u8"\uFFFD"; /* U+FFFD => 0xEF 0xBF 0xBD */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xEF);
> +  TEST_COMPARE (buf[1], (char) 0xBF);
> +  TEST_COMPARE (buf[2], (char) 0xBD);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to three code units.  */
> +  u8s = (const char8_t*) u8"\uFFFF"; /* U+FFFF => 0xEF 0xBF 0xBF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xEF);
> +  TEST_COMPARE (buf[1], (char) 0xBF);
> +  TEST_COMPARE (buf[2], (char) 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to four code units.  */
> +  u8s = (const char8_t*) u8"\U00010000"; /* U+10000 => 0xF0 0x90 0x80 0x80 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 4); /* 4th byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xF0);
> +  TEST_COMPARE (buf[1], (char) 0x90);
> +  TEST_COMPARE (buf[2], (char) 0x80);
> +  TEST_COMPARE (buf[3], (char) 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to four code units.  */
> +  u8s = (const char8_t*) u8"\U0010FFFF"; /* U+10FFFF => 0xF4 0x8F 0xBF 0xBF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 4); /* 4th byte processed */
> +  TEST_COMPARE (buf[0], (char) 0xF4);
> +  TEST_COMPARE (buf[1], (char) 0x8F);
> +  TEST_COMPARE (buf[2], (char) 0xBF);
> +  TEST_COMPARE (buf[3], (char) 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  return 0;
> +}
> +
> +static int
> +test_big5_hkscs (void)
> +{
> +  const char8_t *u8s;
> +  char buf[MB_LEN_MAX];
> +  mbstate_t s;
> +
> +  TEST_VERIFY_EXIT (setlocale (LC_ALL, "zh_HK.BIG5-HKSCS") != NULL);

Use xsetlocale.

> +
> +  /* A pair of two byte UTF-8 code unit sequences that map a Unicode code
> +     point and combining character to a single double byte character.  */
> +  u8s = (const char8_t*) u8"\u00CA\u0304"; /* U+00CA U+0304 => 0x88 0x62 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 2); /* 4th byte processed */
> +  TEST_COMPARE (buf[0], (char) 0x88);
> +  TEST_COMPARE (buf[1], (char) 0x62);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Another pair of two byte UTF-8 code unit sequences that map a Unicode code
> +     point and combining character to a single double byte character.  */
> +  u8s = (const char8_t*) u8"\u00EA\u030C"; /* U+00EA U+030C => 0x88 0xA5 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
> +  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 2); /* 4th byte processed */
> +  TEST_COMPARE (buf[0], (char) 0x88);
> +  TEST_COMPARE (buf[1], (char) 0xA5);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  return 0;
> +}
> +
> +static int
> +do_test (void)
> +{
> +  test_truncated_code_unit_sequence ();
> +  test_invalid_trailing_code_unit_sequence ();
> +  test_lone_trailing_code_units ();
> +  test_overlong_encoding ();
> +  test_surrogate_range ();
> +  test_out_of_range_encoding ();
> +  test_null_output_buffer ();
> +  test_utf8 ();
> +  test_big5_hkscs ();
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>

Ok.

> diff --git a/wcsmbs/test-char8-type.c b/wcsmbs/test-char8-type.c
> new file mode 100644
> index 0000000000..642c7044ed
> --- /dev/null
> +++ b/wcsmbs/test-char8-type.c
> @@ -0,0 +1,31 @@
> +/* Test the char8_t typedef.
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <uchar.h>
> +
> +/* Verify that the char8_t type is recognized.  */
> +char8_t c8;
> +
> +static int
> +do_test (void)
> +{
> +  /* This is a compilation test.  */
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>

This test seems redudant.  Maybe it would be better to certify that
char8_t is always an unsigned char by updating c++-types.data.  It
would require to add all other types pulled by uchar.h.

> diff --git a/wcsmbs/test-mbrtoc8.c b/wcsmbs/test-mbrtoc8.c
> new file mode 100644
> index 0000000000..c5061635ac
> --- /dev/null
> +++ b/wcsmbs/test-mbrtoc8.c
> @@ -0,0 +1,462 @@
> +/* Test mbrtoc8.
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <locale.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <uchar.h>
> +#include <wchar.h>
> +#include <support/check.h>
> +
> +static int
> +test_utf8 (void)
> +{
> +  const char *mbs;
> +  char8_t buf[1];
> +  mbstate_t s;
> +
> +  TEST_VERIFY_EXIT (setlocale (LC_ALL, "de_DE.UTF-8") != NULL);

Use xsetlocale.

> +
> +  /* No inputs.  */
> +  mbs = "";
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 0, &s), (size_t) -2); /* no input */
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Null character.  */
> +  mbs = "\x00"; /* 0x00 => U+0000 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 0); /* null byte written */

Line too long and there are multiple occurances below.

> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0x00);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First non-null character in the code point range that maps to a single
> +     code unit.  */
> +  mbs = "\x01"; /* 0x01 => U+0001 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 1); /* 1st byte processed */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0x01);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to a single code unit.  */
> +  mbs = "\x7F"; /* 0x7F => U+007F */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 1);  /* 1st byte processed */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0x7F);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to two code units.  */
> +  mbs = "\xC2\x80"; /* 0xC2 0x80 => U+0080 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
> +  mbs += 2;
> +  TEST_COMPARE (buf[0], 0xC2);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xC2\x80"; /* 0xC2 0x80 => U+0080 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xC2);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to two code units.  */
> +  mbs = "\xDF\xBF"; /* 0xDF 0xBF => U+07FF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
> +  mbs += 2;
> +  TEST_COMPARE (buf[0], 0xDF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xDF\xBF"; /* 0xDF 0xBF => U+07FF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xDF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to three code units.  */
> +  mbs = u8"\xE0\xA0\x80"; /* 0xE0 0xA0 0x80 => U+0800 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
> +  mbs += 3;
> +  TEST_COMPARE (buf[0], 0xE0);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xA0);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = u8"\xE0\xA0\x80"; /* 0xE0 0xA0 0x80 => U+0800 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xE0);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xA0);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to three code units
> +     before the surrogate code point range.  */
> +  mbs = "\xED\x9F\xBF"; /* 0xED 0x9F 0xBF => U+D7FF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
> +  mbs += 3;
> +  TEST_COMPARE (buf[0], 0xED);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x9F);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xED\x9F\xBF"; /* 0xED 0x9F 0xBF => U+D7FF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xED);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x9F);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to three code units
> +     after the surrogate code point range.  */
> +  mbs = "\xEE\x80\x80"; /* 0xEE 0x80 0x80 => U+E000 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
> +  mbs += 3;
> +  TEST_COMPARE (buf[0], 0xEE);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xEE\x80\x80"; /* 0xEE 0x80 0x80 => U+E000 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xEE);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Not a BOM.  */
> +  mbs = "\xEF\xBB\xBF"; /* 0xEF 0xBB 0xBF => U+FEFF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
> +  mbs += 3;
> +  TEST_COMPARE (buf[0], 0xEF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBB);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xEF\xBB\xBF"; /* 0xEF 0xBB 0xBF => U+FEFF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xEF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBB);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Replacement character.  */
> +  mbs = "\xEF\xBF\xBD"; /* 0xEF 0xBF 0xBD => U+FFFD */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
> +  mbs += 3;
> +  TEST_COMPARE (buf[0], 0xEF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBD);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xEF\xBF\xBD"; /* 0xEF 0xBF 0xBD => U+FFFD */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xEF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBD);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to three code units.  */
> +  mbs = "\xEF\xBF\xBF"; /* 0xEF 0xBF 0xBF => U+FFFF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte processed */
> +  mbs += 3;
> +  TEST_COMPARE (buf[0], 0xEF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte processed */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte processed */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xEF\xBF\xBF"; /* 0xEF 0xBF 0xBF => U+FFFF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xEF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* First character in the code point range that maps to four code units.  */
> +  mbs = "\xF0\x90\x80\x80"; /* 0xF0 0x90 0x80 0x80 => U+10000 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 4);  /* 1st byte written */
> +  mbs += 4;
> +  TEST_COMPARE (buf[0], 0xF0);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x90);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xF0\x90\x80\x80"; /* 0xF0 0x90 0x80 0x80 => U+10000 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xF0);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x90);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0x80);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Last character in the code point range that maps to four code units.  */
> +  mbs = "\xF4\x8F\xBF\xBF"; /* 0xF4 0x8F 0xBF 0xBF => U+10FFFF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 4);  /* 1st byte written */
> +  mbs += 4;
> +  TEST_COMPARE (buf[0], 0xF4);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x8F);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\xF4\x8F\xBF\xBF"; /* 0xF4 0x8F 0xBF 0xBF => U+10FFFF */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xF4);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x8F);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0xBF);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  return 0;
> +}
> +
> +static int
> +test_big5_hkscs (void)
> +{
> +  const char *mbs;
> +  char8_t buf[1];
> +  mbstate_t s;
> +
> +  TEST_VERIFY_EXIT (setlocale (LC_ALL, "zh_HK.BIG5-HKSCS") != NULL);
> +
> +  /* A double byte character that maps to a pair of two byte UTF-8 code unit
> +     sequences.  */
> +  mbs = "\x88\x62"; /* 0x88 0x62 => U+00CA U+0304 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
> +  mbs += 2;
> +  TEST_COMPARE (buf[0], 0xC3);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x8A);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xCC);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0x84);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\x88\x62"; /* 0x88 0x62 => U+00CA U+0304 */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xC3);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0x8A);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xCC);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0x84);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Another double byte character that maps to a pair of two byte UTF-8 code
> +     unit sequences.  */
> +  mbs = "\x88\xA5"; /* 0x88 0xA5 => U+00EA U+030C */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
> +  mbs += 2;
> +  TEST_COMPARE (buf[0], 0xC3);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xAA);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xCC);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0x8C);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  /* Same as last test, but one code unit at a time.  */
> +  mbs = "\x88\xA5"; /* 0x88 0xA5 => U+00EA U+030C */
> +  memset (buf, 0, sizeof (buf));
> +  memset (&s, 0, sizeof (s));
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
> +  mbs += 1;
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
> +  mbs += 1;
> +  TEST_COMPARE (buf[0], 0xC3);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
> +  TEST_COMPARE (buf[0], 0xAA);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
> +  TEST_COMPARE (buf[0], 0xCC);
> +  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
> +  TEST_COMPARE (buf[0], 0x8C);
> +  TEST_VERIFY (mbsinit (&s));
> +
> +  return 0;
> +}
> +
> +static int
> +do_test (void)
> +{
> +  test_utf8 ();
> +  test_big5_hkscs ();
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
diff mbox series

Patch

commit 1fb259ae6b5da6865140c203a484b0735fc152d0
Author: Tom Honermann <tom@honermann.net>
Date:   Sun Feb 27 10:39:00 2022 -0500

    Tests for mbrtoc8(), c8rtomb(), and the char8_t typedef.
    
    This change adds tests for the mbrtoc8 and c8rtomb functions adopted for
    C++20 via WG21 P0482R6 and for C2X via WG14 N2653, and for the char8_t
    typedef adopted for C2X from WG14 N2653.
    
    The tests for mbrtoc8 and c8rtomb specifically exercise conversion to
    and from Big5-HKSCS because of special cases that arise with that encoding.
    Big5-HKSCS defines some double byte sequences that convert to more than
    one Unicode code point.  In order to test this, the locale dependencies
    for running tests under wcsmbs is expanded to include zh_HK.BIG5-HKSCS.

diff --git a/wcsmbs/Makefile b/wcsmbs/Makefile
index bda281ad70..9b8445fa48 100644
--- a/wcsmbs/Makefile
+++ b/wcsmbs/Makefile
@@ -52,6 +52,7 @@  tests := tst-wcstof wcsmbs-tst1 tst-wcsnlen tst-btowc tst-mbrtowc \
 	 tst-c16c32-1 wcsatcliff tst-wcstol-locale tst-wcstod-nan-locale \
 	 tst-wcstod-round test-char-types tst-fgetwc-after-eof \
 	 tst-wcstod-nan-sign tst-c16-surrogate tst-c32-state \
+	 test-char8-type test-mbrtoc8 test-c8rtomb \
 	 $(addprefix test-,$(strop-tests)) tst-mbstowcs \
 	 tst-wprintf-binary
 
@@ -59,7 +60,7 @@  include ../Rules
 
 ifeq ($(run-built-tests),yes)
 LOCALES := de_DE.ISO-8859-1 de_DE.UTF-8 en_US.ANSI_X3.4-1968 hr_HR.ISO-8859-2 \
-	   ja_JP.EUC-JP zh_TW.EUC-TW tr_TR.UTF-8 tr_TR.ISO-8859-9
+	   ja_JP.EUC-JP zh_TW.EUC-TW tr_TR.UTF-8 tr_TR.ISO-8859-9 zh_HK.BIG5-HKSCS
 include ../gen-locales.mk
 
 $(objpfx)tst-btowc.out: $(gen-locales)
diff --git a/wcsmbs/test-c8rtomb.c b/wcsmbs/test-c8rtomb.c
new file mode 100644
index 0000000000..14564fa00a
--- /dev/null
+++ b/wcsmbs/test-c8rtomb.c
@@ -0,0 +1,519 @@ 
+/* Test c8rtomb.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <limits.h>
+#include <locale.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <uchar.h>
+#include <wchar.h>
+#include <support/check.h>
+
+static int
+test_truncated_code_unit_sequence (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Missing trailing code unit for a two code byte unit sequence.  */
+  u8s = (const char8_t*) u8"\xC2";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Missing first trailing code unit for a three byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xE0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Missing second trailing code unit for a three byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xE0\xA0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Missing first trailing code unit for a four byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xF0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Missing second trailing code unit for a four byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xF0\x90";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Missing third trailing code unit for a four byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xF0\x90\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t)  0); /* 3rd byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_invalid_trailing_code_unit_sequence (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Invalid trailing code unit for a two code byte unit sequence.  */
+  u8s = (const char8_t*) u8"\xC2\xC0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Invalid first trailing code unit for a three byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xE0\xC0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Invalid second trailing code unit for a three byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xE0\xA0\xC0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Invalid first trailing code unit for a four byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xF0\xC0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Invalid second trailing code unit for a four byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xF0\x90\xC0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Invalid third trailing code unit for a four byte code unit sequence.  */
+  u8s = (const char8_t*) u8"\xF0\x90\x80\xC0";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t)  0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t)  0); /* 3rd byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_lone_trailing_code_units (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Lone trailing code unit.  */
+  u8s = (const char8_t*) u8"\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Lone trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_overlong_encoding (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Two byte overlong encoding.  */
+  u8s = (const char8_t*) u8"\xC0\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Invalid lead code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Two byte overlong encoding.  */
+  u8s = (const char8_t*) u8"\xC1\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Invalid lead code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Three byte overlong encoding.  */
+  u8s = (const char8_t*) u8"\xE0\x9F\xBF";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Four byte overlong encoding.  */
+  u8s = (const char8_t*) u8"\xF0\x8F\xBF\xBF";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_surrogate_range (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Would encode U+D800.  */
+  u8s = (const char8_t*) u8"\xED\xA0\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Would encode U+DFFF.  */
+  u8s = (const char8_t*) u8"\xED\xBF\xBF";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_out_of_range_encoding (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Would encode U+00110000.  */
+  u8s = (const char8_t*) u8"\xF4\x90\x80\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t)  0); /* First byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) -1); /* Invalid trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  /* Would encode U+00140000.  */
+  u8s = (const char8_t*) u8"\xF5\x90\x80\x80";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  errno = 0;
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) -1); /* Invalid lead code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_null_output_buffer (void)
+{
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  /* Null character with an initial state.  */
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (NULL, u8"X"[0], &s), (size_t) 1); /* null byte processed */
+  TEST_VERIFY (mbsinit (&s));    /* Assert the state is now an initial state.  */
+
+  /* Null buffer with a state corresponding to an incompletely read code
+     unit sequence.  In this case, an error occurs since insufficient
+     information is available to complete the already started code unit
+     sequence and return to the initial state.  */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8"\xC2"[0], &s), (size_t)  0);  /* 1st byte processed */
+  errno = 0;
+  TEST_COMPARE (c8rtomb (NULL, u8"\x80"[0], &s), (size_t) -1); /* No trailing code unit */
+  TEST_COMPARE (errno, EILSEQ);
+
+  return 0;
+}
+
+static int
+test_utf8 (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  TEST_VERIFY_EXIT (setlocale (LC_ALL, "de_DE.UTF-8") != NULL);
+
+  /* Null character.  */
+  u8s = (const char8_t*) u8"\x00"; /* U+0000 => 0x00 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 1); /* 1st byte processed */
+  TEST_COMPARE (buf[0], (char) 0x00);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First non-null character in the code point range that maps to a single
+     code unit.  */
+  u8s = (const char8_t*) u8"\x01"; /* U+0001 => 0x01 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 1); /* 1st byte processed */
+  TEST_COMPARE (buf[0], (char) 0x01);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to a single code unit.  */
+  u8s = (const char8_t*) u8"\x7F"; /* U+007F => 0x7F */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 1); /* 1st byte processed */
+  TEST_COMPARE (buf[0], (char) 0x7F);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to two code units.  */
+  u8s = (const char8_t*) u8"\xC2\x80"; /* U+0080 => 0xC2 0x80 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 2); /* 2nd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xC2);
+  TEST_COMPARE (buf[1], (char) 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to two code units.  */
+  u8s = (const char8_t*) u8"\u07FF"; /* U+07FF => 0xDF 0xBF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 2); /* 2nd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xDF);
+  TEST_COMPARE (buf[1], (char) 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to three code units.  */
+  u8s = (const char8_t*) u8"\u0800"; /* U+0800 => 0xE0 0xA0 0x80 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xE0);
+  TEST_COMPARE (buf[1], (char) 0xA0);
+  TEST_COMPARE (buf[2], (char) 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to three code units
+     before the surrogate code point range.  */
+  u8s = (const char8_t*) u8"\uD7FF"; /* U+D7FF => 0xED 0x9F 0xBF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xED);
+  TEST_COMPARE (buf[1], (char) 0x9F);
+  TEST_COMPARE (buf[2], (char) 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to three code units
+     after the surrogate code point range.  */
+  u8s = (const char8_t*) u8"\uE000"; /* U+E000 => 0xEE 0x80 0x80 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xEE);
+  TEST_COMPARE (buf[1], (char) 0x80);
+  TEST_COMPARE (buf[2], (char) 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Not a BOM.  */
+  u8s = (const char8_t*) u8"\uFEFF"; /* U+FEFF => 0xEF 0xBB 0xBF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xEF);
+  TEST_COMPARE (buf[1], (char) 0xBB);
+  TEST_COMPARE (buf[2], (char) 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Replacement character.  */
+  u8s = (const char8_t*) u8"\uFFFD"; /* U+FFFD => 0xEF 0xBF 0xBD */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xEF);
+  TEST_COMPARE (buf[1], (char) 0xBF);
+  TEST_COMPARE (buf[2], (char) 0xBD);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to three code units.  */
+  u8s = (const char8_t*) u8"\uFFFF"; /* U+FFFF => 0xEF 0xBF 0xBF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], (char) 0xEF);
+  TEST_COMPARE (buf[1], (char) 0xBF);
+  TEST_COMPARE (buf[2], (char) 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to four code units.  */
+  u8s = (const char8_t*) u8"\U00010000"; /* U+10000 => 0xF0 0x90 0x80 0x80 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 4); /* 4th byte processed */
+  TEST_COMPARE (buf[0], (char) 0xF0);
+  TEST_COMPARE (buf[1], (char) 0x90);
+  TEST_COMPARE (buf[2], (char) 0x80);
+  TEST_COMPARE (buf[3], (char) 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to four code units.  */
+  u8s = (const char8_t*) u8"\U0010FFFF"; /* U+10FFFF => 0xF4 0x8F 0xBF 0xBF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 4); /* 4th byte processed */
+  TEST_COMPARE (buf[0], (char) 0xF4);
+  TEST_COMPARE (buf[1], (char) 0x8F);
+  TEST_COMPARE (buf[2], (char) 0xBF);
+  TEST_COMPARE (buf[3], (char) 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  return 0;
+}
+
+static int
+test_big5_hkscs (void)
+{
+  const char8_t *u8s;
+  char buf[MB_LEN_MAX];
+  mbstate_t s;
+
+  TEST_VERIFY_EXIT (setlocale (LC_ALL, "zh_HK.BIG5-HKSCS") != NULL);
+
+  /* A pair of two byte UTF-8 code unit sequences that map a Unicode code
+     point and combining character to a single double byte character.  */
+  u8s = (const char8_t*) u8"\u00CA\u0304"; /* U+00CA U+0304 => 0x88 0x62 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 2); /* 4th byte processed */
+  TEST_COMPARE (buf[0], (char) 0x88);
+  TEST_COMPARE (buf[1], (char) 0x62);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Another pair of two byte UTF-8 code unit sequences that map a Unicode code
+     point and combining character to a single double byte character.  */
+  u8s = (const char8_t*) u8"\u00EA\u030C"; /* U+00EA U+030C => 0x88 0xA5 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (c8rtomb (buf, u8s[0], &s), (size_t) 0); /* 1st byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[1], &s), (size_t) 0); /* 2nd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[2], &s), (size_t) 0); /* 3rd byte processed */
+  TEST_COMPARE (c8rtomb (buf, u8s[3], &s), (size_t) 2); /* 4th byte processed */
+  TEST_COMPARE (buf[0], (char) 0x88);
+  TEST_COMPARE (buf[1], (char) 0xA5);
+  TEST_VERIFY (mbsinit (&s));
+
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  test_truncated_code_unit_sequence ();
+  test_invalid_trailing_code_unit_sequence ();
+  test_lone_trailing_code_units ();
+  test_overlong_encoding ();
+  test_surrogate_range ();
+  test_out_of_range_encoding ();
+  test_null_output_buffer ();
+  test_utf8 ();
+  test_big5_hkscs ();
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/wcsmbs/test-char8-type.c b/wcsmbs/test-char8-type.c
new file mode 100644
index 0000000000..642c7044ed
--- /dev/null
+++ b/wcsmbs/test-char8-type.c
@@ -0,0 +1,31 @@ 
+/* Test the char8_t typedef.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <uchar.h>
+
+/* Verify that the char8_t type is recognized.  */
+char8_t c8;
+
+static int
+do_test (void)
+{
+  /* This is a compilation test.  */
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/wcsmbs/test-mbrtoc8.c b/wcsmbs/test-mbrtoc8.c
new file mode 100644
index 0000000000..c5061635ac
--- /dev/null
+++ b/wcsmbs/test-mbrtoc8.c
@@ -0,0 +1,462 @@ 
+/* Test mbrtoc8.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <locale.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <uchar.h>
+#include <wchar.h>
+#include <support/check.h>
+
+static int
+test_utf8 (void)
+{
+  const char *mbs;
+  char8_t buf[1];
+  mbstate_t s;
+
+  TEST_VERIFY_EXIT (setlocale (LC_ALL, "de_DE.UTF-8") != NULL);
+
+  /* No inputs.  */
+  mbs = "";
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 0, &s), (size_t) -2); /* no input */
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Null character.  */
+  mbs = "\x00"; /* 0x00 => U+0000 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 0); /* null byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0x00);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First non-null character in the code point range that maps to a single
+     code unit.  */
+  mbs = "\x01"; /* 0x01 => U+0001 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 1); /* 1st byte processed */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0x01);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to a single code unit.  */
+  mbs = "\x7F"; /* 0x7F => U+007F */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 1);  /* 1st byte processed */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0x7F);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to two code units.  */
+  mbs = "\xC2\x80"; /* 0xC2 0x80 => U+0080 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
+  mbs += 2;
+  TEST_COMPARE (buf[0], 0xC2);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xC2\x80"; /* 0xC2 0x80 => U+0080 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xC2);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to two code units.  */
+  mbs = "\xDF\xBF"; /* 0xDF 0xBF => U+07FF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
+  mbs += 2;
+  TEST_COMPARE (buf[0], 0xDF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xDF\xBF"; /* 0xDF 0xBF => U+07FF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xDF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to three code units.  */
+  mbs = u8"\xE0\xA0\x80"; /* 0xE0 0xA0 0x80 => U+0800 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
+  mbs += 3;
+  TEST_COMPARE (buf[0], 0xE0);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xA0);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = u8"\xE0\xA0\x80"; /* 0xE0 0xA0 0x80 => U+0800 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xE0);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xA0);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to three code units
+     before the surrogate code point range.  */
+  mbs = "\xED\x9F\xBF"; /* 0xED 0x9F 0xBF => U+D7FF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
+  mbs += 3;
+  TEST_COMPARE (buf[0], 0xED);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x9F);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xED\x9F\xBF"; /* 0xED 0x9F 0xBF => U+D7FF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xED);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x9F);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to three code units
+     after the surrogate code point range.  */
+  mbs = "\xEE\x80\x80"; /* 0xEE 0x80 0x80 => U+E000 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
+  mbs += 3;
+  TEST_COMPARE (buf[0], 0xEE);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xEE\x80\x80"; /* 0xEE 0x80 0x80 => U+E000 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xEE);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Not a BOM.  */
+  mbs = "\xEF\xBB\xBF"; /* 0xEF 0xBB 0xBF => U+FEFF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
+  mbs += 3;
+  TEST_COMPARE (buf[0], 0xEF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBB);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xEF\xBB\xBF"; /* 0xEF 0xBB 0xBF => U+FEFF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xEF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBB);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Replacement character.  */
+  mbs = "\xEF\xBF\xBD"; /* 0xEF 0xBF 0xBD => U+FFFD */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte written */
+  mbs += 3;
+  TEST_COMPARE (buf[0], 0xEF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBD);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xEF\xBF\xBD"; /* 0xEF 0xBF 0xBD => U+FFFD */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xEF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBD);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to three code units.  */
+  mbs = "\xEF\xBF\xBF"; /* 0xEF 0xBF 0xBF => U+FFFF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 3);  /* 1st byte processed */
+  mbs += 3;
+  TEST_COMPARE (buf[0], 0xEF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte processed */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte processed */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xEF\xBF\xBF"; /* 0xEF 0xBF 0xBF => U+FFFF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xEF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* First character in the code point range that maps to four code units.  */
+  mbs = "\xF0\x90\x80\x80"; /* 0xF0 0x90 0x80 0x80 => U+10000 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 4);  /* 1st byte written */
+  mbs += 4;
+  TEST_COMPARE (buf[0], 0xF0);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x90);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xF0\x90\x80\x80"; /* 0xF0 0x90 0x80 0x80 => U+10000 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xF0);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x90);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
+  TEST_COMPARE (buf[0], 0x80);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Last character in the code point range that maps to four code units.  */
+  mbs = "\xF4\x8F\xBF\xBF"; /* 0xF4 0x8F 0xBF 0xBF => U+10FFFF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 4);  /* 1st byte written */
+  mbs += 4;
+  TEST_COMPARE (buf[0], 0xF4);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x8F);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\xF4\x8F\xBF\xBF"; /* 0xF4 0x8F 0xBF 0xBF => U+10FFFF */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xF4);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x8F);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
+  TEST_COMPARE (buf[0], 0xBF);
+  TEST_VERIFY (mbsinit (&s));
+
+  return 0;
+}
+
+static int
+test_big5_hkscs (void)
+{
+  const char *mbs;
+  char8_t buf[1];
+  mbstate_t s;
+
+  TEST_VERIFY_EXIT (setlocale (LC_ALL, "zh_HK.BIG5-HKSCS") != NULL);
+
+  /* A double byte character that maps to a pair of two byte UTF-8 code unit
+     sequences.  */
+  mbs = "\x88\x62"; /* 0x88 0x62 => U+00CA U+0304 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
+  mbs += 2;
+  TEST_COMPARE (buf[0], 0xC3);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x8A);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xCC);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
+  TEST_COMPARE (buf[0], 0x84);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\x88\x62"; /* 0x88 0x62 => U+00CA U+0304 */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xC3);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0x8A);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xCC);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
+  TEST_COMPARE (buf[0], 0x84);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Another double byte character that maps to a pair of two byte UTF-8 code
+     unit sequences.  */
+  mbs = "\x88\xA5"; /* 0x88 0xA5 => U+00EA U+030C */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) 2);  /* 1st byte written */
+  mbs += 2;
+  TEST_COMPARE (buf[0], 0xC3);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xAA);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xCC);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, strlen(mbs)+1, &s), (size_t) -3); /* 4th byte written */
+  TEST_COMPARE (buf[0], 0x8C);
+  TEST_VERIFY (mbsinit (&s));
+
+  /* Same as last test, but one code unit at a time.  */
+  mbs = "\x88\xA5"; /* 0x88 0xA5 => U+00EA U+030C */
+  memset (buf, 0, sizeof (buf));
+  memset (&s, 0, sizeof (s));
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -2);             /* incomplete */
+  mbs += 1;
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) 1);              /* 1st byte written */
+  mbs += 1;
+  TEST_COMPARE (buf[0], 0xC3);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 2nd byte written */
+  TEST_COMPARE (buf[0], 0xAA);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 3rd byte written */
+  TEST_COMPARE (buf[0], 0xCC);
+  TEST_COMPARE (mbrtoc8 (buf, mbs, 1, &s), (size_t) -3);             /* 4th byte written */
+  TEST_COMPARE (buf[0], 0x8C);
+  TEST_VERIFY (mbsinit (&s));
+
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  test_utf8 ();
+  test_big5_hkscs ();
+  return 0;
+}
+
+#include <support/test-driver.c>