diff mbox series

POSIX locale covers every byte [BZ# 29511]

Message ID	20220830181932.oggrz6f6itrpyi6g@tarta.nabijaczleweli.xyz
State	Superseded
Headers	DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 03551385C41B Date: Tue, 30 Aug 2022 20:19:32 +0200 To: libc-alpha@sourceware.org Subject: [PATCH] POSIX locale covers every byte [BZ# 29511] Message-ID: <20220830181932.oggrz6f6itrpyi6g@tarta.nabijaczleweli.xyz> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="6hbz4qgtipbx6qrx" Content-Disposition: inline User-Agent: NeoMutt/20220429 Precedence: list From: =?utf-8?b?0L3QsNCxIHZpYSBMaWJjLWFscGhh?= <libc-alpha@sourceware.org> Reply-To: =?utf-8?b?0L3QsNCx?= <nabijaczleweli@nabijaczleweli.xyz> Cc: Florian Weimer <fweimer@redhat.com> Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org>
Series	POSIX locale covers every byte [BZ# 29511] \| POSIX locale covers every byte [BZ# 29511]

Commit Message

Ahelenia Ziemiańska Aug. 30, 2022, 6:19 p.m. UTC

  This is a trivial patch, largely duplicating the extant ASCII code

There are two user-facing changes:
  * nl_langinfo(CODESET) is "POSIX" instead of "ANSI_X3.4-1968"
  * mbrtowc() and friends return b if b <= 0x7F else <UDF00>+b

Since Issue 7 TC 2/Issue 8, the C/POSIX locale, effectively:
  (a) is 1-byte, stateless, and contains 256 characters
  (b) which collate in byte order
  (c) the first 128 characters are equivalent to ASCII (like previous)
cf. https://www.austingroupbugs.net/view.php?id=663 for a summary of
changes to the standard;
in short, this means that mbrtowc() must never fail and must return
  b if b <= 0x7F else ab+c for all bytes b
  where c is some constant >=0x80
    and a is a positive integer constant

By strategically picking c=<UDF00> we land at the tail-end of the
Unicode Low Surrogate Area at DC00-DFFF, described as
  > Isolated surrogate code points have no interpretation;
  > consequently, no character code charts or names lists
  > are provided for this range.
and match musl

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
Please observe that this patch is NOT correct for s390:
the s390 assembly implementations, marked // TODO,
are copied verbatim from the ASCII ones

I lack the hardware and expertise to write them,
but all others in that file are in assembly, too;
should I just copy the bare implementations there?
Please advise.

CCing Florian since he commented on my bugzilla bug :)

 iconv/gconv_builtin.h                 |   8 +
 iconv/gconv_int.h                     |   7 +
 iconv/gconv_simple.c                  |  75 +++++++
 iconv/tst-iconv_prog.sh               |  43 ++++
 inet/tst-idna_name_classify.c         |   6 +-
 locale/tst-C-locale.c                 |  69 ++++++
 localedata/locales/POSIX              | 143 +++++++++++-
 stdio-common/tst-printf-bz25691.c     |   2 +
 sysdeps/s390/multiarch/gconv_simple.c | 298 ++++++++++++++++++++++++++
 wcsmbs/wcsmbsload.c                   |  10 +-
 10 files changed, 652 insertions(+), 9 deletions(-)

Comments

Florian Weimer Sept. 6, 2022, 2:19 p.m. UTC | #1

* наб via Libc-alpha:

> This is a trivial patch, largely duplicating the extant ASCII code
>
> There are two user-facing changes:
>   * nl_langinfo(CODESET) is "POSIX" instead of "ANSI_X3.4-1968"
>   * mbrtowc() and friends return b if b <= 0x7F else <UDF00>+b
>
> Since Issue 7 TC 2/Issue 8, the C/POSIX locale, effectively:
>   (a) is 1-byte, stateless, and contains 256 characters
>   (b) which collate in byte order
>   (c) the first 128 characters are equivalent to ASCII (like previous)
> cf. https://www.austingroupbugs.net/view.php?id=663 for a summary of
> changes to the standard;
> in short, this means that mbrtowc() must never fail and must return
>   b if b <= 0x7F else ab+c for all bytes b
>   where c is some constant >=0x80
>     and a is a positive integer constant
>
> By strategically picking c=<UDF00> we land at the tail-end of the
> Unicode Low Surrogate Area at DC00-DFFF, described as
>   > Isolated surrogate code points have no interpretation;
>   > consequently, no character code charts or names lists
>   > are provided for this range.
> and match musl

We don't match Python and its surrogateescape encoding (PEP 838).  It
maps invalid bytes in the 0x80…0xff range to U+DC80…U+DCFF.  It may make
more sense to align with that.

What worries me is that this effectively closes the door for using UTF-8
(or some variant, such as Python's) with the C locale.  I used to assume
that POSIX allows that, but they now say this was just a mistake.

Anyway, regarding mechanics, we'll need a new localedata/charmaps/POSIX
charmap, I think.  This charmap then can be tested against the gconv
converter.

You should put the new converters into a separate file (not
iconv/gconv_simple.c), then the s390x version will use that
automatically.
> diff --git a/localedata/locales/POSIX b/localedata/locales/POSIX
> index 7ec7f1c577..fc34a6abc1 100644
> --- a/localedata/locales/POSIX
> +++ b/localedata/locales/POSIX
> @@ -97,6 +97,20 @@ END LC_CTYPE
>  LC_COLLATE
>  % This is the POSIX Locale definition for the LC_COLLATE category.

Isn't this just the C locale?  We don't have a separate file for that.

> diff --git a/wcsmbs/wcsmbsload.c b/wcsmbs/wcsmbsload.c
> index 0f0f55f9ed..f87099bcf5 100644
> --- a/wcsmbs/wcsmbsload.c
> +++ b/wcsmbs/wcsmbsload.c
> @@ -33,10 +33,10 @@ static const struct __gconv_step to_wc =
>    .__shlib_handle = NULL,
>    .__modname = NULL,
>    .__counter = INT_MAX,
> -  .__from_name = (char *) "ANSI_X3.4-1968//TRANSLIT",
> +  .__from_name = (char *) "POSIX",
>    .__to_name = (char *) "INTERNAL",
> -  .__fct = __gconv_transform_ascii_internal,
> -  .__btowc_fct = __gconv_btwoc_ascii,
> +  .__fct = __gconv_transform_posix_internal,
> +  .__btowc_fct = __gconv_btwoc_posix,
>    .__init_fct = NULL,
>    .__end_fct = NULL,
>    .__min_needed_from = 1,
> @@ -53,8 +53,8 @@ static const struct __gconv_step to_mb =
>    .__modname = NULL,
>    .__counter = INT_MAX,
>    .__from_name = (char *) "INTERNAL",
> -  .__to_name = (char *) "ANSI_X3.4-1968//TRANSLIT",
> -  .__fct = __gconv_transform_internal_ascii,
> +  .__to_name = (char *) "POSIX",
> +  .__fct = __gconv_transform_internal_posix,
>    .__btowc_fct = NULL,
>    .__init_fct = NULL,
>    .__end_fct = NULL,

This makes the comment on __wcsmbs_gconv_fcts_c in the same file
obsolete.

Thanks,
Florian

Ahelenia Ziemiańska Sept. 6, 2022, 6:06 p.m. UTC | #2

Hi!

On Tue, Sep 06, 2022 at 04:19:01PM +0200, Florian Weimer wrote:
> * наб via Libc-alpha:
> 
> > This is a trivial patch, largely duplicating the extant ASCII code
> >
> > There are two user-facing changes:
> >   * nl_langinfo(CODESET) is "POSIX" instead of "ANSI_X3.4-1968"
> >   * mbrtowc() and friends return b if b <= 0x7F else <UDF00>+b
> >
> > Since Issue 7 TC 2/Issue 8, the C/POSIX locale, effectively:
> >   (a) is 1-byte, stateless, and contains 256 characters
> >   (b) which collate in byte order
> >   (c) the first 128 characters are equivalent to ASCII (like previous)
> > cf. https://www.austingroupbugs.net/view.php?id=663 for a summary of
> > changes to the standard;
> > in short, this means that mbrtowc() must never fail and must return
> >   b if b <= 0x7F else ab+c for all bytes b
> >   where c is some constant >=0x80
> >     and a is a positive integer constant
> >
> > By strategically picking c=<UDF00> we land at the tail-end of the
> > Unicode Low Surrogate Area at DC00-DFFF, described as
> >   > Isolated surrogate code points have no interpretation;
> >   > consequently, no character code charts or names lists
> >   > are provided for this range.
> > and match musl
> 
> We don't match Python and its surrogateescape encoding (PEP 838).
404?

> It
> maps invalid bytes in the 0x80…0xff range to U+DC80…U+DCFF.
(The same as musl.)

> It may make
> more sense to align with that.
With a=1 and c=<UDF00>, assuming it's as you say, we very much do?

$ printf '\x80\xff' | output/elf/ld.so --library-path output/ output/iconv/iconv_prog -fPOSIX -tUCS4 | hd
00000000  00 00 df 80 00 00 df ff                           |........|
00000008

> Anyway, regarding mechanics, we'll need a new localedata/charmaps/POSIX
> charmap, I think.  This charmap then can be tested against the gconv
> converter.
Hm, the problem with that is tst-tables -> tst-table -> tst-table-from
(and -to) convert by constructing a UTF-8 sequence. The problem with
this approach is that glibc rejects unpaired surrogates.

The output for tst-table-from UTF-8 is:
  ...
  0xED9FBE        0xD7FE
  0xED9FBF        0xD7FF
  0xEE8080        0xE000
  0xEE8081        0xE001
  ...
i.e. there's a gap for the surrogates; and, indeed, the charmap reads
  <UD7FB>     /xed/x9f/xbb HANGUL JONGSEONG PHIEUPH-THIEUTH
  %<UD800>     /xed/xa0/x80 <Non Private Use High Surrogate, First>
  %<UDB7F>     /xed/xad/xbf <Non Private Use High Surrogate, Last>
  %<UDB80>     /xed/xae/x80 <Private Use High Surrogate, First>
  %<UDBFF>     /xed/xaf/xbf <Private Use High Surrogate, Last>
  %<UDC00>     /xed/xb0/x80 <Low Surrogate, First>
  %<UDFFF>     /xed/xbf/xbf <Low Surrogate, Last>
  <UE000>..<UE03F> /xee/x80/x80 <Private Use>
with the surrogate range commented-out;
this dates back to the inclusion of UTF-8 generator scripts in 2015
(4a4839c94a4c93ffc0d5b95c69a08b02a57007f2), these exclusions are
deliberate (grep for surrog in localedata/unicode-gen/utf8_gen.py).

Given this limitation, expanding the charmap to
ANSI_X3.4-1968 + <UDF80>..<UDFFF> doesn't actually test much:
having them as separate codepoints will always fail tests,
and dot-notation lines are ignored when generating the comparison
tables, so this particular type of test just proves that POSIX is the
same as ANSI_X3.4-1968 for the first 128 characters.

There's already an exhaustive iconv_prog-based testsuite
(cf. additions to iconv/tst-iconv_prog.sh), though.

> You should put the new converters into a separate file (not
> iconv/gconv_simple.c), then the s390x version will use that
> automatically.
Oh, of course! Moved to iconv/gconv_posix.c.

> > diff --git a/localedata/locales/POSIX b/localedata/locales/POSIX
> > index 7ec7f1c577..fc34a6abc1 100644
> > --- a/localedata/locales/POSIX
> > +++ b/localedata/locales/POSIX
> > @@ -97,6 +97,20 @@ END LC_CTYPE
> >  LC_COLLATE
> >  % This is the POSIX Locale definition for the LC_COLLATE category.
> 
> Isn't this just the C locale?
Yes, C is defined to be POSIX.

> We don't have a separate file for that.
Yes, we very obviously do, seeing as this patch edits it?
Nothing consumes it AFAICT, but.

> > diff --git a/wcsmbs/wcsmbsload.c b/wcsmbs/wcsmbsload.c
> > index 0f0f55f9ed..f87099bcf5 100644
> > --- a/wcsmbs/wcsmbsload.c
> > +++ b/wcsmbs/wcsmbsload.c
> > @@ -33,10 +33,10 @@ static const struct __gconv_step to_wc =
> >    .__shlib_handle = NULL,
> >    .__modname = NULL,
> >    .__counter = INT_MAX,
> > -  .__from_name = (char *) "ANSI_X3.4-1968//TRANSLIT",
> > +  .__from_name = (char *) "POSIX",
> >    .__to_name = (char *) "INTERNAL",
> > -  .__fct = __gconv_transform_ascii_internal,
> > -  .__btowc_fct = __gconv_btwoc_ascii,
> > +  .__fct = __gconv_transform_posix_internal,
> > +  .__btowc_fct = __gconv_btwoc_posix,
> >    .__init_fct = NULL,
> >    .__end_fct = NULL,
> >    .__min_needed_from = 1,
> > @@ -53,8 +53,8 @@ static const struct __gconv_step to_mb =
> >    .__modname = NULL,
> >    .__counter = INT_MAX,
> >    .__from_name = (char *) "INTERNAL",
> > -  .__to_name = (char *) "ANSI_X3.4-1968//TRANSLIT",
> > -  .__fct = __gconv_transform_internal_ascii,
> > +  .__to_name = (char *) "POSIX",
> > +  .__fct = __gconv_transform_internal_posix,
> >    .__btowc_fct = NULL,
> >    .__init_fct = NULL,
> >    .__end_fct = NULL,
> 
> This makes the comment on __wcsmbs_gconv_fcts_c in the same file
> obsolete.

Comment fixed.

> Thanks,
> Florian

New patchset in followup.

Best,
наб

diff mbox series

Patch

diff --git a/iconv/gconv_builtin.h b/iconv/gconv_builtin.h
index 68c2369b1f..cd1805b3ce 100644
--- a/iconv/gconv_builtin.h
+++ b/iconv/gconv_builtin.h
@@ -89,6 +89,14 @@  BUILTIN_TRANSFORMATION ("INTERNAL", "ANSI_X3.4-1968//", 1, "=INTERNAL->ascii",
 			__gconv_transform_internal_ascii, NULL, 4, 4, 1, 1)
 
 
+BUILTIN_TRANSFORMATION ("POSIX//", "INTERNAL", 1, "=posix->INTERNAL",
+			__gconv_transform_posix_internal, __gconv_btwoc_posix,
+			1, 1, 4, 4)
+
+BUILTIN_TRANSFORMATION ("INTERNAL", "POSIX//", 1, "=INTERNAL->posix",
+			__gconv_transform_internal_posix, NULL, 4, 4, 1, 1)
+
+
 #if BYTE_ORDER == BIG_ENDIAN
 BUILTIN_ALIAS ("UNICODEBIG//", "ISO-10646/UCS2/")
 BUILTIN_ALIAS ("UCS-2BE//", "ISO-10646/UCS2/")
diff --git a/iconv/gconv_int.h b/iconv/gconv_int.h
index 1c6745043e..45ab1edfad 100644
--- a/iconv/gconv_int.h
+++ b/iconv/gconv_int.h
@@ -281,6 +281,8 @@  extern int __gconv_compare_alias (const char *name1, const char *name2)
 
 __BUILTIN_TRANSFORM (__gconv_transform_ascii_internal);
 __BUILTIN_TRANSFORM (__gconv_transform_internal_ascii);
+__BUILTIN_TRANSFORM (__gconv_transform_posix_internal);
+__BUILTIN_TRANSFORM (__gconv_transform_internal_posix);
 __BUILTIN_TRANSFORM (__gconv_transform_utf8_internal);
 __BUILTIN_TRANSFORM (__gconv_transform_internal_utf8);
 __BUILTIN_TRANSFORM (__gconv_transform_ucs2_internal);
@@ -299,6 +301,11 @@  __BUILTIN_TRANSFORM (__gconv_transform_utf16_internal);
    only ASCII characters.  */
 extern wint_t __gconv_btwoc_ascii (struct __gconv_step *step, unsigned char c);
 
+/* Specialized conversion function for a single byte to INTERNAL,
+   identity-mapping bytes [0, 0x7F], and moving [0x80, 0xFF] into the end
+   of the Low Surrogate Area at [U+DF80, U+DFFF].  */
+extern wint_t __gconv_btwoc_posix (struct __gconv_step *step, unsigned char c);
+
 #endif
 
 __END_DECLS
diff --git a/iconv/gconv_simple.c b/iconv/gconv_simple.c
index 640068d9ba..4cd01854cd 100644
--- a/iconv/gconv_simple.c
+++ b/iconv/gconv_simple.c
@@ -53,6 +53,18 @@  __gconv_btwoc_ascii (struct __gconv_step *step, unsigned char c)
     return WEOF;
 }
 
+/* Specialized conversion function for a single byte to INTERNAL,
+   identity-mapping bytes [0, 0x7F], and moving [0x80, 0xFF] into the end
+   of the Low Surrogate Area at [U+DF80, U+DFFF].  */
+wint_t
+__gconv_btwoc_posix (struct __gconv_step *step, unsigned char c)
+{
+  if (c < 0x80)
+    return c;
+  else
+    return 0xdf00 + c;
+}
+
 
 /* Transform from the internal, UCS4-like format, to UCS4.  The
    difference between the internal ucs4 format and the real UCS4
@@ -868,6 +880,69 @@  ucs4le_internal_loop_single (struct __gconv_step *step,
 #include <iconv/skeleton.c>
 
 
+/* Convert from {[0, 0x7F] => ISO 646-IRV; [0x80, 0xFF] => [U+DF80, U+DFFF]}
+   to the internal (UCS4-like) format.  */
+#define DEFINE_INIT		0
+#define DEFINE_FINI		0
+#define MIN_NEEDED_FROM		1
+#define MIN_NEEDED_TO		4
+#define FROM_DIRECTION		1
+#define FROM_LOOP		posix_internal_loop
+#define TO_LOOP			posix_internal_loop /* This is not used.  */
+#define FUNCTION_NAME		__gconv_transform_posix_internal
+#define ONE_DIRECTION		1
+
+#define MIN_NEEDED_INPUT	MIN_NEEDED_FROM
+#define MIN_NEEDED_OUTPUT	MIN_NEEDED_TO
+#define LOOPFCT			FROM_LOOP
+#define BODY \
+  {									      \
+    if (__glibc_unlikely (*inptr > '\x7f'))				      \
+      *((uint32_t *) outptr) = 0xdf00 + *inptr++;			      \
+    else								      \
+      *((uint32_t *) outptr) = *inptr++;				      \
+    outptr += sizeof (uint32_t);					      \
+  }
+#include <iconv/loop.c>
+#include <iconv/skeleton.c>
+
+
+/* Convert from the internal (UCS4-like) format to
+   {ISO 646-IRV => [0, 0x7F]; [U+DF80, U+DFFF] => [0x80, 0xFF]}.  */
+#define DEFINE_INIT		0
+#define DEFINE_FINI		0
+#define MIN_NEEDED_FROM		4
+#define MIN_NEEDED_TO		1
+#define FROM_DIRECTION		1
+#define FROM_LOOP		internal_posix_loop
+#define TO_LOOP			internal_posix_loop /* This is not used.  */
+#define FUNCTION_NAME		__gconv_transform_internal_posix
+#define ONE_DIRECTION		1
+
+#define MIN_NEEDED_INPUT	MIN_NEEDED_FROM
+#define MIN_NEEDED_OUTPUT	MIN_NEEDED_TO
+#define LOOPFCT			FROM_LOOP
+#define BODY \
+  {									      \
+    uint32_t val = *((const uint32_t *) inptr);				      \
+    if (__glibc_unlikely ((val > 0x7f && val < 0xdf80) || val > 0xdfff))      \
+      {									      \
+	UNICODE_TAG_HANDLER (val, 4);					      \
+	STANDARD_TO_LOOP_ERR_HANDLER (4);				      \
+      }									      \
+    else								      \
+      {									      \
+	if (__glibc_unlikely (val > 0x7f))				      \
+	  val -= 0xdf00;						      \
+	*outptr++ = val;						      \
+	inptr += sizeof (uint32_t);					      \
+      }									      \
+  }
+#define LOOP_NEED_FLAGS
+#include <iconv/loop.c>
+#include <iconv/skeleton.c>
+
+
 /* Convert from the internal (UCS4-like) format to UTF-8.  */
 #define DEFINE_INIT		0
 #define DEFINE_FINI		0
diff --git a/iconv/tst-iconv_prog.sh b/iconv/tst-iconv_prog.sh
index b3d8bf5110..a24d8d2207 100644
--- a/iconv/tst-iconv_prog.sh
+++ b/iconv/tst-iconv_prog.sh
@@ -285,3 +285,46 @@  for errorcommand in "${errorarray[@]}"; do
   execute_test
   check_errtest_result
 done
+
+allbytes ()
+{
+  for (( i = 0; i <= 255; i++ )); do
+    printf '\'"$(printf "%o" "$i")"
+  done
+}
+
+allucs4be ()
+{
+  for (( i = 0; i <= 127; i++ )); do
+    printf '\0\0\0\'"$(printf "%o" "$i")"
+  done
+  for (( i = 128; i <= 255; i++ )); do
+    printf '\0\0\xdf\'"$(printf "%o" "$i")"
+  done
+}
+
+check_posix_result ()
+{
+  if [ $? -eq 0 ]; then
+    result=PASS
+  else
+    result=FAIL
+  fi
+
+  echo "$result: from \"$1\", to: \"$2\""
+
+  if [ "$result" != "PASS" ]; then
+    exit 1
+  fi
+}
+
+check_posix_encoding ()
+{
+  eval PROG=\"$ICONV\"
+  allbytes  | $PROG -f POSIX -t UCS-4BE | cmp -s - <(allucs4be)
+  check_posix_result POSIX UCS-4BE
+  allucs4be | $PROG -f UCS-4BE -t POSIX | cmp -s - <(allbytes)
+  check_posix_result UCS-4BE POSIX
+}
+
+check_posix_encoding
diff --git a/inet/tst-idna_name_classify.c b/inet/tst-idna_name_classify.c
index bfd34eee31..b379481844 100644
--- a/inet/tst-idna_name_classify.c
+++ b/inet/tst-idna_name_classify.c
@@ -37,11 +37,11 @@  do_test (void)
   puts ("info: C locale tests");
   locale_insensitive_tests ();
   TEST_COMPARE (__idna_name_classify ("abc\200def"),
-                idna_name_encoding_error);
+                idna_name_nonascii);
   TEST_COMPARE (__idna_name_classify ("abc\200\\def"),
-                idna_name_encoding_error);
+                idna_name_nonascii_backslash);
   TEST_COMPARE (__idna_name_classify ("abc\377def"),
-                idna_name_encoding_error);
+                idna_name_nonascii);
 
   puts ("info: en_US.ISO-8859-1 locale tests");
   if (setlocale (LC_CTYPE, "en_US.ISO-8859-1") == 0)
diff --git a/locale/tst-C-locale.c b/locale/tst-C-locale.c
index 6bd0367069..f30396ae12 100644
--- a/locale/tst-C-locale.c
+++ b/locale/tst-C-locale.c
@@ -229,6 +229,75 @@  run_test (const char *locname)
   STRTEST (YESSTR, "");
   STRTEST (NOSTR, "");
 
+#define CONVTEST(b, v) \
+  {									      \
+    unsigned char bs[] = {b, 0};					      \
+    mbstate_t ctx = {};							      \
+    wchar_t wc = -1;							      \
+    size_t sz = mbrtowc(&wc, (char *) bs, 1, &ctx);			      \
+    if (sz != !!b)							      \
+      {									      \
+	printf ("mbrtowc(%02hhx) width in locale %s wrong "		      \
+		"(is %zd, should be %d)\n", *bs, locname, sz, !!b);	      \
+	result = 1;							      \
+      }									      \
+    if (wc != v)							      \
+      {									      \
+	printf ("mbrtowc(%02hhx) value in locale %s wrong "		      \
+		"(is %x, should be %x)\n", *bs, locname, wc, v);	      \
+	result = 1;							      \
+      }									      \
+  }
+  for(int i = 0; i <= 0x7f; ++i)
+    CONVTEST(i, i);
+  for(int i = 0x80; i <= 0xff; ++i)
+    CONVTEST(i, 0xdf00 + i);
+
+#define DECONVTEST(v, b) \
+  {									      \
+    unsigned char ob = -1;						      \
+    mbstate_t ctx = {};							      \
+    size_t sz = wcrtomb((char *) &ob, v, &ctx);				      \
+    if (sz != 1)							      \
+      {									      \
+	printf ("wcrtomb(%x) width in locale %s wrong "			      \
+		"(is %zd, should be 1)\n", v, locname, sz);		      \
+	result = 1;							      \
+      }									      \
+    if (ob != b)							      \
+      {									      \
+	printf ("wcrtomb(%x) value in locale %s wrong "			      \
+		"(is %hhx, should be %hhx)\n", v, locname, ob, b);	      \
+	result = 1;							      \
+      }									      \
+  }
+#define DECONVERR(v) \
+  {									      \
+    unsigned char ob = -1;						      \
+    mbstate_t ctx = {};							      \
+    size_t sz = wcrtomb((char *) &ob, v, &ctx);				      \
+    if (sz != (size_t) -1)						      \
+      {									      \
+	printf ("wcrtomb(%x) width in locale %s wrong "			      \
+		"(is %zd, should be (size_t )-1)\n", v, locname, sz);	      \
+	result = 1;							      \
+      }									      \
+    if (ob != (unsigned char) -1)					      \
+      {									      \
+	printf ("wcrtomb(%x) value in locale %s wrong "			      \
+		"(is %hhx, should be unchanged)\n", v, locname, ob);	      \
+	result = 1;							      \
+      }									      \
+  }
+  for(int i = 0; i <= 0x7f; ++i)
+    DECONVTEST(i, i);
+  for(int i = 0x80; i < 0xdf00; ++i)
+    DECONVERR(i);
+  for(int i = 0x80; i <= 0xff; ++i)
+    DECONVTEST(0xdf00 + i, i);
+  for(int i = 0xe000; i <= 0xffff; ++i)
+    DECONVERR(i);
+
   /* Test the new locale mechanisms.  */
   loc = newlocale (LC_ALL_MASK, locname, NULL);
   if (loc == NULL)
diff --git a/localedata/locales/POSIX b/localedata/locales/POSIX
index 7ec7f1c577..fc34a6abc1 100644
--- a/localedata/locales/POSIX
+++ b/localedata/locales/POSIX
@@ -97,6 +97,20 @@  END LC_CTYPE
 LC_COLLATE
 % This is the POSIX Locale definition for the LC_COLLATE category.
 % The order is the same as in the ASCII code set.
+% Values above <DEL> (<U007F>) inserted in order, per Issue 7 TC2,
+% XBD, 7.3.2, LC_COLLATE Category in the POSIX Locale:
+% > All characters not explicitly listed here shall be inserted
+% > in the character collation order after the listed characters
+% > and shall be assigned unique primary weights. If the listed
+% > characters have ASCII encoding, the other characters shall
+% > be in ascending order according to their coded character set values
+% Since Issue 7 TC2 (XBD, 6.2 Character Encoding):
+% > The POSIX locale shall contain 256 single-byte characters [...]
+% (cf. bug 663, 674).
+% this is in contrast to previous issues, which limited the POSIX
+% locale to the Portable Character Set (7-bit ASCII).
+% We use the end of the Low Surrogate Area to contain these,
+% yielding [<UDF80>, <UDFFF>]
 order_start forward
 <U0000>
 <U0001>
@@ -226,7 +240,134 @@  order_start forward
 <U007D>
 <U007E>
 <U007F>
-UNDEFINED
+<UDF80>
+<UDF81>
+<UDF82>
+<UDF83>
+<UDF84>
+<UDF85>
+<UDF86>
+<UDF87>
+<UDF88>
+<UDF89>
+<UDF8A>
+<UDF8B>
+<UDF8C>
+<UDF8D>
+<UDF8E>
+<UDF8F>
+<UDF90>
+<UDF91>
+<UDF92>
+<UDF93>
+<UDF94>
+<UDF95>
+<UDF96>
+<UDF97>
+<UDF98>
+<UDF99>
+<UDF9A>
+<UDF9B>
+<UDF9C>
+<UDF9D>
+<UDF9E>
+<UDF9F>
+<UDFA0>
+<UDFA1>
+<UDFA2>
+<UDFA3>
+<UDFA4>
+<UDFA5>
+<UDFA6>
+<UDFA7>
+<UDFA8>
+<UDFA9>
+<UDFAA>
+<UDFAB>
+<UDFAC>
+<UDFAD>
+<UDFAE>
+<UDFAF>
+<UDFB0>
+<UDFB1>
+<UDFB2>
+<UDFB3>
+<UDFB4>
+<UDFB5>
+<UDFB6>
+<UDFB7>
+<UDFB8>
+<UDFB9>
+<UDFBA>
+<UDFBB>
+<UDFBC>
+<UDFBD>
+<UDFBE>
+<UDFBF>
+<UDFC0>
+<UDFC1>
+<UDFC2>
+<UDFC3>
+<UDFC4>
+<UDFC5>
+<UDFC6>
+<UDFC7>
+<UDFC8>
+<UDFC9>
+<UDFCA>
+<UDFCB>
+<UDFCC>
+<UDFCD>
+<UDFCE>
+<UDFCF>
+<UDFD0>
+<UDFD1>
+<UDFD2>
+<UDFD3>
+<UDFD4>
+<UDFD5>
+<UDFD6>
+<UDFD7>
+<UDFD8>
+<UDFD9>
+<UDFDA>
+<UDFDB>
+<UDFDC>
+<UDFDD>
+<UDFDE>
+<UDFDF>
+<UDFE0>
+<UDFE1>
+<UDFE2>
+<UDFE3>
+<UDFE4>
+<UDFE5>
+<UDFE6>
+<UDFE7>
+<UDFE8>
+<UDFE9>
+<UDFEA>
+<UDFEB>
+<UDFEC>
+<UDFED>
+<UDFEE>
+<UDFEF>
+<UDFF0>
+<UDFF1>
+<UDFF2>
+<UDFF3>
+<UDFF4>
+<UDFF5>
+<UDFF6>
+<UDFF7>
+<UDFF8>
+<UDFF9>
+<UDFFA>
+<UDFFB>
+<UDFFC>
+<UDFFD>
+<UDFFE>
+<UDFFF>
 order_end
 %
 END LC_COLLATE
diff --git a/stdio-common/tst-printf-bz25691.c b/stdio-common/tst-printf-bz25691.c
index 44844e71c3..e66242b58f 100644
--- a/stdio-common/tst-printf-bz25691.c
+++ b/stdio-common/tst-printf-bz25691.c
@@ -30,6 +30,8 @@ 
 static int
 do_test (void)
 {
+  setlocale(LC_CTYPE, "C.UTF-8");
+
   mtrace ();
 
   /* For 's' conversion specifier with 'l' modifier the array must be
diff --git a/sysdeps/s390/multiarch/gconv_simple.c b/sysdeps/s390/multiarch/gconv_simple.c
index 41132f620a..3896bdd96a 100644
--- a/sysdeps/s390/multiarch/gconv_simple.c
+++ b/sysdeps/s390/multiarch/gconv_simple.c
@@ -68,6 +68,8 @@ 
 
 # undef __gconv_transform_ascii_internal
 # undef __gconv_transform_internal_ascii
+# undef __gconv_transform_posix_internal
+# undef __gconv_transform_internal_posix
 # undef __gconv_transform_internal_ucs4le
 # undef __gconv_transform_ucs4_internal
 # undef __gconv_transform_ucs4le_internal
@@ -385,6 +387,302 @@  ICONV_VX_IFUNC (__gconv_transform_ascii_internal)
 # undef BODY_ORIG_ERROR
 ICONV_VX_IFUNC (__gconv_transform_internal_ascii)
 
+/* Convert from {[0, 0x7F] => ISO 646-IRV; [0x80, 0xFF] => [U+DF80, U+DFFF]}
+   to the internal (UCS4-like) format.  */
+# define DEFINE_INIT		0
+# define DEFINE_FINI		0
+# define MIN_NEEDED_FROM	1
+# define MIN_NEEDED_TO		4
+# define FROM_DIRECTION		1
+# define FROM_LOOP		ICONV_VX_NAME (posix_internal_loop)
+# define TO_LOOP		ICONV_VX_NAME (posix_internal_loop) /* This is not used.  */
+# define FUNCTION_NAME		ICONV_VX_NAME (__gconv_transform_posix_internal)
+# define ONE_DIRECTION		1
+
+# define MIN_NEEDED_INPUT	MIN_NEEDED_FROM
+# define MIN_NEEDED_OUTPUT	MIN_NEEDED_TO
+# define LOOPFCT		FROM_LOOP
+
+# define BODY_ORIG \
+  {									      \
+    if (__glibc_unlikely (*inptr > '\x7f'))				      \
+      *((uint32_t *) outptr) = 0xdf00 + *inptr++;			      \
+    else								      \
+      *((uint32_t *) outptr) = *inptr++;				      \
+    outptr += sizeof (uint32_t);					      \
+  }
+# define BODY								\
+  {									\
+    size_t len = inend - inptr;						\ TODO: entirely ascii_internal_loop, above
+    if (len > (outend - outptr) / 4)					\
+      len = (outend - outptr) / 4;					\
+    size_t loop_count, tmp;						\
+    __asm__ volatile (".machine push\n\t"				\
+		      ".machine \"z13\"\n\t"				\
+		      ".machinemode \"zarch_nohighgprs\"\n\t"		\
+		      CONVERT_32BIT_SIZE_T ([R_LEN])			\
+		      "    vrepib %%v30,0x7f\n\t" /* For compare > 0x7f.  */ \
+		      "    srlg %[R_LI],%[R_LEN],4\n\t"			\
+		      "    vrepib %%v31,0x20\n\t"			\
+		      "    clgije %[R_LI],0,1f\n\t"			\
+		      "0:  \n\t" /* Handle 16-byte blocks.  */		\
+		      "    vl %%v16,0(%[R_IN])\n\t"			\
+		      /* Checking for values > 0x7f.  */		\
+		      "    vstrcbs %%v17,%%v16,%%v30,%%v31\n\t"		\
+		      "    jno 10f\n\t"					\
+		      /* Enlarge to UCS4.  */				\
+		      "    vuplhb %%v17,%%v16\n\t"			\
+		      "    vupllb %%v18,%%v16\n\t"			\
+		      "    vuplhh %%v19,%%v17\n\t"			\
+		      "    vupllh %%v20,%%v17\n\t"			\
+		      "    vuplhh %%v21,%%v18\n\t"			\
+		      "    vupllh %%v22,%%v18\n\t"			\
+		      /* Store 64bytes to buf_out.  */			\
+		      "    vstm %%v19,%%v22,0(%[R_OUT])\n\t"		\
+		      "    la %[R_IN],16(%[R_IN])\n\t"			\
+		      "    la %[R_OUT],64(%[R_OUT])\n\t"		\
+		      "    brctg %[R_LI],0b\n\t"			\
+		      "    lghi %[R_LI],15\n\t"				\
+		      "    ngr %[R_LEN],%[R_LI]\n\t"			\
+		      "    je 20f\n\t" /* Jump away if no remaining bytes.  */ \
+		      /* Handle remaining bytes.  */			\
+		      "1: aghik %[R_LI],%[R_LEN],-1\n\t"		\
+		      "    jl 20f\n\t" /* Jump away if no remaining bytes.  */ \
+		      "    vll %%v16,%[R_LI],0(%[R_IN])\n\t"		\
+		      /* Checking for values > 0x7f.  */		\
+		      "    vstrcbs %%v17,%%v16,%%v30,%%v31\n\t"		\
+		      "    vlgvb %[R_TMP],%%v17,7\n\t"			\
+		      "    clr %[R_TMP],%[R_LI]\n\t"			\
+		      "    locrh %[R_TMP],%[R_LEN]\n\t"			\
+		      "    locghih %[R_LEN],0\n\t"			\
+		      "    j 12f\n\t"					\
+		      "10:\n\t"						\
+		      /* Found a value > 0x7f.				\
+			 Store the preceding chars.  */			\
+		      "    vlgvb %[R_TMP],%%v17,7\n\t"			\
+		      "12: la %[R_IN],0(%[R_TMP],%[R_IN])\n\t"		\
+		      "    sllk %[R_TMP],%[R_TMP],2\n\t"		\
+		      "    ahi %[R_TMP],-1\n\t"				\
+		      "    jl 20f\n\t"					\
+		      "    lgr %[R_LI],%[R_TMP]\n\t"			\
+		      "    vuplhb %%v17,%%v16\n\t"			\
+		      "    vuplhh %%v19,%%v17\n\t"			\
+		      "    vstl %%v19,%[R_LI],0(%[R_OUT])\n\t"		\
+		      "    ahi %[R_LI],-16\n\t"				\
+		      "    jl 11f\n\t"					\
+		      "    vupllh %%v20,%%v17\n\t"			\
+		      "    vstl %%v20,%[R_LI],16(%[R_OUT])\n\t"		\
+		      "    ahi %[R_LI],-16\n\t"				\
+		      "    jl 11f\n\t"					\
+		      "    vupllb %%v18,%%v16\n\t"			\
+		      "    vuplhh %%v21,%%v18\n\t"			\
+		      "    vstl %%v21,%[R_LI],32(%[R_OUT])\n\t"		\
+		      "    ahi %[R_LI],-16\n\t"				\
+		      "    jl 11f\n\t"					\
+		      "    vupllh %%v22,%%v18\n\t"			\
+		      "    vstl %%v22,%[R_LI],48(%[R_OUT])\n\t"		\
+		      "11:\n\t"						\
+		      "    la %[R_OUT],1(%[R_TMP],%[R_OUT])\n\t"	\
+		      "20:\n\t"						\
+		      ".machine pop"					\
+		      : /* outputs */ [R_OUT] "+a" (outptr)		\
+			, [R_IN] "+a" (inptr)				\
+			, [R_LEN] "+d" (len)				\
+			, [R_LI] "=d" (loop_count)			\
+			, [R_TMP] "=a" (tmp)				\
+		      : /* inputs */					\
+		      : /* clobber list*/ "memory", "cc"		\
+			ASM_CLOBBER_VR ("v16") ASM_CLOBBER_VR ("v17")	\
+			ASM_CLOBBER_VR ("v18") ASM_CLOBBER_VR ("v19")	\
+			ASM_CLOBBER_VR ("v20") ASM_CLOBBER_VR ("v21")	\
+			ASM_CLOBBER_VR ("v22") ASM_CLOBBER_VR ("v30")	\
+			ASM_CLOBBER_VR ("v31")				\
+		      );						\
+    if (len > 0)							\
+      {									\
+	/* Found an invalid character at the next input byte.  */	\
+	BODY_ORIG_ERROR							\
+      }									\
+  }
+
+# include <iconv/loop.c>
+# include <iconv/skeleton.c>
+# undef BODY_ORIG
+# undef BODY_ORIG_ERROR
+ICONV_VX_IFUNC (__gconv_transform_posix_internal)
+
+/* Convert from the internal (UCS4-like) format to
+   {ISO 646-IRV => [0, 0x7F]; [U+DF80, U+DFFF] => [0x80, 0xFF]}.  */
+# define DEFINE_INIT		0
+# define DEFINE_FINI		0
+# define MIN_NEEDED_FROM	4
+# define MIN_NEEDED_TO		1
+# define FROM_DIRECTION		1
+# define FROM_LOOP		ICONV_VX_NAME (internal_posix_loop)
+# define TO_LOOP		ICONV_VX_NAME (internal_posix_loop) /* This is not used.  */
+# define FUNCTION_NAME		ICONV_VX_NAME (__gconv_transform_internal_posix)
+# define ONE_DIRECTION		1
+
+# define MIN_NEEDED_INPUT	MIN_NEEDED_FROM
+# define MIN_NEEDED_OUTPUT	MIN_NEEDED_TO
+# define LOOPFCT		FROM_LOOP
+# define BODY_ORIG_ERROR						\
+  UNICODE_TAG_HANDLER (*((const uint32_t *) inptr), 4);			\
+  STANDARD_TO_LOOP_ERR_HANDLER (4);
+
+# define BODY_ORIG \
+  {									\
+    uint32_t val = *((const uint32_t *) inptr);				\
+    if (__glibc_unlikely ((val > 0x7f && val < 0xdf80) || val > 0xdfff))\
+      {									\
+	UNICODE_TAG_HANDLER (val, 4);					\
+	STANDARD_TO_LOOP_ERR_HANDLER (4);				\
+      }									\
+    else								\
+      {									\
+	if (__glibc_unlikely (val > 0x7f))				\
+	  val -= 0xdf00;						\
+	*outptr++ = val;						\
+	inptr += sizeof (uint32_t);					\
+      }									\
+  }
+
+# define BODY								\
+  {									\
+    size_t len = (inend - inptr) / 4;					\ TODO: entirely internal_ascii_loop, above
+    if (len > outend - outptr)						\
+      len = outend - outptr;						\
+    size_t loop_count, tmp, tmp2;					\
+    __asm__ volatile (".machine push\n\t"				\
+		      ".machine \"z13\"\n\t"				\
+		      ".machinemode \"zarch_nohighgprs\"\n\t"		\
+		      CONVERT_32BIT_SIZE_T ([R_LEN])			\
+		      /* Setup to check for ch > 0x7f.  */		\
+		      "    vzero %%v21\n\t"				\
+		      "    srlg %[R_LI],%[R_LEN],4\n\t"			\
+		      "    vleih %%v21,8192,0\n\t"  /* element 0:   >  */ \
+		      "    vleih %%v21,-8192,2\n\t" /* element 1: =<>  */ \
+		      "    vleif %%v20,127,0\n\t"   /* element 0: 127  */ \
+		      "    lghi %[R_TMP],0\n\t"				\
+		      "    clgije %[R_LI],0,1f\n\t"			\
+		      "0:\n\t"						\
+		      "    vlm %%v16,%%v19,0(%[R_IN])\n\t"		\
+		      /* Shorten to byte values.  */			\
+		      "    vpkf %%v23,%%v16,%%v17\n\t"			\
+		      "    vpkf %%v24,%%v18,%%v19\n\t"			\
+		      "    vpkh %%v23,%%v23,%%v24\n\t"			\
+		      /* Checking for values > 0x7f.  */		\
+		      "    vstrcfs %%v22,%%v16,%%v20,%%v21\n\t"		\
+		      "    jno 10f\n\t"					\
+		      "    vstrcfs %%v22,%%v17,%%v20,%%v21\n\t"		\
+		      "    jno 11f\n\t"					\
+		      "    vstrcfs %%v22,%%v18,%%v20,%%v21\n\t"		\
+		      "    jno 12f\n\t"					\
+		      "    vstrcfs %%v22,%%v19,%%v20,%%v21\n\t"		\
+		      "    jno 13f\n\t"					\
+		      /* Store 16bytes to outptr.  */			\
+		      "    vst %%v23,0(%[R_OUT])\n\t"			\
+		      "    la %[R_IN],64(%[R_IN])\n\t"			\
+		      "    la %[R_OUT],16(%[R_OUT])\n\t"		\
+		      "    brctg %[R_LI],0b\n\t"			\
+		      "    lghi %[R_LI],15\n\t"				\
+		      "    ngr %[R_LEN],%[R_LI]\n\t"			\
+		      "    je 20f\n\t" /* Jump away if no remaining bytes.  */ \
+		      /* Handle remaining bytes.  */			\
+		      "1: sllg %[R_LI],%[R_LEN],2\n\t"			\
+		      "    aghi %[R_LI],-1\n\t"				\
+		      "    jl 20f\n\t" /* Jump away if no remaining bytes.  */ \
+		      /* Load remaining 1...63 bytes.  */		\
+		      "    vll %%v16,%[R_LI],0(%[R_IN])\n\t"		\
+		      "    ahi %[R_LI],-16\n\t"				\
+		      "    jl 2f\n\t"					\
+		      "    vll %%v17,%[R_LI],16(%[R_IN])\n\t"		\
+		      "    ahi %[R_LI],-16\n\t"				\
+		      "    jl 2f\n\t"					\
+		      "    vll %%v18,%[R_LI],32(%[R_IN])\n\t"		\
+		      "    ahi %[R_LI],-16\n\t"				\
+		      "    jl 2f\n\t"					\
+		      "    vll %%v19,%[R_LI],48(%[R_IN])\n\t"		\
+		      "2:\n\t"						\
+		      /* Shorten to byte values.  */			\
+		      "    vpkf %%v23,%%v16,%%v17\n\t"			\
+		      "    vpkf %%v24,%%v18,%%v19\n\t"			\
+		      "    vpkh %%v23,%%v23,%%v24\n\t"			\
+		      "    sllg %[R_LI],%[R_LEN],2\n\t"			\
+		      "    aghi %[R_LI],-16\n\t"			\
+		      "    jl 3f\n\t" /* v16 is not fully loaded.  */	\
+		      "    vstrcfs %%v22,%%v16,%%v20,%%v21\n\t"		\
+		      "    jno 10f\n\t"					\
+		      "    aghi %[R_LI],-16\n\t"			\
+		      "    jl 4f\n\t" /* v17 is not fully loaded.  */	\
+		      "    vstrcfs %%v22,%%v17,%%v20,%%v21\n\t"		\
+		      "    jno 11f\n\t"					\
+		      "    aghi %[R_LI],-16\n\t"			\
+		      "    jl 5f\n\t" /* v18 is not fully loaded.  */	\
+		      "    vstrcfs %%v22,%%v18,%%v20,%%v21\n\t"		\
+		      "    jno 12f\n\t"					\
+		      "    aghi %[R_LI],-16\n\t"			\
+		      /* v19 is not fully loaded. */			\
+		      "    lghi %[R_TMP],12\n\t"			\
+		      "    vstrcfs %%v22,%%v19,%%v20,%%v21\n\t"		\
+		      "6: vlgvb %[R_I],%%v22,7\n\t"			\
+		      "    aghi %[R_LI],16\n\t"				\
+		      "    clrjl %[R_I],%[R_LI],14f\n\t"		\
+		      "    lgr %[R_I],%[R_LEN]\n\t"			\
+		      "    lghi %[R_LEN],0\n\t"				\
+		      "    j 15f\n\t"					\
+		      "3: vstrcfs %%v22,%%v16,%%v20,%%v21\n\t"		\
+		      "    j 6b\n\t"					\
+		      "4: vstrcfs %%v22,%%v17,%%v20,%%v21\n\t"		\
+		      "    lghi %[R_TMP],4\n\t"				\
+		      "    j 6b\n\t"					\
+		      "5: vstrcfs %%v22,%%v17,%%v20,%%v21\n\t"		\
+		      "    lghi %[R_TMP],8\n\t"				\
+		      "    j 6b\n\t"					\
+		      /* Found a value > 0x7f.  */			\
+		      "13: ahi %[R_TMP],4\n\t"				\
+		      "12: ahi %[R_TMP],4\n\t"				\
+		      "11: ahi %[R_TMP],4\n\t"				\
+		      "10: vlgvb %[R_I],%%v22,7\n\t"			\
+		      "14: srlg %[R_I],%[R_I],2\n\t"			\
+		      "    agr %[R_I],%[R_TMP]\n\t"			\
+		      "    je 20f\n\t"					\
+		      /* Store characters before invalid one...  */	\
+		      "15: aghi %[R_I],-1\n\t"				\
+		      "    vstl %%v23,%[R_I],0(%[R_OUT])\n\t"		\
+		      /* ... and update pointers.  */			\
+		      "    la %[R_OUT],1(%[R_I],%[R_OUT])\n\t"		\
+		      "    sllg %[R_I],%[R_I],2\n\t"			\
+		      "    la %[R_IN],4(%[R_I],%[R_IN])\n\t"		\
+		      "20:\n\t"						\
+		      ".machine pop"					\
+		      : /* outputs */ [R_OUT] "+a" (outptr)		\
+			, [R_IN] "+a" (inptr)				\
+			, [R_LEN] "+d" (len)				\
+			, [R_LI] "=d" (loop_count)			\
+			, [R_I] "=a" (tmp2)				\
+			, [R_TMP] "=d" (tmp)				\
+		      : /* inputs */					\
+		      : /* clobber list*/ "memory", "cc"		\
+			ASM_CLOBBER_VR ("v16") ASM_CLOBBER_VR ("v17")	\
+			ASM_CLOBBER_VR ("v18") ASM_CLOBBER_VR ("v19")	\
+			ASM_CLOBBER_VR ("v20") ASM_CLOBBER_VR ("v21")	\
+			ASM_CLOBBER_VR ("v22") ASM_CLOBBER_VR ("v23")	\
+			ASM_CLOBBER_VR ("v24")				\
+		      );						\
+    if (len > 0)							\
+      {									\
+	/* Found an invalid character > 0x7f at next character.  */	\
+	BODY_ORIG_ERROR							\
+      }									\
+  }
+# define LOOP_NEED_FLAGS
+# include <iconv/loop.c>
+# include <iconv/skeleton.c>
+# undef BODY_ORIG
+# undef BODY_ORIG_ERROR
+ICONV_VX_IFUNC (__gconv_transform_internal_posix)
+
 
 /* Convert from internal UCS4 to UCS4 little endian form.  */
 # define DEFINE_INIT		0
diff --git a/wcsmbs/wcsmbsload.c b/wcsmbs/wcsmbsload.c
index 0f0f55f9ed..f87099bcf5 100644
--- a/wcsmbs/wcsmbsload.c
+++ b/wcsmbs/wcsmbsload.c
@@ -33,10 +33,10 @@  static const struct __gconv_step to_wc =
   .__shlib_handle = NULL,
   .__modname = NULL,
   .__counter = INT_MAX,
-  .__from_name = (char *) "ANSI_X3.4-1968//TRANSLIT",
+  .__from_name = (char *) "POSIX",
   .__to_name = (char *) "INTERNAL",
-  .__fct = __gconv_transform_ascii_internal,
-  .__btowc_fct = __gconv_btwoc_ascii,
+  .__fct = __gconv_transform_posix_internal,
+  .__btowc_fct = __gconv_btwoc_posix,
   .__init_fct = NULL,
   .__end_fct = NULL,
   .__min_needed_from = 1,
@@ -53,8 +53,8 @@  static const struct __gconv_step to_mb =
   .__modname = NULL,
   .__counter = INT_MAX,
   .__from_name = (char *) "INTERNAL",
-  .__to_name = (char *) "ANSI_X3.4-1968//TRANSLIT",
-  .__fct = __gconv_transform_internal_ascii,
+  .__to_name = (char *) "POSIX",
+  .__fct = __gconv_transform_internal_posix,
   .__btowc_fct = NULL,
   .__init_fct = NULL,
   .__end_fct = NULL,