fix wcwidth to work with gcc 16 sign extension (Re: wchar.h: tweak wcwidth prototype parameter wchar_t -> wint_t)

Message ID 49097df7-bf9a-4710-8e5a-74765d839791@towo.net
State New
Headers
Series fix wcwidth to work with gcc 16 sign extension (Re: wchar.h: tweak wcwidth prototype parameter wchar_t -> wint_t) |

Commit Message

Thomas Wolff June 3, 2026, 12:43 a.m. UTC
  As discussed in the cygwin thread, I withdraw my previous patch and 
provide the attached one to fix wcwidth for gcc 16.
Thomas

Am 01.06.2026 um 18:02 schrieb Thomas Wolff:
>
> Am 31.05.2026 um 13:57 schrieb Takashi Yano:
>> Hi Thomas,
>>
>> On Sun, 31 May 2026 10:06:12 +0200
>> Thomas Wolff wrote:
>>> Hi Brian,
>>>
>>> Am 31.05.2026 um 05:50 schrieb Brian Inglis via Cygwin:
>>>> On 2026-05-28 22:58, Thomas Wolff wrote:
>>>>> to make it compliant with newlib and the manual page;
>>>>> fixes cases of wrong width calculation:
>>>>> https://cygwin.com/pipermail/cygwin/2026-April/259597.html
>>>>> as mentioned in
>>>>> https://cygwin.com/pipermail/cygwin/2026-May/259734.html
>>>>> as described in
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125451#c14
>>>>> attachment:
>>>> 0001-wchar.h-tweak-wcwidth-prototype-parameter-wchar_t-wi.patch
>>>>
>>>> The existing wcwidth declaration in newlib/libc/include/wchar.h agrees
>>>> with
>>>> POSIX 8 SUS V5.
>>>>
>>>> It is the man doc, definition, and implementation in
>>>> newlib/libc/string/wcwidth.c which need changed to match the
>>>> specification and return codes in:
>>>>
>>>>      https://pubs.opengroup.org/onlinepubs/9799919799/functions/wcwidth.html 
>>>>
>>> Your argument overlooks one significant deviation: in POSIX, wchar_t 
>>> has
>>> 32 bits, in cygwin only 16.
>>> So to make wcwidth work for *all* Unicode character code points, the 32
>>> bit version must be used.
>>> I tested positively that this fixes the broken test case with gcc 16 I
>>> had reported to the cygwin list.
>> However, newlib is not used only by Cygwin, so I think newlib itself 
>> should
>> follow POSIX. Shouldn't we have our own wcwidth() implementation for 
>> Cygwin?
>>
>> On second thought, since a 16‑bit wchar_t needs to be converted to a 
>> 32‑bit
>> Unicode code point especially for surrogate pair, we cannot use 
>> wcwidth in
>> the same way as Linux does. I wonder what the correct approach would be.
> I don't there is a "correct" approach as POSIX probably did not 
> consider this problem.
> But I just responded to a cute idea on the cygwin mailing list, which 
> was unfeasible but I modified it with a proposal to return width 1 for 
> a high surrogate, remember it, and then return 1 or 0 for the low 
> surrogate, respectively.
From 08f00a6599810e8aa5fc4b456a5b183635ce91fc Mon Sep 17 00:00:00 2001
From: Thomas Wolff <towo@towo.net>
Date: Wed, 3 Jun 2026 00:00:00 +0000
Subject: [PATCH] sync wcwidth parameter width with prototype in wchar.h

in order to exclude undefined behaviour on parameter width extension
as arised at gcc 16, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125451,
fixes https://cygwin.com/pipermail/cygwin/2026-April/259597.html
---
 newlib/libc/string/wcwidth.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Jeff Johnston June 4, 2026, 3:33 p.m. UTC | #1
Patch applied.  Thanks.

-- Jeff J.

On Tue, Jun 2, 2026 at 8:43 PM Thomas Wolff <towo@towo.net> wrote:

> As discussed in the cygwin thread, I withdraw my previous patch and
> provide the attached one to fix wcwidth for gcc 16.
> Thomas
>
> Am 01.06.2026 um 18:02 schrieb Thomas Wolff:
> >
> > Am 31.05.2026 um 13:57 schrieb Takashi Yano:
> >> Hi Thomas,
> >>
> >> On Sun, 31 May 2026 10:06:12 +0200
> >> Thomas Wolff wrote:
> >>> Hi Brian,
> >>>
> >>> Am 31.05.2026 um 05:50 schrieb Brian Inglis via Cygwin:
> >>>> On 2026-05-28 22:58, Thomas Wolff wrote:
> >>>>> to make it compliant with newlib and the manual page;
> >>>>> fixes cases of wrong width calculation:
> >>>>> https://cygwin.com/pipermail/cygwin/2026-April/259597.html
> >>>>> as mentioned in
> >>>>> https://cygwin.com/pipermail/cygwin/2026-May/259734.html
> >>>>> as described in
> >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125451#c14
> >>>>> attachment:
> >>>> 0001-wchar.h-tweak-wcwidth-prototype-parameter-wchar_t-wi.patch
> >>>>
> >>>> The existing wcwidth declaration in newlib/libc/include/wchar.h agrees
> >>>> with
> >>>> POSIX 8 SUS V5.
> >>>>
> >>>> It is the man doc, definition, and implementation in
> >>>> newlib/libc/string/wcwidth.c which need changed to match the
> >>>> specification and return codes in:
> >>>>
> >>>>
> https://pubs.opengroup.org/onlinepubs/9799919799/functions/wcwidth.html
> >>>>
> >>> Your argument overlooks one significant deviation: in POSIX, wchar_t
> >>> has
> >>> 32 bits, in cygwin only 16.
> >>> So to make wcwidth work for *all* Unicode character code points, the 32
> >>> bit version must be used.
> >>> I tested positively that this fixes the broken test case with gcc 16 I
> >>> had reported to the cygwin list.
> >> However, newlib is not used only by Cygwin, so I think newlib itself
> >> should
> >> follow POSIX. Shouldn't we have our own wcwidth() implementation for
> >> Cygwin?
> >>
> >> On second thought, since a 16‑bit wchar_t needs to be converted to a
> >> 32‑bit
> >> Unicode code point especially for surrogate pair, we cannot use
> >> wcwidth in
> >> the same way as Linux does. I wonder what the correct approach would be.
> > I don't there is a "correct" approach as POSIX probably did not
> > consider this problem.
> > But I just responded to a cute idea on the cygwin mailing list, which
> > was unfeasible but I modified it with a proposal to return width 1 for
> > a high surrogate, remember it, and then return 1 or 0 for the low
> > surrogate, respectively.
>
  

Patch

diff --git a/newlib/libc/string/wcwidth.c b/newlib/libc/string/wcwidth.c
index 8348eefe8..dfcaa6c21 100644
--- a/newlib/libc/string/wcwidth.c
+++ b/newlib/libc/string/wcwidth.c
@@ -230,7 +230,9 @@  __wcwidth (const wint_t ucs)
 }
 
 int
-wcwidth (const wint_t wc)
+wcwidth (const wchar_t wc)
+// parameter width must be in sync with prototype in wchar.h
+// to exclude undefined behaviour on parameter width extension (e.g. gcc 16)
 {
   wint_t wi = wc;