gnulib's errno module was imported

Message ID 5465EBAD.3070108@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves Nov. 14, 2014, 11:46 a.m. UTC
  On 11/14/2014 05:44 AM, Yao Qi wrote:

> However, we've already had a conclusion that we don't import gnulib's
> errno module because it has some compatibility issues with libiconv
> (discussed in https://sourceware.org/ml/gdb-patches/2012-12/msg00554.html).
> AFAICS, the argument that not having errno module at that moment is
> still valid today.

I think this will keep haunting and blocking us until we fix it.

Can we reevaluate this?

To recap, the issue is that GNU iconv does this:

/* Get errno declaration and values. */
#include <errno.h>
/* Some systems, like SunOS 4, don't have EILSEQ. Some systems, like BSD/OS,
   have EILSEQ in a different header.  On these systems, define EILSEQ
   ourselves. */
#ifndef EILSEQ
#define EILSEQ @EILSEQ@
#endif

That's in:

 http://git.savannah.gnu.org/cgit/libiconv.git/tree/include/iconv.h.in

The "different header" mentioned is wchar.h.  This is handled in:

  http://git.savannah.gnu.org/cgit/libiconv.git/tree/m4/eilseq.m4

which defines @EILSEQ@ to EINVAL if EILSEQ isn't found in
either errno.h or wchar.h.

As we dropped support for both SunOS 4 or old BSD/OS, maybe we
don't need to care about the wchar.h issue anymore.
Still, AFAICS, gnulib's m4/errno_h.m4 doesn't know that EILSEQ may be
defined in wchar.h, and so on such systems, ISTM gnulib ends up defining
an incompatible EILSEQ itself, but I think that should be fixed on
the gnulib side, by making it extract the EILSEQ value out of the
system's wchar.h, like GNU iconv does.

So that leaves handling the case of gnulib making up a EILSEQ value,
which we take as meaning the system really doesn't really define it,
which will be the systems GNU iconv returns ENOENT instead.

With that rationale, how about we try something like this?

The current EILSEQ definition under PHONY_ICONV is obviously stale
as gnulib garantees there's always a EILSEQ defined.

From ccf843befc9750bb0b8fb18296c1352b9ddef855 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Fri, 14 Nov 2014 10:29:03 +0000
Subject: [PATCH] handle iconv defining EILSEQ to ENOENT

---
 gdb/charset.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)
  

Comments

Yao Qi Nov. 14, 2014, 1:01 p.m. UTC | #1
Pedro Alves <palves@redhat.com> writes:

> I think this will keep haunting and blocking us until we fix it.
>

Let us try again to fix it.

> Can we reevaluate this?

Sure.

> So that leaves handling the case of gnulib making up a EILSEQ value,
> which we take as meaning the system really doesn't really define it,
> which will be the systems GNU iconv returns ENOENT instead.
>
> With that rationale, how about we try something like this?

I am fine with your approach, but I am wondering why don't we simply
check ENOENT in the places where we check EILSEQ?

@@ -513,6 +513,7 @@ convert_between_encodings (const char *from, const char *to,
 	  switch (errno)
 	    {
 	    case EILSEQ:
+	    case ENOENT:
 	      {
 		int i;
 
@@ -651,6 +652,7 @@ wchar_iterate (struct wchar_iterator *iter,
 	  switch (errno)
 	    {
 	    case EILSEQ:
+	    case ENOENT:
 	      /* Invalid input sequence.  We still might have
 		 converted a character; if so, return it.  */
 	      if (out_avail < out_request * sizeof (gdb_wchar_t))

This looks cleaner to me (some comments should be added, of course).
  
Pedro Alves Nov. 14, 2014, 1:21 p.m. UTC | #2
On 11/14/2014 01:01 PM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
> 
>> I think this will keep haunting and blocking us until we fix it.
>>
> 
> Let us try again to fix it.
> 
>> Can we reevaluate this?
> 
> Sure.
> 
>> So that leaves handling the case of gnulib making up a EILSEQ value,
>> which we take as meaning the system really doesn't really define it,
>> which will be the systems GNU iconv returns ENOENT instead.
>>
>> With that rationale, how about we try something like this?
> 
> I am fine with your approach, but I am wondering why don't we simply
> check ENOENT in the places where we check EILSEQ?
> 
> @@ -513,6 +513,7 @@ convert_between_encodings (const char *from, const char *to,
>  	  switch (errno)
>  	    {
>  	    case EILSEQ:
> +	    case ENOENT:
>  	      {
>  		int i;
>  
> @@ -651,6 +652,7 @@ wchar_iterate (struct wchar_iterator *iter,
>  	  switch (errno)
>  	    {
>  	    case EILSEQ:
> +	    case ENOENT:
>  	      /* Invalid input sequence.  We still might have
>  		 converted a character; if so, return it.  */
>  	      if (out_avail < out_request * sizeof (gdb_wchar_t))
> 
> This looks cleaner to me (some comments should be added, of course).

That was actually my first approach, but then:

 - I thought that having a central place to handle this
   and to put the comment was cleaner than repeating the fix
   in multiple places.
 - That won't build on systems that EILSEQ and ENOENT are
   defined to the same value (two switch cases with the same value).
   Not sure there are any such systems, but given iconv.h's practice...

I guess I could also simplify and remove the GNULIB_defined_EILSEQ
guard, mapping ENOENT to EILSEQ everywhere ?

+/* On systems that don't have EILSEQ, GNU iconv's iconv.h defines it
+   to ENOENT.  gnulib instead defines it to a different value.  On
+   such systems, map ENOENT to gnulib's EILSEQ, leaving callers
+   agnostic.  */
+#ifdef GNULIB_defined_EILSEQ

I looked at glibc's iconv and it seems that ENOENT is never used
there, so should be safe.

Thanks,
Pedro Alves
  
Yao Qi Nov. 14, 2014, 1:53 p.m. UTC | #3
Pedro Alves <palves@redhat.com> writes:

> That was actually my first approach, but then:
>
>  - I thought that having a central place to handle this
>    and to put the comment was cleaner than repeating the fix
>    in multiple places.
>  - That won't build on systems that EILSEQ and ENOENT are
>    defined to the same value (two switch cases with the same value).
>    Not sure there are any such systems, but given iconv.h's practice...
>
> I guess I could also simplify and remove the GNULIB_defined_EILSEQ
> guard, mapping ENOENT to EILSEQ everywhere ?

I don't have a strong feeling on this, so either is OK to me.

>
> +/* On systems that don't have EILSEQ, GNU iconv's iconv.h defines it
> +   to ENOENT.  gnulib instead defines it to a different value.  On
> +   such systems, map ENOENT to gnulib's EILSEQ, leaving callers
> +   agnostic.  */
> +#ifdef GNULIB_defined_EILSEQ
>
> I looked at glibc's iconv and it seems that ENOENT is never used
> there, so should be safe.

Good, could you please commit your patch?
  
Eli Zaretskii Nov. 14, 2014, 2:31 p.m. UTC | #4
> Date: Fri, 14 Nov 2014 13:21:01 +0000
> From: Pedro Alves <palves@redhat.com>
> CC: gdb-patches@sourceware.org, gregory.0xf0@gmail.com,        Joel Brobecker <brobecker@adacore.com>
> 
> > @@ -651,6 +652,7 @@ wchar_iterate (struct wchar_iterator *iter,
> >  	  switch (errno)
> >  	    {
> >  	    case EILSEQ:
> > +	    case ENOENT:
> >  	      /* Invalid input sequence.  We still might have
> >  		 converted a character; if so, return it.  */
> >  	      if (out_avail < out_request * sizeof (gdb_wchar_t))
> > 
> > This looks cleaner to me (some comments should be added, of course).
> 
> That was actually my first approach, but then:
> 
>  - I thought that having a central place to handle this
>    and to put the comment was cleaner than repeating the fix
>    in multiple places.
>  - That won't build on systems that EILSEQ and ENOENT are
>    defined to the same value (two switch cases with the same value).
>    Not sure there are any such systems, but given iconv.h's practice...

The last one is easy:

  	    case EILSEQ:
 +#if EILSEQ != ENOENT
 +	    case ENOENT:
 +#endif
  
Pedro Alves Nov. 14, 2014, 2:42 p.m. UTC | #5
On 11/14/2014 02:31 PM, Eli Zaretskii wrote:
>> From: Pedro Alves <palves@redhat.com>

>> That was actually my first approach, but then:
>>
>>  - I thought that having a central place to handle this
>>    and to put the comment was cleaner than repeating the fix
>>    in multiple places.
>>  - That won't build on systems that EILSEQ and ENOENT are
>>    defined to the same value (two switch cases with the same value).
>>    Not sure there are any such systems, but given iconv.h's practice...
> 
> The last one is easy:
> 
>   	    case EILSEQ:
>  +#if EILSEQ != ENOENT
>  +	    case ENOENT:
>  +#endif

Agreed, but then having to do that in multiple places
seems even uglier.  :-)

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/charset.c b/gdb/charset.c
index 94ad020..d71321e 100644
--- a/gdb/charset.c
+++ b/gdb/charset.c
@@ -95,15 +95,6 @@ 
 #undef ICONV_CONST
 #define ICONV_CONST const
 
-/* Some systems don't have EILSEQ, so we define it here, but not as
-   EINVAL, because callers of `iconv' want to distinguish EINVAL and
-   EILSEQ.  This is what iconv.h from libiconv does as well.  Note
-   that wchar.h may also define EILSEQ, so this needs to be after we
-   include wchar.h, which happens in defs.h through gdb_wchar.h.  */
-#ifndef EILSEQ
-#define EILSEQ ENOENT
-#endif
-
 static iconv_t
 phony_iconv_open (const char *to, const char *from)
 {
@@ -187,8 +178,32 @@  phony_iconv (iconv_t utf_flag, const char **inbuf, size_t *inbytesleft,
   return 0;
 }
 
-#endif
+#else /* PHONY_ICONV */
+
+/* On systems that don't have EILSEQ, GNU iconv's iconv.h defines it
+   to ENOENT.  gnulib instead defines it to a different value.  On
+   such systems, map ENOENT to gnulib's EILSEQ, leaving callers
+   agnostic.  */
+#ifdef GNULIB_defined_EILSEQ
+
+static size_t
+gdb_iconv (iconv_t utf_flag, ICONV_CONST char **inbuf, size_t *inbytesleft,
+	   char **outbuf, size_t *outbytesleft)
+{
+  size_t ret;
+
+  ret = iconv (utf_flag, inbuf, inbytesleft, outbuf, outbytesleft);
+  if (errno == ENOENT)
+    errno = EILSEQ;
+  return ret;
+}
 
+#undef iconv
+#define iconv gdb_iconv
+
+#endif /* GNULIB_defined_EILSEQ */
+
+#endif /* PHONY_ICONV */
 
 
 /* The global lists of character sets and translations.  */