diff mbox

[testsuite] check for UTF-32 target wide charset support in gdb.base/wchar.exp

Message ID 55FA325D.1020207@codesourcery.com
State New
Headers show

Commit Message

Sandra Loosemore Sept. 17, 2015, 3:24 a.m. UTC
This patch is related to the one I posted yesterday to make 
with_target_charset do something reasonable in the absence of ICONV support:

https://sourceware.org/ml/gdb-patches/2015-09/msg00357.html

If GDB is configured without ICONV support, the target wide charset 
defaults to "ISO-8859-1" (which isn't even a wide charset), and all the 
wide strings in this test print as gibberish.  Otherwise, GDB seems to 
think the default is "auto; currently UTF-32", so let's make the 
dependency on UTF-32 explicit here and bail out if it's not available.

OK to commit?

-Sandra

Comments

Doug Evans Sept. 17, 2015, 4:30 a.m. UTC | #1
Sandra Loosemore <sandra@codesourcery.com> writes:
> This patch is related to the one I posted yesterday to make
> with_target_charset do something reasonable in the absence of ICONV
> support:
>
> https://sourceware.org/ml/gdb-patches/2015-09/msg00357.html
>
> If GDB is configured without ICONV support, the target wide charset
> defaults to "ISO-8859-1" (which isn't even a wide charset), and all
> the wide strings in this test print as gibberish.  Otherwise, GDB
> seems to think the default is "auto; currently UTF-32", so let's make
> the dependency on UTF-32 explicit here and bail out if it's not
> available.
>
> OK to commit?
>
> -Sandra
>
>
> 2015-09-16  Sandra Loosemore  <sandra@codesourcery.com>
>
> 	gdb/testsuite/
> 	* gdb.base/wchar.exp: Require UTF-32 target wide charset support,
> 	otherwise skip this test.
>
> diff --git a/gdb/testsuite/gdb.base/wchar.exp b/gdb/testsuite/gdb.base/wchar.exp
> index 1a5a2d4..171385b 100644
> --- a/gdb/testsuite/gdb.base/wchar.exp
> +++ b/gdb/testsuite/gdb.base/wchar.exp
> @@ -24,6 +24,19 @@ if ![runto "wchar.c:$bp_location" ] then {
>    return -1
>  }
>  
> +# This test requires wide character support in GDB.
> +# Setting the charset may fail if GDB was configured without
> +# ICONV support.
> +gdb_test_multiple "set target-wide-charset UTF-32" "" {
> +    -re "Undefined item.*$gdb_prompt " {
> +	unsupported "Unknown charset UTF-32"
> +	return -1
> +    }
> +    -re ".*$gdb_prompt " {
> +	pass "set target-wide-charset UTF-32"
> +    }
> +}
> +
>  gdb_test "print narrow" "= 97 L'a'"
>  
>  gdb_test "print single" "= 48879 L'\\\\xbeef'"

Yeah, the default for the !iconv case doesn't seem right:

#ifdef PHONY_ICONV

/* Provide a phony iconv that does as little as possible.  Also,
   arrange for there to be a single available character set.  */

#undef GDB_DEFAULT_HOST_CHARSET
#define GDB_DEFAULT_HOST_CHARSET "ISO-8859-1"
#define GDB_DEFAULT_TARGET_CHARSET "ISO-8859-1"
#define GDB_DEFAULT_TARGET_WIDE_CHARSET "ISO-8859-1" <<<<

How reasonable is it to enhance the PHONY_ICONV support so that
it handles this better?
I see it already tries to provide some minimal functionality:

static iconv_t
phony_iconv_open (const char *to, const char *from)
{
  /* We allow conversions from UTF-32BE, wchar_t, and the host charset.
     We allow conversions to wchar_t and the host charset.  */
  if (strcmp (from, "UTF-32BE") && strcmp (from, "wchar_t")
      && strcmp (from, GDB_DEFAULT_HOST_CHARSET))
    return -1;
  if (strcmp (to, "wchar_t") && strcmp (to, GDB_DEFAULT_HOST_CHARSET))
    return -1;

  /* Return 1 if we are converting from UTF-32BE, 0 otherwise.  This is
     used as a flag in calls to iconv.  */
  return !strcmp (from, "UTF-32BE");
}

I don't know, off hand, why big endian is supported and not little endian.
Eli Zaretskii Sept. 17, 2015, 5:07 a.m. UTC | #2
> Date: Wed, 16 Sep 2015 21:24:13 -0600
> From: Sandra Loosemore <sandra@codesourcery.com>
> 
> If GDB is configured without ICONV support, the target wide charset 
> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the 
> wide strings in this test print as gibberish.  Otherwise, GDB seems to 
> think the default is "auto; currently UTF-32", so let's make the 
> dependency on UTF-32 explicit here and bail out if it's not available.

Why UTF-32, hard-coded?  Why not allow also UTF-16, for example?
Eli Zaretskii Sept. 17, 2015, 5:10 a.m. UTC | #3
> From: Doug Evans <xdje42@gmail.com>
> Cc: gdb-patches <gdb-patches@sourceware.org>
> Date: Wed, 16 Sep 2015 21:30:46 -0700
> 
> #undef GDB_DEFAULT_HOST_CHARSET
> #define GDB_DEFAULT_HOST_CHARSET "ISO-8859-1"
> #define GDB_DEFAULT_TARGET_CHARSET "ISO-8859-1"
> #define GDB_DEFAULT_TARGET_WIDE_CHARSET "ISO-8859-1" <<<<
> 
> How reasonable is it to enhance the PHONY_ICONV support so that
> it handles this better?

You mean, have functions like strlen and strcat handle 32-bit wchar_t
strings?  Not reasonable.

> I see it already tries to provide some minimal functionality:
> 
> static iconv_t
> phony_iconv_open (const char *to, const char *from)
> {
>   /* We allow conversions from UTF-32BE, wchar_t, and the host charset.
>      We allow conversions to wchar_t and the host charset.  */
>   if (strcmp (from, "UTF-32BE") && strcmp (from, "wchar_t")
>       && strcmp (from, GDB_DEFAULT_HOST_CHARSET))
>     return -1;
>   if (strcmp (to, "wchar_t") && strcmp (to, GDB_DEFAULT_HOST_CHARSET))
>     return -1;
> 
>   /* Return 1 if we are converting from UTF-32BE, 0 otherwise.  This is
>      used as a flag in calls to iconv.  */
>   return !strcmp (from, "UTF-32BE");
> }
> 
> I don't know, off hand, why big endian is supported and not little endian.

Supported by whom?
Sandra Loosemore Sept. 17, 2015, 5:16 a.m. UTC | #4
On 09/16/2015 11:07 PM, Eli Zaretskii wrote:
>> Date: Wed, 16 Sep 2015 21:24:13 -0600
>> From: Sandra Loosemore <sandra@codesourcery.com>
>>
>> If GDB is configured without ICONV support, the target wide charset
>> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the
>> wide strings in this test print as gibberish.  Otherwise, GDB seems to
>> think the default is "auto; currently UTF-32", so let's make the
>> dependency on UTF-32 explicit here and bail out if it's not available.
>
> Why UTF-32, hard-coded?  Why not allow also UTF-16, for example?

It looked to me like the default target wide charset is UTF-32 if you 
don't pick one explicitly.  Since the test as currently written doesn't, 
the patterns the .exp file is trying to match must assume the default 
target wide charset, and not some other wide charset that might happen 
to be supported.

If I'm confused and the default charset might not always be UTF-32 if 
ICONV is present, how about changing the testcase to bail if it sees the 
default wide charset is ISO-8859-1?  That means either ICONV is not 
present or GDB's default is otherwise wrongly configured.

-Sandra
Eli Zaretskii Sept. 17, 2015, 5:59 a.m. UTC | #5
> Date: Wed, 16 Sep 2015 23:16:37 -0600
> From: Sandra Loosemore <sandra@codesourcery.com>
> CC: <gdb-patches@sourceware.org>
> 
> If I'm confused and the default charset might not always be UTF-32 if 
> ICONV is present, how about changing the testcase to bail if it sees the 
> default wide charset is ISO-8859-1?  That means either ICONV is not 
> present or GDB's default is otherwise wrongly configured.

If the test makes no sense without ICONV, I think it indeed will be
better to skip it.
Doug Evans Sept. 18, 2015, 4:36 a.m. UTC | #6
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Doug Evans <xdje42@gmail.com>
>> Cc: gdb-patches <gdb-patches@sourceware.org>
>> Date: Wed, 16 Sep 2015 21:30:46 -0700
>> 
>> #undef GDB_DEFAULT_HOST_CHARSET
>> #define GDB_DEFAULT_HOST_CHARSET "ISO-8859-1"
>> #define GDB_DEFAULT_TARGET_CHARSET "ISO-8859-1"
>> #define GDB_DEFAULT_TARGET_WIDE_CHARSET "ISO-8859-1" <<<<
>> 
>> How reasonable is it to enhance the PHONY_ICONV support so that
>> it handles this better?
>
> You mean, have functions like strlen and strcat handle 32-bit wchar_t
> strings?  Not reasonable.

Yikes. No, I didn't mean that. :-)
I meant have the PHONY_ICONV case use a better value for
GDB_DEFAULT_TARGET_WIDE_CHARSET.

>> I see it already tries to provide some minimal functionality:
>> 
>> static iconv_t
>> phony_iconv_open (const char *to, const char *from)
>> {
>>   /* We allow conversions from UTF-32BE, wchar_t, and the host charset.
>>      We allow conversions to wchar_t and the host charset.  */
>>   if (strcmp (from, "UTF-32BE") && strcmp (from, "wchar_t")
>>       && strcmp (from, GDB_DEFAULT_HOST_CHARSET))
>>     return -1;
>>   if (strcmp (to, "wchar_t") && strcmp (to, GDB_DEFAULT_HOST_CHARSET))
>>     return -1;
>> 
>>   /* Return 1 if we are converting from UTF-32BE, 0 otherwise.  This is
>>      used as a flag in calls to iconv.  */
>>   return !strcmp (from, "UTF-32BE");
>> }
>> 
>> I don't know, off hand, why big endian is supported and not little endian.
>
> Supported by whom?

By PHONY_ICONV.
[which is more of a minimal iconv than a phony iconv, but whatever]
Doug Evans Sept. 18, 2015, 4:40 a.m. UTC | #7
Eli Zaretskii <eliz@gnu.org> writes:
>> Date: Wed, 16 Sep 2015 21:24:13 -0600
>> From: Sandra Loosemore <sandra@codesourcery.com>
>> 
>> If GDB is configured without ICONV support, the target wide charset 
>> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the 
>> wide strings in this test print as gibberish.  Otherwise, GDB seems to 
>> think the default is "auto; currently UTF-32", so let's make the 
>> dependency on UTF-32 explicit here and bail out if it's not available.
>
> Why UTF-32, hard-coded?  Why not allow also UTF-16, for example?

Because this is the case of no iconv support and thus gdb is supplying its own
minimal version. No point in supporting anything unwarranted.
If a compelling case arises for UTF-16 we can cross that bridge
when we get to it.
Doug Evans Sept. 18, 2015, 4:47 a.m. UTC | #8
Sandra Loosemore <sandra@codesourcery.com> writes:
> On 09/16/2015 11:07 PM, Eli Zaretskii wrote:
>>> Date: Wed, 16 Sep 2015 21:24:13 -0600
>>> From: Sandra Loosemore <sandra@codesourcery.com>
>>>
>>> If GDB is configured without ICONV support, the target wide charset
>>> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the
>>> wide strings in this test print as gibberish.  Otherwise, GDB seems to
>>> think the default is "auto; currently UTF-32", so let's make the
>>> dependency on UTF-32 explicit here and bail out if it's not available.
>>
>> Why UTF-32, hard-coded?  Why not allow also UTF-16, for example?
>
> It looked to me like the default target wide charset is UTF-32 if you
> don't pick one explicitly.  Since the test as currently written
> doesn't, the patterns the .exp file is trying to match must assume the
> default target wide charset, and not some other wide charset that
> might happen to be supported.
>
> If I'm confused and the default charset might not always be UTF-32 if
> ICONV is present, how about changing the testcase to bail if it sees
> the default wide charset is ISO-8859-1?  That means either ICONV is
> not present or GDB's default is otherwise wrongly configured.

The question I have is what's the intent behind PHONY_ICONV?
[The fact that it apparently tries to provide some minimal wide char
support but defines GDB_DEFAULT_TARGET_WIDE_CHARSET as "ISO-8859-1"
seems like a bug to me.]

Is it intended that it provide some minimal wide char support?
And if so, my preference would be to keep the test if it's easy.
Pedro Alves Sept. 29, 2015, 1:50 p.m. UTC | #9
On 09/18/2015 05:47 AM, Doug Evans wrote:

> The question I have is what's the intent behind PHONY_ICONV?
> [The fact that it apparently tries to provide some minimal wide char
> support but defines GDB_DEFAULT_TARGET_WIDE_CHARSET as "ISO-8859-1"
> seems like a bug to me.]
> 
> Is it intended that it provide some minimal wide char support?

I recall the feature being submitted and the follow up Solaris fixes,
but I wasn't the one doing the work, so I'm not 100% sure of anything,
but I believe the intention is to punt on wide char support with the least
extra code possible.  That seems to be what the original submission
suggests as well:

 https://sourceware.org/ml/gdb-patches/2009-01/msg00533.html
 https://sourceware.org/ml/gdb-patches/2009-03/msg00420.html

along with the comments in mails around the follow up Solaris fixes:

 https://www.sourceware.org/ml/gdb-patches/2010-08/msg00339.html

Thanks,
Pedro Alves
diff mbox

Patch

diff --git a/gdb/testsuite/gdb.base/wchar.exp b/gdb/testsuite/gdb.base/wchar.exp
index 1a5a2d4..171385b 100644
--- a/gdb/testsuite/gdb.base/wchar.exp
+++ b/gdb/testsuite/gdb.base/wchar.exp
@@ -24,6 +24,19 @@  if ![runto "wchar.c:$bp_location" ] then {
   return -1
 }
 
+# This test requires wide character support in GDB.
+# Setting the charset may fail if GDB was configured without
+# ICONV support.
+gdb_test_multiple "set target-wide-charset UTF-32" "" {
+    -re "Undefined item.*$gdb_prompt " {
+	unsupported "Unknown charset UTF-32"
+	return -1
+    }
+    -re ".*$gdb_prompt " {
+	pass "set target-wide-charset UTF-32"
+    }
+}
+
 gdb_test "print narrow" "= 97 L'a'"
 
 gdb_test "print single" "= 48879 L'\\\\xbeef'"