Message ID | 55FA325D.1020207@codesourcery.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 26323 invoked by alias); 17 Sep 2015 03:25:11 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 26298 invoked by uid 89); 17 Sep 2015 03:25:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 17 Sep 2015 03:25:04 +0000 Received: from svr-orw-fem-05.mgc.mentorg.com ([147.34.97.43]) by relay1.mentorg.com with esmtp id 1ZcPoj-0001zS-M0 from Sandra_Loosemore@mentor.com for gdb-patches@sourceware.org; Wed, 16 Sep 2015 20:25:01 -0700 Received: from [IPv6:::1] (147.34.91.1) by svr-orw-fem-05.mgc.mentorg.com (147.34.97.43) with Microsoft SMTP Server id 14.3.224.2; Wed, 16 Sep 2015 20:25:00 -0700 Message-ID: <55FA325D.1020207@codesourcery.com> Date: Wed, 16 Sep 2015 21:24:13 -0600 From: Sandra Loosemore <sandra@codesourcery.com> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: gdb-patches <gdb-patches@sourceware.org> Subject: [patch, testsuite] check for UTF-32 target wide charset support in gdb.base/wchar.exp Content-Type: multipart/mixed; boundary="------------000508090209040500030806" |
Commit Message
Sandra Loosemore
Sept. 17, 2015, 3:24 a.m. UTC
This patch is related to the one I posted yesterday to make with_target_charset do something reasonable in the absence of ICONV support: https://sourceware.org/ml/gdb-patches/2015-09/msg00357.html If GDB is configured without ICONV support, the target wide charset defaults to "ISO-8859-1" (which isn't even a wide charset), and all the wide strings in this test print as gibberish. Otherwise, GDB seems to think the default is "auto; currently UTF-32", so let's make the dependency on UTF-32 explicit here and bail out if it's not available. OK to commit? -Sandra
Comments
Sandra Loosemore <sandra@codesourcery.com> writes: > This patch is related to the one I posted yesterday to make > with_target_charset do something reasonable in the absence of ICONV > support: > > https://sourceware.org/ml/gdb-patches/2015-09/msg00357.html > > If GDB is configured without ICONV support, the target wide charset > defaults to "ISO-8859-1" (which isn't even a wide charset), and all > the wide strings in this test print as gibberish. Otherwise, GDB > seems to think the default is "auto; currently UTF-32", so let's make > the dependency on UTF-32 explicit here and bail out if it's not > available. > > OK to commit? > > -Sandra > > > 2015-09-16 Sandra Loosemore <sandra@codesourcery.com> > > gdb/testsuite/ > * gdb.base/wchar.exp: Require UTF-32 target wide charset support, > otherwise skip this test. > > diff --git a/gdb/testsuite/gdb.base/wchar.exp b/gdb/testsuite/gdb.base/wchar.exp > index 1a5a2d4..171385b 100644 > --- a/gdb/testsuite/gdb.base/wchar.exp > +++ b/gdb/testsuite/gdb.base/wchar.exp > @@ -24,6 +24,19 @@ if ![runto "wchar.c:$bp_location" ] then { > return -1 > } > > +# This test requires wide character support in GDB. > +# Setting the charset may fail if GDB was configured without > +# ICONV support. > +gdb_test_multiple "set target-wide-charset UTF-32" "" { > + -re "Undefined item.*$gdb_prompt " { > + unsupported "Unknown charset UTF-32" > + return -1 > + } > + -re ".*$gdb_prompt " { > + pass "set target-wide-charset UTF-32" > + } > +} > + > gdb_test "print narrow" "= 97 L'a'" > > gdb_test "print single" "= 48879 L'\\\\xbeef'" Yeah, the default for the !iconv case doesn't seem right: #ifdef PHONY_ICONV /* Provide a phony iconv that does as little as possible. Also, arrange for there to be a single available character set. */ #undef GDB_DEFAULT_HOST_CHARSET #define GDB_DEFAULT_HOST_CHARSET "ISO-8859-1" #define GDB_DEFAULT_TARGET_CHARSET "ISO-8859-1" #define GDB_DEFAULT_TARGET_WIDE_CHARSET "ISO-8859-1" <<<< How reasonable is it to enhance the PHONY_ICONV support so that it handles this better? I see it already tries to provide some minimal functionality: static iconv_t phony_iconv_open (const char *to, const char *from) { /* We allow conversions from UTF-32BE, wchar_t, and the host charset. We allow conversions to wchar_t and the host charset. */ if (strcmp (from, "UTF-32BE") && strcmp (from, "wchar_t") && strcmp (from, GDB_DEFAULT_HOST_CHARSET)) return -1; if (strcmp (to, "wchar_t") && strcmp (to, GDB_DEFAULT_HOST_CHARSET)) return -1; /* Return 1 if we are converting from UTF-32BE, 0 otherwise. This is used as a flag in calls to iconv. */ return !strcmp (from, "UTF-32BE"); } I don't know, off hand, why big endian is supported and not little endian.
> Date: Wed, 16 Sep 2015 21:24:13 -0600 > From: Sandra Loosemore <sandra@codesourcery.com> > > If GDB is configured without ICONV support, the target wide charset > defaults to "ISO-8859-1" (which isn't even a wide charset), and all the > wide strings in this test print as gibberish. Otherwise, GDB seems to > think the default is "auto; currently UTF-32", so let's make the > dependency on UTF-32 explicit here and bail out if it's not available. Why UTF-32, hard-coded? Why not allow also UTF-16, for example?
> From: Doug Evans <xdje42@gmail.com> > Cc: gdb-patches <gdb-patches@sourceware.org> > Date: Wed, 16 Sep 2015 21:30:46 -0700 > > #undef GDB_DEFAULT_HOST_CHARSET > #define GDB_DEFAULT_HOST_CHARSET "ISO-8859-1" > #define GDB_DEFAULT_TARGET_CHARSET "ISO-8859-1" > #define GDB_DEFAULT_TARGET_WIDE_CHARSET "ISO-8859-1" <<<< > > How reasonable is it to enhance the PHONY_ICONV support so that > it handles this better? You mean, have functions like strlen and strcat handle 32-bit wchar_t strings? Not reasonable. > I see it already tries to provide some minimal functionality: > > static iconv_t > phony_iconv_open (const char *to, const char *from) > { > /* We allow conversions from UTF-32BE, wchar_t, and the host charset. > We allow conversions to wchar_t and the host charset. */ > if (strcmp (from, "UTF-32BE") && strcmp (from, "wchar_t") > && strcmp (from, GDB_DEFAULT_HOST_CHARSET)) > return -1; > if (strcmp (to, "wchar_t") && strcmp (to, GDB_DEFAULT_HOST_CHARSET)) > return -1; > > /* Return 1 if we are converting from UTF-32BE, 0 otherwise. This is > used as a flag in calls to iconv. */ > return !strcmp (from, "UTF-32BE"); > } > > I don't know, off hand, why big endian is supported and not little endian. Supported by whom?
On 09/16/2015 11:07 PM, Eli Zaretskii wrote: >> Date: Wed, 16 Sep 2015 21:24:13 -0600 >> From: Sandra Loosemore <sandra@codesourcery.com> >> >> If GDB is configured without ICONV support, the target wide charset >> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the >> wide strings in this test print as gibberish. Otherwise, GDB seems to >> think the default is "auto; currently UTF-32", so let's make the >> dependency on UTF-32 explicit here and bail out if it's not available. > > Why UTF-32, hard-coded? Why not allow also UTF-16, for example? It looked to me like the default target wide charset is UTF-32 if you don't pick one explicitly. Since the test as currently written doesn't, the patterns the .exp file is trying to match must assume the default target wide charset, and not some other wide charset that might happen to be supported. If I'm confused and the default charset might not always be UTF-32 if ICONV is present, how about changing the testcase to bail if it sees the default wide charset is ISO-8859-1? That means either ICONV is not present or GDB's default is otherwise wrongly configured. -Sandra
> Date: Wed, 16 Sep 2015 23:16:37 -0600 > From: Sandra Loosemore <sandra@codesourcery.com> > CC: <gdb-patches@sourceware.org> > > If I'm confused and the default charset might not always be UTF-32 if > ICONV is present, how about changing the testcase to bail if it sees the > default wide charset is ISO-8859-1? That means either ICONV is not > present or GDB's default is otherwise wrongly configured. If the test makes no sense without ICONV, I think it indeed will be better to skip it.
Eli Zaretskii <eliz@gnu.org> writes: >> From: Doug Evans <xdje42@gmail.com> >> Cc: gdb-patches <gdb-patches@sourceware.org> >> Date: Wed, 16 Sep 2015 21:30:46 -0700 >> >> #undef GDB_DEFAULT_HOST_CHARSET >> #define GDB_DEFAULT_HOST_CHARSET "ISO-8859-1" >> #define GDB_DEFAULT_TARGET_CHARSET "ISO-8859-1" >> #define GDB_DEFAULT_TARGET_WIDE_CHARSET "ISO-8859-1" <<<< >> >> How reasonable is it to enhance the PHONY_ICONV support so that >> it handles this better? > > You mean, have functions like strlen and strcat handle 32-bit wchar_t > strings? Not reasonable. Yikes. No, I didn't mean that. :-) I meant have the PHONY_ICONV case use a better value for GDB_DEFAULT_TARGET_WIDE_CHARSET. >> I see it already tries to provide some minimal functionality: >> >> static iconv_t >> phony_iconv_open (const char *to, const char *from) >> { >> /* We allow conversions from UTF-32BE, wchar_t, and the host charset. >> We allow conversions to wchar_t and the host charset. */ >> if (strcmp (from, "UTF-32BE") && strcmp (from, "wchar_t") >> && strcmp (from, GDB_DEFAULT_HOST_CHARSET)) >> return -1; >> if (strcmp (to, "wchar_t") && strcmp (to, GDB_DEFAULT_HOST_CHARSET)) >> return -1; >> >> /* Return 1 if we are converting from UTF-32BE, 0 otherwise. This is >> used as a flag in calls to iconv. */ >> return !strcmp (from, "UTF-32BE"); >> } >> >> I don't know, off hand, why big endian is supported and not little endian. > > Supported by whom? By PHONY_ICONV. [which is more of a minimal iconv than a phony iconv, but whatever]
Eli Zaretskii <eliz@gnu.org> writes: >> Date: Wed, 16 Sep 2015 21:24:13 -0600 >> From: Sandra Loosemore <sandra@codesourcery.com> >> >> If GDB is configured without ICONV support, the target wide charset >> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the >> wide strings in this test print as gibberish. Otherwise, GDB seems to >> think the default is "auto; currently UTF-32", so let's make the >> dependency on UTF-32 explicit here and bail out if it's not available. > > Why UTF-32, hard-coded? Why not allow also UTF-16, for example? Because this is the case of no iconv support and thus gdb is supplying its own minimal version. No point in supporting anything unwarranted. If a compelling case arises for UTF-16 we can cross that bridge when we get to it.
Sandra Loosemore <sandra@codesourcery.com> writes: > On 09/16/2015 11:07 PM, Eli Zaretskii wrote: >>> Date: Wed, 16 Sep 2015 21:24:13 -0600 >>> From: Sandra Loosemore <sandra@codesourcery.com> >>> >>> If GDB is configured without ICONV support, the target wide charset >>> defaults to "ISO-8859-1" (which isn't even a wide charset), and all the >>> wide strings in this test print as gibberish. Otherwise, GDB seems to >>> think the default is "auto; currently UTF-32", so let's make the >>> dependency on UTF-32 explicit here and bail out if it's not available. >> >> Why UTF-32, hard-coded? Why not allow also UTF-16, for example? > > It looked to me like the default target wide charset is UTF-32 if you > don't pick one explicitly. Since the test as currently written > doesn't, the patterns the .exp file is trying to match must assume the > default target wide charset, and not some other wide charset that > might happen to be supported. > > If I'm confused and the default charset might not always be UTF-32 if > ICONV is present, how about changing the testcase to bail if it sees > the default wide charset is ISO-8859-1? That means either ICONV is > not present or GDB's default is otherwise wrongly configured. The question I have is what's the intent behind PHONY_ICONV? [The fact that it apparently tries to provide some minimal wide char support but defines GDB_DEFAULT_TARGET_WIDE_CHARSET as "ISO-8859-1" seems like a bug to me.] Is it intended that it provide some minimal wide char support? And if so, my preference would be to keep the test if it's easy.
On 09/18/2015 05:47 AM, Doug Evans wrote: > The question I have is what's the intent behind PHONY_ICONV? > [The fact that it apparently tries to provide some minimal wide char > support but defines GDB_DEFAULT_TARGET_WIDE_CHARSET as "ISO-8859-1" > seems like a bug to me.] > > Is it intended that it provide some minimal wide char support? I recall the feature being submitted and the follow up Solaris fixes, but I wasn't the one doing the work, so I'm not 100% sure of anything, but I believe the intention is to punt on wide char support with the least extra code possible. That seems to be what the original submission suggests as well: https://sourceware.org/ml/gdb-patches/2009-01/msg00533.html https://sourceware.org/ml/gdb-patches/2009-03/msg00420.html along with the comments in mails around the follow up Solaris fixes: https://www.sourceware.org/ml/gdb-patches/2010-08/msg00339.html Thanks, Pedro Alves
diff --git a/gdb/testsuite/gdb.base/wchar.exp b/gdb/testsuite/gdb.base/wchar.exp index 1a5a2d4..171385b 100644 --- a/gdb/testsuite/gdb.base/wchar.exp +++ b/gdb/testsuite/gdb.base/wchar.exp @@ -24,6 +24,19 @@ if ![runto "wchar.c:$bp_location" ] then { return -1 } +# This test requires wide character support in GDB. +# Setting the charset may fail if GDB was configured without +# ICONV support. +gdb_test_multiple "set target-wide-charset UTF-32" "" { + -re "Undefined item.*$gdb_prompt " { + unsupported "Unknown charset UTF-32" + return -1 + } + -re ".*$gdb_prompt " { + pass "set target-wide-charset UTF-32" + } +} + gdb_test "print narrow" "= 97 L'a'" gdb_test "print single" "= 48879 L'\\\\xbeef'"