Message ID | 1401192650-29688-1-git-send-email-yao@codesourcery.com |
---|---|
State | Superseded |
Headers |
Received: (qmail 20511 invoked by alias); 27 May 2014 12:13:27 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 20483 invoked by uid 89); 27 May 2014 12:13:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 27 May 2014 12:13:24 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1WpGFs-0003fz-Mo from Yao_Qi@mentor.com for gdb-patches@sourceware.org; Tue, 27 May 2014 05:13:20 -0700 Received: from SVR-ORW-FEM-04.mgc.mentorg.com ([147.34.97.41]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Tue, 27 May 2014 05:13:20 -0700 Received: from qiyao.dyndns.org.com (147.34.91.1) by svr-orw-fem-04.mgc.mentorg.com (147.34.97.41) with Microsoft SMTP Server id 14.2.247.3; Tue, 27 May 2014 05:13:19 -0700 From: Yao Qi <yao@codesourcery.com> To: <gdb-patches@sourceware.org> Subject: [PATCH] Different outputs affected by locale Date: Tue, 27 May 2014 20:10:50 +0800 Message-ID: <1401192650-29688-1-git-send-email-yao@codesourcery.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-IsSubscribed: yes |
Commit Message
Yao Qi
May 27, 2014, 12:10 p.m. UTC
We find the following fails in gdb test on mingw host. FAIL: gdb.base/wchar.exp: print repeat FAIL: gdb.base/wchar.exp: print repeat_p FAIL: gdb.base/wchar.exp: print repeat (print null on) FAIL: gdb.base/wchar.exp: print repeat (print elements 3) FAIL: gdb.base/wchar.exp: print repeat_p (print elements 3) print repeat^M $7 = L"A", '¢' <repeats 21 times>, "B", '\000' <repeats 104 times>^M (gdb) FAIL: gdb.base/wchar.exp: print repeat the \242 is expected in the test but cent sign is displayed. In valprint.c:print_wchar, wchar_printable is called to determine whether a wchar is printable. wchar_printable calls iswprint but the iswprint's return value depends on LC_CTYPE setting of locale [1, 2]. The output may vary with different locale settings. I noticed that gdb.exp:gdb_init set LC_CTYPE to C. If I remove that line, tests fail on native testing too. IMO, either \242 or '¢' (cent sign) is a correct output, which is affect by locale, and it is not related to gdb at all. [1] http://pubs.opengroup.org/onlinepubs/009604499/functions/iswprint.html [2] msdn.microsoft.com/en-us/library/ewx8s4kw.aspx This patch is to add code to 'p repeat[1]' to extract the cent first, and then use it to match in the following tests. gdb/testsuite: 2014-05-27 Yao Qi <yao@codesourcery.com> * gdb.base/wchar.exp: Execute command 'p repeat[1]' and extract cent from the output. --- gdb/testsuite/gdb.base/wchar.exp | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
Comments
On 05/27/2014 08:10 PM, Yao Qi wrote: > We find the following fails in gdb test on mingw host. > > FAIL: gdb.base/wchar.exp: print repeat > FAIL: gdb.base/wchar.exp: print repeat_p > FAIL: gdb.base/wchar.exp: print repeat (print null on) > FAIL: gdb.base/wchar.exp: print repeat (print elements 3) > FAIL: gdb.base/wchar.exp: print repeat_p (print elements 3) > > print repeat^M > $7 = L"A", '¢' <repeats 21 times>, "B", '\000' <repeats 104 times>^M > (gdb) FAIL: gdb.base/wchar.exp: print repeat > > the \242 is expected in the test but cent sign is displayed. > > In valprint.c:print_wchar, wchar_printable is called to determine > whether a wchar is printable. wchar_printable calls iswprint but > the iswprint's return value depends on LC_CTYPE setting of locale [1, 2]. > The output may vary with different locale settings. I noticed that > gdb.exp:gdb_init set LC_CTYPE to C. If I remove that line, tests > fail on native testing too. > > IMO, either \242 or '¢' (cent sign) is a correct output, which is > affect by locale, and it is not related to gdb at all. > > [1] http://pubs.opengroup.org/onlinepubs/009604499/functions/iswprint.html > [2] msdn.microsoft.com/en-us/library/ewx8s4kw.aspx > > This patch is to add code to 'p repeat[1]' to extract the cent first, > and then use it to match in the following tests. > > gdb/testsuite: > > 2014-05-27 Yao Qi <yao@codesourcery.com> > > * gdb.base/wchar.exp: Execute command 'p repeat[1]' and extract > cent from the output. Ping.
> > 2014-05-27 Yao Qi <yao@codesourcery.com> > > > > * gdb.base/wchar.exp: Execute command 'p repeat[1]' and extract > > cent from the output. This is a patch that I felt would be better reviewed by Tom, but we'd have to wait for him to be back. When I read your patch, I thought that the approach you took was weakening the test a little, because if GDB started printing the character incorrectly, you would not notice it anymore.
On 06/04/2014 08:47 PM, Joel Brobecker wrote: >>> 2014-05-27 Yao Qi <yao@codesourcery.com> >>> >>> * gdb.base/wchar.exp: Execute command 'p repeat[1]' and extract >>> cent from the output. > > This is a patch that I felt would be better reviewed by Tom, but > we'd have to wait for him to be back. When I read your patch, > I thought that the approach you took was weakening the test a little, > because if GDB started printing the character incorrectly, you would > not notice it anymore. > The character printed by GDB in this case is out the control of GDB, IMO. IOW, we can't tell what character printed is correct and what is incorrect. Or we can relax the pattern to match either \242 or '¢' (cent sign) in the test. WDYT?
> The character printed by GDB in this case is out the control of GDB, > IMO. IIRC, it is a little bit by ways of how it decodes multibyte characters? > IOW, we can't tell what character printed is correct and what is > incorrect. Or we can relax the pattern to match either \242 or '¢' > (cent sign) in the test. WDYT? That would have been my first approach, but I would prefer it if someone who knows better about encodings commented on that. I could be wrong!
>>>>> "Yao" == Yao Qi <yao@codesourcery.com> writes:
Yao> The character printed by GDB in this case is out the control of GDB,
Yao> IMO. IOW, we can't tell what character printed is correct and what is
Yao> incorrect. Or we can relax the pattern to match either \242 or '¢'
Yao> (cent sign) in the test. WDYT?
I think that would be preferable. It is more conservative for the
reason Joel pointed out; and should we encounter a system that emits
something else, it is easy to update the test at that time.
I am not really a great standards lawyer but my first reaction is that
mingw's C locale is not conforming. At least from:
http://pubs.opengroup.org/onlinepubs/009604499/basedefs/xbd_chap07.html
.. it seems to me that \242 is not defined as a 'print' character in the
LC_CTYPE section. Though I'd like to reiterate that I don't actually
trust my own reading of that text.
Tom
On 06/04/2014 09:15 PM, Tom Tromey wrote: > I am not really a great standards lawyer but my first reaction is that > mingw's C locale is not conforming. At least from: > > http://pubs.opengroup.org/onlinepubs/009604499/basedefs/xbd_chap07.html > > .. it seems to me that \242 is not defined as a 'print' character in the > LC_CTYPE section. Though I'd like to reiterate that I don't actually > trust my own reading of that text. I wonder whether this is really a mingw issue, or whether this is a remote host testing issue. That is, aren't we setting LC_CTYPE on the _build_ (where expect runs), not on the host (mingw, through ssh)? Is LC_CTYPE really being propagated to the host? Does testing GDB manually directly on a Windows console show the same issue?
diff --git a/gdb/testsuite/gdb.base/wchar.exp b/gdb/testsuite/gdb.base/wchar.exp index 4290478..215d2f4 100644 --- a/gdb/testsuite/gdb.base/wchar.exp +++ b/gdb/testsuite/gdb.base/wchar.exp @@ -36,7 +36,23 @@ gdb_test "print simple\[2\]" "= 99 L'c'" gdb_test "print difficile\[2\]" "= 65261 L'\\\\xfeed'" -set cent "\\\\242" +# The contents in 'repeat' are shown differently under different +# locale. In stead of hard code the cent sign in variable 'cent', +# extract it from the output of 'print repeat[1]', and use it to +# match the output in the following tests. +set cent "" +set test "get cent" +gdb_test_multiple "p repeat\[1\]" $test { + -re " = 162 L'(.*)'.*\r\n$gdb_prompt $" { + set cent [string_to_regexp $expect_out(1,string)] + pass $test + } + -re ".*$gdb_prompt $" { + fail $test + return + } +} + gdb_test "print repeat" "= L\"A\", '$cent' <repeats 21 times>, \"B.*" global hex