Message ID | 94eb2c1121f421b06405406923b6@google.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 91800 invoked by alias); 3 Nov 2016 17:46:32 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 91787 invoked by uid 89); 3 Nov 2016 17:46:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=H*M:sk:94eb2c1, H*MI:sk:94eb2c1, 3336, 48838 X-HELO: mail-pa0-f74.google.com Received: from mail-pa0-f74.google.com (HELO mail-pa0-f74.google.com) (209.85.220.74) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 03 Nov 2016 17:46:30 +0000 Received: by mail-pa0-f74.google.com with SMTP id fi11so163253pac.2 for <gdb-patches@sourceware.org>; Thu, 03 Nov 2016 10:46:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:message-id:date:subject:from:to; bh=AU9xlNOdqkkXc3kD5TkPL85Cu7lNwBMNcO8eF7jr4qk=; b=Cd/+26HO/fu0JXVRvTqgryEmB/fQOR4hZzoLw72DwVZR01tefh6ekDCsg2CEpizS6V 39Wc0wps0hn08BOdK6nJyqNl1KEbJSE2gEKXMLIrv3mucflz1oAgTdDeRxI92Cc++Eva ai46KoQ0eLGPdtKihrzWsH5XzwiQyvb3jmwDlAx7a/GW2iN5s/VPjLvbwdFHbmVyp6ZW Qenad/K36qVIiqTuODyF5BypH7ifURkgdTJ0ZpcAweDkXDwVQ8lsqU81KE0UxdFadDtN RqLg18kfpOLrfSSOeUFFh7MLNihKwJU9FOb2+wWODJbMUUCc0Dvn00OsApmees37m3aM R5NA== X-Gm-Message-State: ABUngvdwKv4HrZkLM02ryKkXwpbm3wJrJAqDkKGBhLl4VFmf7vaehi0sMnuCExA9fBJ3PPfmeWa0JVj422y9CooBTmv4bYBBcc5MOZi0tXoOKFObnnIAThKmUoNShPVxZUi3LvcKQfKiRpkcxqJFHMo8UeTQ1x6B4b+Kl5LF+2VBSU//53Pn3Q== MIME-Version: 1.0 X-Received: by 10.98.22.196 with SMTP id 187mr2857723pfw.41.1478195188682; Thu, 03 Nov 2016 10:46:28 -0700 (PDT) Message-ID: <94eb2c1121f421b06405406923b6@google.com> Date: Thu, 03 Nov 2016 17:46:28 +0000 Subject: [PATCH, doc RFA] Fix lazy string type docs From: Doug Evans <dje@google.com> To: gdb-patches@sourceware.org, eliz@gnu.org Content-Type: text/plain; charset=UTF-8; format=flowed; delsp=yes X-IsSubscribed: yes |
Commit Message
Doug Evans
Nov. 3, 2016, 5:46 p.m. UTC
Hi. I was trying to understand a problem I was having with python lazy strings. It turns out the docs are wrong, and the "type" attribute of a lazy string is the character type, not a pointer to the character's type. Tested on amd64-linux. 2016-11-03 Doug Evans <dje@google.com> * python/py-lazy-string.c (lazy_string_object): Add comment. doc/ * guile.texi (Lazy Strings In Guile): Fix docs for lazy-string-type. * python.texi (Lazy Strings In Python): Fix docs for LazyString.type. testsuite/ * gdb.guile/scm-value.exp (test_lazy_strings): Add test for lazy-string-type. * gdb.python/py-value.exp (test_lazy_strings): Add test for LazyString.type.
Comments
> Date: Thu, 03 Nov 2016 17:46:28 +0000 > From: Doug Evans <dje@google.com> > > +Return the type of a character in @var{lazy-string}. Hmm... what is "type of a character" in this context? I have no comments to the markup or the text. Thanks.
Apologies for the resend. text/html strikes again. On Thu, Nov 3, 2016 at 11:21 AM, Eli Zaretskii <eliz@gnu.org> wrote: >> Date: Thu, 03 Nov 2016 17:46:28 +0000 >> From: Doug Evans <dje@google.com> >> >> +Return the type of a character in @var{lazy-string}. > > Hmm... what is "type of a character" in this context? > > I have no comments to the markup or the text. > > Thanks. Consider char vs wchar_t. e.g., #include <stddef.h> char foo[] = "bar"; wchar_t wide_foo[] = L"bar";
On Thu, Nov 3, 2016 at 10:46 AM, Doug Evans <dje@google.com> wrote: > Hi. > > I was trying to understand a problem I was having with python lazy strings. > It turns out the docs are wrong, and the "type" attribute of a lazy > string is the character type, not a pointer to the character's type. ... Unless the thing we're making a lazy string out of is an array instead of a pointer. const char* foo = "Dave"; const char bar[] = "No, man, I'm Dave."; (gdb) py print gdb.parse_and_eval("foo").lazy_string().type const char (gdb) py print gdb.parse_and_eval("bar").lazy_string().type const char [19] I don't have a strong opinion on what the correct answer is, but there is certainly a bug here. Phil, do you remember why this code exists in valpy_lazy_string(): if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR) value = value_ind (value); [lazy string support went in in commit be759fcf] I kinda like the type of the lazy string version of a char array still being a char array. OTOH, when I look at the definition of a lazy string, recording the character type makes more sense (to me anyway: e.g., length would be redundant for the case where type is a char array) typedef struct { PyObject_HEAD /* Holds the address of the lazy string. */ CORE_ADDR address; /* Holds the encoding that will be applied to the string when the string is printed by GDB. If the encoding is set to None then GDB will select the most appropriate encoding when the sting is printed. */ char *encoding; /* Holds the length of the string in characters. If the length is -1, then the string will be fetched and encoded up to the first null of appropriate width. */ long length; /* This attribute holds the type that is represented by the lazy string's type. */ struct type *type; } lazy_string_object; To fix this bug we're going to have to break one or the other, or add a knob (bleah) to control the old, broken, behaviour. Or mark the "type" LazyString attribute as broken/deprecated and provide a new attribute with correct behaviour. Opinions?
> From: Doug Evans <dje@google.com> > Date: Thu, 3 Nov 2016 11:30:39 -0700 > Cc: gdb-patches <gdb-patches@sourceware.org> > > On Thu, Nov 3, 2016 at 11:21 AM, Eli Zaretskii <eliz@gnu.org> wrote: > > > Date: Thu, 03 Nov 2016 17:46:28 +0000 > > From: Doug Evans <dje@google.com> > > > > +Return the type of a character in @var{lazy-string}. > > Hmm... what is "type of a character" in this context? > > I have no comments to the markup or the text. > > Thanks. > > Consider char vs wchar_t. > e.g., > #include <stddef.h> > char foo[] = "bar"; > wchar_t wide_foo[] = L"bar"; So maybe it would be more clear to say "the type of characters", plural? And perhaps give an example or two of what types these could be? Thanks.
On Thu, Nov 03, 2016 at 12:03:46PM -0700, Doug Evans wrote: > To fix this bug we're going to have to break one or the other, or add > a knob (bleah) to control the old, broken, behaviour. Adding a knob to enable the old behaviour does not really help frontends, as with the same effort a frontend adds a "set knob enabled" command it could just adapt to the new behaviour. > Or mark the "type" LazyString attribute as broken/deprecated and > provide a new attribute with correct behaviour. [...] > Opinions? To be honest, I don't really have an opinion on which is better or even correct, as it is still rather unclear to me in which situation a frontend would benefit from using gdb.LazyString at all. To me, the need to specify an encoding or rely on GDB to select something appropriate seems to be a disadvantage over keeping a (address, length) pair and using gdb.Inferior.read_memory() when needed, especially when the contents of that memory can be corrupted and not necessarily contain data of an intended encoding anymore. André PS: > [...] will select the most appropriate > encoding when the sting is printed [...] s/sting/string/ ? (also copied to guile/scm-lazy-string.c)
On Thu, Nov 3, 2016 at 12:07 PM, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Doug Evans <dje@google.com> >> Date: Thu, 3 Nov 2016 11:30:39 -0700 >> Cc: gdb-patches <gdb-patches@sourceware.org> >> >> On Thu, Nov 3, 2016 at 11:21 AM, Eli Zaretskii <eliz@gnu.org> wrote: >> >> > Date: Thu, 03 Nov 2016 17:46:28 +0000 >> > From: Doug Evans <dje@google.com> >> > >> > +Return the type of a character in @var{lazy-string}. >> >> Hmm... what is "type of a character" in this context? >> >> I have no comments to the markup or the text. >> >> Thanks. >> >> Consider char vs wchar_t. >> e.g., >> #include <stddef.h> >> char foo[] = "bar"; >> wchar_t wide_foo[] = L"bar"; > > So maybe it would be more clear to say "the type of characters", > plural? And perhaps give an example or two of what types these could > be? Or how about "the type of an element in the string" ?
On Thu, Nov 3, 2016 at 12:03 PM, Doug Evans <dje@google.com> wrote: > On Thu, Nov 3, 2016 at 10:46 AM, Doug Evans <dje@google.com> wrote: >> Hi. >> >> I was trying to understand a problem I was having with python lazy strings. >> It turns out the docs are wrong, and the "type" attribute of a lazy >> string is the character type, not a pointer to the character's type. > > ... Unless the thing we're making a lazy string out of is an array > instead of a pointer. > > const char* foo = "Dave"; > const char bar[] = "No, man, I'm Dave."; > > (gdb) py print gdb.parse_and_eval("foo").lazy_string().type > const char > (gdb) py print gdb.parse_and_eval("bar").lazy_string().type > const char [19] > > I don't have a strong opinion on what the correct answer is, but there > is certainly a bug here. > > Phil, do you remember why this code exists in valpy_lazy_string(): > > if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR) > value = value_ind (value); > > [lazy string support went in in commit be759fcf] > > I kinda like the type of the lazy string version of a char array still > being a char array. > > OTOH, when I look at the definition of a lazy string, recording the > character type makes more sense > (to me anyway: e.g., length would be redundant for the case where type > is a char array) > > typedef struct { > PyObject_HEAD > /* Holds the address of the lazy string. */ > CORE_ADDR address; > > /* Holds the encoding that will be applied to the string > when the string is printed by GDB. If the encoding is set > to None then GDB will select the most appropriate > encoding when the sting is printed. */ > char *encoding; > > /* Holds the length of the string in characters. If the > length is -1, then the string will be fetched and encoded up to > the first null of appropriate width. */ > long length; > > /* This attribute holds the type that is represented by the lazy > string's type. */ > struct type *type; > } lazy_string_object; > > To fix this bug we're going to have to break one or the other, or add > a knob (bleah) to control the old, broken, behaviour. > Or mark the "type" LazyString attribute as broken/deprecated and > provide a new attribute with correct behaviour. Ok, an audit of all internal uses that I can find of lazy_string->type use the type as an element of the string. E.g.: py-pretty-print.c:print_children has: gdbpy_extract_lazy_string (py_v, &addr, &type, &length, &encoding); local_opts.addressprint = 0; val_print_string (type, encoding, addr, (int) length, stream, &local_opts); val_print_string takes a string element type as the first arg. I think the right thing to do here is go with the posted patch (after doc is ok). And I'll post another patch to fix char[] handling (I'll file a pr for that too, assuming my search fu can't find an existing one). Plus I have another patch after that to fix another aspect of lazy string handling.
> From: Doug Evans <dje@google.com> > Date: Thu, 3 Nov 2016 16:54:13 -0700 > Cc: gdb-patches <gdb-patches@sourceware.org> > > >> Consider char vs wchar_t. > >> e.g., > >> #include <stddef.h> > >> char foo[] = "bar"; > >> wchar_t wide_foo[] = L"bar"; > > > > So maybe it would be more clear to say "the type of characters", > > plural? And perhaps give an example or two of what types these could > > be? > > Or how about "the type of an element in the string" ? Yes, that'd be fine, too. Thanks.
On Thu, Nov 3, 2016 at 5:00 PM, Doug Evans <dje@google.com> wrote: > On Thu, Nov 3, 2016 at 12:03 PM, Doug Evans <dje@google.com> wrote: >> On Thu, Nov 3, 2016 at 10:46 AM, Doug Evans <dje@google.com> wrote: >>> Hi. >>> >>> I was trying to understand a problem I was having with python lazy strings. >>> It turns out the docs are wrong, and the "type" attribute of a lazy >>> string is the character type, not a pointer to the character's type. >> >> ... Unless the thing we're making a lazy string out of is an array >> instead of a pointer. >> >> const char* foo = "Dave"; >> const char bar[] = "No, man, I'm Dave."; >> >> (gdb) py print gdb.parse_and_eval("foo").lazy_string().type >> const char >> (gdb) py print gdb.parse_and_eval("bar").lazy_string().type >> const char [19] >> >> I don't have a strong opinion on what the correct answer is, but there >> is certainly a bug here. >> >> Phil, do you remember why this code exists in valpy_lazy_string(): >> >> if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR) >> value = value_ind (value); >> >> [lazy string support went in in commit be759fcf] >> >> I kinda like the type of the lazy string version of a char array still >> being a char array. >> >> OTOH, when I look at the definition of a lazy string, recording the >> character type makes more sense >> (to me anyway: e.g., length would be redundant for the case where type >> is a char array) >> >> typedef struct { >> PyObject_HEAD >> /* Holds the address of the lazy string. */ >> CORE_ADDR address; >> >> /* Holds the encoding that will be applied to the string >> when the string is printed by GDB. If the encoding is set >> to None then GDB will select the most appropriate >> encoding when the sting is printed. */ >> char *encoding; >> >> /* Holds the length of the string in characters. If the >> length is -1, then the string will be fetched and encoded up to >> the first null of appropriate width. */ >> long length; >> >> /* This attribute holds the type that is represented by the lazy >> string's type. */ >> struct type *type; >> } lazy_string_object; >> >> To fix this bug we're going to have to break one or the other, or add >> a knob (bleah) to control the old, broken, behaviour. >> Or mark the "type" LazyString attribute as broken/deprecated and >> provide a new attribute with correct behaviour. > > Ok, an audit of all internal uses that I can find of lazy_string->type > use the type as an element of the string. > > E.g.: py-pretty-print.c:print_children has: > > gdbpy_extract_lazy_string (py_v, &addr, &type, &length, &encoding); > > local_opts.addressprint = 0; > val_print_string (type, encoding, addr, (int) length, stream, > &local_opts); > > val_print_string takes a string element type as the first arg. > > I think the right thing to do here is go with the posted patch (after > doc is ok). > And I'll post another patch to fix char[] handling (I'll file a pr for > that too, assuming my search fu can't find an existing one). > Plus I have another patch after that to fix another aspect of lazy > string handling. As more data, I found the following test in py-value.exp: gdb_test "python print (lstr.type)" "const char \*." "Test lazy-string type name equality" gdb_test "python print (sptr.type)" "const char \*." "Test string type name equality" The test is, seemingly, trying to ensure the type of the lazy string is the same as the type of the string. Not unreasonable to be sure. Alas that's not what's going on: python print (lstr.type)^M const char^M (gdb) PASS: gdb.python/py-value.exp: Test lazy-string type name equality python print (sptr.type)^M const char *^M (gdb) PASS: gdb.python/py-value.exp: Test string type name equality I'm now thinking of fixing LazyString.type for char* strings so that the above "python print (lstr.type)" prints "const char *" instead of "const char". Yeah, I think it's going to break someone, but I don't want to leave things as they are and it does feel better from a u/i perspective (even with its warts). If people want me taking a different route, please speak up. btw, I found the following four lazy string bugs. I'll try to address them all as well, if time allows. https://sourceware.org/bugzilla/show_bug.cgi?id=18779 https://sourceware.org/bugzilla/show_bug.cgi?id=18439 https://sourceware.org/bugzilla/show_bug.cgi?id=17728 https://sourceware.org/bugzilla/show_bug.cgi?id=16020
On 04/11/16 17:52, Doug Evans wrote: > On Thu, Nov 3, 2016 at 5:00 PM, Doug Evans <dje@google.com> wrote: >> On Thu, Nov 3, 2016 at 12:03 PM, Doug Evans <dje@google.com> wrote: >>> On Thu, Nov 3, 2016 at 10:46 AM, Doug Evans <dje@google.com> wrote: >>>> Hi. >>>> >>> >>> (gdb) py print gdb.parse_and_eval("foo").lazy_string().type >>> const char >>> (gdb) py print gdb.parse_and_eval("bar").lazy_string().type >>> const char [19] >>> >>> I don't have a strong opinion on what the correct answer is, but there >>> is certainly a bug here. >>> >>> Phil, do you remember why this code exists in valpy_lazy_string(): >>> >>> if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR) >>> value = value_ind (value); >>> >>> [lazy string support went in in commit be759fcf] I can't remember. Too many sleeps ;) It's possible it is just a mistake and we're worrying about a bug that should just be fixed. I think lazy strings were implemented to carry around a string that remains unfetched (in cases of massive strings) until actually needed, or a pointer to that string. I also thought there was some work carried out on them by Tom (?) after the initial commit. I keep notes and I'll look on the weekend to see of I can dig out specific implementation details. Cheers Phil
On Fri, Nov 4, 2016 at 12:20 PM, Phil Muldoon <pmuldoon@redhat.com> wrote: > On 04/11/16 17:52, Doug Evans wrote: >> On Thu, Nov 3, 2016 at 5:00 PM, Doug Evans <dje@google.com> wrote: >>> On Thu, Nov 3, 2016 at 12:03 PM, Doug Evans <dje@google.com> wrote: >>>> On Thu, Nov 3, 2016 at 10:46 AM, Doug Evans <dje@google.com> wrote: >>>>> Hi. >>>>> > >>>> >>>> (gdb) py print gdb.parse_and_eval("foo").lazy_string().type >>>> const char >>>> (gdb) py print gdb.parse_and_eval("bar").lazy_string().type >>>> const char [19] >>>> >>>> I don't have a strong opinion on what the correct answer is, but there >>>> is certainly a bug here. >>>> >>>> Phil, do you remember why this code exists in valpy_lazy_string(): >>>> >>>> if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR) >>>> value = value_ind (value); >>>> >>>> [lazy string support went in in commit be759fcf] > > I can't remember. Too many sleeps ;) It's possible it is just a > mistake and we're worrying about a bug that should just be fixed. > > I think lazy strings were implemented to carry around a string that > remains unfetched (in cases of massive strings) until actually needed, > or a pointer to that string. I also thought there was some work > carried out on them by Tom (?) after the initial commit. > > I keep notes and I'll look on the weekend to see of I can dig out > specific implementation details. Hi Phil. Don't spend too much time on this. I think I'm at a point where I'm comfortable with sending a patch I'd like committed.
diff --git a/gdb/python/py-lazy-string.c b/gdb/python/py-lazy-string.c index d4b40df..49b3744 100644 --- a/gdb/python/py-lazy-string.c +++ b/gdb/python/py-lazy-string.c @@ -40,8 +40,7 @@ typedef struct { the first null of appropriate width. */ long length; - /* This attribute holds the type that is represented by the lazy - string's type. */ + /* This attribute holds the type of a character in the string. */ struct type *type; } lazy_string_object; diff --git a/gdb/doc/guile.texi b/gdb/doc/guile.texi index 0030f3f..a413802 100644 --- a/gdb/doc/guile.texi +++ b/gdb/doc/guile.texi @@ -3247,9 +3247,7 @@ most appropriate encoding when the string is printed. @end deffn @deffn {Scheme Procedure} lazy-string-type lazy-string -Return the type that is represented by @var{lazy-string}'s type. -For a lazy string this will always be a pointer type. To -resolve this to the lazy string's character type, use @code{type-target-type}. +Return the type of a character in @var{lazy-string}. @xref{Types In Guile}. @end deffn diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi index d6507e5..63fca68 100644 --- a/gdb/doc/python.texi +++ b/gdb/doc/python.texi @@ -4883,11 +4883,8 @@ is not writable. @end defvar @defvar LazyString.type -This attribute holds the type that is represented by the lazy string's -type. For a lazy string this will always be a pointer type. To -resolve this to the lazy string's character type, use the type's -@code{target} method. @xref{Types In Python}. This attribute is not -writable. +This attribute holds the type of a character in the string. +@xref{Types In Python}. This attribute is not writable. @end defvar @node Architectures In Python diff --git a/gdb/testsuite/gdb.guile/scm-value.exp b/gdb/testsuite/gdb.guile/scm-value.exp index 1d07c9f..02730ae 100644 --- a/gdb/testsuite/gdb.guile/scm-value.exp +++ b/gdb/testsuite/gdb.guile/scm-value.exp @@ -241,6 +241,8 @@ proc test_lazy_strings {} { "= 0" "Test lazy string length" gdb_test "gu (print (lazy-string-address snstr))" \ "= 0" "Test lazy string address" + gdb_test "gu (print (lazy-string-type snstr))" \ + "= const char" "Test lazy string type" } proc test_inferior_function_call {} { diff --git a/gdb/testsuite/gdb.python/py-value.exp b/gdb/testsuite/gdb.python/py-value.exp index 89be659..3ed1801 100644 --- a/gdb/testsuite/gdb.python/py-value.exp +++ b/gdb/testsuite/gdb.python/py-value.exp @@ -333,6 +333,7 @@ proc test_lazy_strings {} { gdb_py_test_silent_cmd "python snstr = snptr.lazy_string(length=0)" "Succesfully create a lazy string" 1 gdb_test "python print (snstr.length)" "0" "Test lazy string length" gdb_test "python print (snstr.address)" "0" "Test lazy string address" + gdb_test "python print (snstr.type)" "const char" "Test lazy string type" }