diff mbox

gdb: Allow struct fields named double

Message ID 20181215232414.10883-1-andrew.burgess@embecosm.com
State New
Headers show

Commit Message

Andrew Burgess Dec. 15, 2018, 11:24 p.m. UTC
The patch below fixes an issue with the current was floating point is
handled on RISC-V.  We currently model 64-bit floating point as a
union with fields 'double' and 'float'.  The problem is that having a
field named 'double' isn't currently valid in GDB (though float is fine).

The easiest solution might be to change the field names... but the
patch below instead extends the C parser to allow double as a field
name.

I'd be interested in hearing feedback on the below,

Thanks,
Andrew

---

The 64-bit RISC-V target currently models the floating point registers
as having the following type:

    union riscv_double
    {
        builtin_type_ieee_single float;
        builtin_type_ieee_double double;
    }

Notice the choice of names for the fields of this struct, possibly not
ideal choices, as these are not valid field names in C.  However, this
type is only ever defined within GDB (or in the target description),
and no restriction seems to exist on the field names in that case.

The problem though is that currently:

    (gdb) info registers $ft0
    ft0            {float = 0, double = 0}	(raw 0x0000000000000000)
    (gdb) p $ft0.float
    $1 = 0
    (gdb) p $ft0.double
    A syntax error in expression, near `double'.

We can access the 'float' field, but not the 'double' field.  This is
because the string 'double' is handled differently to the string
'float' in c-exp.y.

In both cases the string '$ft0' is parsed as a VARIABLE expression.

In the 'float' case, the string 'float' becomes a generic NAME token
in 'lex_one_token', which then allows the rule "exp '.' name" to match
and the field name lookup to occur.

The 'double' case is different.  In order to allow parsing of the type
string 'long double', the 'double' string becomes the token
DOUBLE_KEYWORD.  At this point there's no rule to match "exp '.'
DOUBLE_KEYWORD", so we can never lookup the field named 'double'.

We could rename the fields for RISC-V, and maybe that would be the
best solution.  However, its not hard to allow for fields named
'double', which is what this patch does.

A new case is added to the 'name' rule to match the DOUBLE_KEYWORD,
and create a suitable 'struct stoken'.  With this done the "exp '.'
name" pattern can now match, and we can lookup the double field.

With this patch in place I now see this behaviour:

    (gdb) info registers $ft0
    ft0            {float = 0, double = 0}	(raw 0x0000000000000000)
    (gdb) p $ft0.float
    $1 = 0
    (gdb) p $ft0.double
    $2 = 0

This change was tested on x86-64 GNU/Linux with no regressions.

gdb/ChangeLog:

	* c-exp.y (name): Allow DOUBLE_KEYWORD to act as a name.
	(typename_stoken): New function.
---
 gdb/ChangeLog |  5 +++++
 gdb/c-exp.y   | 17 +++++++++++++++++
 2 files changed, 22 insertions(+)

Comments

Simon Marchi Dec. 16, 2018, 4:23 p.m. UTC | #1
On 2018-12-15 6:24 p.m., Andrew Burgess wrote:
> The patch below fixes an issue with the current was floating point is
> handled on RISC-V.  We currently model 64-bit floating point as a
> union with fields 'double' and 'float'.  The problem is that having a
> field named 'double' isn't currently valid in GDB (though float is fine).
> 
> The easiest solution might be to change the field names... but the
> patch below instead extends the C parser to allow double as a field
> name.
> 
> I'd be interested in hearing feedback on the below,
> 
> Thanks,
> Andrew
> 
> ---
> 
> The 64-bit RISC-V target currently models the floating point registers
> as having the following type:
> 
>     union riscv_double
>     {
>         builtin_type_ieee_single float;
>         builtin_type_ieee_double double;
>     }
> 
> Notice the choice of names for the fields of this struct, possibly not
> ideal choices, as these are not valid field names in C.  However, this
> type is only ever defined within GDB (or in the target description),
> and no restriction seems to exist on the field names in that case.
> 
> The problem though is that currently:
> 
>     (gdb) info registers $ft0
>     ft0            {float = 0, double = 0}	(raw 0x0000000000000000)
>     (gdb) p $ft0.float
>     $1 = 0
>     (gdb) p $ft0.double
>     A syntax error in expression, near `double'.
> 
> We can access the 'float' field, but not the 'double' field.  This is
> because the string 'double' is handled differently to the string
> 'float' in c-exp.y.
> 
> In both cases the string '$ft0' is parsed as a VARIABLE expression.
> 
> In the 'float' case, the string 'float' becomes a generic NAME token
> in 'lex_one_token', which then allows the rule "exp '.' name" to match
> and the field name lookup to occur.
> 
> The 'double' case is different.  In order to allow parsing of the type
> string 'long double', the 'double' string becomes the token
> DOUBLE_KEYWORD.  At this point there's no rule to match "exp '.'
> DOUBLE_KEYWORD", so we can never lookup the field named 'double'.
> 
> We could rename the fields for RISC-V, and maybe that would be the
> best solution.  However, its not hard to allow for fields named
> 'double', which is what this patch does.
> 
> A new case is added to the 'name' rule to match the DOUBLE_KEYWORD,
> and create a suitable 'struct stoken'.  With this done the "exp '.'
> name" pattern can now match, and we can lookup the double field.
> 
> With this patch in place I now see this behaviour:
> 
>     (gdb) info registers $ft0
>     ft0            {float = 0, double = 0}	(raw 0x0000000000000000)
>     (gdb) p $ft0.float
>     $1 = 0
>     (gdb) p $ft0.double
>     $2 = 0
> 
> This change was tested on x86-64 GNU/Linux with no regressions.
> 
> gdb/ChangeLog:
> 
> 	* c-exp.y (name): Allow DOUBLE_KEYWORD to act as a name.
> 	(typename_stoken): New function.

This looks reasonable to me, from the user point of view there is not reason
why x.double would be different from x.float.  With the same logic, we should
also allow x.int, x.short, etc.  But I'm fine with doing it on an as-needed basis.

This LGTM, with one thing below you might want to change.  I'm far from an expert
in parsers though, so please give others ~1 week to comment, and push then if you
don't receive additional feedback.

> ---
>  gdb/ChangeLog |  5 +++++
>  gdb/c-exp.y   | 17 +++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/gdb/c-exp.y b/gdb/c-exp.y
> index bfc78415b22..8d805cbde3d 100644
> --- a/gdb/c-exp.y
> +++ b/gdb/c-exp.y
> @@ -113,6 +113,7 @@ static int type_aggregate_p (struct type *);
>  static int parse_number (struct parser_state *par_state,
>  			 const char *, int, int, YYSTYPE *);
>  static struct stoken operator_stoken (const char *);
> +static struct stoken typename_stoken (const char *);
>  static void check_parameter_typelist (VEC (type_ptr) *);
>  static void write_destructor_name (struct parser_state *par_state,
>  				   struct stoken);
> @@ -1642,6 +1643,7 @@ name	:	NAME { $$ = $1.stoken; }
>  	|	TYPENAME { $$ = $1.stoken; }
>  	|	NAME_OR_INT  { $$ = $1.stoken; }
>  	|	UNKNOWN_CPP_NAME  { $$ = $1.stoken; }
> +	|	DOUBLE_KEYWORD { $$ = typename_stoken ("double"); }
>  	|	oper { $$ = $1; }
>  	;
>  
> @@ -1707,6 +1709,21 @@ operator_stoken (const char *op)
>    return st;
>  };
>  
> +/* Returns a stoken of the type named TYPE.  */
> +
> +static struct stoken
> +typename_stoken (const char *type)
> +{
> +  struct stoken st = { NULL, 0 };
> +  char *buf = xstrdup (type);
> +  st.ptr = buf;
> +  st.length = strlen (type);
> +
> +  /* The toplevel (c_parse) will free the memory allocated here.  */
> +  make_cleanup (free, buf);
> +  return st;
> +};

Unlike operator_stoken, I don't think we need to malloc anything here, we
could return { "double", strlen ("double") }.  At this point, we might not
need a dedicated function.

Simon
Tom Tromey Dec. 16, 2018, 4:42 p.m. UTC | #2
>>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:

Simon> This looks reasonable to me, from the user point of view there is not reason
Simon> why x.double would be different from x.float.  With the same logic, we should
Simon> also allow x.int, x.short, etc.  But I'm fine with doing it on an as-needed basis.

See also https://sourceware.org/bugzilla/show_bug.cgi?id=13368

Simon> This LGTM, with one thing below you might want to change.  I'm far from an expert
Simon> in parsers though, so please give others ~1 week to comment, and push then if you
Simon> don't receive additional feedback.

I think the main possible issue is if this introduces a new parser conflict.
Taking a glance at the uses of "name" in the grammar, though, it seems
safe enough.

Tom
Andrew Burgess Dec. 18, 2018, 10:40 p.m. UTC | #3
Simon, Tom,

Thanks for the feedback.

* Tom Tromey <tom@tromey.com> [2018-12-16 09:42:15 -0700]:

> >>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:
> 
> Simon> This looks reasonable to me, from the user point of view there is not reason
> Simon> why x.double would be different from x.float.  With the same logic, we should
> Simon> also allow x.int, x.short, etc.  But I'm fine with doing it on an as-needed basis.
> 
> See also https://sourceware.org/bugzilla/show_bug.cgi?id=13368

The new patch series includes a patch that extends some of the
comments in order to hopefully address this bug.

> 
> Simon> This LGTM, with one thing below you might want to change.  I'm far from an expert
> Simon> in parsers though, so please give others ~1 week to comment, and push then if you
> Simon> don't receive additional feedback.
> 
> I think the main possible issue is if this introduces a new parser conflict.
> Taking a glance at the uses of "name" in the grammar, though, it seems
> safe enough.

I took a better look at how the YACC step of the compile works, and it
turns out that parser conflicts are not fatal to the build, and my
change had introduced a new conflict.

The new patch series addresses this by focusing my change specifically
on structure field names, so I want to be able to parse:
'object.double', but I don't expect to be able to parse an object
named 'double'.  I also extended my patch to cover other types like
'int', 'short', etc.

As penance I've included a patch that tweaks how pointers are parsed
that resolves 49 reduce/reduce conflicts.

After this series there's still 42 shift/reduce conflicts, and 4
reduce/reduce conflicts in the C parser, but patch #4, the one I
really care about, doesn't introduce any new conflicts.

Tested on X86-64 GNU/Linux.

Thanks,
Andrew

---

Andrew Burgess (4):
  gdb: Extend the comments in c-exp.y
  gdb: Resolve 49 reduce/reduce conflicts in c-exp.y
  gdb: Add new parser rule for structure field names
  gdb: Allow struct fields named double

 gdb/ChangeLog                                      |  24 ++++
 gdb/c-exp.y                                        |  73 +++++++++---
 gdb/testsuite/ChangeLog                            |   5 +
 gdb/testsuite/gdb.dwarf2/dw2-unusual-field-names.c |  32 +++++
 .../gdb.dwarf2/dw2-unusual-field-names.exp         | 132 +++++++++++++++++++++
 5 files changed, 248 insertions(+), 18 deletions(-)
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-unusual-field-names.c
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-unusual-field-names.exp
Andrew Burgess Dec. 24, 2018, 5:30 p.m. UTC | #4
* Andrew Burgess <andrew.burgess@embecosm.com> [2018-12-18 22:40:09 +0000]:

> Simon, Tom,
> 
> Thanks for the feedback.
> 
> * Tom Tromey <tom@tromey.com> [2018-12-16 09:42:15 -0700]:
> 
> > >>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:
> > 
> > Simon> This looks reasonable to me, from the user point of view there is not reason
> > Simon> why x.double would be different from x.float.  With the same logic, we should
> > Simon> also allow x.int, x.short, etc.  But I'm fine with doing it on an as-needed basis.
> > 
> > See also https://sourceware.org/bugzilla/show_bug.cgi?id=13368
> 
> The new patch series includes a patch that extends some of the
> comments in order to hopefully address this bug.
> 
> > 
> > Simon> This LGTM, with one thing below you might want to change.  I'm far from an expert
> > Simon> in parsers though, so please give others ~1 week to comment, and push then if you
> > Simon> don't receive additional feedback.
> > 
> > I think the main possible issue is if this introduces a new parser conflict.
> > Taking a glance at the uses of "name" in the grammar, though, it seems
> > safe enough.
> 
> I took a better look at how the YACC step of the compile works, and it
> turns out that parser conflicts are not fatal to the build, and my
> change had introduced a new conflict.
> 
> The new patch series addresses this by focusing my change specifically
> on structure field names, so I want to be able to parse:
> 'object.double', but I don't expect to be able to parse an object
> named 'double'.  I also extended my patch to cover other types like
> 'int', 'short', etc.
> 
> As penance I've included a patch that tweaks how pointers are parsed
> that resolves 49 reduce/reduce conflicts.
> 
> After this series there's still 42 shift/reduce conflicts, and 4
> reduce/reduce conflicts in the C parser, but patch #4, the one I
> really care about, doesn't introduce any new conflicts.
> 
> Tested on X86-64 GNU/Linux.
> 

Tom,

Thanks for the feedback on the revised patch series.  I have gone
ahead and pushed 1, 3, and 4, dropping patch 2.

For patch #4 I confirmed there are no duplicate test names.

In patch #2, you are correct and that the type stack is parsed in a
slightly different order with the patch than before.  I don't know if
this is significant, but I certainly don't know that it's not.

I'm convinced that patch #2 could be made to work, but I don't have
time right now, so I'll add it to my ever-growing todo list.

Thanks,
Andrew
diff mbox

Patch

diff --git a/gdb/c-exp.y b/gdb/c-exp.y
index bfc78415b22..8d805cbde3d 100644
--- a/gdb/c-exp.y
+++ b/gdb/c-exp.y
@@ -113,6 +113,7 @@  static int type_aggregate_p (struct type *);
 static int parse_number (struct parser_state *par_state,
 			 const char *, int, int, YYSTYPE *);
 static struct stoken operator_stoken (const char *);
+static struct stoken typename_stoken (const char *);
 static void check_parameter_typelist (VEC (type_ptr) *);
 static void write_destructor_name (struct parser_state *par_state,
 				   struct stoken);
@@ -1642,6 +1643,7 @@  name	:	NAME { $$ = $1.stoken; }
 	|	TYPENAME { $$ = $1.stoken; }
 	|	NAME_OR_INT  { $$ = $1.stoken; }
 	|	UNKNOWN_CPP_NAME  { $$ = $1.stoken; }
+	|	DOUBLE_KEYWORD { $$ = typename_stoken ("double"); }
 	|	oper { $$ = $1; }
 	;
 
@@ -1707,6 +1709,21 @@  operator_stoken (const char *op)
   return st;
 };
 
+/* Returns a stoken of the type named TYPE.  */
+
+static struct stoken
+typename_stoken (const char *type)
+{
+  struct stoken st = { NULL, 0 };
+  char *buf = xstrdup (type);
+  st.ptr = buf;
+  st.length = strlen (type);
+
+  /* The toplevel (c_parse) will free the memory allocated here.  */
+  make_cleanup (free, buf);
+  return st;
+};
+
 /* Return true if the type is aggregate-like.  */
 
 static int