Patchwork review request: implementing DW_AT_endianity

login
register
mail settings
Submitter Peeter Joot
Date Oct. 6, 2017, 3:05 p.m.
Message ID <VI1PR0501MB28618180080787DD22117A459C710@VI1PR0501MB2861.eurprd05.prod.outlook.com>
Download mbox | patch
Permalink /patch/23381/
State New
Headers show

Comments

Peeter Joot - Oct. 6, 2017, 3:05 p.m.
I've implemented support in gdb for DW_AT_endianity, DW_END_bigendian, and DW_END_littleendian, and would like to ask for gdb community review of this change.


Purpose: gcc6+ supports mixed endian attributes on structures and unions, and flags these dwarf instrumentation of these structure members with DW_END_... attributes.  As intel appears to have introduced these dwarf flags into the standard, I expect their compiler and debugger also supports them.  However, gdb does not.  The following example, compiled on a little endian system, is an example of this attribute use:


#include <stdio.h>

#include <string.h>


struct big {

    int v;

    short a[4];

} __attribute__( ( scalar_storage_order( "big-endian" ) ) );


struct little {

    int v;

    short a[4];

} __attribute__( ( scalar_storage_order( "little-endian" ) ) );


struct native {

    int v;

    short a[4];

};


int main() {

    struct big b = {3, {1, 2, 3, 4}};

    struct native n = {3, {1, 2, 3, 4}};

    struct little l = {3, {1, 2, 3, 4}};

    int cb = memcmp( &b, &n, sizeof( b ) );

    int cl = memcmp( &l, &n, sizeof( l ) );


    printf( "%d %d %d: big %s native.  little %s native\n", b.v, n.v, l.v,

            cb == 0 ? "==" : "!=", cl == 0 ? "==" : "!=" );

    return 0;

}



Running this produces the expected result, and the endianness of the underlying stores can be seen in a debugger session:


Breakpoint 1, main () at test.c:20

20          struct big b = {3, {1, 2, 3, 4}};

(gdb) n

21          struct native n = {3, {1, 2, 3, 4}};

(gdb) n

22          struct little l = {3, {1, 2, 3, 4}};

(gdb) n

23          int cb = memcmp( &b, &n, sizeof( b ) );

(gdb) p b

$1 = {v = 50331648, a = {256, 512, 768, 1024}}

(gdb) p n

$2 = {v = 3, a = {1, 2, 3, 4}}

(gdb) p l

$3 = {v = 3, a = {1, 2, 3, 4}}

(gdb) x/4x &b

0x7fffffffd96c: 0x03000000      0x02000100      0x04000300      0x00000000

(gdb) x/4x &l

0x7fffffffd954: 0x00000003      0x00020001      0x00040003      0x00000003

(gdb) x/4x &n

0x7fffffffd960: 0x00000003      0x00020001      0x00040003      0x03000000

(gdb) n

24          int cl = memcmp( &l, &n, sizeof( l ) );

(gdb)

26          printf( "%d %d %d: big %s native.  little %s native\n", b.v, n.v, l.v,

(gdb) n

3 3 3: big != native.  little == native

28          return 0;



This debugger session also shows that gdb currently ignores the DW_END_bigendian (and littleendian) attributes, showing the internal representation of the data instead of what the values that memory represents.  It turns out that propagating the DW_AT_endianity dwarf attributes to the gdb print_scalar_formatted function is fairly straightforward, which allows gdb to display the results in a user-friendly way regardless of the endian attributes:




On behalf of my employer (LzLabs), I would like to contribute this change to to the gdb project.  For a small change like this, it is not clear FSF copyright assignment is required, as I see the following in your CONTRIBUTIONS document: "Small changes can be accepted without a copyright assignment form on file.".  What counts as small?


If assignment is required, I have approval from company management and legal to formally go through that process.  Before figuring out how that assignment process works, I wanted to first ensure the changes I've made would be acceptable for contribution, and make any review driven updates required.


--

Peeter
Peeter Joot - Oct. 6, 2017, 9:18 p.m.
It looks like any talk of contribution (and perhaps review) is premature.  My initial test only had a big-endian structure, and the one I wrote describing what I wanted to address doesn't work (which I noticed only after sending my review request email).  The big-endian tagged structure does show up properly, but what I have done messes up the display of any little endian integers.  It appears that I end up setting my new big-endian flag inappropriately:


Breakpoint 1, print_scalar_formatted (valaddr=0x1e25260 "\003", type=0x1ce2670, options=0x7ffe20ac7490, size=0, stream=0x1c95a40)

    at ../../binutils-gdb/gdb/printcmd.c:355

355       struct gdbarch *gdbarch = get_type_arch (type);

(gdb) p *type->main_type

$1 = {code = TYPE_CODE_INT, flag_unsigned = 0, flag_nosign = 0, flag_stub = 0, flag_target_stub = 0, flag_static = 0,

  flag_prototyped = 0, flag_incomplete = 0, flag_varargs = 0, flag_vector = 0, flag_stub_supported = 0, flag_gnu_ifunc = 0,

  flag_fixed_instance = 0, flag_objfile_owned = 1, flag_endianity_big = 1, flag_endianity_little = 0, flag_declared_class = 0,

  flag_flag_enum = 0, type_specific_field = TYPE_SPECIFIC_NONE, nfields = 0, name = 0x1caacbc "int", tag_name = 0x0, owner = {

    objfile = 0x1c98620, gdbarch = 0x1c98620}, target_type = 0x0, flds_bnds = {fields = 0x0, bounds = 0x0}, type_specific = {

    cplus_stuff = 0x0, gnat_stuff = 0x0, floatformat = 0x0, func_stuff = 0x0, self_type = 0x0}, dyn_prop_list = 0x0}


This is when I'm printing a native endian integer, so do not want to have flag_endianity_big set.


I'm guessing that main_type is not the place for this flag, but it has to be in struct type instead.  Is that guess on the right track?


Peeter
Simon Marchi - Oct. 8, 2017, 6:41 p.m.
On 2017-10-06 11:05 AM, Peeter Joot wrote:
> I've implemented support in gdb for DW_AT_endianity, DW_END_bigendian, and DW_END_littleendian, and would like to ask for gdb community review of this change.

Hi Peeter,

Thanks for the patch.  Since the compiler supports it, it makes sense for the debugger
to support it correctly.

Please read the Contribution Checklist:

  https://sourceware.org/gdb/wiki/ContributionChecklist

Most importantly, make sure your code uses the GNU style, and use "git send-email" to
send your patch.  It will make it easier for reviewers to apply and look at your patch.

Changes to binutils should be sent the binutils mailing list (binutils@sourceware.org), so
make a separate patch for binutils/dwarf.c and send it there.

The best way to show that your contribution works is to add a test for it.  You can add it
to testsuite/gdb.base.  Copy an existing one (e.g. wchar.c/wchar.exp) and modify as needed.

You should also run the testsuite to see if your patch causes any regression:

  https://sourceware.org/gdb/wiki/TestingGDB

Note that there are many existing failures in the testsuite, so what you should do is run the
testsuite without and with your patch, and diff the before/after testsuite/gdb.sum file.

I just noted a few formatting comments below, I'll look at the code more in depth once you
send an updated version that's easier to apply.

> Purpose: gcc6+ supports mixed endian attributes on structures and unions, and flags these dwarf instrumentation of these structure members with DW_END_... attributes.  As intel appears to have introduced these dwarf flags into the standard, I expect their compiler and debugger also supports them.  However, gdb does not.  The following example, compiled on a little endian system, is an example of this attribute use:
> 
> 
> #include <stdio.h>
> 
> #include <string.h>
> 
> 
> struct big {
> 
>     int v;
> 
>     short a[4];
> 
> } __attribute__( ( scalar_storage_order( "big-endian" ) ) );
> 
> 
> struct little {
> 
>     int v;
> 
>     short a[4];
> 
> } __attribute__( ( scalar_storage_order( "little-endian" ) ) );
> 
> 
> struct native {
> 
>     int v;
> 
>     short a[4];
> 
> };
> 
> 
> int main() {
> 
>     struct big b = {3, {1, 2, 3, 4}};
> 
>     struct native n = {3, {1, 2, 3, 4}};
> 
>     struct little l = {3, {1, 2, 3, 4}};
> 
>     int cb = memcmp( &b, &n, sizeof( b ) );
> 
>     int cl = memcmp( &l, &n, sizeof( l ) );
> 
> 
>     printf( "%d %d %d: big %s native.  little %s native\n", b.v, n.v, l.v,
> 
>             cb == 0 ? "==" : "!=", cl == 0 ? "==" : "!=" );
> 
>     return 0;
> 
> }
> 
> 
> 
> Running this produces the expected result, and the endianness of the underlying stores can be seen in a debugger session:
> 
> 
> Breakpoint 1, main () at test.c:20
> 
> 20          struct big b = {3, {1, 2, 3, 4}};
> 
> (gdb) n
> 
> 21          struct native n = {3, {1, 2, 3, 4}};
> 
> (gdb) n
> 
> 22          struct little l = {3, {1, 2, 3, 4}};
> 
> (gdb) n
> 
> 23          int cb = memcmp( &b, &n, sizeof( b ) );
> 
> (gdb) p b
> 
> $1 = {v = 50331648, a = {256, 512, 768, 1024}}
> 
> (gdb) p n
> 
> $2 = {v = 3, a = {1, 2, 3, 4}}
> 
> (gdb) p l
> 
> $3 = {v = 3, a = {1, 2, 3, 4}}
> 
> (gdb) x/4x &b
> 
> 0x7fffffffd96c: 0x03000000      0x02000100      0x04000300      0x00000000
> 
> (gdb) x/4x &l
> 
> 0x7fffffffd954: 0x00000003      0x00020001      0x00040003      0x00000003
> 
> (gdb) x/4x &n
> 
> 0x7fffffffd960: 0x00000003      0x00020001      0x00040003      0x03000000
> 
> (gdb) n
> 
> 24          int cl = memcmp( &l, &n, sizeof( l ) );
> 
> (gdb)
> 
> 26          printf( "%d %d %d: big %s native.  little %s native\n", b.v, n.v, l.v,
> 
> (gdb) n
> 
> 3 3 3: big != native.  little == native
> 
> 28          return 0;
> 
> 
> 
> This debugger session also shows that gdb currently ignores the DW_END_bigendian (and littleendian) attributes, showing the internal representation of the data instead of what the values that memory represents.  It turns out that propagating the DW_AT_endianity dwarf attributes to the gdb print_scalar_formatted function is fairly straightforward, which allows gdb to display the results in a user-friendly way regardless of the endian attributes:
> 
> 
> diff --git a/binutils/dwarf.c b/binutils/dwarf.c
> index 91f95ff..e79fd5e 100644
> --- a/binutils/dwarf.c
> +++ b/binutils/dwarf.c
> @@ -2283,6 +2283,17 @@ read_and_display_attr_value (unsigned long attribute,
>   }
>        break;
> 
> +    case DW_AT_endianity:
> +      printf ("\t");
> +      switch (uvalue)
> + {
> + case DW_END_default: printf ("(default)"); break;
> + case DW_END_big: printf ("(big)"); break;
> + case DW_END_little: printf ("(little)"); break;
> + default: printf (_("(unknown endianity)")); break;

Put each statement on its own line:

case DW_END_default:
  printf ("(default)");
  break;

> + }
> +      break;
> +
>      case DW_AT_virtuality:
>        printf ("\t");
>        switch (uvalue)
> diff --git a/gdb/ChangeLog b/gdb/ChangeLog
> index 18224e0..66a8eaf 100644
> --- a/gdb/ChangeLog
> +++ b/gdb/ChangeLog
> @@ -1,3 +1,19 @@
> +2017-10-06  Peeter Joot  <peeter.joot@lzlabs.com>
> +
> + * gdb/gdbtypes.h (global scope): define TYPE_ENDIANITY_BIG,
> + TYPE_ENDIANITY_LITTLE.
> + * binutils/dwarf.c (read_and_display_attr_value): Handle
> + DW_AT_endianity, DW_END_default, DW_END_big, DW_END_little
> + * gdb/dwarf2read.c (read_base_type): Handle DW_END_big, DW_END_little
> + * gdb/gdbtypes.c (check_types_equal): Require matching
> + TYPE_ENDIANITY_BIG, and TYPE_ENDIANITY_LITTLE if set.
> + (recursive_dump_type): Print TYPE_ENDIANITY_BIG, and
> + TYPE_ENDIANITY_LITTLE if set.
> + (struct main_type): Add flag_endianity_big, flag_endianity_little
> + * gdb/printcmd.c (print_scalar_formatted): Use compiler supplied
> + endianness instead of arch endianness if TYPE_ENDIANITY_BIG or
> + TYPE_ENDIANITY_LITTLE is set.
> +
>  2017-10-06  Yao Qi  <yao.qi@linaro.org>
> 
>   * Makefile.in (ALL_64_TARGET_OBS): Replace aarch64-insn.o with
> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index 1b15adc..fa2889b 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -15234,6 +15234,7 @@ read_base_type (struct die_info *die, struct dwarf2_cu *cu)
>    struct type *type;
>    struct attribute *attr;
>    int encoding = 0, bits = 0;
> +  int endianity = 0;
>    const char *name;
> 
>    attr = dwarf2_attr (die, DW_AT_encoding, cu);
> @@ -15330,6 +15331,21 @@ read_base_type (struct die_info *die, struct dwarf2_cu *cu)
>    if (name && strcmp (name, "char") == 0)
>      TYPE_NOSIGN (type) = 1;
> 
> +  attr = dwarf2_attr (die, DW_AT_endianity, cu);
> +  if ( attr )
> +    {
> +      endianity = DW_UNSND (attr);
> +      switch (endianity)
> +        {
> +           case DW_END_big:
> +              TYPE_ENDIANITY_BIG (type) = 1;
> +              break;
> +           case DW_END_little:
> +              TYPE_ENDIANITY_LITTLE (type) = 1;
> +              break;
> +        }
> +    }
> +
>    return set_die_type (die, type, cu);
>  }
> 
> diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
> index 73d4453..43f553b 100644
> --- a/gdb/gdbtypes.c
> +++ b/gdb/gdbtypes.c
> @@ -3423,6 +3423,8 @@ check_types_equal (struct type *type1, struct type *type2,
>        || TYPE_LENGTH (type1) != TYPE_LENGTH (type2)
>        || TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)
>        || TYPE_NOSIGN (type1) != TYPE_NOSIGN (type2)
> +      || TYPE_ENDIANITY_BIG (type1) != TYPE_ENDIANITY_BIG (type2)
> +      || TYPE_ENDIANITY_LITTLE (type1) != TYPE_ENDIANITY_LITTLE (type2)
>        || TYPE_VARARGS (type1) != TYPE_VARARGS (type2)
>        || TYPE_VECTOR (type1) != TYPE_VECTOR (type2)
>        || TYPE_NOTTEXT (type1) != TYPE_NOTTEXT (type2)
> @@ -4460,6 +4462,14 @@ recursive_dump_type (struct type *type, int spaces)
>      {
>        puts_filtered (" TYPE_NOSIGN");
>      }
> +  if (TYPE_ENDIANITY_BIG (type))
> +    {
> +      puts_filtered (" TYPE_ENDIANITY_BIG");
> +    }
> +  if (TYPE_ENDIANITY_LITTLE (type))
> +    {
> +      puts_filtered (" TYPE_ENDIANITY_LITTLE");
> +    }
>    if (TYPE_STUB (type))
>      {
>        puts_filtered (" TYPE_STUB");
> diff --git a/gdb/gdbtypes.h b/gdb/gdbtypes.h
> index 009cea9..074aa2c 100644
> --- a/gdb/gdbtypes.h
> +++ b/gdb/gdbtypes.h
> @@ -210,6 +210,16 @@ DEF_ENUM_FLAGS_TYPE (enum type_instance_flag_value, type_instance_flags);
> 
>  #define TYPE_NOSIGN(t) (TYPE_MAIN_TYPE (t)->flag_nosign)
> 
> +/* * Mixed endian archetectures can supply dwarf instrumentation
> + * that indicates the desired endian interpretation of the variable.
> + * This indicates that the interpretation should be big-endian
> + * even if the cpu is running in little endian mode. */
> +#define TYPE_ENDIANITY_BIG(t) (TYPE_MAIN_TYPE (t)->flag_endianity_big)
> +
> +/* * The type has a little endian interpretation even if the cpu
> + * is running in big endian mode. */
> +#define TYPE_ENDIANITY_LITTLE(t) (TYPE_MAIN_TYPE (t)->flag_endianity_little)
> +
>  /* * This appears in a type's flags word if it is a stub type (e.g.,
>     if someone referenced a type that wasn't defined in a source file
>     via (struct sir_not_appearing_in_this_film *)).  */
> @@ -616,6 +626,8 @@ struct main_type
>    unsigned int flag_gnu_ifunc : 1;
>    unsigned int flag_fixed_instance : 1;
>    unsigned int flag_objfile_owned : 1;
> +  unsigned int flag_endianity_big : 1;
> +  unsigned int flag_endianity_little : 1;
> 
>    /* * True if this type was declared with "class" rather than
>       "struct".  */
> diff --git a/gdb/printcmd.c b/gdb/printcmd.c
> index a8743f1..e714283 100644
> --- a/gdb/printcmd.c
> +++ b/gdb/printcmd.c
> @@ -356,6 +356,15 @@ print_scalar_formatted (const gdb_byte *valaddr, struct type *type,
>    unsigned int len = TYPE_LENGTH (type);
>    enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
> 
> +  if ( TYPE_ENDIANITY_BIG (type) )
> +  {
> +    byte_order = BFD_ENDIAN_BIG;
> +  }
> +  if ( TYPE_ENDIANITY_LITTLE (type) )

Don't put spaces inside parentheses.

> +  {
> +    byte_order = BFD_ENDIAN_LITTLE;
> +  }

Don't use curly braces for a single statement.

> +
>    /* String printing should go through val_print_scalar_formatted.  */
>    gdb_assert (options->format != 's');
> 
> 
> 
> On behalf of my employer (LzLabs), I would like to contribute this change to to the gdb project.  For a small change like this, it is not clear FSF copyright assignment is required, as I see the following in your CONTRIBUTIONS document: "Small changes can be accepted without a copyright assignment form on file.".  What counts as small?
> 
> 
> If assignment is required, I have approval from company management and legal to formally go through that process.  Before figuring out how that assignment process works, I wanted to first ensure the changes I've made would be acceptable for contribution, and make any review driven updates required.

I would consider a change like that to be significant enough to require an assignment.

Thanks!

Simon
Peeter Joot - Oct. 9, 2017, 9:11 a.m.
Hi Simon,


> Please read the Contribution Checklist:

>

>   https://sourceware.org/gdb/wiki/ContributionChecklist

>

> Most importantly, make sure your code uses the GNU style, and use "git send-email" to

> send your patch.  It will make it easier for reviewers to apply and look at your patch.

>

> Changes to binutils should be sent the binutils mailing list (binutils@sourceware.org), so

> make a separate patch for binutils/dwarf.c and send it there.

>

> The best way to show that your contribution works is to add a test for it.  You can add it

> to testsuite/gdb.base.  Copy an existing one (e.g. wchar.c/wchar.exp) and modify as needed.

>

> You should also run the testsuite to see if your patch causes any regression:

>

>   https://sourceware.org/gdb/wiki/TestingGDB

>

> Note that there are many existing failures in the testsuite, so what you should do is run the

> testsuite without and with your patch, and diff the before/after testsuite/gdb.sum file.

Thanks for the new-developer tips.


> > --- a/binutils/dwarf.c

> > +++ b/binutils/dwarf.c

...

> Put each statement on its own line:

>

> case DW_END_default:

>   printf ("(default)");

>   break;



Note that if I did this, the addition wouldn't match any of the case statements in the surrounding code (read_and_display_attr_value).  There are case statements matching your suggested style earlier in this function, but others that do not (like all the ones immediately surrounding my addition).  I've temporarily adjusted my addition to match the immediately surrounding code (i.e. using tabs instead of spaces, which I hadn't initially noticed).  Shall I adjust all of that surrounding "non-standard formatting" so that each statement is on its own line, or be consistent locally with the conventions used in this file?


I've fixed up all the other formatting issues that you've pointed out, and will submit a new patch with git send-email after running the test suite, and adding my own new test.  The next patch I send will have an additional mechanical change, altering blocks of code from:


gdbarch_byte_order (get_type_arch (type1))


to:


gdbarch_byte_order_for_type (NULL, type1)


where the following helper function has been added to gdbtypes.c:


enum bfd_endian

gdbarch_byte_order_for_type (struct gdbarch * gdbarch, struct type *type)

{

  if (TYPE_ENDIANITY_BIG (type))

    return BFD_ENDIAN_BIG;

  else if (TYPE_ENDIANITY_LITTLE (type))

    return BFD_ENDIAN_LITTLE;

  if (!gdbarch)

    gdbarch = get_type_arch (type);

  return gdbarch_byte_order (gdbarch);

}


When the caller of this already had a gdbarch variable handy, I've used that, instead of passing NULL.  This transformation is enough to make gdb assignment to a mixed endian field, as in:


(gdb) p b.v = 4


work properly, storing the value in big-endian format.  This is the using the same example from my original review request email, assuming a little endian target and a structure with a big-endian attribute.


In the process of doing this debug testing, I've noticed that gcc appears to have some bugs with respect to its dwarf endianity attribute tagging.  For example:


#include <stdio.h>

#include <string.h>


typedef int int32be_t;

typedef short int16be_t;


struct big {

    int32be_t v;

    int16be_t a[4];

} __attribute__( ( scalar_storage_order( "big-endian" ) ) );


struct native {

    int v;

    short a[4];

};


int main() {

    struct big b = {3, {1, 2, 3, 4}};

    struct native n = {3, {1, 2, 3, 4}};


    printf( "%d\n", b.v );

    printf( "%d\n", n.v );


    return 0;

}


A version of this code that does not use typedefs for the field types does get tagged with DW_AT_endianity/DW_END_big, but when typedefs are used as above, the object code ends up with no endianity attributes at all (although the compiler does insert the desired bswap operations).  It appears that it may take a few iterations of compiler/debugger enhancements to really get this working nicely together.


There is a bigger issue of the DW_AT_name that gets emitted for a field with non-native endianity.  I think it would be best for the compiler to choose a different DW_AT_name than the default (int.be for example).  If that is not done it looks like it breaks gdb's assumption that there is one set of attributes for each type in question.  This could be considered a gdb bug, but I think it would be better for the compiler to cater to gdb in this case.  I'm curious what other developers (especially anybody that works on both gcc and gdb) think about this.


> I would consider a change like that to be significant enough to require an assignment.


Okay.  I've contacted the FSF assignment representative to start this process.


Peeter
Simon Marchi - Oct. 9, 2017, 12:12 p.m.
On 2017-10-09 05:11 AM, Peeter Joot wrote:
> Note that if I did this, the addition wouldn't match any of the case statements in the surrounding code (read_and_display_attr_value).  There are case statements matching your suggested style earlier in this function, but others that do not (like all the ones immediately surrounding my addition).  I've temporarily adjusted my addition to match the immediately surrounding code (i.e. using tabs instead of spaces, which I hadn't initially noticed).  Shall I adjust all of that surrounding "non-standard formatting" so that each statement is on its own line, or be consistent locally with the conventions used in this file?

Ok, I see this is binutils code, and I hadn't see that the surrounding code did like that.  You
can leave it as is, and the binutils people will tell you if that's not ok.

> I've fixed up all the other formatting issues that you've pointed out, and will submit a new patch with git send-email after running the test suite, and adding my own new test.  The next patch I send will have an additional mechanical change, altering blocks of code from:
> 
> 
> gdbarch_byte_order (get_type_arch (type1))
> 
> 
> to:
> 
> 
> gdbarch_byte_order_for_type (NULL, type1)
> 
> 
> where the following helper function has been added to gdbtypes.c:
> 
> 
> enum bfd_endian
> 
> gdbarch_byte_order_for_type (struct gdbarch * gdbarch, struct type *type)
> 
> {
> 
>   if (TYPE_ENDIANITY_BIG (type))
> 
>     return BFD_ENDIAN_BIG;
> 
>   else if (TYPE_ENDIANITY_LITTLE (type))
> 
>     return BFD_ENDIAN_LITTLE;
> 
>   if (!gdbarch)
> 
>     gdbarch = get_type_arch (type);
> 
>   return gdbarch_byte_order (gdbarch);
> 
> }

I suggest naming this function type_byte_order.  Functions named "gdbarch_*" are usually
those part of the gdbarch interface (defined in gdbarch.sh/.h/.c).

> 
> 
> When the caller of this already had a gdbarch variable handy, I've used that, instead of passing NULL.  This transformation is enough to make gdb assignment to a mixed endian field, as in:
> 
> 
> (gdb) p b.v = 4

Nice.  Assginment of fields by GDB would be a good thing to check in the test.

> work properly, storing the value in big-endian format.  This is the using the same example from my original review request email, assuming a little endian target and a structure with a big-endian attribute.
> 
> 
> In the process of doing this debug testing, I've noticed that gcc appears to have some bugs with respect to its dwarf endianity attribute tagging.  For example:
> 
> 
> #include <stdio.h>
> 
> #include <string.h>
> 
> 
> typedef int int32be_t;
> 
> typedef short int16be_t;
> 
> 
> struct big {
> 
>     int32be_t v;
> 
>     int16be_t a[4];
> 
> } __attribute__( ( scalar_storage_order( "big-endian" ) ) );
> 
> 
> struct native {
> 
>     int v;
> 
>     short a[4];
> 
> };
> 
> 
> int main() {
> 
>     struct big b = {3, {1, 2, 3, 4}};
> 
>     struct native n = {3, {1, 2, 3, 4}};
> 
> 
>     printf( "%d\n", b.v );
> 
>     printf( "%d\n", n.v );
> 
> 
>     return 0;
> 
> }
> 
> 
> A version of this code that does not use typedefs for the field types does get tagged with DW_AT_endianity/DW_END_big, but when typedefs are used as above, the object code ends up with no endianity attributes at all (although the compiler does insert the desired bswap operations).  It appears that it may take a few iterations of compiler/debugger enhancements to really get this working nicely together.

Ah indeed.  Do you report the gcc bugs you find to them?

> There is a bigger issue of the DW_AT_name that gets emitted for a field with non-native endianity.  I think it would be best for the compiler to choose a different DW_AT_name than the default (int.be for example).  If that is not done it looks like it breaks gdb's assumption that there is one set of attributes for each type in question.  This could be considered a gdb bug, but I think it would be better for the compiler to cater to gdb in this case.  I'm curious what other developers (especially anybody that works on both gcc and gdb) think about this.
> 
> 
>> I would consider a change like that to be significant enough to require an assignment.
> 
> 
> Okay.  I've contacted the FSF assignment representative to start this process.
Great, thanks,

Simon
Peeter Joot - Oct. 10, 2017, 6:16 p.m.
> I suggest naming this function type_byte_order.  Functions named "gdbarch_*" are usually

> those part of the gdbarch interface (defined in gdbarch.sh/.h/.c).


done.

> Nice.  Assginment of fields by GDB would be a good thing to check in the test.

done.

> Ah indeed.  Do you report the gcc bugs you find to them?

I will verify first on the dev version of gcc8 that this is still an issue before submitting a report.

> testsuite.

It is normal to see the number of tests vary when running the test suite (make check -j8)?  My before and after runs had an unexpected difference in the numbers of tests:

                === gdb Summary ===



-# of expected passes           40087

-# of unexpected failures       96

+# of expected passes           40082

+# of unexpected failures       98

 # of unexpected successes      1

 # of expected failures         67

 # of unknown successes         3


My test added 4 additional expected passes (and I verified that my new tests ran in gdb/testsuite/gdb.log), so the number of expected successes should have grown by 4, not decreased by 5?  Some of the failures differences look like buggy tests (outputting pids and so forth).

I clearly didn't regress anything significant, but didn't expect the baseline to vary run to run.

Peeter
Simon Marchi - Oct. 10, 2017, 6:32 p.m.
On 2017-10-10 14:16, Peeter Joot wrote:
>> I suggest naming this function type_byte_order.  Functions named 
>> "gdbarch_*" are usually
> 
>> those part of the gdbarch interface (defined in gdbarch.sh/.h/.c).
> 
> 
> done.
> 
>> Nice.  Assginment of fields by GDB would be a good thing to check in 
>> the test.
> 
> done.
> 
>> Ah indeed.  Do you report the gcc bugs you find to them?
> 
> I will verify first on the dev version of gcc8 that this is still an
> issue before submitting a report.
> 
>> testsuite.
> 
> It is normal to see the number of tests vary when running the test
> suite (make check -j8)?  My before and after runs had an unexpected
> difference in the numbers of tests:
> 
>                 === gdb Summary ===
> 
> 
> 
> -# of expected passes           40087
> 
> -# of unexpected failures       96
> 
> +# of expected passes           40082
> 
> +# of unexpected failures       98
> 
>  # of unexpected successes      1
> 
>  # of expected failures         67
> 
>  # of unknown successes         3
> 
> 
> My test added 4 additional expected passes (and I verified that my new
> tests ran in gdb/testsuite/gdb.log), so the number of expected
> successes should have grown by 4, not decreased by 5?  Some of the
> failures differences look like buggy tests (outputting pids and so
> forth).
> 
> I clearly didn't regress anything significant, but didn't expect the
> baseline to vary run to run.
> 
> Peeter

Yeah, it's possible for the total number of test to vary when tests 
start failing.  For example, you can have:

if [something] {
   fail "couldn't fetch variable"
   return
}

gdb_test "..."
gdb_test "..."

If it succeeds, you'll have 2 pass, 0 fail.  If [something] fails, 
you'll have 0 pass, 1 fail.  If your case you have 2 more unexpected 
failures, I would look into these, see if they are related.  Which tests 
are they from?

Simon
Peeter Joot - Oct. 10, 2017, 6:38 p.m.
These were the differences in the gdb.sum output for the baseline run vs. mine:


-PASS: gdb.base/step-over-syscall.exp: clone: displaced=off: single step over clone

-PASS: gdb.base/step-over-syscall.exp: clone: displaced=off: get hexadecimal valueof "$pc"

-PASS: gdb.base/step-over-syscall.exp: clone: displaced=off: single step over clone final pc

-PASS: gdb.base/step-over-syscall.exp: clone: displaced=off: break marker

-PASS: gdb.base/step-over-syscall.exp: clone: displaced=off: continue to marker (clone)

-KFAIL: gdb.base/step-over-syscall.exp: clone: displaced=on: single step over clone (PRMS: gdb/19675)

+FAIL: gdb.base/step-over-syscall.exp: clone: displaced=off: single step over clone

-PASS: gdb.mi/list-thread-groups-available.exp: list available thread groups

+FAIL: gdb.mi/list-thread-groups-available.exp: list available thread groups (unexpected output)

-PASS: gdb.threads/attach-into-signal.exp: nonthreaded: attempt 1: attach (pass 1), pending signal catch

-PASS: gdb.threads/attach-into-signal.exp: nonthreaded: attempt 3: attach (pass 2), pending signal catch

+PASS: gdb.threads/attach-into-signal.exp: nonthreaded: attempt 4: attach (pass 1), pending signal catch

+PASS: gdb.threads/attach-into-signal.exp: nonthreaded: attempt 4: attach (pass 2), pending signal catch

-PASS: gdb.threads/attach-into-signal.exp: threaded: attempt 4: attach (pass 1), pending signal catch

-PASS: gdb.threads/attach-into-signal.exp: threaded: attempt 7: attach (pass 2), pending signal catch

+PASS: gdb.threads/attach-into-signal.exp: threaded: attempt 3: attach (pass 1), pending signal catch

+PASS: gdb.threads/attach-into-signal.exp: threaded: attempt 3: attach (pass 2), pending signal catch

-PASS: gdb.threads/process-dies-while-detaching.exp: single-process: detach: killed outside: get integer valueof "mypid" (9535)

+PASS: gdb.threads/process-dies-while-detaching.exp: single-process: detach: killed outside: get integer valueof "mypid" (6194)

-PASS: gdb.threads/process-dies-while-detaching.exp: single-process: continue: killed outside: get integer valueof "mypid" (8865)

+PASS: gdb.threads/process-dies-while-detaching.exp: single-process: continue: killed outside: get integer valueof "mypid" (19681)

-PASS: gdb.threads/process-dies-while-detaching.exp: multi-process: detach: killed outside: get integer valueof "mypid" (443)

+PASS: gdb.threads/process-dies-while-detaching.exp: multi-process: detach: killed outside: get integer valueof "mypid" (11368)

-PASS: gdb.threads/process-dies-while-detaching.exp: multi-process: continue: killed outside: get integer valueof "mypid" (24003)

+PASS: gdb.threads/process-dies-while-detaching.exp: multi-process: continue: killed outside: get integer valueof "mypid" (2640)

-PASS: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on: cond_bp_target=0: inferior 1 exited

-PASS: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on: cond_bp_target=0: no threads left

+KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=on: cond_bp_target=0: inferior 1 exited (prompt) (PRMS: gdb/18749)

-PASS: gdb.threads/process-dies-while-handling-bp.exp: non_stop=off: cond_bp_target=0: inferior 1 exited

-PASS: gdb.threads/process-dies-while-handling-bp.exp: non_stop=off: cond_bp_target=0: no threads left

+KFAIL: gdb.threads/process-dies-while-handling-bp.exp: non_stop=off: cond_bp_target=0: inferior 1 exited (prompt) (PRMS: gdb/18749)

-PASS: gdb.threads/signal-while-stepping-over-bp-other-thread.exp: step

+FAIL: gdb.threads/signal-while-stepping-over-bp-other-thread.exp: step (pattern 3)

-FAIL: gdb.threads/watchpoint-fork.exp: child: multithreaded: breakpoint (A) after the second fork (timeout)

+PASS: gdb.threads/watchpoint-fork.exp: child: multithreaded: breakpoint (A) after the second fork

-KFAIL: gdb.threads/watchthreads2.exp: gdb can drop watchpoints in multithreaded app (PRMS: gdb/10116)

+PASS: gdb.threads/watchthreads2.exp: all threads incremented x



(there were also the 4 additional PASS: records from my new test, and no failures from that test).


Peeter
Simon Marchi - Oct. 10, 2017, 6:48 p.m.
On 2017-10-10 14:38, Peeter Joot wrote:

About these:

> -PASS: gdb.threads/process-dies-while-detaching.exp: single-process:
> detach: killed outside: get integer valueof "mypid" (9535)
> 
> +PASS: gdb.threads/process-dies-while-detaching.exp: single-process:
> detach: killed outside: get integer valueof "mypid" (6194)

It's weird that get_integer_valueof outputs the value in the test name.  
I can't remember see this variation, but it's bad anyhow.  I'll post a 
patch for that.

The rest seems to be known racy test cases (not saying it's good, just 
saying it's known).

> (there were also the 4 additional PASS: records from my new test, and
> no failures from that test).

Looks good then!

Simon
Peeter Joot - Oct. 10, 2017, 7:38 p.m.
re: the gcc bug report.  Have now opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82509


Peeter

Patch

diff --git a/binutils/dwarf.c b/binutils/dwarf.c
index 91f95ff..e79fd5e 100644
--- a/binutils/dwarf.c
+++ b/binutils/dwarf.c
@@ -2283,6 +2283,17 @@  read_and_display_attr_value (unsigned long attribute,
  }
       break;

+    case DW_AT_endianity:
+      printf ("\t");
+      switch (uvalue)
+ {
+ case DW_END_default: printf ("(default)"); break;
+ case DW_END_big: printf ("(big)"); break;
+ case DW_END_little: printf ("(little)"); break;
+ default: printf (_("(unknown endianity)")); break;
+ }
+      break;
+
     case DW_AT_virtuality:
       printf ("\t");
       switch (uvalue)
diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 18224e0..66a8eaf 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,19 @@ 
+2017-10-06  Peeter Joot  <peeter.joot@lzlabs.com>
+
+ * gdb/gdbtypes.h (global scope): define TYPE_ENDIANITY_BIG,
+ TYPE_ENDIANITY_LITTLE.
+ * binutils/dwarf.c (read_and_display_attr_value): Handle
+ DW_AT_endianity, DW_END_default, DW_END_big, DW_END_little
+ * gdb/dwarf2read.c (read_base_type): Handle DW_END_big, DW_END_little
+ * gdb/gdbtypes.c (check_types_equal): Require matching
+ TYPE_ENDIANITY_BIG, and TYPE_ENDIANITY_LITTLE if set.
+ (recursive_dump_type): Print TYPE_ENDIANITY_BIG, and
+ TYPE_ENDIANITY_LITTLE if set.
+ (struct main_type): Add flag_endianity_big, flag_endianity_little
+ * gdb/printcmd.c (print_scalar_formatted): Use compiler supplied
+ endianness instead of arch endianness if TYPE_ENDIANITY_BIG or
+ TYPE_ENDIANITY_LITTLE is set.
+
 2017-10-06  Yao Qi  <yao.qi@linaro.org>

  * Makefile.in (ALL_64_TARGET_OBS): Replace aarch64-insn.o with
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index 1b15adc..fa2889b 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -15234,6 +15234,7 @@  read_base_type (struct die_info *die, struct dwarf2_cu *cu)
   struct type *type;
   struct attribute *attr;
   int encoding = 0, bits = 0;
+  int endianity = 0;
   const char *name;

   attr = dwarf2_attr (die, DW_AT_encoding, cu);
@@ -15330,6 +15331,21 @@  read_base_type (struct die_info *die, struct dwarf2_cu *cu)
   if (name && strcmp (name, "char") == 0)
     TYPE_NOSIGN (type) = 1;

+  attr = dwarf2_attr (die, DW_AT_endianity, cu);
+  if ( attr )
+    {
+      endianity = DW_UNSND (attr);
+      switch (endianity)
+        {
+           case DW_END_big:
+              TYPE_ENDIANITY_BIG (type) = 1;
+              break;
+           case DW_END_little:
+              TYPE_ENDIANITY_LITTLE (type) = 1;
+              break;
+        }
+    }
+
   return set_die_type (die, type, cu);
 }

diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
index 73d4453..43f553b 100644
--- a/gdb/gdbtypes.c
+++ b/gdb/gdbtypes.c
@@ -3423,6 +3423,8 @@  check_types_equal (struct type *type1, struct type *type2,
       || TYPE_LENGTH (type1) != TYPE_LENGTH (type2)
       || TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)
       || TYPE_NOSIGN (type1) != TYPE_NOSIGN (type2)
+      || TYPE_ENDIANITY_BIG (type1) != TYPE_ENDIANITY_BIG (type2)
+      || TYPE_ENDIANITY_LITTLE (type1) != TYPE_ENDIANITY_LITTLE (type2)
       || TYPE_VARARGS (type1) != TYPE_VARARGS (type2)
       || TYPE_VECTOR (type1) != TYPE_VECTOR (type2)
       || TYPE_NOTTEXT (type1) != TYPE_NOTTEXT (type2)
@@ -4460,6 +4462,14 @@  recursive_dump_type (struct type *type, int spaces)
     {
       puts_filtered (" TYPE_NOSIGN");
     }
+  if (TYPE_ENDIANITY_BIG (type))
+    {
+      puts_filtered (" TYPE_ENDIANITY_BIG");
+    }
+  if (TYPE_ENDIANITY_LITTLE (type))
+    {
+      puts_filtered (" TYPE_ENDIANITY_LITTLE");
+    }
   if (TYPE_STUB (type))
     {
       puts_filtered (" TYPE_STUB");
diff --git a/gdb/gdbtypes.h b/gdb/gdbtypes.h
index 009cea9..074aa2c 100644
--- a/gdb/gdbtypes.h
+++ b/gdb/gdbtypes.h
@@ -210,6 +210,16 @@  DEF_ENUM_FLAGS_TYPE (enum type_instance_flag_value, type_instance_flags);

 #define TYPE_NOSIGN(t) (TYPE_MAIN_TYPE (t)->flag_nosign)

+/* * Mixed endian archetectures can supply dwarf instrumentation
+ * that indicates the desired endian interpretation of the variable.
+ * This indicates that the interpretation should be big-endian
+ * even if the cpu is running in little endian mode. */
+#define TYPE_ENDIANITY_BIG(t) (TYPE_MAIN_TYPE (t)->flag_endianity_big)
+
+/* * The type has a little endian interpretation even if the cpu
+ * is running in big endian mode. */
+#define TYPE_ENDIANITY_LITTLE(t) (TYPE_MAIN_TYPE (t)->flag_endianity_little)
+
 /* * This appears in a type's flags word if it is a stub type (e.g.,
    if someone referenced a type that wasn't defined in a source file
    via (struct sir_not_appearing_in_this_film *)).  */
@@ -616,6 +626,8 @@  struct main_type
   unsigned int flag_gnu_ifunc : 1;
   unsigned int flag_fixed_instance : 1;
   unsigned int flag_objfile_owned : 1;
+  unsigned int flag_endianity_big : 1;
+  unsigned int flag_endianity_little : 1;

   /* * True if this type was declared with "class" rather than
      "struct".  */
diff --git a/gdb/printcmd.c b/gdb/printcmd.c
index a8743f1..e714283 100644
--- a/gdb/printcmd.c
+++ b/gdb/printcmd.c
@@ -356,6 +356,15 @@  print_scalar_formatted (const gdb_byte *valaddr, struct type *type,
   unsigned int len = TYPE_LENGTH (type);
   enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);

+  if ( TYPE_ENDIANITY_BIG (type) )
+  {
+    byte_order = BFD_ENDIAN_BIG;
+  }
+  if ( TYPE_ENDIANITY_LITTLE (type) )
+  {
+    byte_order = BFD_ENDIAN_LITTLE;
+  }
+
   /* String printing should go through val_print_scalar_formatted.  */
   gdb_assert (options->format != 's');