arc: Select CPU model properly before disassembling

Message ID 20170601160213.16092-1-Anton.Kolesov@synopsys.com
State New, archived
Headers

Commit Message

Anton Kolesov June 1, 2017, 4:02 p.m. UTC
  Enforce CPU model for disassembler via its options, if it was specified in XML
target description, otherwise use default method of determining CPU implemented
in disassembler - scanning ELF private header.  The latter requires
disassemble_info->section to be properly initialized.  Unfortunately the latter
trick will be used only when doing semantic disassembling to analyze
instructions, because arc_delayed_print_insn is not called to print
instructions starting with [1].  Maybe it would be possible to fix that by
altering opcodes/disassemble.c:disassemble_init_for_target somehow.

Support for CPU in disassembler options for ARC has been added in [2].

[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=39503f82427e22ed8e04d986ccdc8562091ec62e
[2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10045478d984f9924cb945423388ba25b7dd3ffe

gdb/ChangeLog:

yyyy-mm-dd  Anton Kolesov  <anton.kolesov@synopsys.com>

	* arc-tdep.c (arc_disassembler_options): New variable.
	(arc_gdbarch_init): Set it.
	(arc_delayed_print_insn): Set info->section when needed.
---
 gdb/arc-tdep.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)
  

Comments

Simon Marchi June 5, 2017, 8:07 p.m. UTC | #1
On 2017-06-01 18:02, Anton Kolesov wrote:
> Enforce CPU model for disassembler via its options, if it was specified 
> in XML
> target description, otherwise use default method of determining CPU 
> implemented
> in disassembler - scanning ELF private header.  The latter requires
> disassemble_info->section to be properly initialized.  Unfortunately 
> the latter
> trick will be used only when doing semantic disassembling to analyze
> instructions, because arc_delayed_print_insn is not called to print
> instructions starting with [1].  Maybe it would be possible to fix that 
> by
> altering opcodes/disassemble.c:disassemble_init_for_target somehow.
> 
> Support for CPU in disassembler options for ARC has been added in [2].
> 
> [1]
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=39503f82427e22ed8e04d986ccdc8562091ec62e
> [2]
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10045478d984f9924cb945423388ba25b7dd3ffe

Hi Anton,

This is not very clear to me (I haven't really touched disassembly yet), 
so let me train to summarize the situation.  Let's say you feed the 
right executable to GDB, but connect to a gdbserver that doesn't report 
the cpu arch in the target description.  arc_disassembler_options won't 
be set, so we won't pass an explicit disassembler option.  opcodes 
internals could try to fall back on data from the BFD if 
disassemble_info::section was set.  However, this is only done in 
arc_delayed_print_insn, which is not used in that case (it was removed 
as a gdbarch callback in commit [2]).  Does that sound right?

Could gdb_disassembler set the section field of disassemble_info itself? 
  What you are doing to set the section field here is not ARC-specific, 
it looks like it could potentially help other architectures to have that 
field set.

Other places that use the disassembler would have to set it too, for 
example in the object prepared by the arc_disassemble_info function.

> gdb/ChangeLog:
> 
> yyyy-mm-dd  Anton Kolesov  <anton.kolesov@synopsys.com>
> 
> 	* arc-tdep.c (arc_disassembler_options): New variable.
> 	(arc_gdbarch_init): Set it.
> 	(arc_delayed_print_insn): Set info->section when needed.
> ---
>  gdb/arc-tdep.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
> 
> diff --git a/gdb/arc-tdep.c b/gdb/arc-tdep.c
> index d9ee5c6..c0b8a14 100644
> --- a/gdb/arc-tdep.c
> +++ b/gdb/arc-tdep.c
> @@ -145,6 +145,8 @@ static const char *const 
> core_arcompact_register_names[] = {
>    "lp_count", "reserved", "limm", "pcl",
>  };
> 
> +static char *arc_disassembler_options = NULL;
> +
>  /* Functions are sorted in the order as they are used in the
>     _initialize_arc_tdep (), which uses the same order as gdbarch.h.  
> Static
>     functions are defined before the first invocation.  */
> @@ -1410,6 +1412,31 @@ arc_delayed_print_insn (bfd_vma addr, struct
> disassemble_info *info)
>       will handle NULL value gracefully.  */
>    print_insn = arc_get_disassembler (exec_bfd);
>    gdb_assert (print_insn != NULL);
> +
> +  /* Standard BFD "machine number" field allows libocodes disassembler 
> to
> +     distinguish ARC 600, 700 and v2 cores, however v2 encompasses 
> both ARC EM
> +     and HS, which have some difference between.  There are two ways 
> to specify
> +     what is the target core:
> +     1) via the disassemble_info->disassembler_options;
> +     2) otherwise libopcodes will use private (architecture-specific) 
> ELF
> +     header.
> +
> +     Using disassembler_options is preferable, because it comes 
> directly from
> +     GDBserver which scanned an actual ARC core identification info.  
> However,
> +     not all GDBservers report core architecture, so as a fallback GDB 
> still
> +     should support analysis of ELF header.  The libopcodes 
> disassembly code
> +     uses the section to find the BFD and the BFD to find the ELF 
> header,
> +     therefore this function should set disassemble_info->section 
> properly.
> +
> +     disassembler_options was already set by non-target specific code 
> with
> +     proper options obtained via gdbarch_disassembler_options ().  */
> +  if (info->disassembler_options == NULL)
> +    {
> +      struct obj_section * s = find_pc_section (addr);
> +      if (s != NULL)
> +	info->section = s->the_bfd_section;
> +    }
> +
>    return print_insn (addr, info);
>  }
> 
> @@ -2041,6 +2068,23 @@ arc_gdbarch_init (struct gdbarch_info info,
> struct gdbarch_list *arches)
>    if (tdep->jb_pc >= 0)
>      set_gdbarch_get_longjmp_target (gdbarch, arc_get_longjmp_target);
> 
> +  /* Disassembler options.  Enforce CPU if it was specified in XML 
> target
> +     description, otherwise use default method of determining CPU (ELF 
> private
> +     header).  */
> +  if (info.target_desc != NULL)
> +    {
> +      const struct bfd_arch_info *tdesc_arch
> +	= tdesc_architecture (info.target_desc);
> +      if (tdesc_arch != NULL)
> +	{
> +	  std::string disasm_options
> +	    = string_printf ("cpu=%s", tdesc_arch->printable_name);
> +	  arc_disassembler_options = xstrdup (disasm_options.c_str ());

You could shorten this by using xstrprintf directly.

> +	  set_gdbarch_disassembler_options (gdbarch,
> +					    &arc_disassembler_options);
> +	}
> +    }
> +
>    tdesc_use_registers (gdbarch, tdesc, tdesc_data);
> 
>    return gdbarch;

This part, which sets the disassembler options based on the target 
description, looks good to me.

Simon
  
Anton Kolesov June 6, 2017, 1:51 p.m. UTC | #2
Hi Simon

> >
> > Support for CPU in disassembler options for ARC has been added in [2].
> >
> > [1]
> > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=39503f82427e22ed8e04d986ccdc8562091ec62e
> > [2]
> > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10045478d984f9924cb945423388ba25b7dd3ffe
 
> Hi Anton,
> 
> This is not very clear to me (I haven't really touched disassembly yet),
> so let me train to summarize the situation.  Let's say you feed the
> right executable to GDB, but connect to a gdbserver that doesn't report
> the cpu arch in the target description.  arc_disassembler_options won't
> be set, so we won't pass an explicit disassembler option.  opcodes
> internals could try to fall back on data from the BFD if
> disassemble_info::section was set.  However, this is only done in
> arc_delayed_print_insn, which is not used in that case (it was removed
> as a gdbarch callback in commit [2]).  Does that sound right?

Yes, your understanding is correct. For ARC this change will not be a
regression, because today disassembler never selects properly between ARC EM
and HS at all cases - this patch will fix some cases, but not the others, that
why I feel this makes sense.

> 
> Could gdb_disassembler set the section field of disassemble_info itself?
>   What you are doing to set the section field here is not ARC-specific,
> it looks like it could potentially help other architectures to have that
> field set.
> 
> Other places that use the disassembler would have to set it too, for
> example in the object prepared by the arc_disassemble_info function.

I think this info->section initialization code can be moved to
arch-utils.c:default_print_insn, however I'm not sure if that wouldn't cause
any troubles with other architectures. Doing initialization in gdb_disassembler
constructor is complicated, because we don't really know what would be the
disassembled address at this stage, hence what would be the section.
Gdb_dissassembler construction can use .text section as a default, but then
might now work with some multi-arch ELFs, I presume (unlikely to be a
problem for ARC, though).

Another option is, of course, to partially revert [2] for ARC - make
arc_delayed_print_insn a printer for ARC, but change it, so it will only
set info->section and then call default_print_insn. I believe that same change
is needed in mep-tdep.c. At least for ARC we would need to use print_insn
anyway to disassemble, because opcodes/arc-dis.c:arc_insn_decode relies on it.

> > +	  std::string disasm_options
> > +	    = string_printf ("cpu=%s", tdesc_arch->printable_name);
> > +	  arc_disassembler_options = xstrdup (disasm_options.c_str ());
> 
> You could shorten this by using xstrprintf directly.

Will do in v2.

Anton

> 
> > +	  set_gdbarch_disassembler_options (gdbarch,
> > +					    &arc_disassembler_options);
> > +	}
> > +    }
> > +
> >    tdesc_use_registers (gdbarch, tdesc, tdesc_data);
> >
> >    return gdbarch;
> 
> This part, which sets the disassembler options based on the target
> description, looks good to me.
> 
> Simon
  
Simon Marchi June 13, 2017, 9:31 p.m. UTC | #3
On 2017-06-06 15:51, Anton Kolesov wrote:
>> Could gdb_disassembler set the section field of disassemble_info 
>> itself?
>>   What you are doing to set the section field here is not 
>> ARC-specific,
>> it looks like it could potentially help other architectures to have 
>> that
>> field set.
>> 
>> Other places that use the disassembler would have to set it too, for
>> example in the object prepared by the arc_disassemble_info function.
> 
> I think this info->section initialization code can be moved to
> arch-utils.c:default_print_insn, however I'm not sure if that wouldn't 
> cause
> any troubles with other architectures.

I don't see why setting the section would cause a problem.  If it does, 
I'd say it's more likely a bug in these other architectures code.  You 
could argue that for architectures that don't have disassembler options, 
it would be some wasted cycles though...

> Doing initialization in gdb_disassembler
> constructor is complicated, because we don't really know what would be 
> the
> disassembled address at this stage, hence what would be the section.
> Gdb_dissassembler construction can use .text section as a default, but 
> then
> might now work with some multi-arch ELFs, I presume (unlikely to be a
> problem for ARC, though).

That might be a good default behavior, to be overridden by the few 
architectures that have some more corner cases.

> Another option is, of course, to partially revert [2] for ARC - make
> arc_delayed_print_insn a printer for ARC, but change it, so it will 
> only
> set info->section and then call default_print_insn. I believe that same 
> change
> is needed in mep-tdep.c. At least for ARC we would need to use 
> print_insn
> anyway to disassemble, because opcodes/arc-dis.c:arc_insn_decode relies 
> on it.

That's fine with me.

Simon
  

Patch

diff --git a/gdb/arc-tdep.c b/gdb/arc-tdep.c
index d9ee5c6..c0b8a14 100644
--- a/gdb/arc-tdep.c
+++ b/gdb/arc-tdep.c
@@ -145,6 +145,8 @@  static const char *const core_arcompact_register_names[] = {
   "lp_count", "reserved", "limm", "pcl",
 };
 
+static char *arc_disassembler_options = NULL;
+
 /* Functions are sorted in the order as they are used in the
    _initialize_arc_tdep (), which uses the same order as gdbarch.h.  Static
    functions are defined before the first invocation.  */
@@ -1410,6 +1412,31 @@  arc_delayed_print_insn (bfd_vma addr, struct disassemble_info *info)
      will handle NULL value gracefully.  */
   print_insn = arc_get_disassembler (exec_bfd);
   gdb_assert (print_insn != NULL);
+
+  /* Standard BFD "machine number" field allows libocodes disassembler to
+     distinguish ARC 600, 700 and v2 cores, however v2 encompasses both ARC EM
+     and HS, which have some difference between.  There are two ways to specify
+     what is the target core:
+     1) via the disassemble_info->disassembler_options;
+     2) otherwise libopcodes will use private (architecture-specific) ELF
+     header.
+
+     Using disassembler_options is preferable, because it comes directly from
+     GDBserver which scanned an actual ARC core identification info.  However,
+     not all GDBservers report core architecture, so as a fallback GDB still
+     should support analysis of ELF header.  The libopcodes disassembly code
+     uses the section to find the BFD and the BFD to find the ELF header,
+     therefore this function should set disassemble_info->section properly.
+
+     disassembler_options was already set by non-target specific code with
+     proper options obtained via gdbarch_disassembler_options ().  */
+  if (info->disassembler_options == NULL)
+    {
+      struct obj_section * s = find_pc_section (addr);
+      if (s != NULL)
+	info->section = s->the_bfd_section;
+    }
+
   return print_insn (addr, info);
 }
 
@@ -2041,6 +2068,23 @@  arc_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   if (tdep->jb_pc >= 0)
     set_gdbarch_get_longjmp_target (gdbarch, arc_get_longjmp_target);
 
+  /* Disassembler options.  Enforce CPU if it was specified in XML target
+     description, otherwise use default method of determining CPU (ELF private
+     header).  */
+  if (info.target_desc != NULL)
+    {
+      const struct bfd_arch_info *tdesc_arch
+	= tdesc_architecture (info.target_desc);
+      if (tdesc_arch != NULL)
+	{
+	  std::string disasm_options
+	    = string_printf ("cpu=%s", tdesc_arch->printable_name);
+	  arc_disassembler_options = xstrdup (disasm_options.c_str ());
+	  set_gdbarch_disassembler_options (gdbarch,
+					    &arc_disassembler_options);
+	}
+    }
+
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
   return gdbarch;