Fix alignment of disassemble /r

Message ID CAF5HaEUBbK4d2reSxUpUQwp7Zt_KjcpNC_nbUrMHRb0yNo9wtw@mail.gmail.com
State Changes Requested, archived
Headers

Commit Message

Daniel Gutson April 11, 2014, 10:24 p.m. UTC
  Hi,

   when disassembling in raw mode (/r) in a variable-length insn
architecture (i.e. x86),
the output can be completely messed since no alignment takes place.

I am aware of the uiout->table stuff, but it seems an overkill since I
should change
the current_uiout when disassembling in this mode (and I didn't find
any actual use of this
machinery at least for x86).
Therefore, I added a hack in the dump_insns when the /r flag is specified.
This clearly isn't the cutiest thing in the world, and I specified a
hardcoded maximum number
of opcode bytes to align (currently 8) though it is easily changeable.

Please let me know if this approach is OK or I should do something
else. Maybe consider whether
current arch is insn-len variable?

If this happens to be OK, please commit it for me since I don't have
write access.

Thanks,

   Daniel.


2014-04-11  Daniel Gutson  <daniel.gutson@tallertechnologies.com>

        * disasm.c (dump_insns): Added right alignment when showing opcodes.
  

Comments

Daniel Gutson April 16, 2014, 12:20 p.m. UTC | #1
Ping for maintainer feedback.

Thanks,

    Daniel.

On Fri, Apr 11, 2014 at 7:24 PM, Daniel Gutson
<daniel.gutson@tallertechnologies.com> wrote:
> Hi,
>
>    when disassembling in raw mode (/r) in a variable-length insn
> architecture (i.e. x86),
> the output can be completely messed since no alignment takes place.
>
> I am aware of the uiout->table stuff, but it seems an overkill since I
> should change
> the current_uiout when disassembling in this mode (and I didn't find
> any actual use of this
> machinery at least for x86).
> Therefore, I added a hack in the dump_insns when the /r flag is specified.
> This clearly isn't the cutiest thing in the world, and I specified a
> hardcoded maximum number
> of opcode bytes to align (currently 8) though it is easily changeable.
>
> Please let me know if this approach is OK or I should do something
> else. Maybe consider whether
> current arch is insn-len variable?
>
> If this happens to be OK, please commit it for me since I don't have
> write access.
>
> Thanks,
>
>    Daniel.
>
>
> 2014-04-11  Daniel Gutson  <daniel.gutson@tallertechnologies.com>
>
>         * disasm.c (dump_insns): Added right alignment when showing opcodes.
  
Yao Qi April 17, 2014, 1:07 a.m. UTC | #2
On 04/12/2014 06:24 AM, Daniel Gutson wrote:
>    when disassembling in raw mode (/r) in a variable-length insn
> architecture (i.e. x86),
> the output can be completely messed since no alignment takes place.

The /r output is messed, but not completely :).

> 
> I am aware of the uiout->table stuff, but it seems an overkill since I
> should change
> the current_uiout when disassembling in this mode (and I didn't find
> any actual use of this
> machinery at least for x86).
> Therefore, I added a hack in the dump_insns when the /r flag is specified.
> This clearly isn't the cutiest thing in the world, and I specified a
> hardcoded maximum number
> of opcode bytes to align (currently 8) though it is easily changeable.

Hard-coded opcode length will affect other targets.  For example, without
your patch, the disassembly for c6x is like,
(gdb) disassemble /r main
Dump of assembler code for function main:                                                                                                                     
   0x00000860 <+0>:     c2 1b be 07     subah .D2 b15,16,b15
   0x00000864 <+4>:     f6 02 3d 07     stw .D2T2 b14,*+b15(32)

with your patch applied, it becomes

(gdb) disassemble /r main                                                                                                 
Dump of assembler code for function main:                                                                                                                     
   0x00000860 <+0>:     c2 1b be 07             subah .D2 b15,16,b15
   0x00000864 <+4>:     f6 02 3d 07             stw .D2T2 b14,*+b15(32)

gdbarch_max_insn_length can tell us the max instruction length, but if we
align the instruction to the max length (it is 16 on x86), the text
instruction will be far behind the hex code, which is a little ugly.
  
Daniel Gutson April 17, 2014, 2:37 p.m. UTC | #3
On Wed, Apr 16, 2014 at 10:07 PM, Yao Qi <yao@codesourcery.com> wrote:
> On 04/12/2014 06:24 AM, Daniel Gutson wrote:
>>    when disassembling in raw mode (/r) in a variable-length insn
>> architecture (i.e. x86),
>> the output can be completely messed since no alignment takes place.
>
> The /r output is messed, but not completely :).
>
>>
>> I am aware of the uiout->table stuff, but it seems an overkill since I
>> should change
>> the current_uiout when disassembling in this mode (and I didn't find
>> any actual use of this
>> machinery at least for x86).
>> Therefore, I added a hack in the dump_insns when the /r flag is specified.
>> This clearly isn't the cutiest thing in the world, and I specified a
>> hardcoded maximum number
>> of opcode bytes to align (currently 8) though it is easily changeable.
>
> Hard-coded opcode length will affect other targets.  For example, without
> your patch, the disassembly for c6x is like,
> (gdb) disassemble /r main
> Dump of assembler code for function main:
>    0x00000860 <+0>:     c2 1b be 07     subah .D2 b15,16,b15
>    0x00000864 <+4>:     f6 02 3d 07     stw .D2T2 b14,*+b15(32)
>
> with your patch applied, it becomes
>
> (gdb) disassemble /r main
> Dump of assembler code for function main:
>    0x00000860 <+0>:     c2 1b be 07             subah .D2 b15,16,b15
>    0x00000864 <+4>:     f6 02 3d 07             stw .D2T2 b14,*+b15(32)
>
> gdbarch_max_insn_length can tell us the max instruction length, but if we
> align the instruction to the max length (it is 16 on x86), the text
> instruction will be far behind the hex code, which is a little ugly.

Hi Yao, thanks for answering.

So what if I add a new configuration variable, such as
     set disassemble-raw-alignment
with "off" as default, and if set to on, pad to gdbarch_max_insn_length ?

>
> --
> Yao (齐尧)
  
Doug Evans April 17, 2014, 4:27 p.m. UTC | #4
On Thu, Apr 17, 2014 at 7:37 AM, Daniel Gutson
<daniel.gutson@tallertechnologies.com> wrote:
> So what if I add a new configuration variable, such as
>      set disassemble-raw-alignment
> with "off" as default, and if set to on, pad to gdbarch_max_insn_length ?

Presumably some frontends will do their own alignment.

If we went with disassemble-raw-alignment, a boolean value won't help
x86 much, it's either no alignment or (in general) too much
whitespace.
An improvement would be a value from min-insn-length to
max-insn-length, but that would be problematic in a multi-arch
debugging scenario.

If we could agree on some minimum alignment for each variable-length
ISA (5 would be fine for me for x86) then maybe a boolean value could
be useful ("off" = no alignment, "on" = employ arch-specific minimum).

OTOH, what if we made two passes over the instructions, with the first
pass computing the maximum instruction length that is present?
[And maybe only doing this for CLI.]
  
Daniel Gutson April 17, 2014, 5:27 p.m. UTC | #5
On Thu, Apr 17, 2014 at 1:27 PM, Doug Evans <dje@google.com> wrote:
> On Thu, Apr 17, 2014 at 7:37 AM, Daniel Gutson
> <daniel.gutson@tallertechnologies.com> wrote:
>> So what if I add a new configuration variable, such as
>>      set disassemble-raw-alignment
>> with "off" as default, and if set to on, pad to gdbarch_max_insn_length ?
>
> Presumably some frontends will do their own alignment.
>
> If we went with disassemble-raw-alignment, a boolean value won't help
> x86 much, it's either no alignment or (in general) too much
> whitespace.
> An improvement would be a value from min-insn-length to
> max-insn-length, but that would be problematic in a multi-arch
> debugging scenario.
>
> If we could agree on some minimum alignment for each variable-length
> ISA (5 would be fine for me for x86) then maybe a boolean value could
> be useful ("off" = no alignment, "on" = employ arch-specific minimum).
>
> OTOH, what if we made two passes over the instructions, with the first
> pass computing the maximum instruction length that is present?

I already considered this, but thought that it would be going to be
rejected due to be too much non-performant. Wouldn't each pass
translate in a lot of MI messaging in a case of a remote server? And,
what about screen paginig? I shouldn't iterate over all the range, but
the screen height range only.

I can go for any of the proposed solutions. 5 insn-length was fine for
me. On a side note, I did this since I was debugging some
self-modifying-code where the mis-alignment was driving me crazy.
The two-passes solution is the one I dislike more IMVHO.

Yet another option is to set the disassembly raw alignment width, something like
    set disassemble-raw-width 5
(5 is the amount of insn; 0 means no padding equal to the current
behavior and that would be the default value).

  Daniel.

> [And maybe only doing this for CLI.]
  
Doug Evans April 17, 2014, 6:19 p.m. UTC | #6
On Thu, Apr 17, 2014 at 10:27 AM, Daniel Gutson
<daniel.gutson@tallertechnologies.com> wrote:
> On Thu, Apr 17, 2014 at 1:27 PM, Doug Evans <dje@google.com> wrote:
>> On Thu, Apr 17, 2014 at 7:37 AM, Daniel Gutson
>> <daniel.gutson@tallertechnologies.com> wrote:
>>> So what if I add a new configuration variable, such as
>>>      set disassemble-raw-alignment
>>> with "off" as default, and if set to on, pad to gdbarch_max_insn_length ?
>>
>> Presumably some frontends will do their own alignment.
>>
>> If we went with disassemble-raw-alignment, a boolean value won't help
>> x86 much, it's either no alignment or (in general) too much
>> whitespace.
>> An improvement would be a value from min-insn-length to
>> max-insn-length, but that would be problematic in a multi-arch
>> debugging scenario.
>>
>> If we could agree on some minimum alignment for each variable-length
>> ISA (5 would be fine for me for x86) then maybe a boolean value could
>> be useful ("off" = no alignment, "on" = employ arch-specific minimum).
>>
>> OTOH, what if we made two passes over the instructions, with the first
>> pass computing the maximum instruction length that is present?
>
> I already considered this, but thought that it would be going to be
> rejected due to be too much non-performant. Wouldn't each pass
> translate in a lot of MI messaging in a case of a remote server?

[nit: If by "remote server" you mean things like gdbserver, then MI
isn't involved.]
Yes, it *could* involve extra traffic between gdb and gdbserver (or
whatever stub is being used) but we have caching now that should
mitigate that.
Still, it's a valid concern.

> And,
> what about screen paginig? I shouldn't iterate over all the range, but
> the screen height range only.

I guess one could go that route but IMO it's not necessary.
OTOH, One *could* also do the max-length computation in bunches that
took advantage of the cache size thus guaranteeing no additional
gdb/gdbserver traffic.
Not my first choice, just a possibility.
[gdb has a lot of knobs, just want to make sure no stone is left
unturned before we add this one :-)]

> I can go for any of the proposed solutions. 5 insn-length was fine for
> me. On a side note, I did this since I was debugging some
> self-modifying-code where the mis-alignment was driving me crazy.
> The two-passes solution is the one I dislike more IMVHO.
>
> Yet another option is to set the disassembly raw alignment width, something like
>     set disassemble-raw-width 5
> (5 is the amount of insn; 0 means no padding equal to the current
> behavior and that would be the default value).

The problem with this is that the setting is arch-specific.
Not that one can do this today (or maybe one can in certain
situations, e.g. Cell), but if I'm debugging two different programs
from two different architectures it's possible any one value of
disassemble-raw-width is going to be painful.
One *could* have prefix commands for each architecture, and then one could have

set arch-$arch disassemble-raw-width N
or
set arch $arch disassemble-raw-width N

[again, just a possibility, not sure I prefer it]

Sorry for not having a definitive suggestion.
Maybe others will express an opinion and we can go from there.
  
Yao Qi April 21, 2014, 12:54 a.m. UTC | #7
On 04/18/2014 01:27 AM, Daniel Gutson wrote:
> I already considered this, but thought that it would be going to be
> rejected due to be too much non-performant. Wouldn't each pass
> translate in a lot of MI messaging in a case of a remote server? And,

If you meant "rsp packets" rather than "MI messaging", we don't worry
about the performance much here.  disassemble uses code cache
(target_read_code) to read instructions from remote server and the
following read to the same area will hit the cache.

> what about screen paginig? I shouldn't iterate over all the range, but
> the screen height range only.

What is the reason do you think we shouldn't iterator over all the
range?  IMO, screen height and alignment are orthogonal.
  
Daniel Gutson April 21, 2014, 4 p.m. UTC | #8
On Sun, Apr 20, 2014 at 9:54 PM, Yao Qi <yao@codesourcery.com> wrote:
> On 04/18/2014 01:27 AM, Daniel Gutson wrote:
>> I already considered this, but thought that it would be going to be
>> rejected due to be too much non-performant. Wouldn't each pass
>> translate in a lot of MI messaging in a case of a remote server? And,
>
> If you meant "rsp packets" rather than "MI messaging",

Yes, my bad, sorry.

we don't worry
> about the performance much here.  disassemble uses code cache
> (target_read_code) to read instructions from remote server and the
> following read to the same area will hit the cache.
>
>> what about screen paginig? I shouldn't iterate over all the range, but
>> the screen height range only.
>
> What is the reason do you think we shouldn't iterator over all the
> range?  IMO, screen height and alignment are orthogonal.

Suppose your function takes two screens of disassembly.
Then  suppose that the instructions that fit in the first screen take
N bytes at most,
and those of the second takes N+k. If I ignore screen paging, I would
be aligning
the first screen to N+k too, whereas those k are just useless space.
It's a minor and cosmetic concern, but I'd like to get it right and
pretty in one shot :)

   Daniel.

>
> --
> Yao (齐尧)
  
Daniel Gutson April 22, 2014, 12:13 p.m. UTC | #9
OK timeout, I wil just KISS and implement the 2-passes solution
without considering paging.

I'll be back with the new version of the patch.

Thanks for your comments!

  Daniel.

On Mon, Apr 21, 2014 at 1:00 PM, Daniel Gutson
<daniel.gutson@tallertechnologies.com> wrote:
> On Sun, Apr 20, 2014 at 9:54 PM, Yao Qi <yao@codesourcery.com> wrote:
>> On 04/18/2014 01:27 AM, Daniel Gutson wrote:
>>> I already considered this, but thought that it would be going to be
>>> rejected due to be too much non-performant. Wouldn't each pass
>>> translate in a lot of MI messaging in a case of a remote server? And,
>>
>> If you meant "rsp packets" rather than "MI messaging",
>
> Yes, my bad, sorry.
>
> we don't worry
>> about the performance much here.  disassemble uses code cache
>> (target_read_code) to read instructions from remote server and the
>> following read to the same area will hit the cache.
>>
>>> what about screen paginig? I shouldn't iterate over all the range, but
>>> the screen height range only.
>>
>> What is the reason do you think we shouldn't iterator over all the
>> range?  IMO, screen height and alignment are orthogonal.
>
> Suppose your function takes two screens of disassembly.
> Then  suppose that the instructions that fit in the first screen take
> N bytes at most,
> and those of the second takes N+k. If I ignore screen paging, I would
> be aligning
> the first screen to N+k too, whereas those k are just useless space.
> It's a minor and cosmetic concern, but I'd like to get it right and
> pretty in one shot :)
>
>    Daniel.
>
>>
>> --
>> Yao (齐尧)
>
>
>
> --
>
> Daniel F. Gutson
> Chief Engineering Officer, SPD
>
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> Córdoba, Argentina
>
>
> Phone: +54 351 4217888 / +54 351 4218211
>
> Skype: dgutson
  
Daniel Gutson March 4, 2016, 8:41 p.m. UTC | #10
FWIW: https://sourceware.org/bugzilla/show_bug.cgi?id=19768

On Tue, Apr 22, 2014 at 9:13 AM, Daniel Gutson
<daniel.gutson@tallertechnologies.com> wrote:
> OK timeout, I wil just KISS and implement the 2-passes solution
> without considering paging.
>
> I'll be back with the new version of the patch.
>
> Thanks for your comments!
>
>   Daniel.
>
> On Mon, Apr 21, 2014 at 1:00 PM, Daniel Gutson
> <daniel.gutson@tallertechnologies.com> wrote:
>> On Sun, Apr 20, 2014 at 9:54 PM, Yao Qi <yao@codesourcery.com> wrote:
>>> On 04/18/2014 01:27 AM, Daniel Gutson wrote:
>>>> I already considered this, but thought that it would be going to be
>>>> rejected due to be too much non-performant. Wouldn't each pass
>>>> translate in a lot of MI messaging in a case of a remote server? And,
>>>
>>> If you meant "rsp packets" rather than "MI messaging",
>>
>> Yes, my bad, sorry.
>>
>> we don't worry
>>> about the performance much here.  disassemble uses code cache
>>> (target_read_code) to read instructions from remote server and the
>>> following read to the same area will hit the cache.
>>>
>>>> what about screen paginig? I shouldn't iterate over all the range, but
>>>> the screen height range only.
>>>
>>> What is the reason do you think we shouldn't iterator over all the
>>> range?  IMO, screen height and alignment are orthogonal.
>>
>> Suppose your function takes two screens of disassembly.
>> Then  suppose that the instructions that fit in the first screen take
>> N bytes at most,
>> and those of the second takes N+k. If I ignore screen paging, I would
>> be aligning
>> the first screen to N+k too, whereas those k are just useless space.
>> It's a minor and cosmetic concern, but I'd like to get it right and
>> pretty in one shot :)
>>
>>    Daniel.
>>
>>>
>>> --
>>> Yao (齐尧)
>>
>>
>>
>> --
>>
>> Daniel F. Gutson
>> Chief Engineering Officer, SPD
>>
>>
>> San Lorenzo 47, 3rd Floor, Office 5
>>
>> Córdoba, Argentina
>>
>>
>> Phone: +54 351 4217888 / +54 351 4218211
>>
>> Skype: dgutson
>
>
>
> --
>
> Daniel F. Gutson
> Chief Engineering Officer, SPD
>
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> Córdoba, Argentina
>
>
> Phone: +54 351 4217888 / +54 351 4218211
>
> Skype: dgutson
  

Patch

diff --git a/gdb/disasm.c b/gdb/disasm.c
index d94225b..0fd5aa1 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -107,6 +107,13 @@  dump_insns (struct gdbarch *gdbarch, struct ui_out *uiout,
   int offset;
   int line;
   struct cleanup *ui_out_chain;
+  /* This array holds enough space for 8 bytes of opcodes;
+     since the format is 02x plus the space, the length
+     the array is (2chars + 1space) * 8 bytes + 1zero = 25 chars.  */
+  char right_align[8 * 3 + 1];
+
+  memset(right_align, ' ', sizeof(right_align) - 2);
+  right_align[sizeof(right_align) - 1] = 0;
 
   for (pc = low; pc < high;)
     {
@@ -154,6 +161,7 @@  dump_insns (struct gdbarch *gdbarch, struct ui_out *uiout,
           bfd_byte data;
           int status;
           const char *spacer = "";
+          unsigned int alignment_pos = 0;
 
           /* Build the opcodes using a temporary stream so we can
              write them out in a single go for the MI.  */
@@ -170,8 +178,11 @@  dump_insns (struct gdbarch *gdbarch, struct ui_out *uiout,
               fprintf_filtered (opcode_stream, "%s%02x",
                                 spacer, (unsigned) data);
               spacer = " ";
+              alignment_pos += 3;
             }
           ui_out_field_stream (uiout, "opcodes", opcode_stream);
+          if (alignment_pos < sizeof(right_align))
+            ui_out_text (uiout, &right_align[alignment_pos]);
           ui_out_text (uiout, "\t");
 
           do_cleanups (cleanups);