Make arm_record_test work on big-endian machines

Message ID fb6d077c-d731-03b9-c9cd-b03a8f1ab161@ericsson.com
State New, archived
Headers

Commit Message

Simon Marchi March 12, 2018, 3:41 p.m. UTC
  On 2018-03-12 10:15, Yao Qi wrote:
> Simon Marchi <simon.marchi@polymtl.ca> writes:
>
>> While running selftests on a big-endian PPC, I noticed arm_record_test
>> failed a test.  The reason is that the gdbarch_find_by_info call is done
>> with BFD_ENDIAN_UNKNOWN byte order in the info structure.  In that case,
>> it uses the byte order of the current default BFD, which happened to be
>> big-endian on that GDB build, and the gdbarch returned is a big-endian
>> one.  The instruction used for the 32-bits part of the test is written
>> in little-endian form, so GDB fails to decode the instruction
>> properly.
>
> The test instructions are endianess-neutral, because they are written as
> uint16_t, and that is why you didn't have to adjust 16-bit thumb
> instructions in the tests.

Ok, and I think there was something confusing me with the order of the Thumb2 instructions
(maybe it's the order the bytes are written in the comment).  From what I understand now
after playing with objdump, the two 16-bit parts of a Thumb2 instructions are not swapped
on little endian, they are considered as two separate instructions (I guess it makes sense,
otherwise it would be impossible to determine whether the next instruction is 16-bit or
32-bit long).  So only the two bytes of each 16-bit chunk are swapped when switching
between big and little endian.  In memory, the bytes for that instruction should look like
this:

  little endian: 0x1d, 0xee, 0x70, 0x7f
  big endian:    0xee, 0x1d, 0x7f, 0x70

And in both cases, we want the value in arm_record->arm_insn to end up containing 0xee1d7f70.  Is that right?

>>
>> Since ARM supports both endiannesses, and it should be possible to debug
>> using an host of both endiannesses, I changed the test to check with
>> gdbarches of both endiannesses.
>
> Although ARM supports both endianesses, the big endian has two variants,
> BE-32 and BE-8.  Before ARMv6, it was BE-32.  In ARMv6, BE-8 must be
> supported, but support of BE-32 is implementation defined.  ARMv7 drops
> the BE-32 support.  With BE-8, data is big endian and code is little
> endian.  Thumb-2 instruction was introduced in armv6t2, so it is still
> likely to have both big endian and little endian thumb-2 instructions
> code.

Ok, thanks for the info.

> As I said, the test case is written in an endianess-neutral way,
>
>     static const uint16_t insns[] = {
>       /* 1d ee 70 7f	 mrc	15, 0, r7, cr13, cr0, {3} */
>       0xee1d, 0x7f70,
>     };

If my assumption above is right, everything looks right until extract_arm_insn.
The 4-byte instruction is extracted with

  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
                           insn_size,
			   gdbarch_byte_order_for_code (insn_record->gdbarch));

Since the instruction is extracted as a single 4-byte value, on little endian,
extract_arm_insn leaves insn_record->arm_insn equal to 0x7f70ee1d.  On big
endian, it leaves it equal to 0xee1d7f70.  Hence the need for the word swapping
you show below.

In my opinion, it should be extract_arm_insn's responsibility to leave
insn_record->arm_insn with a consistent value regardless of the target/host
endiannesses.  The easiest way would be to read the instruction as two 16-bit
integers instead of one 32-bit one.  Or we can still do the word swap shown below
(only if the code is little endian), but at least I think it would make more sense
to do it in extract_arm_insn.

>
> but the arm process recording processes thumb-2 instruction as one
> 32-bit instruction, so that it has this,
>
> 	  /* Swap first half of 32bit thumb instruction with second half.  */
> 	  arm_record->arm_insn
> 	    = (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);

> I wonder this is the cause of the test fail, that is, we don't need to
> swap them in big endian.  Since I've never run GDB tests on arm big
> endian target, I am not very confident on my fix below.  It fixes the
> test fail on ppc64, and also the fail with your patch to change the test
> with two different endianess.

What do you think about the patch below?  Functionally, it should be identical
to what you suggested, but I think it's clear to fix extract_arm_insn instead.


From 925b2f54e53f39b8713dccc7788b3f10fccc374c Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@ericsson.com>
Date: Mon, 12 Mar 2018 11:28:29 -0400
Subject: [PATCH] extract_arm_insn: Read 32-bit instruction as two 16-bit words

---
 gdb/arm-tdep.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)
  

Comments

Yao Qi March 12, 2018, 5:09 p.m. UTC | #1
Simon Marchi <simon.marchi@ericsson.com> writes:

> What do you think about the patch below?  Functionally, it should be identical
> to what you suggested, but I think it's clear to fix extract_arm_insn instead.
>

Patch looks good to me.  Thanks.

> +
> +  insn_record->arm_insn
> +    = (uint32_t) extract_unsigned_integer (&buf[0], 2, endian);
> +
> +  if (insn_size == 4)
> +    {
> +      insn_record->arm_insn <<= 16;
> +      insn_record->arm_insn
> +	|= (uint32_t) extract_unsigned_integer (&buf[2], 2, endian);

Better to add some comments here.
  
Simon Marchi March 12, 2018, 6:53 p.m. UTC | #2
On 2018-03-12 13:09, Yao Qi wrote:
> Simon Marchi <simon.marchi@ericsson.com> writes:
> 
>> What do you think about the patch below?  Functionally, it should be 
>> identical
>> to what you suggested, but I think it's clear to fix extract_arm_insn 
>> instead.
>> 
> 
> Patch looks good to me.  Thanks.
> 
>> +
>> +  insn_record->arm_insn
>> +    = (uint32_t) extract_unsigned_integer (&buf[0], 2, endian);
>> +
>> +  if (insn_size == 4)
>> +    {
>> +      insn_record->arm_insn <<= 16;
>> +      insn_record->arm_insn
>> +	|= (uint32_t) extract_unsigned_integer (&buf[2], 2, endian);
> 
> Better to add some comments here.

Oh, actually, does my patch get it wrong for ARM instructions?  For 
them, it seems like in little endian, the four bytes of an instruction 
are reversed (as if it was a 4-byte value).  So that would explain why 
the word swap would only occur under a if (THUMB2_RECORD == 
record_type).  There is no test with ARM instructions in 
arm_record_test... should there be one?

Simon
  

Patch

diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index ef7e66b..e92ac13 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -13089,14 +13089,25 @@  extract_arm_insn (abstract_memory_reader& reader,
 		  insn_decode_record *insn_record, uint32_t insn_size)
 {
   gdb_byte buf[insn_size];
-
   memset (&buf[0], 0, insn_size);

+  gdb_assert (insn_size == 2 || insn_size == 4);
+
   if (!reader.read (insn_record->this_addr, buf, insn_size))
     return 1;
-  insn_record->arm_insn = (uint32_t) extract_unsigned_integer (&buf[0],
-                           insn_size,
-			   gdbarch_byte_order_for_code (insn_record->gdbarch));
+
+  bfd_endian endian = gdbarch_byte_order_for_code (insn_record->gdbarch);
+
+  insn_record->arm_insn
+    = (uint32_t) extract_unsigned_integer (&buf[0], 2, endian);
+
+  if (insn_size == 4)
+    {
+      insn_record->arm_insn <<= 16;
+      insn_record->arm_insn
+	|= (uint32_t) extract_unsigned_integer (&buf[2], 2, endian);
+    }
+
   return 0;
 }

@@ -13188,10 +13199,6 @@  decode_insn (abstract_memory_reader &reader, insn_decode_record *arm_record,
       /* As thumb does not have condition codes, we set negative.  */
       arm_record->cond = -1;

-      /* Swap first half of 32bit thumb instruction with second half.  */
-      arm_record->arm_insn
-	= (arm_record->arm_insn >> 16) | (arm_record->arm_insn << 16);
-
       ret = thumb2_record_decode_insn_handler (arm_record);

       if (ret != ARM_RECORD_SUCCESS)