Clear upper bits during sign extension

Message ID 54A29886.8030603@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves Dec. 30, 2014, 12:20 p.m. UTC
  On 12/30/2014 09:19 AM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
> 
>> This seems to me to paper over an issue elsewhere, and is likely
>> to paper over issues as gdb_sign_extend is used more throughout.
>>
>> I'm not immediately familiar with all the conditions indirect_pieced_value
>> is called, but going by the comment quoted, I think the root issue
>> might be that we shouldn't use value_as_address in the first place,
>> but something like unpack_long directly.
> 
> indirect_pieced_value is called by value_ind, in which its argument ARG1
> should be regarded as an address, IMO.
> 
> #0  indirect_pieced_value (value=0x8af0fd8) at ../../../git/gdb/dwarf2loc.c:2006
> #1  0x081d99fa in value_ind (arg1=0x8af0fd8) at ../../../git/gdb/valops.c:1548
> #2  0x081de7f2 in value_subscript (array=0x8b41678, index=-2) at ../../../git/gdb/valarith.c:181
> 
> See value_ind's comment:
> 
> /* Given a value of a pointer type, apply the C unary * operator to
>    it.  */
> 
> struct value *
> value_ind (struct value *arg1)
> 
>>
>> E.g., I don't see how it makes sense to interpret -2 as an address
>> on spu, which ends up calling:
> 
> -2 is *not* the address in this case.  The address is 0xfffffffe, and
>  sign extended to 64-bit (0xfffffffffffffffe) on MIPS target.

Well, that -2 is being interpreted as an address, given value_as_address is
called on it.   From your original post:

>> in the first test, 'd[-2]' is processed by GDB as '* (&d[-2])'.  'd'
>> is a synthetic pointer, so its value is zero, the address of 'd[-2]'
>> is -2.  In dwarf2loc.c:indirect_pieced_value,
>>
>>   /* This is an offset requested by GDB, such as value subscripts.
>>      However, due to how synthetic pointers are implemented, this is
>>      always presented to us as a pointer type.  This means we have to
>>      sign-extend it manually as appropriate.  */
>>   byte_offset = value_as_address (value);                  <---- [1]

...

>> on MIPS target, after [1], byte_offset is -2 (0xfffffffffffffffe),
>> because 32-bit -2 (as an address) is sign extended to 64-bit.  After

> Sorry, I don't understand how is gdbarch_integer_to_address hook related
> to this problem.  The address (0xfffffffe) is the address of synthetic
> pointer, instead of the actual address.

I thought value_as_address was reaching the call to gdbarch_integer_to_address.
But given indirect_pieced_value has this at the top:

  if (TYPE_CODE (type) != TYPE_CODE_PTR)
    return NULL;

we know we're handling a TYPE_CODE_PTR.

That means that instead, value_as_address is calling unpack_long
at the bottom, which then calls extract_typed_address, which calls
gdbarch_pointer_to_address.  The same point applies.  The default of
that hook is unsigned_pointer_to_address.  But on MIPS, the hook
calls signed_pointer_to_address, which does the sign extension.

That would suggest the fix to be to do something like:

  /* This is an offset requested by GDB, such as value subscripts.
     However, due to how synthetic pointers are implemented, this is
     always presented to us as a pointer type.  This means we have to
-     sign-extend it manually as appropriate.  */
+     sign-extend it if needed (on some architectures, like MIPS,
+     addresses are signed).

   byte_offset = value_as_address (value);
-  if (TYPE_LENGTH (value_type (value)) < sizeof (LONGEST))
+  if (TYPE_UNSIGNED (value_type (value)
+      && TYPE_LENGTH (value_type (value)) < sizeof (LONGEST))
    byte_offset = gdb_sign_extend (byte_offset,
				   8 * TYPE_LENGTH (value_type (value)));

however, that would not look correct to me on AVR, SPU, or other ports
that install a custom gdbarch_pointer_to_address hook, where value_as_address
ends up returning a CORE_ADDR that had some magic bit manipulations
thrown in.  Your change to gdb_sign_extend would wipe those (high) bits
out, for sure, but that clearly is not the intended role of gdb_sign_extend,
so looks brittle and not as direct to rely on that.

So what we need here is to get back the raw value of the pointer
as a signed integer, without any GDB magic address bits.
That is, we don't want the manipulations from gdbarch_pointer_to_address.

So I think we should either explicitly always clear bits above TYPE_LENGTH
after value_as_address, with a comment mentioning that we don't want
any magic bits that gdbarch_pointer_to_address would give us,
or, given we know the value is really an offset, simply extract the value
that way.  Like in the patch below:

From 57e268c3f0da5eb90f6c39c307b60c321c76faa2 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 30 Dec 2014 11:07:35 +0000
Subject: [PATCH] always read synthetic pointers as signed integers

---
 gdb/dwarf2loc.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)
  

Comments

Yao Qi Dec. 30, 2014, 1:46 p.m. UTC | #1
Pedro Alves <palves@redhat.com> writes:

> So I think we should either explicitly always clear bits above TYPE_LENGTH
> after value_as_address, with a comment mentioning that we don't want
> any magic bits that gdbarch_pointer_to_address would give us,
> or, given we know the value is really an offset, simply extract the value
> that way.  Like in the patch below:

The latter is fine to me.

> +  byte_order = gdbarch_byte_order (get_type_arch (type));

How about getting gdbarch via get_frame_arch (frame)?  How about
removing gdb_sign_extend as it is no longer used?

I'll post a full version on top of yours.
  

Patch

diff --git a/gdb/dwarf2loc.c b/gdb/dwarf2loc.c
index fd5856c..5dd0867 100644
--- a/gdb/dwarf2loc.c
+++ b/gdb/dwarf2loc.c
@@ -2012,6 +2012,7 @@  indirect_pieced_value (struct value *value)
   int i, bit_offset, bit_length;
   struct dwarf_expr_piece *piece = NULL;
   LONGEST byte_offset;
+  enum bfd_endian byte_order;

   type = check_typedef (value_type (value));
   if (TYPE_CODE (type) != TYPE_CODE_PTR)
@@ -2056,11 +2057,16 @@  indirect_pieced_value (struct value *value)
   /* This is an offset requested by GDB, such as value subscripts.
      However, due to how synthetic pointers are implemented, this is
      always presented to us as a pointer type.  This means we have to
-     sign-extend it manually as appropriate.  */
-  byte_offset = value_as_address (value);
-  if (TYPE_LENGTH (value_type (value)) < sizeof (LONGEST))
-    byte_offset = gdb_sign_extend (byte_offset,
-				   8 * TYPE_LENGTH (value_type (value)));
+     sign-extend it manually as appropriate.  Use raw
+     extract_signed_integer directly rather than value_as_address and
+     sign extend afterwards on architectures that would need it
+     (mostly everywhere except MIPS, which has signed addresses) as
+     the later would go through gdbarch_pointer_to_address and thus
+     return a CORE_ADDR with high bits set on architectures that
+     encode address spaces and other things in CORE_ADDR.  */
+  byte_order = gdbarch_byte_order (get_type_arch (type));
+  byte_offset = extract_signed_integer (value_contents (value),
+					TYPE_LENGTH (type), byte_order);
   byte_offset += piece->v.ptr.offset;

   gdb_assert (piece);