[PR,gdb/22736,aarch64] gdb crashes on a conditional breakpoint with cast return type

Message ID DDA6BBB3-C085-4818-8B87-20CFAE19E6FE@arm.com
State New, archived
Headers

Commit Message

Alan Hayward March 7, 2018, 11:10 a.m. UTC
  > On 5 Mar 2018, at 16:45, Pedro Alves <palves@redhat.com> wrote:
> 
> On 03/05/2018 03:57 PM, Alan Hayward wrote:
>> 
>> 
>>> On 2 Mar 2018, at 15:18, Joel Brobecker <brobecker@adacore.com> wrote:
>>> 
>>>>>> The cast to (int) is causing this - remove the cast and it finds the type.
>>>>>> I’m assuming that’s causing it to drop the debug info.
>>>>> 
>>>>> It sounds like a bug to me.  If this bug "TYPE_TARGET_TYPE
>>>>> (func_type) becomes NULL caused by cast" is fixed, the GDB segfault
>>>>> will go away accordingly.
>>>>> 
>>>>> To be clear, your patch here is fine to me.  My suggestion is that
>>>>> we'd better dig it deeper.
>>>> 
>>>> Agreed. I’ll raise a new bug, and have a look into it too whilst I’m
>>>> in the area.
>>> 
>>> Having read what Yao said, and looking at the example you gave,
>>> I'm now thinking that one could very well be the cause of the other;
>>> it seems like the cast might indeed be returning a value with
>>> a struct type that's missing the return type. Even a subprogram
>>> which returns nothing has a TYPE_TARGET_TYPE set to a TYPE_CODE_VOID,
>>> right?
>> 
>> Been debugging this a little more (and learnt quite a few things about gdb along the way!)
>> Not sure if this reply is the best place to discuss, or if it should go onto the bug.
>> But….
>> 
>> On x86, for strcmp the type and it's target types, as received in amd64_push_dummy_call are:
>> TYPE_CODE_FUNC -> TYPE_CODE_INT -> 0x0
>> Whereas on aarch64 in aarch64_push_dummy_call we get:
>> TYPE_CODE_FUNC -> 0x0
>> 
>> It turns out this occurs because strcmp has IFUNC vs FUNC differences:
>> 
>> X86:
>> $ readelf -s /lib/x86_64-linux-gnu/libc-2.19.so | grep strcmp
>>  2085: 0000000000087380    60 IFUNC   GLOBAL DEFAULT   12 strcmp@@GLIBC_2.2.5
>> 
>> Aarch64:
>> $ readelf -s /lib/aarch64-linux-gnu/libc-2.23.so | grep strcmp
>>  2043: 0000000000076380   164 FUNC    GLOBAL DEFAULT   12 strcmp@@GLIBC_2.17
>> 
>> 
>> I decided to test this on X86 using a FUNC….
>> 
>> $ readelf -s /lib/x86_64-linux-gnu/libc-2.19.so | grep strlen
>>   769: 0000000000088dd0   436 FUNC    GLOBAL DEFAULT   12 strlen@@GLIBC_2.2.5
>> 
>> Changed my test to do:
>> (gdb) b foo if (int)strlen(name) == 3
>> 
>> ...Now the type received by the target code is TYPE_CODE_FUNC -> 0, the same as aarch64.
>> 
>> However, x86 does not segfault. This is because amd64_push_dummy_call doesn’t bother to
>> check the function parameter.
>> 
>> 
>> Looking into the code, when reading a FUNC, readlf.c : elf_symtab_read() hits:
>> 
>> 		  if (sym->flags & BSF_GNU_INDIRECT_FUNCTION)
>> 		    ms_type = mst_text_gnu_ifunc;
>> 		  else
>> 		    ms_type = mst_text;
>> 
>> Setting the type to mst_text, which eventually causes find_minsym_type_and_address() to
>> set the type to objfile_type (objfile)->nodebug_text_symbol
>> Whereas an IFUNC gets set to objfile_type (objfile)->nodebug_text_gnu_ifunc_symbol.
>> 
>> So, I think probably either:
>> * Everything is correct
>> * mst_text needs handling differently
>> * FUNCs need a new minimal symbol type (mst_text_gnu_func?)
>> (I’m not familiar enough with this part of the code to give a definitive answer).
> 
> Funny enough, I started working on fixing ifunc-related problems early
> last week, but then got sidetracked into other high priority things. I've not
> run into a GDB crash, but just in case, maybe this branch helps:
> 
> https://github.com/palves/gdb/commits/palves/ifunc
> 
> The gnu-ifunc.exp test doesn't pass cleanly on that branch
> because I'm midway through augmenting that testcase to cover
> different scenarios and evaluating whether it's the test that
> needs fixing, or gdb.
> 

Thanks for the branch, gave that it a try, but it has the same error.

Digging a little more into this….

The target type for the *function is set to the type of the function pointer, not the return type.
So, for IFUNC, TYPE_CODE_FUNC target type is TYPE_CODE_INT which gives the type
of the function pointer. For the FUNC, there is no pointer, so the target type of TYPE_CODE_FUNC
is set to null. That makes sense to me, but isn’t immediately obvious.

In call_function_by_hand_dummy() there already exists code to extract the type of the function
return, which is set as target_values_type. Patch below simply passes this through to the
gdbarch_push_dummy_call  method. Updated aarch64 and arm targets to use this.

If this patch is ok for amd64/aarch64/arm, then I will post again with _push_dummy_call()
fixed on every target. (Not done that yet because it’s a lot of tedious copy pasting). Most targets
don’t even look at the return types, so I’m not expecting many issues when I look at them.

Tested on aarch64 only with make check unitest and native_gdbserver.


2018-03-06  Alan Hayward  <alan.hayward@arm.com>

	PR gdb/22736
	* aarch64-tdep.c (aarch64_push_dummy_call): Use passed in return type.
	* amd64-tdep.c (amd64_push_dummy_call): Add new arg.
	* arm-tdep.c (arm_push_dummy_call): Use passed in return type.
	* gdbarch.sh (gdbarch_push_dummy_call): Add return type.
	* gdbarch.h: Regenerate.
	* gdbarch.sh: Likewise.
	* infcall.c (call_function_by_hand_dummy): Pass down return type.
  

Comments

Joel Brobecker March 9, 2018, 8:51 a.m. UTC | #1
Hello Alan,

> The target type for the *function is set to the type of the function
> pointer, not the return type.  So, for IFUNC, TYPE_CODE_FUNC target type
> is TYPE_CODE_INT which gives the type of the function pointer. For the
> FUNC, there is no pointer, so the target type of TYPE_CODE_FUNC is set
> to null. That makes sense to me, but isn’t immediately obvious.

I don't really understand what you mean by that -- maybe it is related
to the '*' before 'function' above? If we're talking about the
target_type for the function, it should be the type of the return value.
At least according to the documentation for that field in gdb_types.h.
But maybe you're talking about something else?

> In call_function_by_hand_dummy() there already exists code to extract
> the type of the function return, which is set as target_values_type.
> Patch below simply passes this through to the gdbarch_push_dummy_call
> method. Updated aarch64 and arm targets to use this.
>
> If this patch is ok for amd64/aarch64/arm, then I will post again with
> _push_dummy_call() fixed on every target. (Not done that yet because
> it’s a lot of tedious copy pasting). Most targets don’t even look at
> the return types, so I’m not expecting many issues when I look at
> them.

Sounds like a good change even if we didn't have this issue, as
it avoids a gdbarch method having to re-compute something that was
already previously computed.

> 2018-03-06  Alan Hayward  <alan.hayward@arm.com>
> 
> 	PR gdb/22736
> 	* aarch64-tdep.c (aarch64_push_dummy_call): Use passed in return type.
> 	* amd64-tdep.c (amd64_push_dummy_call): Add new arg.
> 	* arm-tdep.c (arm_push_dummy_call): Use passed in return type.
> 	* gdbarch.sh (gdbarch_push_dummy_call): Add return type.
> 	* gdbarch.h: Regenerate.
> 	* gdbarch.sh: Likewise.
> 	* infcall.c (call_function_by_hand_dummy): Pass down return type.

Just a minor comment below (besides what you already remarked on about
updating the remaining targets). Please give Pedro some time to comment;
my reviews have not be exactly top notch, lately, and I don't want you
to do useless work because I missed something.

> @@ -1411,20 +1410,11 @@ aarch64_push_dummy_call (struct gdbarch *gdbarch, struct value *function,
>       Rather that change the target interface we call the language code
>       directly ourselves.  */
> 
> -  func_type = check_typedef (value_type (function));
> -
> -  /* Dereference function pointer types.  */
> -  if (TYPE_CODE (func_type) == TYPE_CODE_PTR)
> -    func_type = TYPE_TARGET_TYPE (func_type);
> -
> -  gdb_assert (TYPE_CODE (func_type) == TYPE_CODE_FUNC
> -	      || TYPE_CODE (func_type) == TYPE_CODE_METHOD);
> -
>    /* If language_pass_by_reference () returned true we will have been
>       given an additional initial argument, a hidden pointer to the
>       return slot in memory.  */
> -  return_type = TYPE_TARGET_TYPE (func_type);
> -  lang_struct_return = language_pass_by_reference (return_type);
> +  if (return_type != nullptr)

I think return_type can never be nullptr (from what I understand of
the code in call_function_by_hand_dummy).
  
Pedro Alves March 9, 2018, 4:04 p.m. UTC | #2
On 03/09/2018 08:51 AM, Joel Brobecker wrote:

> Just a minor comment below (besides what you already remarked on about
> updating the remaining targets). Please give Pedro some time to comment;
> my reviews have not be exactly top notch, lately, and I don't want you
> to do useless work because I missed something.

Sorry for the constant delays the past couple weeks.  I've been getting
distracted by other things more than usual.  Today I'm trying to finish/post
the ifunc-fixing series I pointed at earlier, and hopefully that will give
me enough background to understand/review this patch (I'm afraid I haven't
really looked at it in any detail).

Thanks,
Pedro Alves
  
Yao Qi March 9, 2018, 4:44 p.m. UTC | #3
Pedro Alves <palves@redhat.com> writes:

> Today I'm trying to finish/post
> the ifunc-fixing series I pointed at earlier, and hopefully that will give
> me enough background to understand/review this patch (I'm afraid I haven't
> really looked at it in any detail).

FWIW, this issue is *not* related to ifunc.  As Alan described in
previous email, ifunc symbol is OK, but normal function symbol's target
type is NULL, because without debug information, GDB doesn't know the
symbol is a function or not.  I thought about it, but I am not confident
that we can set symbol's target type (for example, set it void or int)
in absent of debug information.

Anyway, it is great if you can take a look.
  
Alan Hayward March 9, 2018, 7:11 p.m. UTC | #4
> On 9 Mar 2018, at 08:51, Joel Brobecker <brobecker@adacore.com> wrote:

> 

> Hello Alan,

> 

>> The target type for the *function is set to the type of the function

>> pointer, not the return type.  So, for IFUNC, TYPE_CODE_FUNC target type

>> is TYPE_CODE_INT which gives the type of the function pointer. For the

>> FUNC, there is no pointer, so the target type of TYPE_CODE_FUNC is set

>> to null. That makes sense to me, but isn’t immediately obvious.

> 

> I don't really understand what you mean by that -- maybe it is related

> to the '*' before 'function' above? If we're talking about the

> target_type for the function, it should be the type of the return value.

> At least according to the documentation for that field in gdb_types.h.

> But maybe you're talking about something else?


Agreed that the documentation says “type” should contain the type of the
function return.

I’m not quite sure on the terminology here, but, what the gdb code is picking
up and placing into type is the type of the function entry in the elf file.
The code in find_minsym_type_and_address() ensures that a FUNC will always
have the type nodebug_text_gnu_ifunc_symbol and an IFUNC will always have
the type nodebug_text_symbol.  This is completely unrelated to the return type
of the function, which is only resolved much later in call_function_by_hand_dummy().

In an ideal world, maybe the common code needs a bit of a rewrite:
The function structure needs to contain both the type of the elf entry and the
type of the return value. But, the existing value* and function* structures don’t
really support adding extra fields. I wasn’t really comfortable enough will this part
of the code to meddle with that.

Happy with the other comments too. Thanks!


> On 9 Mar 2018, at 16:04, Pedro Alves <palves@redhat.com> wrote:

> 

> Sorry for the constant delays the past couple weeks.  I've been getting

> distracted by other things more than usual.  Today I'm trying to finish/post

> the ifunc-fixing series I pointed at earlier, and hopefully that will give

> me enough background to understand/review this patch (I'm afraid I haven't

> really looked at it in any detail).

> 


Ok, happy to wait until you have some time. I’m looking at pr/22943 which also
might help with my understanding of this one a little more.

> 

> On 9 Mar 2018, at 16:44, Yao Qi <qiyaoltc@gmail.com> wrote:

> 

> FWIW, this issue is *not* related to ifunc.  As Alan described in

> previous email, ifunc symbol is OK, but normal function symbol's target

> type is NULL, because without debug information, GDB doesn't know the

> symbol is a function or not.  I thought about it, but I am not confident

> that we can set symbol's target type (for example, set it void or int)

> in absent of debug information.

> 


I’d just add that even with IFUNC the type will always be set to a pointer to an
Int, regardless of the return type of the function.




Alan.
  
Pedro Alves March 9, 2018, 8:05 p.m. UTC | #5
On 03/09/2018 07:11 PM, Alan Hayward wrote:

>> On 9 Mar 2018, at 16:44, Yao Qi <qiyaoltc@gmail.com> wrote:
>>
>> FWIW, this issue is *not* related to ifunc.  As Alan described in
>> previous email, ifunc symbol is OK, but normal function symbol's target
>> type is NULL, because without debug information, GDB doesn't know the
>> symbol is a function or not.  I thought about it, but I am not confident
>> that we can set symbol's target type (for example, set it void or int)
>> in absent of debug information.
>>

And how are we calling a function if we don't know its return type?
GDB won't let you, unless you add the cast.  And then the question is,
why are we losing the cast-to type then?  _That_ should the be return
type used, it sounds to me (but I still haven't reviewed in detail :-/ ).

> 
> I’d just add that even with IFUNC the type will always be set to a pointer to an
> Int, regardless of the return type of the function.

Actually, I've changed that today in my series.  /me almost done...

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
index f08945ea07101e1cd7906ca640c023ac7d189dd9..bf9ae4475f80a8200eeb56a8edf4a9f7ee3daa37 100644
--- a/gdb/aarch64-tdep.c
+++ b/gdb/aarch64-tdep.c
@@ -1376,13 +1376,12 @@  aarch64_push_dummy_call (struct gdbarch *gdbarch, struct value *function,
 			 struct regcache *regcache, CORE_ADDR bp_addr,
 			 int nargs,
 			 struct value **args, CORE_ADDR sp, int struct_return,
-			 CORE_ADDR struct_addr)
+			 CORE_ADDR struct_addr, struct type *return_type)
 {
   int argnum;
   struct aarch64_call_info info;
   struct type *func_type;
-  struct type *return_type;
-  int lang_struct_return;
+  int lang_struct_return = 0;

   memset (&info, 0, sizeof (info));

@@ -1411,20 +1410,11 @@  aarch64_push_dummy_call (struct gdbarch *gdbarch, struct value *function,
      Rather that change the target interface we call the language code
      directly ourselves.  */

-  func_type = check_typedef (value_type (function));
-
-  /* Dereference function pointer types.  */
-  if (TYPE_CODE (func_type) == TYPE_CODE_PTR)
-    func_type = TYPE_TARGET_TYPE (func_type);
-
-  gdb_assert (TYPE_CODE (func_type) == TYPE_CODE_FUNC
-	      || TYPE_CODE (func_type) == TYPE_CODE_METHOD);
-
   /* If language_pass_by_reference () returned true we will have been
      given an additional initial argument, a hidden pointer to the
      return slot in memory.  */
-  return_type = TYPE_TARGET_TYPE (func_type);
-  lang_struct_return = language_pass_by_reference (return_type);
+  if (return_type != nullptr)
+    lang_struct_return = language_pass_by_reference (return_type);

   /* Set the return address.  For the AArch64, the return breakpoint
      is always at BP_ADDR.  */
diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index 6b92c9244c627af5fea78fdfd97b41a887fb679a..c50f611846c32364f31c33e0b4092af881fa6248 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -990,7 +990,8 @@  static CORE_ADDR
 amd64_push_dummy_call (struct gdbarch *gdbarch, struct value *function,
 		       struct regcache *regcache, CORE_ADDR bp_addr,
 		       int nargs, struct value **args,	CORE_ADDR sp,
-		       int struct_return, CORE_ADDR struct_addr)
+		       int struct_return, CORE_ADDR struct_addr,
+		       struct type *return_type)
 {
   enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
   gdb_byte buf[8];
diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c
index ef7e66b36afe34aed3880b86d16b466984481131..78240d21ef1082bfdaaafa6be6c2f5a73ac819a6 100644
--- a/gdb/arm-tdep.c
+++ b/gdb/arm-tdep.c
@@ -3689,7 +3689,7 @@  static CORE_ADDR
 arm_push_dummy_call (struct gdbarch *gdbarch, struct value *function,
 		     struct regcache *regcache, CORE_ADDR bp_addr, int nargs,
 		     struct value **args, CORE_ADDR sp, int struct_return,
-		     CORE_ADDR struct_addr)
+		     CORE_ADDR struct_addr, struct type *return_type)
 {
   enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
   int argnum;
@@ -3700,12 +3700,8 @@  arm_push_dummy_call (struct gdbarch *gdbarch, struct value *function,
   struct type *ftype;
   unsigned vfp_regs_free = (1 << 16) - 1;

-  /* Determine the type of this function and whether the VFP ABI
-     applies.  */
-  ftype = check_typedef (value_type (function));
-  if (TYPE_CODE (ftype) == TYPE_CODE_PTR)
-    ftype = check_typedef (TYPE_TARGET_TYPE (ftype));
-  use_vfp_abi = arm_vfp_abi_for_function (gdbarch, ftype);
+  /* Determine whether the VFP ABI applies.  */
+  use_vfp_abi = arm_vfp_abi_for_function (gdbarch, return_type);

   /* Set the return address.  For the ARM, the return breakpoint is
      always at BP_ADDR.  */
diff --git a/gdb/gdbarch.h b/gdb/gdbarch.h
index 5cb131de1d27209107b5b83773ea7560ef0da6ac..086a230ffe6d295e1d1d1c3b5e3391ec979444eb 100644
--- a/gdb/gdbarch.h
+++ b/gdb/gdbarch.h
@@ -397,8 +397,8 @@  extern void set_gdbarch_deprecated_fp_regnum (struct gdbarch *gdbarch, int depre

 extern int gdbarch_push_dummy_call_p (struct gdbarch *gdbarch);

-typedef CORE_ADDR (gdbarch_push_dummy_call_ftype) (struct gdbarch *gdbarch, struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr);
-extern CORE_ADDR gdbarch_push_dummy_call (struct gdbarch *gdbarch, struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr);
+typedef CORE_ADDR (gdbarch_push_dummy_call_ftype) (struct gdbarch *gdbarch, struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr, struct type *return_type);
+extern CORE_ADDR gdbarch_push_dummy_call (struct gdbarch *gdbarch, struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr, struct type *return_type);
 extern void set_gdbarch_push_dummy_call (struct gdbarch *gdbarch, gdbarch_push_dummy_call_ftype *push_dummy_call);

 extern int gdbarch_call_dummy_location (struct gdbarch *gdbarch);
diff --git a/gdb/gdbarch.c b/gdb/gdbarch.c
index b8703e5a556e310e023cadf57037918d1bdd2850..50dd72db5d5a06a841391290c1afac345af49eb9 100644
--- a/gdb/gdbarch.c
+++ b/gdb/gdbarch.c
@@ -2368,13 +2368,13 @@  gdbarch_push_dummy_call_p (struct gdbarch *gdbarch)
 }

 CORE_ADDR
-gdbarch_push_dummy_call (struct gdbarch *gdbarch, struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr)
+gdbarch_push_dummy_call (struct gdbarch *gdbarch, struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr, struct type *return_type)
 {
   gdb_assert (gdbarch != NULL);
   gdb_assert (gdbarch->push_dummy_call != NULL);
   if (gdbarch_debug >= 2)
     fprintf_unfiltered (gdb_stdlog, "gdbarch_push_dummy_call called\n");
-  return gdbarch->push_dummy_call (gdbarch, function, regcache, bp_addr, nargs, args, sp, struct_return, struct_addr);
+  return gdbarch->push_dummy_call (gdbarch, function, regcache, bp_addr, nargs, args, sp, struct_return, struct_addr, return_type);
 }

 void
diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
index 33dfa6b349dee778f1a82a511a6cf57960d48f89..a1472069aa3d3a3bd355a772a4f9527f413330ca 100755
--- a/gdb/gdbarch.sh
+++ b/gdb/gdbarch.sh
@@ -488,7 +488,7 @@  M;struct frame_id;dummy_id;struct frame_info *this_frame;this_frame
 # deprecated_fp_regnum.
 v;int;deprecated_fp_regnum;;;-1;-1;;0

-M;CORE_ADDR;push_dummy_call;struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr;function, regcache, bp_addr, nargs, args, sp, struct_return, struct_addr
+M;CORE_ADDR;push_dummy_call;struct value *function, struct regcache *regcache, CORE_ADDR bp_addr, int nargs, struct value **args, CORE_ADDR sp, int struct_return, CORE_ADDR struct_addr, struct type *return_type;function, regcache, bp_addr, nargs, args, sp, struct_return, struct_addr, return_type
 v;int;call_dummy_location;;;;AT_ENTRY_POINT;;0
 M;CORE_ADDR;push_dummy_code;CORE_ADDR sp, CORE_ADDR funaddr, struct value **args, int nargs, struct type *value_type, CORE_ADDR *real_pc, CORE_ADDR *bp_addr, struct regcache *regcache;sp, funaddr, args, nargs, value_type, real_pc, bp_addr, regcache

diff --git a/gdb/infcall.c b/gdb/infcall.c
index b7f4a176db581c15c4fdd8c5299aab35d0fe4a68..773f43e884b3d831e0bd8254998cbb7ecb8daf33 100644
--- a/gdb/infcall.c
+++ b/gdb/infcall.c
@@ -1070,7 +1070,8 @@  call_function_by_hand_dummy (struct value *function,
      return address should be pointed.  */
   sp = gdbarch_push_dummy_call (gdbarch, function, get_current_regcache (),
 				bp_addr, nargs, args,
-				sp, struct_return, struct_addr);
+				sp, struct_return, struct_addr,
+				target_values_type);

   /* Set up a frame ID for the dummy frame so we can pass it to
      set_momentary_breakpoint.  We need to give the breakpoint a frame