gdb/amdgpu: provide dummy implementation of gdbarch_return_value_as_value

Message ID 20230306214650.1744872-1-simon.marchi@polymtl.ca
State New
Headers
Series gdb/amdgpu: provide dummy implementation of gdbarch_return_value_as_value |

Commit Message

Simon Marchi March 6, 2023, 9:46 p.m. UTC
  The AMD GPU support has been merged shortly after commit 4e1d2f5814b2
("Add new overload of gdbarch_return_value"), which made it mandatory
for architectures to provide either a return_value or
return_value_as_value implementation.  Because of my failure to test
properly after rebasing and before pushing, we get this with the current
master:

    $ gdb ./gdb -nx --data-directory=data-directory -q -ex "set arch amdgcn:gfx1010" -batch
    /home/simark/src/binutils-gdb/gdb/gdbarch.c:517: internal-error: verify_gdbarch: the following are invalid ...
            return_value_as_value

I started trying to change GDB to not force architectures to provide a
return_value or return_value_as_value implementation, but Andrew pointed
out that any serious port will have an implementation one day or
another, and it's easy to add a dummy implementation in the mean time.
So it's better to not complicate the core of GDB to know how to deal
with this.

There is an implementation of return_value in the downstream ROCgdb port
(which we'll need to convert to the new return_value_as_value), which
we'll contribute soon-ish.  In the mean time, add a dummy implementation
of return_value_as_value to avoid the failed assertion.

Change-Id: I26edf441b511170aa64068fd248ab6201158bb63
---
 gdb/amdgpu-tdep.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)


base-commit: 1d6653fd3f4e0d7140e705733412fd75c40177b2
  

Comments

Lancelot SIX March 7, 2023, 10:45 a.m. UTC | #1
Hi Simon,

> +/* Dummy implementation of gdbarch_return_value_as_value.  */
> +
> +static return_value_convention
> +amdgpu_return_value_as_value (gdbarch *arch, value *function, type *valtype,
> +			      regcache *regcache, value **read_value,
> +			      const gdb_byte *writebuf)
> +{
> +  gdb_assert_not_reached ("not implemented");

Isn't "error" more appropriate here?  We just need to indicate that the
current hook failed.  GDB is not in an inconsistent state.

Maybe another approach could be to add an elem to the
return_value_convention like RETURN_VALUE_UNKNOWN which could be a
reasonable default if the gdbarch does not know how to handle a given
type.

Anyway, I do not think that you can easily reach this point with the
current port of amdgcn.  The `return`, `finish` and `call` commands will
produce errors before reaching this point.

Best,
Lancelot.

> +}
> +
>  /* Return the name of register REGNUM.  */
>  
>  static const char *
> @@ -1195,6 +1205,8 @@ amdgpu_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
>  
>    set_gdbarch_dwarf2_reg_to_regnum (gdbarch, amdgpu_dwarf_reg_to_regnum);
>  
> +  set_gdbarch_return_value_as_value (gdbarch, amdgpu_return_value_as_value);
> +
>    /* Register representation.  */
>    set_gdbarch_register_name (gdbarch, amdgpu_register_name);
>    set_gdbarch_register_type (gdbarch, amdgpu_register_type);
> 
> base-commit: 1d6653fd3f4e0d7140e705733412fd75c40177b2
> -- 
> 2.39.2
>
  
Simon Marchi March 7, 2023, 2:47 p.m. UTC | #2
On 3/7/23 05:45, Lancelot SIX wrote:
> Hi Simon,
> 
>> +/* Dummy implementation of gdbarch_return_value_as_value.  */
>> +
>> +static return_value_convention
>> +amdgpu_return_value_as_value (gdbarch *arch, value *function, type *valtype,
>> +			      regcache *regcache, value **read_value,
>> +			      const gdb_byte *writebuf)
>> +{
>> +  gdb_assert_not_reached ("not implemented");
> 
> Isn't "error" more appropriate here?  We just need to indicate that the
> current hook failed.  GDB is not in an inconsistent state.

In my original patch, I made these hooks optional, and added some
predicate checks:

  if (!gdbarch_return_value_as_value_p (gdbarch))
    error_arch_no_return_value (gdbarch);

The feedback was that throwing errors at some of these places (like
during event handling) would probably put GDB in a bad state.  Erroring
out of amdgpu_return_value_as_value would be the same.

> 
> Maybe another approach could be to add an elem to the
> return_value_convention like RETURN_VALUE_UNKNOWN which could be a
> reasonable default if the gdbarch does not know how to handle a given
> type.

I think that Pedro hinted that we would need this anyway at some point,
for functions that don't follow a defined ABI.  So, I think it would
make sense, but we need to update the core of GDB to handle that
response.  And I'm not too familiar with this area, so I don't know how
much work this represents.  But if we know we're going to need this
anyway, I might as well give it a shot.

> Anyway, I do not think that you can easily reach this point with the
> current port of amdgcn.  The `return`, `finish` and `call` commands will
> produce errors before reaching this point.

Yes, that's my experience.  The AMD GPU port upstream is too barebones
to use these commands.  And it's just temporary.

Simon
  
Lancelot SIX March 7, 2023, 7:12 p.m. UTC | #3
On Tue, Mar 07, 2023 at 09:47:15AM -0500, Simon Marchi via Gdb-patches wrote:
> 
> 
> On 3/7/23 05:45, Lancelot SIX wrote:
> > Hi Simon,
> > 
> >> +/* Dummy implementation of gdbarch_return_value_as_value.  */
> >> +
> >> +static return_value_convention
> >> +amdgpu_return_value_as_value (gdbarch *arch, value *function, type *valtype,
> >> +			      regcache *regcache, value **read_value,
> >> +			      const gdb_byte *writebuf)
> >> +{
> >> +  gdb_assert_not_reached ("not implemented");
> > 
> > Isn't "error" more appropriate here?  We just need to indicate that the
> > current hook failed.  GDB is not in an inconsistent state.
> 
> In my original patch, I made these hooks optional, and added some
> predicate checks:
> 
>   if (!gdbarch_return_value_as_value_p (gdbarch))
>     error_arch_no_return_value (gdbarch);
> 
> The feedback was that throwing errors at some of these places (like
> during event handling) would probably put GDB in a bad state.  Erroring
> out of amdgpu_return_value_as_value would be the same.
>

Hi,

Sorry I missed that, I am a bit behind on reading the ML.

> 
> Yes, that's my experience.  The AMD GPU port upstream is too barebones
> to use these commands.  And it's just temporary.

FWIW, this LGTM.

I have tested this and it solves the original issue you have seen.

Reviewed-By: Lancelot SIX <lancelot.six@amd.com>

Best,
Lancelot.

> 
> Simon
  
Tom Tromey March 7, 2023, 7:20 p.m. UTC | #4
>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

Simon> I think that Pedro hinted that we would need this anyway at some point,
Simon> for functions that don't follow a defined ABI.  So, I think it would
Simon> make sense, but we need to update the core of GDB to handle that
Simon> response.

Can we even detect this situation?

E.g., PR 30090 turned out to have a function with a nonstandard ABI, and
in the end I just xfail'd the test.

Simon> And I'm not too familiar with this area, so I don't know how
Simon> much work this represents.  But if we know we're going to need this
Simon> anyway, I might as well give it a shot.

There aren't many callers of the gdbarch hooks so I guess you could just
track them all down and see what needs to be done at each one.  There's
definitely already code to handle the lack of a return value, so it
seems like it may not be too hard.

Tom
  
Simon Marchi March 7, 2023, 8:32 p.m. UTC | #5
On 3/7/23 14:20, Tom Tromey wrote:
>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Simon> I think that Pedro hinted that we would need this anyway at some point,
> Simon> for functions that don't follow a defined ABI.  So, I think it would
> Simon> make sense, but we need to update the core of GDB to handle that
> Simon> response.
> 
> Can we even detect this situation?
> 
> E.g., PR 30090 turned out to have a function with a nonstandard ABI, and
> in the end I just xfail'd the test.

I chatted about this with Pedro offline, and that was my question.
Apart from the "I'm half way through implementing a port" situation, is
there really a case where a function won't have a standard ABI, it won't
be marked as such in the DWARF (with is already handled with the
is_nocall_function check) and the arch code will be able to know?  Our
intuition was, probably not.  So we concluded that it didn't make sense
to add a RETURN_VALUE_UNKNOWN enumerator for the amdgpu case, since it
would just be temporary until we submit the real code.

In the mean time, it's not really possible to reach that place anyway.
And even if we could, it's expected that the AMD GPU port is not really
usable in master anyway as of today.

> Simon> And I'm not too familiar with this area, so I don't know how
> Simon> much work this represents.  But if we know we're going to need this
> Simon> anyway, I might as well give it a shot.
> 
> There aren't many callers of the gdbarch hooks so I guess you could just
> track them all down and see what needs to be done at each one.  There's
> definitely already code to handle the lack of a return value, so it
> seems like it may not be too hard.

Yes, that's what we would have done, had we decided to go forward with
that solution.

I will go aheady and push this patch, to unbreak the port.

Simon
  
Lancelot SIX March 7, 2023, 8:33 p.m. UTC | #6
On Tue, Mar 07, 2023 at 12:20:18PM -0700, Tom Tromey wrote:
> >>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Simon> I think that Pedro hinted that we would need this anyway at some point,
> Simon> for functions that don't follow a defined ABI.  So, I think it would
> Simon> make sense, but we need to update the core of GDB to handle that
> Simon> response.
> 
> Can we even detect this situation?
> 
> E.g., PR 30090 turned out to have a function with a nonstandard ABI, and
> in the end I just xfail'd the test.
> 
> Simon> And I'm not too familiar with this area, so I don't know how
> Simon> much work this represents.  But if we know we're going to need this
> Simon> anyway, I might as well give it a shot.
> 
> There aren't many callers of the gdbarch hooks so I guess you could just
> track them all down and see what needs to be done at each one.  There's
> definitely already code to handle the lack of a return value, so it
> seems like it may not be too hard.

We already have some things in place to support cases when DWARF
indicates that a given function does not follow the standard calling
convention for the target (DW_AT_calling_convention set to
DW_CC_nocall).

We have discussed this a bit off-list, and our understanding is that the
gdbarch hook has to implement the standard ABI.  In the end, returning a
RETURN_VALUE_UNKNOWN value would imply that a gdbarch hook does not
implement the ABI for a given type.  The better approach would be to
finish the implementation to add support for such type, in which case
RETURN_VALUE_UNKNOWN is not needed.

I am not sure how we would model the ticket you linked above.  Could the
arch implement a "rust on $ARCH"  ABI in the gdbarch hook by inspecting
the language of the CU the function belongs to?  This would need the
custum ABI to be stable, and I have no idea if this is the case for
rust.

Lancelot.

> 
> Tom
  
Tom Tromey March 7, 2023, 8:44 p.m. UTC | #7
Lancelot> We have discussed this a bit off-list, and our understanding is that the
Lancelot> gdbarch hook has to implement the standard ABI.

Yeah, agreed.

Lancelot> I am not sure how we would model the ticket you linked above.  Could the
Lancelot> arch implement a "rust on $ARCH"  ABI in the gdbarch hook by inspecting
Lancelot> the language of the CU the function belongs to?  This would need the
Lancelot> custum ABI to be stable, and I have no idea if this is the case for
Lancelot> rust.

It's not, so it would be premature for gdb to do anything like that.
The Rust proposal is to extend DWARF (in some unspecified way) so that
the compiler can describe the ABI in use.  If that ever happens I guess
we can implement it.

Tom
  
Tom Tromey March 7, 2023, 8:44 p.m. UTC | #8
Simon> I will go aheady and push this patch, to unbreak the port.

Thank you, this makes sense to me.

Tom
  

Patch

diff --git a/gdb/amdgpu-tdep.c b/gdb/amdgpu-tdep.c
index 7f0e9dffab37..d681d9d6a504 100644
--- a/gdb/amdgpu-tdep.c
+++ b/gdb/amdgpu-tdep.c
@@ -51,6 +51,16 @@  get_amdgpu_gdbarch_tdep (gdbarch *arch)
   return gdbarch_tdep<amdgpu_gdbarch_tdep> (arch);
 }
 
+/* Dummy implementation of gdbarch_return_value_as_value.  */
+
+static return_value_convention
+amdgpu_return_value_as_value (gdbarch *arch, value *function, type *valtype,
+			      regcache *regcache, value **read_value,
+			      const gdb_byte *writebuf)
+{
+  gdb_assert_not_reached ("not implemented");
+}
+
 /* Return the name of register REGNUM.  */
 
 static const char *
@@ -1195,6 +1205,8 @@  amdgpu_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
 
   set_gdbarch_dwarf2_reg_to_regnum (gdbarch, amdgpu_dwarf_reg_to_regnum);
 
+  set_gdbarch_return_value_as_value (gdbarch, amdgpu_return_value_as_value);
+
   /* Register representation.  */
   set_gdbarch_register_name (gdbarch, amdgpu_register_name);
   set_gdbarch_register_type (gdbarch, amdgpu_register_type);