[v2,0/8] Initial support for ROCm platform (AMD GPU) debugging

Message ID 20230105200237.987771-1-simon.marchi@polymtl.ca
Headers
Series Initial support for ROCm platform (AMD GPU) debugging |

Message

Simon Marchi Jan. 5, 2023, 8:02 p.m. UTC
  This is the v2 of:

  https://inbox.sourceware.org/gdb-patches/20221206135729.3937767-1-simon.marchi@efficios.com/

I pushed patches 5, 7, 8 and 9, since I considered them good chances even
outside of this series.

Most other preparatory patches have been reviewed and LGTM'ed by Andrew
Burgess, but they are not really useful on their own, so they are still
in this series.  There are no changes in them, except a typo fix in
patch 3.

The only preparatory patch which was not reviewed is "gdb/solib-svr4:
don't disable probes interface if probe not found".

The only changes in the last patch are in the doc.  I reduced the size
and scope of the documentation drastically.  There was a lot of
information that was only tangentially relevant to GDB, about the
behavior of the rest of the ROCm platform.  We agreed that this kind of
information is not relevant for the upstream GDB project (it can still
continue to exist in the downstream ROCm GDB documentation).   So I
removed anything that was not documenting concrete use cases of GDB
itself.

Other than that, the last patch didn't change since v1.

Here is the original cover letter (with patch numbers updated):

This patch series adds initial support for debugging programs offloaded
to AMD GPUs using the ROCm platform.

Patches 1 to 7 are preparatory patches, and patch 8 is the big one.
We included in that patch only what we consider to be the bare minimum
for a cohesive first step, something that can be run and tested.  See
the commit message of patch 8 for more details.

At the end of this series, it is possible to hit breakpoints in GPU code
and resume execution until the end of the program.  Notably, GDB is not
able to compute the backtrace, which is of course an important piece to
do any real debugging.  Supporting this will require GDB to support some
extensions to DWARF, which is a big piece in itself.  This part, as well
as other features, will come as future patches.

Patch 8 also contains documentation changes.  This series was tested on
Ubuntu 22.04, and no regressions appear for x86-64 / Linux debugging
when and the AMDGPU / ROCm platform support is compiled in.
Built-tested with --enable-targets=all.

Porting GDB to the ROCm platform brought up many challenges and has a
few interesting differences, compared to a "standard" CPU port.  We did
a presentation at the 2022 Cauldron, in Prague, that explains some of
them:

  https://www.youtube.com/watch?v=X1iZ_Ta7jOo

Lancelot SIX (1):
  gdb: add supports_arch_info callback to gdbarch_register

Pedro Alves (1):
  gdb: make install_breakpoint return a non-owning reference

Simon Marchi (6):
  gdbsupport: add type definitions for pid, lwp and tid
  gdb: add inferior_pre_detach observable
  gdb: add gdbarch_up
  gdb/solib-svr4: don't disable probes interface if probe not found
  gdb: make gdb_printing_disassembler::stream public
  gdb: initial support for ROCm platform (AMDGPU) debugging

 gdb/Makefile.in                   |   17 +-
 gdb/NEWS                          |    7 +
 gdb/README                        |   15 +
 gdb/amd-dbgapi-target.c           | 1966 +++++++++++++++++++++++++++++
 gdb/amd-dbgapi-target.h           |  116 ++
 gdb/amdgpu-tdep.c                 | 1367 ++++++++++++++++++++
 gdb/amdgpu-tdep.h                 |   93 ++
 gdb/arch-utils.c                  |    9 +-
 gdb/breakpoint.c                  |    4 +-
 gdb/breakpoint.h                  |    8 +-
 gdb/configure                     |  425 +++++--
 gdb/configure.ac                  |   52 +
 gdb/configure.tgt                 |   23 +-
 gdb/disasm.h                      |    4 +-
 gdb/doc/gdb.texinfo               |  291 +++++
 gdb/gdbarch.h                     |   12 +-
 gdb/observable.c                  |    1 +
 gdb/observable.h                  |    3 +
 gdb/regcache.c                    |    3 +-
 gdb/solib-rocm.c                  |  679 ++++++++++
 gdb/solib-svr4.c                  |   15 +-
 gdb/target.c                      |    2 +
 gdb/testsuite/gdb.rocm/simple.cpp |   48 +
 gdb/testsuite/gdb.rocm/simple.exp |   52 +
 gdb/testsuite/lib/future.exp      |   38 +
 gdb/testsuite/lib/gdb.exp         |    7 +
 gdb/testsuite/lib/rocm.exp        |   94 ++
 gdbsupport/ptid.h                 |   18 +-
 28 files changed, 5210 insertions(+), 159 deletions(-)
 create mode 100644 gdb/amd-dbgapi-target.c
 create mode 100644 gdb/amd-dbgapi-target.h
 create mode 100644 gdb/amdgpu-tdep.c
 create mode 100644 gdb/amdgpu-tdep.h
 create mode 100644 gdb/solib-rocm.c
 create mode 100644 gdb/testsuite/gdb.rocm/simple.cpp
 create mode 100644 gdb/testsuite/gdb.rocm/simple.exp
 create mode 100644 gdb/testsuite/lib/rocm.exp


base-commit: 1a8605a8c79b0c4ebc71f5691e36a1338d407837
  

Comments

Simon Marchi Jan. 25, 2023, 5:03 p.m. UTC | #1
On 1/5/23 15:02, Simon Marchi wrote:
> This is the v2 of:
> 
>   https://inbox.sourceware.org/gdb-patches/20221206135729.3937767-1-simon.marchi@efficios.com/
> 
> I pushed patches 5, 7, 8 and 9, since I considered them good chances even
> outside of this series.
> 
> Most other preparatory patches have been reviewed and LGTM'ed by Andrew
> Burgess, but they are not really useful on their own, so they are still
> in this series.  There are no changes in them, except a typo fix in
> patch 3.
> 
> The only preparatory patch which was not reviewed is "gdb/solib-svr4:
> don't disable probes interface if probe not found".

Ping.

I plan to push the series in a week from now, if there are no more
comments on this.  Note that this code has been heavily scrutinized by
Pedro and Lancelot, so I am confident that it is in good shape already.

Simon
  
Simon Marchi Feb. 2, 2023, 3:08 p.m. UTC | #2
On 1/25/23 12:03, Simon Marchi via Gdb-patches wrote:
> 
> 
> On 1/5/23 15:02, Simon Marchi wrote:
>> This is the v2 of:
>>
>>   https://inbox.sourceware.org/gdb-patches/20221206135729.3937767-1-simon.marchi@efficios.com/
>>
>> I pushed patches 5, 7, 8 and 9, since I considered them good chances even
>> outside of this series.
>>
>> Most other preparatory patches have been reviewed and LGTM'ed by Andrew
>> Burgess, but they are not really useful on their own, so they are still
>> in this series.  There are no changes in them, except a typo fix in
>> patch 3.
>>
>> The only preparatory patch which was not reviewed is "gdb/solib-svr4:
>> don't disable probes interface if probe not found".
> 
> Ping.
> 
> I plan to push the series in a week from now, if there are no more
> comments on this.  Note that this code has been heavily scrutinized by
> Pedro and Lancelot, so I am confident that it is in good shape already.
> 
> Simon

I pushed the series.

Simon
  
Tom de Vries Feb. 6, 2023, 11:47 a.m. UTC | #3
On 2/2/23 16:08, Simon Marchi via Gdb-patches wrote:

>
> On 1/25/23 12:03, Simon Marchi via Gdb-patches wrote:
>>
>> On 1/5/23 15:02, Simon Marchi wrote:
>>> This is the v2 of:
>>>
>>>    https://inbox.sourceware.org/gdb-patches/20221206135729.3937767-1-simon.marchi@efficios.com/
>>>
>>> I pushed patches 5, 7, 8 and 9, since I considered them good chances even
>>> outside of this series.
>>>
>>> Most other preparatory patches have been reviewed and LGTM'ed by Andrew
>>> Burgess, but they are not really useful on their own, so they are still
>>> in this series.  There are no changes in them, except a typo fix in
>>> patch 3.
>>>
>>> The only preparatory patch which was not reviewed is "gdb/solib-svr4:
>>> don't disable probes interface if probe not found".
>> Ping.
>>
>> I plan to push the series in a week from now, if there are no more
>> comments on this.  Note that this code has been heavily scrutinized by
>> Pedro and Lancelot, so I am confident that it is in good shape already.
>>
>> Simon
> I pushed the series.
>
> Simon


Hi,

as I mentioned earlier to Lancelot off-list, I'm now seeing:

...

Running /data/vries/gdb/src/gdb/testsuite/gdb.rocm/simple.exp ...
gdb compile failed, 
/data/vries/gdb/src/gdb/testsuite/gdb.rocm/simple.cpp:18:10: fatal 
error: hip/hip_runtime.h: No such file or directory
#include "hip/hip_runtime.h"
          ^~~~~~~~~~~~~~~~~~~
compilation terminated.
...

Thanks,

- Tom
  
Lancelot SIX Feb. 6, 2023, 2:01 p.m. UTC | #4
Hi,

Thanks for, letting us know. I am preparing a patch to fix this (skip the tests if the amd-dbgapi support is not built in or if the hiccc compiler for amdgpu is not available).

I hope to send this tomorrow.

Best
Lancelot

Le 6 février 2023 11:47:55 GMT+00:00, Tom de Vries <tdevries@suse.de> a écrit :
>On 2/2/23 16:08, Simon Marchi via Gdb-patches wrote:
>
>> 
>> On 1/25/23 12:03, Simon Marchi via Gdb-patches wrote:
>>> 
>>> On 1/5/23 15:02, Simon Marchi wrote:
>>>> This is the v2 of:
>>>> 
>>>>    https://inbox.sourceware.org/gdb-patches/20221206135729.3937767-1-simon.marchi@efficios.com/
>>>> 
>>>> I pushed patches 5, 7, 8 and 9, since I considered them good chances even
>>>> outside of this series.
>>>> 
>>>> Most other preparatory patches have been reviewed and LGTM'ed by Andrew
>>>> Burgess, but they are not really useful on their own, so they are still
>>>> in this series.  There are no changes in them, except a typo fix in
>>>> patch 3.
>>>> 
>>>> The only preparatory patch which was not reviewed is "gdb/solib-svr4:
>>>> don't disable probes interface if probe not found".
>>> Ping.
>>> 
>>> I plan to push the series in a week from now, if there are no more
>>> comments on this.  Note that this code has been heavily scrutinized by
>>> Pedro and Lancelot, so I am confident that it is in good shape already.
>>> 
>>> Simon
>> I pushed the series.
>> 
>> Simon
>
>
>Hi,
>
>as I mentioned earlier to Lancelot off-list, I'm now seeing:
>
>...
>
>Running /data/vries/gdb/src/gdb/testsuite/gdb.rocm/simple.exp ...
>gdb compile failed, /data/vries/gdb/src/gdb/testsuite/gdb.rocm/simple.cpp:18:10: fatal error: hip/hip_runtime.h: No such file or directory
>#include "hip/hip_runtime.h"
>         ^~~~~~~~~~~~~~~~~~~
>compilation terminated.
>...
>
>Thanks,
>
>- Tom
  
Tom de Vries Feb. 6, 2023, 4:53 p.m. UTC | #5
On 2/6/23 15:01, Lancelot Six wrote:

> Hi,
>
> Thanks for, letting us know. I am preparing a patch to fix this (skip 
> the tests if the amd-dbgapi support is not built in or if the hiccc 
> compiler for amdgpu is not available).
>
> I hope to send this tomorrow.
>
Thanks for working on this.


FWIW, I just tried the same test-case on openSUSE Tumbleweed, and ran 
into some kind of "no usable compiler found" message.

After adding c++ to the compile flags, I ran into the same error as 
described below.

Thanks,

- Tom



> Best
> Lancelot
>
> Le 6 février 2023 11:47:55 GMT+00:00, Tom de Vries <tdevries@suse.de> 
> a écrit :
>
>     On 2/2/23 16:08, Simon Marchi via Gdb-patches wrote:
>
>>     On 1/25/23 12:03, Simon Marchi via Gdb-patches wrote:
>>>     On 1/5/23 15:02, Simon Marchi wrote:
>>>>     This is the v2 of:
>>>>
>>>>        https://inbox.sourceware.org/gdb-patches/20221206135729.3937767-1-simon.marchi@efficios.com/
>>>>
>>>>     I pushed patches 5, 7, 8 and 9, since I considered them good chances even
>>>>     outside of this series.
>>>>
>>>>     Most other preparatory patches have been reviewed and LGTM'ed by Andrew
>>>>     Burgess, but they are not really useful on their own, so they are still
>>>>     in this series.  There are no changes in them, except a typo fix in
>>>>     patch 3.
>>>>
>>>>     The only preparatory patch which was not reviewed is "gdb/solib-svr4:
>>>>     don't disable probes interface if probe not found".
>>>     Ping.
>>>
>>>     I plan to push the series in a week from now, if there are no more
>>>     comments on this.  Note that this code has been heavily scrutinized by
>>>     Pedro and Lancelot, so I am confident that it is in good shape already.
>>>
>>>     Simon
>>     I pushed the series.
>>
>>     Simon
>
>
>     Hi,
>
>     as I mentioned earlier to Lancelot off-list, I'm now seeing:
>
>     ...
>
>     Running /data/vries/gdb/src/gdb/testsuite/gdb.rocm/simple.exp ...
>     gdb compile failed,
>     /data/vries/gdb/src/gdb/testsuite/gdb.rocm/simple.cpp:18:10: fatal
>     error: hip/hip_runtime.h: No such file or directory
>     #include "hip/hip_runtime.h"
>              ^~~~~~~~~~~~~~~~~~~
>     compilation terminated.
>     ...
>
>     Thanks,
>
>     - Tom
>