[4/4] gdb/testsuite: allow_hipcc_tests tests the hipcc compiler

Message ID 20230207132802.223510-5-lancelot.six@amd.com
State Committed
Commit 39f6d7c6b06b06a0372284f30c86417a8c0d6ba5
Headers
Series Fix gdb.rocm/simple.exp on hosts without ROCm |

Commit Message

Lancelot SIX Feb. 7, 2023, 1:28 p.m. UTC
  Update allow_hipcc_tests so all gdb.rocm tests are skipped if we do not
have a working hipcc compiler available.

To achieve this, adjust gdb_simple_compile to ensure that the hip
program is saved in a ".cpp" file before calling hipcc otherwise
compilation will fail.

One thing to note is that it is possible to have a hipcc installed with
a CUDA backend.  Compiling with this back-end will successfully result
in an application, but GDB cannot debug it (at least for the offload
part). In the context of the gdb.rocm tests, we want to detect such
situation where gdb_simple_compile would give a false positive.

To achieve this, this patch checks that there is at least one AMDGPU
device available and that hipcc can compile for this or those targets.
Detecting the device is done using the rocm_agent_enumerator tool which
is installed with the all ROCm installations (it is used by hipcc to
detect identify targets if this is not specified on the comand line).

This patch also makes the allow_hipcc_tests proc a cached proc.
---
 gdb/testsuite/lib/gdb.exp  |  4 +++
 gdb/testsuite/lib/rocm.exp | 69 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 72 insertions(+), 1 deletion(-)
  

Comments

Lancelot SIX Feb. 7, 2023, 1:42 p.m. UTC | #1
Hi,

Some of this patch is based on original work from Pedro in 
https://github.com/ROCm-Developer-Tools/ROCgdb/commit/031a44b9c6ec2c381030a919b837e5dfc255e688. 
  I just realized that I forgot to give credit where due.

I'll add the following tag:
Co-Authored-By: Pedro Alves <pedro@palves.net>

Best,
Laneclot.

On 07/02/2023 13:28, Lancelot SIX wrote:
> Update allow_hipcc_tests so all gdb.rocm tests are skipped if we do not
> have a working hipcc compiler available.
> 
> To achieve this, adjust gdb_simple_compile to ensure that the hip
> program is saved in a ".cpp" file before calling hipcc otherwise
> compilation will fail.
> 
> One thing to note is that it is possible to have a hipcc installed with
> a CUDA backend.  Compiling with this back-end will successfully result
> in an application, but GDB cannot debug it (at least for the offload
> part). In the context of the gdb.rocm tests, we want to detect such
> situation where gdb_simple_compile would give a false positive.
> 
> To achieve this, this patch checks that there is at least one AMDGPU
> device available and that hipcc can compile for this or those targets.
> Detecting the device is done using the rocm_agent_enumerator tool which
> is installed with the all ROCm installations (it is used by hipcc to
> detect identify targets if this is not specified on the comand line).
> 
> This patch also makes the allow_hipcc_tests proc a cached proc.
> ---
>   gdb/testsuite/lib/gdb.exp  |  4 +++
>   gdb/testsuite/lib/rocm.exp | 69 +++++++++++++++++++++++++++++++++++++-
>   2 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> index faa0ac05a9a..6333728f71e 100644
> --- a/gdb/testsuite/lib/gdb.exp
> +++ b/gdb/testsuite/lib/gdb.exp
> @@ -4581,6 +4581,10 @@ proc gdb_simple_compile {name code {type object} {compile_flags {}} {object obj}
>   	    set ext "go"
>   	    break
>   	}
> +	if { "$flag" eq "hip" } {
> +	    set ext "cpp"
> +	    break
> +	}
>       }
>       set src [standard_temp_file $name-[pid].$ext]
>       set obj [standard_temp_file $name-[pid].$postfix]
> diff --git a/gdb/testsuite/lib/rocm.exp b/gdb/testsuite/lib/rocm.exp
> index b5b59748c27..06d5b3988b8 100644
> --- a/gdb/testsuite/lib/rocm.exp
> +++ b/gdb/testsuite/lib/rocm.exp
> @@ -15,7 +15,51 @@
>   #
>   # Support library for testing ROCm (AMD GPU) GDB features.
>   
> -proc allow_hipcc_tests { } {
> +# Get the list of gpu targets to compile for.
> +#
> +# If HCC_AMDGPU_TARGET is set in the environment, use it.  Otherwise,
> +# try reading it from the system using the rocm_agent_enumerator
> +# utility.
> +
> +proc hcc_amdgpu_targets {} {
> +    # Look for HCC_AMDGPU_TARGET (same env var hipcc uses).  If
> +    # that fails, try using rocm_agent_enumerator (again, same as
> +    # hipcc does).
> +    if {[info exists ::env(HCC_AMDGPU_TARGET)]} {
> +	return [split $::env(HCC_AMDGPU_TARGET) ","]
> +    }
> +
> +    set rocm_agent_enumerator "rocm_agent_enumerator"
> +
> +    # If available, use ROCM_PATH to locate rocm_agent_enumerator.
> +    if { [info exists ::env(ROCM_PATH)] } {
> +	set rocm_agent_enumerator \
> +	    "$::env(ROCM_PATH)/bin/rocm_agent_enumerator"
> +    }
> +
> +    # If we fail to locate the rocm_agent_enumerator, just return an empty
> +    # list of targets and let the caller decide if this should be an error.
> +    if { [which $rocm_agent_enumerator] == 0 } {
> +	return [list]
> +    }
> +
> +    set result [remote_exec host $rocm_agent_enumerator]
> +    if { [lindex $result 0] != 0 } {
> +	error "rocm_agent_enumerator failed"
> +    }
> +
> +    set targets [list]
> +    foreach target [lindex $result 1] {
> +	# Ignore gfx000 which is the host CPU.
> +	if { $target ne "gfx000" } {
> +	    lappend targets $target
> +	}
> +    }
> +
> +    return $targets
> +}
> +
> +gdb_caching_proc allow_hipcc_tests {
>       # Only the native target supports ROCm debugging.  E.g., when
>       # testing against GDBserver, there's no point in running the ROCm
>       # tests.
> @@ -29,6 +73,29 @@ proc allow_hipcc_tests { } {
>   	return 0
>       }
>   
> +    # Check we have a working hipcc compiler available.
> +    set targets [hcc_amdgpu_targets]
> +    if { [llength $targets] == 0} {
> +	return 0
> +    }
> +
> +    set flags [list hip additional_flags=--offload-arch=[join $targets ","]]
> +    if {![gdb_simple_compile hipprobe {
> +	    #include <hip/hip_runtime.h>
> +	    __global__ void
> +	    kern () {}
> +
> +	    int
> +	    main ()
> +	    {
> +		kern<<<1, 1>>> ();
> +		hipDeviceSynchronize ();
> +		return 0;
> +	    }
> +	} executable $flags]} {
> +	return 0
> +    }
> +
>       return 1
>   }
>
  
Simon Marchi Feb. 7, 2023, 2:12 p.m. UTC | #2
On 2/7/23 08:28, Lancelot SIX via Gdb-patches wrote:
> Update allow_hipcc_tests so all gdb.rocm tests are skipped if we do not
> have a working hipcc compiler available.
> 
> To achieve this, adjust gdb_simple_compile to ensure that the hip
> program is saved in a ".cpp" file before calling hipcc otherwise
> compilation will fail.
> 
> One thing to note is that it is possible to have a hipcc installed with
> a CUDA backend.  Compiling with this back-end will successfully result
> in an application, but GDB cannot debug it (at least for the offload
> part). In the context of the gdb.rocm tests, we want to detect such
> situation where gdb_simple_compile would give a false positive.
> 
> To achieve this, this patch checks that there is at least one AMDGPU
> device available and that hipcc can compile for this or those targets.
> Detecting the device is done using the rocm_agent_enumerator tool which
> is installed with the all ROCm installations (it is used by hipcc to
> detect identify targets if this is not specified on the comand line).
> 
> This patch also makes the allow_hipcc_tests proc a cached proc.
> ---
>  gdb/testsuite/lib/gdb.exp  |  4 +++
>  gdb/testsuite/lib/rocm.exp | 69 +++++++++++++++++++++++++++++++++++++-
>  2 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> index faa0ac05a9a..6333728f71e 100644
> --- a/gdb/testsuite/lib/gdb.exp
> +++ b/gdb/testsuite/lib/gdb.exp
> @@ -4581,6 +4581,10 @@ proc gdb_simple_compile {name code {type object} {compile_flags {}} {object obj}
>  	    set ext "go"
>  	    break
>  	}
> +	if { "$flag" eq "hip" } {
> +	    set ext "cpp"
> +	    break
> +	}
>      }
>      set src [standard_temp_file $name-[pid].$ext]
>      set obj [standard_temp_file $name-[pid].$postfix]
> diff --git a/gdb/testsuite/lib/rocm.exp b/gdb/testsuite/lib/rocm.exp
> index b5b59748c27..06d5b3988b8 100644
> --- a/gdb/testsuite/lib/rocm.exp
> +++ b/gdb/testsuite/lib/rocm.exp
> @@ -15,7 +15,51 @@
>  #
>  # Support library for testing ROCm (AMD GPU) GDB features.
>  
> -proc allow_hipcc_tests { } {
> +# Get the list of gpu targets to compile for.
> +#
> +# If HCC_AMDGPU_TARGET is set in the environment, use it.  Otherwise,
> +# try reading it from the system using the rocm_agent_enumerator
> +# utility.
> +
> +proc hcc_amdgpu_targets {} {
> +    # Look for HCC_AMDGPU_TARGET (same env var hipcc uses).  If
> +    # that fails, try using rocm_agent_enumerator (again, same as
> +    # hipcc does).
> +    if {[info exists ::env(HCC_AMDGPU_TARGET)]} {
> +	return [split $::env(HCC_AMDGPU_TARGET) ","]
> +    }
> +
> +    set rocm_agent_enumerator "rocm_agent_enumerator"
> +
> +    # If available, use ROCM_PATH to locate rocm_agent_enumerator.
> +    if { [info exists ::env(ROCM_PATH)] } {
> +	set rocm_agent_enumerator \
> +	    "$::env(ROCM_PATH)/bin/rocm_agent_enumerator"
> +    }
> +
> +    # If we fail to locate the rocm_agent_enumerator, just return an empty
> +    # list of targets and let the caller decide if this should be an error.
> +    if { [which $rocm_agent_enumerator] == 0 } {
> +	return [list]
> +    }
> +
> +    set result [remote_exec host $rocm_agent_enumerator]
> +    if { [lindex $result 0] != 0 } {
> +	error "rocm_agent_enumerator failed"
> +    }
> +
> +    set targets [list]
> +    foreach target [lindex $result 1] {
> +	# Ignore gfx000 which is the host CPU.
> +	if { $target ne "gfx000" } {
> +	    lappend targets $target
> +	}
> +    }
> +
> +    return $targets

I typically don't have ROCM_PATH set, and I don't add /opt/rocm/bin to
my PATH, so rocm_agent_enumerator will not be found by default here.
However, gdb_find_hipcc does fall back on /opt/rocm/bin, so it does find
my hipcc.  I think we should use similar strategies for both cases.  I
don't mind having to set ROCM_PATH or adding /opt/rocm/bin to my PATH if
needed, as long as we're consistent about it.

> +}
> +
> +gdb_caching_proc allow_hipcc_tests {
>      # Only the native target supports ROCm debugging.  E.g., when
>      # testing against GDBserver, there's no point in running the ROCm
>      # tests.
> @@ -29,6 +73,29 @@ proc allow_hipcc_tests { } {
>  	return 0
>      }
>  
> +    # Check we have a working hipcc compiler available.
> +    set targets [hcc_amdgpu_targets]
> +    if { [llength $targets] == 0} {
> +	return 0
> +    }
> +
> +    set flags [list hip additional_flags=--offload-arch=[join $targets ","]]
> +    if {![gdb_simple_compile hipprobe {
> +	    #include <hip/hip_runtime.h>
> +	    __global__ void
> +	    kern () {}
> +
> +	    int
> +	    main ()
> +	    {
> +		kern<<<1, 1>>> ();
> +		hipDeviceSynchronize ();
> +		return 0;
> +	    }
> +	} executable $flags]} {
> +	return 0
> +    }

So, this last part ensures we don't have a "CUDA hipcc false
positive"?  It would be good to put a comment above it to indicate the
precise intent.

Simon
  
Lancelot SIX Feb. 7, 2023, 3:31 p.m. UTC | #3
On 07/02/2023 14:12, Simon Marchi wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On 2/7/23 08:28, Lancelot SIX via Gdb-patches wrote:
>> Update allow_hipcc_tests so all gdb.rocm tests are skipped if we do not
>> have a working hipcc compiler available.
>>
>> To achieve this, adjust gdb_simple_compile to ensure that the hip
>> program is saved in a ".cpp" file before calling hipcc otherwise
>> compilation will fail.
>>
>> One thing to note is that it is possible to have a hipcc installed with
>> a CUDA backend.  Compiling with this back-end will successfully result
>> in an application, but GDB cannot debug it (at least for the offload
>> part). In the context of the gdb.rocm tests, we want to detect such
>> situation where gdb_simple_compile would give a false positive.
>>
>> To achieve this, this patch checks that there is at least one AMDGPU
>> device available and that hipcc can compile for this or those targets.
>> Detecting the device is done using the rocm_agent_enumerator tool which
>> is installed with the all ROCm installations (it is used by hipcc to
>> detect identify targets if this is not specified on the comand line).
>>
>> This patch also makes the allow_hipcc_tests proc a cached proc.
>> ---
>>   gdb/testsuite/lib/gdb.exp  |  4 +++
>>   gdb/testsuite/lib/rocm.exp | 69 +++++++++++++++++++++++++++++++++++++-
>>   2 files changed, 72 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
>> index faa0ac05a9a..6333728f71e 100644
>> --- a/gdb/testsuite/lib/gdb.exp
>> +++ b/gdb/testsuite/lib/gdb.exp
>> @@ -4581,6 +4581,10 @@ proc gdb_simple_compile {name code {type object} {compile_flags {}} {object obj}
>>            set ext "go"
>>            break
>>        }
>> +     if { "$flag" eq "hip" } {
>> +         set ext "cpp"
>> +         break
>> +     }
>>       }
>>       set src [standard_temp_file $name-[pid].$ext]
>>       set obj [standard_temp_file $name-[pid].$postfix]
>> diff --git a/gdb/testsuite/lib/rocm.exp b/gdb/testsuite/lib/rocm.exp
>> index b5b59748c27..06d5b3988b8 100644
>> --- a/gdb/testsuite/lib/rocm.exp
>> +++ b/gdb/testsuite/lib/rocm.exp
>> @@ -15,7 +15,51 @@
>>   #
>>   # Support library for testing ROCm (AMD GPU) GDB features.
>>
>> -proc allow_hipcc_tests { } {
>> +# Get the list of gpu targets to compile for.
>> +#
>> +# If HCC_AMDGPU_TARGET is set in the environment, use it.  Otherwise,
>> +# try reading it from the system using the rocm_agent_enumerator
>> +# utility.
>> +
>> +proc hcc_amdgpu_targets {} {
>> +    # Look for HCC_AMDGPU_TARGET (same env var hipcc uses).  If
>> +    # that fails, try using rocm_agent_enumerator (again, same as
>> +    # hipcc does).
>> +    if {[info exists ::env(HCC_AMDGPU_TARGET)]} {
>> +     return [split $::env(HCC_AMDGPU_TARGET) ","]
>> +    }
>> +
>> +    set rocm_agent_enumerator "rocm_agent_enumerator"
>> +
>> +    # If available, use ROCM_PATH to locate rocm_agent_enumerator.
>> +    if { [info exists ::env(ROCM_PATH)] } {
>> +     set rocm_agent_enumerator \
>> +         "$::env(ROCM_PATH)/bin/rocm_agent_enumerator"
>> +    }
>> +
>> +    # If we fail to locate the rocm_agent_enumerator, just return an empty
>> +    # list of targets and let the caller decide if this should be an error.
>> +    if { [which $rocm_agent_enumerator] == 0 } {
>> +     return [list]
>> +    }
>> +
>> +    set result [remote_exec host $rocm_agent_enumerator]
>> +    if { [lindex $result 0] != 0 } {
>> +     error "rocm_agent_enumerator failed"
>> +    }
>> +
>> +    set targets [list]
>> +    foreach target [lindex $result 1] {
>> +     # Ignore gfx000 which is the host CPU.
>> +     if { $target ne "gfx000" } {
>> +         lappend targets $target
>> +     }
>> +    }
>> +
>> +    return $targets
> 
> I typically don't have ROCM_PATH set, and I don't add /opt/rocm/bin to
> my PATH, so rocm_agent_enumerator will not be found by default here.
> However, gdb_find_hipcc does fall back on /opt/rocm/bin, so it does find
> my hipcc.  I think we should use similar strategies for both cases.  I
> don't mind having to set ROCM_PATH or adding /opt/rocm/bin to my PATH if
> needed, as long as we're consistent about it.
> 

When installing the pre-built packages, at least on Ubuntu, there should 
be symlinks from /usr/bin to /etc/alternavites to 
/opt/rocm-$VERSION/bin.   I expect that when those tools start to be 
packaged by the various distributions, they will be available in PATH.

As I am not a big fan of the hard-coded /opt/rocm prefix, especially in 
the context of upstream, I would prefer to update gdb_find_hipcc.

Would something like this be OK?  It search for hipcc in dejagnu's 
tool_root_dir, $::env(ROCM_PATH) (if set) and fallbacks to "hipcc" if 
none of the above worked to search in PATH.

 From b8638cafabd36bc9316591da3c8326e10277372a Mon Sep 17 00:00:00 2001
From: Lancelot SIX <lancelot.six@amd.com>
Date: Tue, 7 Feb 2023 15:13:47 +0000
Subject: [PATCH] gdb/testsuite: look for hipcc in env(ROCM_PATH)

If the hipcc compiler cannot be found in dejagnu's tool_root_dir, look
for it in $::env(ROCM_PATH) (if set).  If hipcc is still not found,
fallback to "hipcc" so the compiler will be searched in the PATH.  This
removes the fallback to the hard-coded "/opt/rocm/bin" prefix.

This change is done so ROCM tools are searched in a uniform manner.
---
  gdb/testsuite/lib/future.exp | 5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gdb/testsuite/lib/future.exp b/gdb/testsuite/lib/future.exp
index 5720d3837d5..2e8315bbfe1 100644
--- a/gdb/testsuite/lib/future.exp
+++ b/gdb/testsuite/lib/future.exp
@@ -125,8 +125,11 @@ proc gdb_find_hipcc {} {
      global tool_root_dir
      if {![is_remote host]} {
  	set hipcc [lookfor_file $tool_root_dir hipcc]
+	if {$hipcc == "" && [info exists ::env(ROCM_PATH)]} {
+	    set hipcc [lookfor_file $::env(ROCM_PATH)/bin hipcc]
+	}
  	if {$hipcc == ""} {
-	    set hipcc [lookfor_file /opt/rocm/bin hipcc]
+	    set hipcc hipcc
  	}
      } else {
  	set hipcc ""
  

Patch

diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index faa0ac05a9a..6333728f71e 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -4581,6 +4581,10 @@  proc gdb_simple_compile {name code {type object} {compile_flags {}} {object obj}
 	    set ext "go"
 	    break
 	}
+	if { "$flag" eq "hip" } {
+	    set ext "cpp"
+	    break
+	}
     }
     set src [standard_temp_file $name-[pid].$ext]
     set obj [standard_temp_file $name-[pid].$postfix]
diff --git a/gdb/testsuite/lib/rocm.exp b/gdb/testsuite/lib/rocm.exp
index b5b59748c27..06d5b3988b8 100644
--- a/gdb/testsuite/lib/rocm.exp
+++ b/gdb/testsuite/lib/rocm.exp
@@ -15,7 +15,51 @@ 
 #
 # Support library for testing ROCm (AMD GPU) GDB features.
 
-proc allow_hipcc_tests { } {
+# Get the list of gpu targets to compile for.
+#
+# If HCC_AMDGPU_TARGET is set in the environment, use it.  Otherwise,
+# try reading it from the system using the rocm_agent_enumerator
+# utility.
+
+proc hcc_amdgpu_targets {} {
+    # Look for HCC_AMDGPU_TARGET (same env var hipcc uses).  If
+    # that fails, try using rocm_agent_enumerator (again, same as
+    # hipcc does).
+    if {[info exists ::env(HCC_AMDGPU_TARGET)]} {
+	return [split $::env(HCC_AMDGPU_TARGET) ","]
+    }
+
+    set rocm_agent_enumerator "rocm_agent_enumerator"
+
+    # If available, use ROCM_PATH to locate rocm_agent_enumerator.
+    if { [info exists ::env(ROCM_PATH)] } {
+	set rocm_agent_enumerator \
+	    "$::env(ROCM_PATH)/bin/rocm_agent_enumerator"
+    }
+
+    # If we fail to locate the rocm_agent_enumerator, just return an empty
+    # list of targets and let the caller decide if this should be an error.
+    if { [which $rocm_agent_enumerator] == 0 } {
+	return [list]
+    }
+
+    set result [remote_exec host $rocm_agent_enumerator]
+    if { [lindex $result 0] != 0 } {
+	error "rocm_agent_enumerator failed"
+    }
+
+    set targets [list]
+    foreach target [lindex $result 1] {
+	# Ignore gfx000 which is the host CPU.
+	if { $target ne "gfx000" } {
+	    lappend targets $target
+	}
+    }
+
+    return $targets
+}
+
+gdb_caching_proc allow_hipcc_tests {
     # Only the native target supports ROCm debugging.  E.g., when
     # testing against GDBserver, there's no point in running the ROCm
     # tests.
@@ -29,6 +73,29 @@  proc allow_hipcc_tests { } {
 	return 0
     }
 
+    # Check we have a working hipcc compiler available.
+    set targets [hcc_amdgpu_targets]
+    if { [llength $targets] == 0} {
+	return 0
+    }
+
+    set flags [list hip additional_flags=--offload-arch=[join $targets ","]]
+    if {![gdb_simple_compile hipprobe {
+	    #include <hip/hip_runtime.h>
+	    __global__ void
+	    kern () {}
+
+	    int
+	    main ()
+	    {
+		kern<<<1, 1>>> ();
+		hipDeviceSynchronize ();
+		return 0;
+	    }
+	} executable $flags]} {
+	return 0
+    }
+
     return 1
 }