OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236]

Message ID 9e2d0331-92c4-c8a6-a662-61f298fb3976@codesourcery.com
State New
Headers
Series OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236] |

Commit Message

Tobias Burnus Oct. 18, 2022, 7:27 p.m. UTC
  Found when playing around with reverse offload once I used 'omp target parallel'.
The other issue showed up when running the testsuite (which is done with -O2).

In all cases, the ICE is in expand_GOMP_TARGET_REV of this IFN, which should
be unreachable

Note: ENABLE_OFFLOADING inside the compiler must evaluate to true to show up
as ICE - otherwise, the IFN is not even generated.

I did not see a good reason for DECL_CONTEXT = NULL, thus, I now set it to
the same as was set for child_fn - for no good reason.

Tested on x86-64 with ENABLE_OFFLOADING albeit without true offloading.
OK for mainline?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Comments

Tobias Burnus Oct. 24, 2022, 7:47 a.m. UTC | #1
Ping this patch – and also "Re: [Patch][v5] libgomp/nvptx: Prepare for
reverse-offload callback handling".

For the latter cf. Alexander's code approval
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603908.html – and
his concerns regarding the generic feature in
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601959.html (I
think 'target nowait' permits what he thinks is the better way for GPUs.)

Tobias

On 18.10.22 21:27, Tobias Burnus wrote:
> Found when playing around with reverse offload once I used 'omp target
> parallel'.
> The other issue showed up when running the testsuite (which is done
> with -O2).
>
> In all cases, the ICE is in expand_GOMP_TARGET_REV of this IFN, which
> should
> be unreachable
>
> Note: ENABLE_OFFLOADING inside the compiler must evaluate to true to
> show up
> as ICE - otherwise, the IFN is not even generated.
>
> I did not see a good reason for DECL_CONTEXT = NULL, thus, I now set
> it to
> the same as was set for child_fn - for no good reason.
>
> Tested on x86-64 with ENABLE_OFFLOADING albeit without true offloading.
> OK for mainline?
>
> Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Jakub Jelinek Oct. 24, 2022, 1:14 p.m. UTC | #2
On Tue, Oct 18, 2022 at 09:27:04PM +0200, Tobias Burnus wrote:
> The cgraph_node::create_clone issue is exposed with -O2 for the existing
> libgomp.fortran/reverse-offload-1.f90.
> 
> omp-offload.cc
> 
> 	PR middle-end/107236
> 
> gcc/ChangeLog:
> 	* omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt
> 	in DECL_CONTEXT and not to cfun->decl.
> 	* cgraphclones.cc (cgraph_node::create_clone): Copy also the
> 	node's calls_declare_variant_alt value.
> 
> gcc/testsuite/ChangeLog:
> 	* gfortran.dg/gomp/target-device-ancestor-6.f90: New test.

LGTM, thanks.

	Jakub
  

Patch

OpenMP: Fix reverse offload GOMP_TARGET_REV IFN corner cases [PR107236]

For 'target parallel' and similarly nested directives, cgraph_node's
calls_declare_variant_alt was not set in the parent region node but in
cfun->decl. Hence, pass_omp_device_lower did not process handle the
internal function GOMP_TARGET_REV. - Solution is to set it to the
DECL_CONTEXT, which is set in adjust_context_and_scope.

The cgraph_node::create_clone issue is exposed with -O2 for the existing
libgomp.fortran/reverse-offload-1.f90.

omp-offload.cc

	PR middle-end/107236

gcc/ChangeLog:
	* omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt
	in DECL_CONTEXT and not to cfun->decl.
	* cgraphclones.cc (cgraph_node::create_clone): Copy also the
	node's calls_declare_variant_alt value.

gcc/testsuite/ChangeLog:
	* gfortran.dg/gomp/target-device-ancestor-6.f90: New test.

 gcc/cgraphclones.cc                                     |  1 +
 gcc/omp-expand.cc                                       | 13 ++++++-------
 .../gfortran.dg/gomp/target-device-ancestor-6.f90       | 17 +++++++++++++++++
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index eb0fa87b554..bb4b3c5407d 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -375,6 +375,7 @@  cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   if (!new_inlined_to)
     prof_count = count.combine_with_ipa_count (prof_count);
   new_node->count = prof_count;
+  new_node->calls_declare_variant_alt = this->calls_declare_variant_alt;
 
   /* Update IPA profile.  Local profiles need no updating in original.  */
   if (update_original)
diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc
index 5dc0bf16e17..c636a174e36 100644
--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -10054,13 +10054,8 @@  expand_omp_target (struct omp_region *region)
 
       /* Handle the case that an inner ancestor:1 target is called by an outer
 	 target region. */
-      if (!is_ancestor)
-	cgraph_node::get (child_fn)->calls_declare_variant_alt
-	  |= cgraph_node::get (cfun->decl)->calls_declare_variant_alt;
-      else  /* Duplicate function to create empty nonhost variant. */
+      if (is_ancestor)
 	{
-	  /* Enable pass_omp_device_lower pass.  */
-	  cgraph_node::get (cfun->decl)->calls_declare_variant_alt = 1;
 	  cgraph_node *fn2_node;
 	  child_fn2 = build_decl (DECL_SOURCE_LOCATION (child_fn),
 				  FUNCTION_DECL,
@@ -10074,7 +10069,7 @@  expand_omp_target (struct omp_region *region)
 	  TREE_PUBLIC (child_fn2) = 0;
 	  DECL_UNINLINABLE (child_fn2) = 1;
 	  DECL_EXTERNAL (child_fn2) = 0;
-	  DECL_CONTEXT (child_fn2) = NULL_TREE;
+	  DECL_CONTEXT (child_fn2) = DECL_CONTEXT (child_fn);
 	  DECL_INITIAL (child_fn2) = make_node (BLOCK);
 	  BLOCK_SUPERCONTEXT (DECL_INITIAL (child_fn2)) = child_fn2;
 	  DECL_ATTRIBUTES (child_fn)
@@ -10098,6 +10093,10 @@  expand_omp_target (struct omp_region *region)
 	  fn2_node->force_output = 1;
 	  node->offloadable = 0;
 
+	  /* Enable pass_omp_device_lower pass.  */
+	  fn2_node = cgraph_node::get (DECL_CONTEXT (child_fn));
+	  fn2_node->calls_declare_variant_alt = 1;
+
 	  t = build_decl (DECL_SOURCE_LOCATION (child_fn),
 			  RESULT_DECL, NULL_TREE, void_type_node);
 	  DECL_ARTIFICIAL (t) = 1;
diff --git a/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-6.f90 b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-6.f90
new file mode 100644
index 00000000000..821e7852e85
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target-device-ancestor-6.f90
@@ -0,0 +1,17 @@ 
+! PR middle-end/107236
+
+! Did ICE before because IFN .GOMP_TARGET_REV was not
+! processed in omp-offload.cc.
+! Note: Test required ENABLE_OFFLOADING being true inside GCC.
+
+implicit none
+!$omp requires reverse_offload
+!$omp target parallel num_threads(4)
+  !$omp target device(ancestor:1)
+    call foo()
+  !$omp end target 
+!$omp end target parallel
+contains
+  subroutine foo
+  end
+end