OpenMP/Fortran: Use firstprivat not alloc for ptr attach for arrays

Message ID e2f799ad-338e-9c38-e1c9-2a4516b07586@mentor.com
State New
Headers
Series OpenMP/Fortran: Use firstprivat not alloc for ptr attach for arrays |

Commit Message

Tobias Burnus May 13, 2022, 5:21 p.m. UTC
  Based on sollve_vv's tests/4.5/target_teams_distribute/test_target_teams_distribute_nowait.F90

As discussed, for simple pointers – like here with nondescriptor array,
instead of alloc:a + pointer assign, a firstprivate + pointer assign makes
more sense.

It also avoids the race exposed by the sollve_vv testcase in some constellations.
(The testcase, both as attached and currently in sollve_vv [→ Issue #532], is
invalid if run with 'nowait' as there is a race related to the array - as it
only (un)mapped as (disjunct) array sections via interleaved, concurrently
running target constructs with nowait clause.)

There might be more places where this should/could be done – and in principle,
firstprivate could also be useful for an array descriptor (but not its data);
this could also be explored. (Including whether it should then not be privatized
with shared memory.)

OK?

Tobias

PS: OpenACC is excluded as it does its own firstprivate handling.
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Comments

Jakub Jelinek May 13, 2022, 5:44 p.m. UTC | #1
On Fri, May 13, 2022 at 07:21:02PM +0200, Tobias Burnus wrote:
> gcc/fortran/ChangeLog:
> 
> 	* trans-openmp.cc (gfc_trans_omp_clauses): When mapping nondescriptor
> 	array sections, use GOMP_MAP_FIRSTPRIVATE_POINTER instead of
> 	GOMP_MAP_POINTER for the pointer attachment.
> 
> libgomp/ChangeLog:
> 
> 	* testsuite/libgomp.fortran/target-nowait-array-section.f90: New test.

Not 100% sure if we want to add such a testcase into the testsuite given
that it is not valid OpenMP, but perhaps it is ok as we are testing a QoI.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/target-nowait-array-section.f90
> @@ -0,0 +1,56 @@
> +! Runs the the target region asynchrolously and checks for it
> +!
> +! Note that  map(alloc: work(:, i)) + nowait  should be save

s/save/safe/

Otherwise LGTM.

	Jakub
  
Thomas Schwinge Nov. 12, 2022, 9:19 a.m. UTC | #2
Hi Tobias!

On 2022-05-13T19:44:51+0200, Jakub Jelinek via Fortran <fortran@gcc.gnu.org> wrote:
> On Fri, May 13, 2022 at 07:21:02PM +0200, Tobias Burnus wrote:
>> gcc/fortran/ChangeLog:
>>
>>      * trans-openmp.cc (gfc_trans_omp_clauses): When mapping nondescriptor
>>      array sections, use GOMP_MAP_FIRSTPRIVATE_POINTER instead of
>>      GOMP_MAP_POINTER for the pointer attachment.
>>
>> libgomp/ChangeLog:
>>
>>      * testsuite/libgomp.fortran/target-nowait-array-section.f90: New test.
>
> Not 100% sure if we want to add such a testcase into the testsuite given
> that it is not valid OpenMP, but perhaps it is ok as we are testing a QoI.

For non-offloading x86_64-pc-linux-gnu '-m32', I'm occasionally (but very
rarely!) seeing this test case FAIL its execution test.  Similar can also
be seen on occasional reports via <gcc-testresults@gcc.gnu.org>,
<gcc-regression@gcc.gnu.org>.


Grüße
 Thomas


'libgomp.fortran/target-nowait-array-section.f90':

| ! Runs the the target region asynchrolously and checks for it
| !
| ! Note that  map(alloc: work(:, i)) + nowait  should be safe
| ! given that a nondescriptor array is used. However, it still
| ! violates a map clause restriction, added in OpenMP 5.1 [354:10-13].
|
| PROGRAM test_target_teams_distribute_nowait
|   USE ISO_Fortran_env, only: INT64
|   implicit none
|     INTEGER, parameter :: N = 1024, N_TASKS = 16
|     INTEGER :: i, j, k, my_ticket
|     INTEGER :: order(n_tasks)
|     INTEGER(INT64) :: work(n, n_tasks)
|     INTEGER :: ticket
|     logical :: async
|
|     ticket = 0
|
|     !$omp target enter data map(to: ticket, order)
|
|     !$omp parallel do num_threads(n_tasks)
|     DO i = 1, n_tasks
|        !$omp target map(alloc: work(:, i), ticket) private(my_ticket) nowait
|        !!$omp target teams distribute map(alloc: work(:, i), ticket) private(my_ticket) nowait
|        DO j = 1, n
|           ! Waste cyles
| !          work(j, i) = 0
| !          DO k = 1, n*(n_tasks - i)
| !             work(j, i) = work(j, i) + i*j*k
| !          END DO
|           my_ticket = 0
|           !$omp atomic capture
|           ticket = ticket + 1
|           my_ticket = ticket
|           !$omp end atomic
|           !$omp atomic write
|           order(i) = my_ticket
|        END DO
|        !$omp end target !teams distribute
|     END DO
|     !$omp end parallel do
|
|     !$omp target exit data map(from:ticket, order)
|
|     IF (ticket .ne. n_tasks*n) stop 1
|     if (maxval(order) /= n_tasks*n) stop 2
|     ! order(i) == n*i if synchronous and between n and n*n_tasks if run concurrently
|     do i = 1, n_tasks
|       if (order(i) < n .or. order(i) > n*n_tasks) stop 3
|     end do
|     async = .false.
|     do i = 1, n_tasks
|       if (order(i) /= n*i) async = .true.
|     end do
|     if (.not. async) stop 4 ! Did not run asynchronously
| end
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Patch

OpenMP/Fortran: Use firstprivat not alloc for ptr attach for arrays

For a non-descriptor array,  map(A(n:m)) was mapped as
  map(tofrom:A[n-1] [len: ...]) map(alloc:A [pointer assign, bias: ...])
with this patch, it is changed to
  map(tofrom:A[n-1] [len: ...]) map(firstprivate:A [pointer assign, bias: ...])

The latter avoids an alloc - and also avoids the race condition with
nowait in the enclosed testcase. (Note: predantically, the testcase is
invalid since OpenMP 5.1, violating the map clause restriction at [354:10-13].

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): When mapping nondescriptor
	array sections, use GOMP_MAP_FIRSTPRIVATE_POINTER instead of
	GOMP_MAP_POINTER for the pointer attachment.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/target-nowait-array-section.f90: New test.

 gcc/fortran/trans-openmp.cc                        | 12 +++--
 .../target-nowait-array-section.f90                | 56 ++++++++++++++++++++++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index baa45f78a0e..eb5870c3bc5 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -3312,9 +3312,15 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		  /* An array element or array section which is not part of a
 		     derived type, etc.  */
 		  bool element = n->expr->ref->u.ar.type == AR_ELEMENT;
-		  gfc_trans_omp_array_section (block, n, decl, element,
-					       GOMP_MAP_POINTER, node, node2,
-					       node3, node4);
+		  tree type = TREE_TYPE (decl);
+		  gomp_map_kind k = GOMP_MAP_POINTER;
+		  if (!openacc
+		      && !GFC_DESCRIPTOR_TYPE_P (type)
+		      && !(POINTER_TYPE_P (type)
+			   && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (type))))
+		    k = GOMP_MAP_FIRSTPRIVATE_POINTER;
+		  gfc_trans_omp_array_section (block, n, decl, element, k,
+					       node, node2, node3, node4);
 		}
 	      else if (n->expr
 		       && n->expr->expr_type == EXPR_VARIABLE
diff --git a/libgomp/testsuite/libgomp.fortran/target-nowait-array-section.f90 b/libgomp/testsuite/libgomp.fortran/target-nowait-array-section.f90
new file mode 100644
index 00000000000..7560cff746b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-nowait-array-section.f90
@@ -0,0 +1,56 @@ 
+! Runs the the target region asynchrolously and checks for it
+!
+! Note that  map(alloc: work(:, i)) + nowait  should be save
+! given that a nondescriptor array is used. However, it still
+! violates a map clause restriction, added in OpenMP 5.1 [354:10-13].
+
+PROGRAM test_target_teams_distribute_nowait
+  USE ISO_Fortran_env, only: INT64
+  implicit none
+    INTEGER, parameter :: N = 1024, N_TASKS = 16
+    INTEGER :: i, j, k, my_ticket
+    INTEGER :: order(n_tasks)
+    INTEGER(INT64) :: work(n, n_tasks)
+    INTEGER :: ticket
+    logical :: async
+
+    ticket = 0
+
+    !$omp target enter data map(to: ticket, order)
+
+    !$omp parallel do num_threads(n_tasks)
+    DO i = 1, n_tasks
+       !$omp target map(alloc: work(:, i), ticket) private(my_ticket) nowait
+       !!$omp target teams distribute map(alloc: work(:, i), ticket) private(my_ticket) nowait
+       DO j = 1, n
+          ! Waste cyles
+!          work(j, i) = 0
+!          DO k = 1, n*(n_tasks - i)
+!             work(j, i) = work(j, i) + i*j*k
+!          END DO
+          my_ticket = 0
+          !$omp atomic capture
+          ticket = ticket + 1
+          my_ticket = ticket
+          !$omp end atomic
+          !$omp atomic write
+          order(i) = my_ticket
+       END DO
+       !$omp end target !teams distribute
+    END DO
+    !$omp end parallel do
+
+    !$omp target exit data map(from:ticket, order)
+
+    IF (ticket .ne. n_tasks*n) stop 1
+    if (maxval(order) /= n_tasks*n) stop 2
+    ! order(i) == n*i if synchronous and between n and n*n_tasks if run concurrently
+    do i = 1, n_tasks
+      if (order(i) < n .or. order(i) > n*n_tasks) stop 3
+    end do
+    async = .false.
+    do i = 1, n_tasks
+      if (order(i) /= n*i) async = .true.
+    end do
+    if (.not. async) stop 4 ! Did not run asynchronously
+end