OpenMP: Fix omp_get_device_from_uid, minor cleanup (was: Re: [Patch][v2] OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines)

Message ID 97fb3c92-909a-4fe8-9d85-fd3c99c31dbe@baylibre.com
State New
Headers
Series OpenMP: Fix omp_get_device_from_uid, minor cleanup (was: Re: [Patch][v2] OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines) |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed

Commit Message

Tobias Burnus Sept. 20, 2024, 11:33 p.m. UTC
  Hi Thomas, hello all,

the attached follow-up patch does:

* It fixes an issue (thinko) related to Fortran and \0 terminated,
   which fails for at least substring strings.

* Includes some minor fixes, e.g. ensuring the device is initialized
   in omp_get_uid_from_device, the superfluous 'omp_', or adding some
   inits to oacc-host.c.

* Now the plugins return NULL instead of failing when the UID cannot
   be obtained; in that case, the fallback UID "OMP_DEV_%d" is used.

Comments or remarks before I commit it?

* * *

Regarding the topic of caching in the plugin instead of in
libgomp: If we want to change it, we either to remove the fallback
and require the existence and success of GOMP_OFFLOAD_get_uid.
Otherwise, with host fallback support, we have to cache it at both
locations, which is somehow not really sensible, either.

Thoughts on this topic?

* * *

Longer reply to Thomas' comments:

Thomas Schwinge wrote:

>> +      "omp_get_uid_from_device",
> ..., but here without 'omp_' prefix: 'get_uid_from_device' (and properly
> sorted).

Ups! Should be of course without. (as 'omp_' prefix is checked before).

> Do we apparently not have test suite coverage for these things?

We do *not* test all API routines. The check is, e.g., used in

   gfc_error ("%s cannot contain OpenMP API call in intervening code "

or
   "OpenMP runtime API call %qD in a region with "
   "%<order(concurrent)%> clause", fndecl);

And we have a few tests for each of them, but not a full set of all API routines.

* * *
    

>> +  const char *uid;
> Caching this here, instead of acquiring via 'GOMP_OFFLOAD_get_uid' for
> each call, is a minor performance optimization?  (Similar to other items
> cached here, I guess.)

Yes, but it goes a bit beyond: As the pointer is returned to the user, it
has to be allocated at some point - and cached to avoid allocating more
memory when called repeatable called. As the fallback and host handling is
also done in target.c, the caching is done here.

(Besides the API routines, two env vars and one context selector for
'target_device' support the UID.)

* * *

> Please also update 'libgomp/oacc-host.c:host_dispatch'.
Done.
>> +          ! Note: In gfortran, strings are \0 termined
>> +          integer(c_int) function omp_get_device_from_uid(uid) bind(C)
> For my understanding: in general, Fortran strings are *not*
> NUL-terminated, right?  So this is a specific properly of 'gfortran'
> and/or this GCC/OpenMP interface,

The Fortran standard leaves this to implementation, but by construction,
there is a length (however it is made handled internally, e.g. via the
declaration) and the actual data. - To aid debugging, gfortran NUL terminates
them.

However, when thinking a bit more about it, taking a substring of a
null-terminated string will not magically be \0 at the boundary of the
substring. - Thus, the simplified approach failed + a Fortran specific
function had to be added (→ fortran.c).

* * *

>> +        interface omp_get_uid_from_device
>> +          ! Deviation from OpenMP 6.0: VALUE added.
> (..., which I suppose you've reported to OpenMP...)

No - it is not really a bug in the standard. The OpenMP
specification tries to provide a consistent API - but it
is difficult to create an API without touching the ABI.

For the caller side, the usage is the same independent
whether there is an 'intent(in)' or VALUE attribute,
a Bind(C) with or without binding name. Or also a generic
interface with multiple specific ones - which we do to handle
-fdefault-integer-8.

Obviously, the compiler needs to know those details, but
unless users codes the interface themselves instead of
using omp.h / omp_lib.h / the omp_lib module.

Thus, that's one of the few deviation from the OpenMP
specification which does affect the ABI but not the API.

* * *

>> +GOMP_OFFLOAD_get_uid (int ord)
>> +{
> I guess I'd have just put this code into 'init_hsa_context', filling a
> new statically-sized 'uuid' field in 'hsa_context_info' (like
> 'driver_version_s'; and assuming that 'hsa_context_info' is the right
> abstraction for this), and then just return that 'uuid' from
> 'GOMP_OFFLOAD_get_uid'.

That would be one option. Still, we have to decide whether we either
want to have strictly everything handled in the device code - including
fallback handling (which could be an UID replacement or a fatal error).

Of we do part of the handling elsewhere, e.g. by permitting that the
plugin can fail or does not provide the functions, we can handle it
in target.c (as currently done) - but then we need to cache it there
as well (or at least the fallbacks).

* * *

> That way, you'd avoid the unclear semantics of
> who gets to 'free' the buffer returned from 'GOMP_OFFLOAD_get_uid' upon
> 'GOMP_OFFLOAD_fini_device' -- currently the memory is lost?

Well, depends what you mean by lost. The 'devices' data structure in 
target.c is allocated early during device initialization and it is never 
deallocated. Hence, also the current "uint" member is never deallocated 
and remains until the end of the program accessible.

The variable it self has a rather high lifetime as the user might store 
it and only much later use it. As a pointer to the string is is returned 
to the user, the lifetime has to be rather long.

* * *

> But actually, should we just 'return NULL;' in that case, and let libgomp
> proper handle this gracefully, similar to as if 'GOMP_OFFLOAD_get_uid'
> isn't implemented at all by a plugin?  (What are the expected OpenMP
> semantics?)

The spec is silent on it. Expected is that repeatable runs repeatable 
such that (at least) for the non-host devices, the same always the same 
UID is used for a specific device.

Most sensible is really the UUID, if available, but we can use a 
fallback, if needed.

I now added a "return NULL" which clashes a bit with the topic of 
caching in the plugin as now the OMP_DEV_%d string also needs to be 
cached, but on the libgomp.so side.

* * *
>> +  sprintf (str, "GPU-%02x" "%02x" "%02x" "%02x"
>> […]
>> +	   (unsigned char) s.bytes[0], (unsigned char) s.bytes[1],

BTW: without the (unsigned char), instead of, e.g, "ef" one gets
"ffffffef" which is a bit surprising at a glance.

Tobias
  

Comments

Tobias Burnus Sept. 23, 2024, 2:01 p.m. UTC | #1
Now committed as r15-3799-gcdb9aa0f623ec7 / 
https://gcc.gnu.org/r15-3799-gcdb9aa0f623ec7

Tobias

Am 21.09.24 um 01:33 schrieb Tobias Burnus:
> Hi Thomas, hello all,
>
> the attached follow-up patch does:
>
> * It fixes an issue (thinko) related to Fortran and \0 terminated,
>   which fails for at least substring strings.
>
> * Includes some minor fixes, e.g. ensuring the device is initialized
>   in omp_get_uid_from_device, the superfluous 'omp_', or adding some
>   inits to oacc-host.c.
>
> * Now the plugins return NULL instead of failing when the UID cannot
>   be obtained; in that case, the fallback UID "OMP_DEV_%d" is used.
>
> Comments or remarks before I commit it?
>
> * * *
>
> Regarding the topic of caching in the plugin instead of in
> libgomp: If we want to change it, we either to remove the fallback
> and require the existence and success of GOMP_OFFLOAD_get_uid.
> Otherwise, with host fallback support, we have to cache it at both
> locations, which is somehow not really sensible, either.
>
> Thoughts on this topic?
>
> * * *
>
> Longer reply to Thomas' comments:
>
> Thomas Schwinge wrote:
>
>>> +      "omp_get_uid_from_device",
>> ..., but here without 'omp_' prefix: 'get_uid_from_device' (and properly
>> sorted).
>
> Ups! Should be of course without. (as 'omp_' prefix is checked before).
>
>> Do we apparently not have test suite coverage for these things?
>
> We do *not* test all API routines. The check is, e.g., used in
>
>   gfc_error ("%s cannot contain OpenMP API call in intervening code "
>
> or
>   "OpenMP runtime API call %qD in a region with "
>   "%<order(concurrent)%> clause", fndecl);
>
> And we have a few tests for each of them, but not a full set of all 
> API routines.
>
> * * *
>
>>> +  const char *uid;
>> Caching this here, instead of acquiring via 'GOMP_OFFLOAD_get_uid' for
>> each call, is a minor performance optimization?  (Similar to other items
>> cached here, I guess.)
>
> Yes, but it goes a bit beyond: As the pointer is returned to the user, it
> has to be allocated at some point - and cached to avoid allocating more
> memory when called repeatable called. As the fallback and host 
> handling is
> also done in target.c, the caching is done here.
>
> (Besides the API routines, two env vars and one context selector for
> 'target_device' support the UID.)
>
> * * *
>
>> Please also update 'libgomp/oacc-host.c:host_dispatch'.
> Done.
>>> +          ! Note: In gfortran, strings are \0 termined
>>> +          integer(c_int) function omp_get_device_from_uid(uid) bind(C)
>> For my understanding: in general, Fortran strings are *not*
>> NUL-terminated, right?  So this is a specific properly of 'gfortran'
>> and/or this GCC/OpenMP interface,
>
> The Fortran standard leaves this to implementation, but by construction,
> there is a length (however it is made handled internally, e.g. via the
> declaration) and the actual data. - To aid debugging, gfortran NUL 
> terminates
> them.
>
> However, when thinking a bit more about it, taking a substring of a
> null-terminated string will not magically be \0 at the boundary of the
> substring. - Thus, the simplified approach failed + a Fortran specific
> function had to be added (→ fortran.c).
>
> * * *
>
>>> +        interface omp_get_uid_from_device
>>> +          ! Deviation from OpenMP 6.0: VALUE added.
>> (..., which I suppose you've reported to OpenMP...)
>
> No - it is not really a bug in the standard. The OpenMP
> specification tries to provide a consistent API - but it
> is difficult to create an API without touching the ABI.
>
> For the caller side, the usage is the same independent
> whether there is an 'intent(in)' or VALUE attribute,
> a Bind(C) with or without binding name. Or also a generic
> interface with multiple specific ones - which we do to handle
> -fdefault-integer-8.
>
> Obviously, the compiler needs to know those details, but
> unless users codes the interface themselves instead of
> using omp.h / omp_lib.h / the omp_lib module.
>
> Thus, that's one of the few deviation from the OpenMP
> specification which does affect the ABI but not the API.
>
> * * *
>
>>> +GOMP_OFFLOAD_get_uid (int ord)
>>> +{
>> I guess I'd have just put this code into 'init_hsa_context', filling a
>> new statically-sized 'uuid' field in 'hsa_context_info' (like
>> 'driver_version_s'; and assuming that 'hsa_context_info' is the right
>> abstraction for this), and then just return that 'uuid' from
>> 'GOMP_OFFLOAD_get_uid'.
>
> That would be one option. Still, we have to decide whether we either
> want to have strictly everything handled in the device code - including
> fallback handling (which could be an UID replacement or a fatal error).
>
> Of we do part of the handling elsewhere, e.g. by permitting that the
> plugin can fail or does not provide the functions, we can handle it
> in target.c (as currently done) - but then we need to cache it there
> as well (or at least the fallbacks).
>
> * * *
>
>> That way, you'd avoid the unclear semantics of
>> who gets to 'free' the buffer returned from 'GOMP_OFFLOAD_get_uid' upon
>> 'GOMP_OFFLOAD_fini_device' -- currently the memory is lost?
>
> Well, depends what you mean by lost. The 'devices' data structure in 
> target.c is allocated early during device initialization and it is 
> never deallocated. Hence, also the current "uint" member is never 
> deallocated and remains until the end of the program accessible.
>
> The variable it self has a rather high lifetime as the user might 
> store it and only much later use it. As a pointer to the string is is 
> returned to the user, the lifetime has to be rather long.
>
> * * *
>
>> But actually, should we just 'return NULL;' in that case, and let 
>> libgomp
>> proper handle this gracefully, similar to as if 'GOMP_OFFLOAD_get_uid'
>> isn't implemented at all by a plugin?  (What are the expected OpenMP
>> semantics?)
>
> The spec is silent on it. Expected is that repeatable runs repeatable 
> such that (at least) for the non-host devices, the same always the 
> same UID is used for a specific device.
>
> Most sensible is really the UUID, if available, but we can use a 
> fallback, if needed.
>
> I now added a "return NULL" which clashes a bit with the topic of 
> caching in the plugin as now the OMP_DEV_%d string also needs to be 
> cached, but on the libgomp.so side.
>
> * * *
>>> +  sprintf (str, "GPU-%02x" "%02x" "%02x" "%02x"
>>> […]
>>> +       (unsigned char) s.bytes[0], (unsigned char) s.bytes[1],
>
> BTW: without the (unsigned char), instead of, e.g, "ef" one gets
> "ffffffef" which is a bit surprising at a glance.
>
> Tobias
  

Patch

OpenMP: Fix omp_get_device_from_uid, minor cleanup

In Fortran, omp_get_device_from_uid can also accept substrings, which are
then not NUL terminated.  Fixed by introducing a fortran.c wrapper function.
Additionally, in case of a fail the plugin functions now return NULL instead
of failing fatally such that a fall-back UID is generated.

gcc/ChangeLog:

	* omp-general.cc (omp_runtime_api_procname): Strip "omp_" from
	string; move get_device_from_uid as now a '_' suffix exists.

libgomp/ChangeLog:

	* fortran.c (omp_get_device_from_uid_): New function.
	* libgomp.map (GOMP_6.0): Add it.
	* oacc-host.c (host_dispatch): Init '.uid' and '.get_uid_func'.
	* omp_lib.f90.in: Make it used by removing bind(C).
	* omp_lib.h.in: Likewise.
	* target.c (omp_get_device_from_uid): Ensure the device is initialized.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): Add function comment;
	return NULL in case of an error.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): Likewise.
	* testsuite/libgomp.fortran/device_uid.f90: Update to test substrings.

  gcc/omp-general.cc                               |  4 ++--
 libgomp/fortran.c                                | 18 ++++++++++++++++++
 libgomp/libgomp.map                              |  1 +
 libgomp/oacc-host.c                              |  2 ++
 libgomp/omp_lib.f90.in                           |  5 ++---
 libgomp/omp_lib.h.in                             |  5 ++---
 libgomp/plugin/plugin-gcn.c                      |  8 +++++++-
 libgomp/plugin/plugin-nvptx.c                    |  7 +++++--
 libgomp/target.c                                 |  7 +++++--
 libgomp/testsuite/libgomp.fortran/device_uid.f90 | 18 ++++++++++++++++--
 10 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
index 12788ad0249..3dfbc315056 100644
--- a/gcc/omp-general.cc
+++ b/gcc/omp-general.cc
@@ -3260,7 +3260,6 @@  omp_runtime_api_procname (const char *name)
       "alloc",
       "calloc",
       "free",
-      "get_device_from_uid",
       "get_interop_int",
       "get_interop_ptr",
       "get_mapped_ptr",
@@ -3290,6 +3289,7 @@  omp_runtime_api_procname (const char *name)
       "get_cancellation",
       "get_default_allocator",
       "get_default_device",
+      "get_device_from_uid",
       "get_device_num",
       "get_dynamic",
       "get_initial_device",
@@ -3339,7 +3339,7 @@  omp_runtime_api_procname (const char *name)
 	 as DECL_NAME only omp_* and omp_*_8 appear.  */
       "display_env",
       "get_ancestor_thread_num",
-      "omp_get_uid_from_device",
+      "get_uid_from_device",
       "get_partition_place_nums",
       "get_place_num_procs",
       "get_place_proc_ids",
diff --git a/libgomp/fortran.c b/libgomp/fortran.c
index 9b7f093555b..7976e5b9638 100644
--- a/libgomp/fortran.c
+++ b/libgomp/fortran.c
@@ -834,6 +834,24 @@  omp_get_interop_rc_desc_ (const char **res, size_t *res_len,
   *res_len = *res ? strlen (*res) : 0;
 }
 
+int
+omp_get_device_from_uid_ (const char *uid, size_t uid_len)
+{
+#ifndef LIBGOMP_OFFLOADED_ONLY
+  char *str = __builtin_alloca ((uid_len + 1) * sizeof (char));
+  memcpy (str, uid, uid_len * sizeof (char));
+  str[uid_len] = '\0';
+  return omp_get_device_from_uid (str);
+#else
+  /* Inside the target region, invoking this routine is undefined
+     behavior; thus, resolve it already here - instead of inside
+     libgomp/config/.../target.c.
+     Note that on nvptx __builtin_alloca is defined, but fails with a sorry
+     during compilation, as it is unsupported until isa 7.3 / sm_52.  */
+  return omp_invalid_device;
+#endif
+}
+
 void
 omp_get_uid_from_device_ (const char **res, size_t *res_len,
 			  int32_t device_num)
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 0023d3e1b6d..4530b3adc94 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -446,6 +446,7 @@  GOMP_5.1.3 {
 GOMP_6.0 {
   global:
 	omp_get_device_from_uid;
+	omp_get_device_from_uid_;
 	omp_get_uid_from_device;
 	omp_get_uid_from_device_;
 	omp_get_uid_from_device_8_;
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index 5efdf7fb796..7d4362ebd6c 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -263,6 +263,7 @@  host_openacc_destroy_thread_data (void *tls_data __attribute__ ((unused)))
 static struct gomp_device_descr host_dispatch =
   {
     .name = "host",
+    .uid = NULL,
     .capabilities = (GOMP_OFFLOAD_CAP_SHARED_MEM
 		     | GOMP_OFFLOAD_CAP_NATIVE_EXEC
 		     | GOMP_OFFLOAD_CAP_OPENACC_200),
@@ -270,6 +271,7 @@  static struct gomp_device_descr host_dispatch =
     .type = OFFLOAD_TARGET_TYPE_HOST,
 
     .get_name_func = host_get_name,
+    .get_uid_func = NULL,
     .get_caps_func = host_get_caps,
     .get_type_func = host_get_type,
     .get_num_devices_func = host_get_num_devices,
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index 360352c5a07..77f2bd4d38e 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -1004,10 +1004,9 @@ 
         end interface
 
         interface
-          ! Note: In gfortran, strings are \0 termined
-          integer(c_int) function omp_get_device_from_uid(uid) bind(C)
+          integer(c_int) function omp_get_device_from_uid (uid)
             use iso_c_binding
-            character(c_char), intent(in) :: uid(*)
+            character, intent(in) :: uid(*)
           end function omp_get_device_from_uid
         end interface
 
diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in
index 10038611d80..fcfc737243c 100644
--- a/libgomp/omp_lib.h.in
+++ b/libgomp/omp_lib.h.in
@@ -612,10 +612,9 @@ 
       end interface
 
       interface
-!       Note: In gfortran, strings are \0 termined
-        integer(c_int) function omp_get_device_from_uid(uid) bind(C)
+        integer(c_int) function omp_get_device_from_uid (uid)
           use iso_c_binding
-          character(c_char), intent(in) :: uid(*)
+          character, intent(in) :: uid(*)
         end function omp_get_device_from_uid
       end interface
 
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index bf6ad371ea2..f805206852d 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3316,6 +3316,9 @@  GOMP_OFFLOAD_get_name (void)
   return "gcn";
 }
 
+/* Return the UID; if not available return NULL.
+   Returns freshly allocated memoy.  */
+
 const char *
 GOMP_OFFLOAD_get_uid (int ord)
 {
@@ -3328,7 +3331,10 @@  GOMP_OFFLOAD_get_uid (int ord)
   status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AMD_AGENT_INFO_UUID,
 					  str);
   if (status != HSA_STATUS_SUCCESS)
-    hsa_fatal ("Could not obtain device UUID", status);
+    {
+      free (str);
+      return NULL;
+    }
   return str;
 }
 
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index a8b85bd9fd0..9310241d4fb 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1242,6 +1242,9 @@  GOMP_OFFLOAD_get_name (void)
   return "nvptx";
 }
 
+/* Return the UID; if not available return NULL.
+   Returns freshly allocated memoy.  */
+
 const char *
 GOMP_OFFLOAD_get_uid (int ord)
 {
@@ -1254,9 +1257,9 @@  GOMP_OFFLOAD_get_uid (int ord)
   else if (CUDA_CALL_EXISTS (cuDeviceGetUuid))
     r = CUDA_CALL_NOCHECK (cuDeviceGetUuid, &s, dev->dev);
   else
-    r = CUDA_ERROR_NOT_FOUND;
+    return NULL;
   if (r != CUDA_SUCCESS)
-    GOMP_PLUGIN_fatal ("cuDeviceGetUuid error: %s", cuda_error (r));
+    NULL;
 
   size_t len = strlen ("GPU-12345678-9abc-defg-hijk-lmniopqrstuv");
   char *str = (char *) GOMP_PLUGIN_malloc (len + 1);
diff --git a/libgomp/target.c b/libgomp/target.c
index f9aa1789f0b..abd1ac8bcca 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -5268,8 +5268,11 @@  omp_get_device_from_uid (const char *uid)
   if (strcmp (uid, str_omp_initial_device) == 0)
     return omp_initial_device;
   for (int dev = 0; dev < gomp_get_num_devices (); dev++)
-    if (strcmp (uid, gomp_get_uid_for_device (&devices[dev], dev)) == 0)
-      return dev;
+    {
+      struct gomp_device_descr *devicep = resolve_device (dev, false);
+      if (strcmp (uid, gomp_get_uid_for_device (devicep, dev)) == 0)
+	return dev;
+    }
   return omp_invalid_device;
 }
 
diff --git a/libgomp/testsuite/libgomp.fortran/device_uid.f90 b/libgomp/testsuite/libgomp.fortran/device_uid.f90
index 4c2a28a104e..504f6caaf07 100644
--- a/libgomp/testsuite/libgomp.fortran/device_uid.f90
+++ b/libgomp/testsuite/libgomp.fortran/device_uid.f90
@@ -12,7 +12,7 @@  program main
 
   do i = omp_invalid_device - 1, omp_get_num_devices () + 1
     str => omp_get_uid_from_device (i)
-    dev = omp_get_device_from_uid (str);
+    dev = omp_get_device_from_uid (str)
 ! print *, i, str, dev
     if (i < omp_initial_device .or. i > omp_get_num_devices ()) then
       if (dev /= omp_invalid_device .or. associated(str)) &
@@ -30,12 +30,26 @@  program main
       stop 4
     end if
     strs(dev)%str => str
+
+    block
+      ! Check substring handling
+      character(len=100) :: long_str
+      integer :: dev2
+      long_str = str // "ABCDEF"
+      dev2 = omp_get_device_from_uid (long_str(1:len(str)))
+      if (i == omp_initial_device .or. i == omp_get_num_devices ()) then
+        if (dev2 /= omp_initial_device .and. dev2  /= omp_get_num_devices ()) &
+          stop 5
+      else if (dev /= dev2) then
+        stop 6
+      end if
+    end block
   end do
 
   do i = 0, omp_get_num_devices () - 1
     do j = i + 1, omp_get_num_devices ()
       if (strs(i)%str == strs(j)%str) &
-        stop 4
+        stop 7
     end do
   end do
   deallocate (strs)