OpenMP: Add strictly nested API call check [PR102972]
Commit Message
The original motivation was to fix the routine part
of the restriction quoted below. Namely that the only
routines calls to
omp_get_num_teams() and omp_get_team_num()
are permitted in teams when closely nested.
"Restrictions to the teams construct are as follows:
...
• distribute regions, including any distribute regions arising from composite constructs,
parallel regions, including any parallel regions arising from combined constructs, loop
regions, omp_get_num_teams() regions, and omp_get_team_num() regions are the
only OpenMP regions that may be strictly nested inside the teams region."
While being there, I found one issue related to the ancestor
check – which checked too strictly – and in the generic check
which assumed that the DECL_NAME in Fortran had the '_' suffix
while only the assembler name has.
That worked well with '_' as DECL_NAME then matched the C name
but for the integer(8) version, only ..._8_ was matched and
DECL_NAME only contained ..._8 without tailing '_'.
The assembler name is also needed because in Fortran,
module m
contains
subroutine omp_is_initial_device ()
has an OpenMP API name in DECL_NAME but internally, it is
something like m_MOD_omp_is_initial_device_ - which is an
odd user name but is not the API routine name.
I hope that no target starts mangling the C name such that
C's DECL_NAME() != the assembler name as then the patch
will break, but I think all targets do permit those simple
names and don't introduce further mangling.
While other testsuites had surprisingly little problems with
this change – most did use omp_get_num_teams() and
omp_get_team_num() but that's fine - the GCC testsuite did
have many violations. — I hoped I have fixed them in a
sensible way.
OK for mainline?
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Comments
On Fri, Oct 29, 2021 at 12:09:55PM +0200, Tobias Burnus wrote:
> The original motivation was to fix the routine part
> of the restriction quoted below. Namely that the only
> routines calls to
> omp_get_num_teams() and omp_get_team_num()
> are permitted in teams when closely nested.
>
>
> "Restrictions to the teams construct are as follows:
> ...
> • distribute regions, including any distribute regions arising from composite constructs,
> parallel regions, including any parallel regions arising from combined constructs, loop
> regions, omp_get_num_teams() regions, and omp_get_team_num() regions are the
> only OpenMP regions that may be strictly nested inside the teams region."
>
>
> While being there, I found one issue related to the ancestor
> check – which checked too strictly – and in the generic check
> which assumed that the DECL_NAME in Fortran had the '_' suffix
> while only the assembler name has.
>
> That worked well with '_' as DECL_NAME then matched the C name
> but for the integer(8) version, only ..._8_ was matched and
> DECL_NAME only contained ..._8 without tailing '_'.
>
> The assembler name is also needed because in Fortran,
> module m
> contains
> subroutine omp_is_initial_device ()
> has an OpenMP API name in DECL_NAME but internally, it is
> something like m_MOD_omp_is_initial_device_ - which is an
> odd user name but is not the API routine name.
I'm afraid using DECL_ASSEMBLER_NAME opens a new can of worms.
For one, it shouldn't be HAS_DECL_ASSEMBLER_NAME_P guarded, we either want
to use one or the other always, not randomly pick between them depending
on whether a function already got an assembler name or not.
But, for DECL_ASSEMBLER_NAME, I'm afraid one needs to
targetm.strip_name_encoding and also strip user_label_prefix if any.
At least for C++,
namespace A
{
int omp_is_initial_device () { return 0; }
};
is meant to be checked by
|| (DECL_CONTEXT (fndecl) != NULL_TREE
&& TREE_CODE (DECL_CONTEXT (fndecl)) != TRANSLATION_UNIT_DECL)
If that doesn't work for Fortran modules, we need to find out something
different, e.g. setjmp_or_longjmp_p also relies on that...
On the other side, when we use DECL_NAME we don't currently differentiate
between:
extern "C" int omp_is_initial_device ();
and say
extern int omp_is_initial_device (double, float);
where the latter is in C++ mangled differently. Sure, one can't use
the latter together with #include <omp.h>...
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -3911,7 +3911,7 @@ setjmp_or_longjmp_p (const_tree fndecl)
> /* Return true if FNDECL is an omp_* runtime API call. */
>
> static bool
> -omp_runtime_api_call (const_tree fndecl)
> +omp_runtime_api_call (tree fndecl, bool permit_num_teams)
> {
> tree declname = DECL_NAME (fndecl);
> if (!declname
> @@ -3920,6 +3920,8 @@ omp_runtime_api_call (const_tree fndecl)
> || !TREE_PUBLIC (fndecl))
> return false;
>
> + if (HAS_DECL_ASSEMBLER_NAME_P (fndecl))
> + declname = DECL_ASSEMBLER_NAME (fndecl);
> const char *name = IDENTIFIER_POINTER (declname);
> if (!startswith (name, "omp_"))
> return false;
> @@ -4029,7 +4031,17 @@ omp_runtime_api_call (const_tree fndecl)
> && (name[4 + len + 1] == '\0'
> || (mode > 1
> && strcmp (name + 4 + len + 1, "8_") == 0)))))
> - return true;
> + {
> + /* Only omp_get_num_teams + omp_get_team_num. */
> + if (permit_num_teams
> + && mode == 1
> + && (strncmp (name + 4, "get_num_teams",
> + strlen ("get_num_teams")) == 0
> + || strncmp (name + 4, "get_team_num",
> + strlen ("get_team_num")) == 0))
> + return false;
> + return true;
> + }
> }
> return false;
> }
As mentioned in the PR, I really don't like this permit_num_teams argument,
IMHO it is a caller that should check it, otherwise we end up in the
function with myriads of future exceptions etc.
But, if the stripping of the prefixes is non-trivial, perhaps
omp_runtime_api_call shouldn't return bool, but const char *, either NULL
for "this isn't an OpenMP API call", or pointer to the actual name starting
with "omp_", so that callers can check further.
As for tests where you are adding parallel to avoid the new diagnostics,
I'd suggest parallel if(0) instead, no need to create any extra threads...
Jakub
On 29.10.21 12:53, Jakub Jelinek wrote:
> On Fri, Oct 29, 2021 at 12:09:55PM +0200, Tobias Burnus wrote:
>> [...] only routines calls to
>> omp_get_num_teams() and omp_get_team_num()
>> are permitted in teams when closely nested.
> I'm afraid using DECL_ASSEMBLER_NAME opens a new can of worms. [...]
> At least for C++, [...]
> is meant to be checked by
> || (DECL_CONTEXT (fndecl) != NULL_TREE
> && TREE_CODE (DECL_CONTEXT (fndecl)) != TRANSLATION_UNIT_DECL)
> If that doesn't work for Fortran modules, we need to find out something
> different, e.g. setjmp_or_longjmp_p also relies on that...
It turned out that the current (pre-patch) code works correctly, except
that DECL_NAME for Fortran does not have the '_' suffix. I have now
updated the comments and just for omp_* and omp_*_8. That simplifies the
code and, fortunately, DECL_NAME does seem to work.
> On the other side, when we use DECL_NAME we don't currently differentiate
> between:
> extern "C" int omp_is_initial_device ();
> and say
> extern int omp_is_initial_device (double, float);
> where the latter is in C++ mangled differently. Sure, one can't use
> the latter together with #include <omp.h>...
The question is whether anyone cares that we reject the latter?
> As mentioned in the PR, I really don't like this permit_num_teams argument,
> IMHO it is a caller that should check it, otherwise we end up in the
> function with myriads of future exceptions etc.
I concur – given that DECL_NAME seems to work fine (ignoring C++ w/o
extern "C").
> As for tests where you are adding parallel to avoid the new diagnostics,
> I'd suggest parallel if(0) instead, no need to create any extra threads...
Done.
Thanks for the comments!
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
On Fri, Oct 29, 2021 at 05:54:57PM +0200, Tobias Burnus wrote:
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -3926,8 +3926,9 @@ omp_runtime_api_call (const_tree fndecl)
>
> static const char *omp_runtime_apis[] =
> {
> - /* This array has 3 sections. First omp_* calls that don't
> - have any suffixes. */
> + /* This array has 2 sections. First omp_* calls that don't
> + have any suffixes in the DECL_NAME; this includes omp_*
> + but also the omp_*_ of libgomp/fortran.c. */
> "aligned_alloc",
> "aligned_calloc",
> "alloc",
> @@ -3941,8 +3942,6 @@ omp_runtime_api_call (const_tree fndecl)
> "target_is_present",
> "target_memcpy",
> "target_memcpy_rect",
> - NULL,
> - /* Now omp_* calls that are available as omp_* and omp_*_. */
> "capture_affinity",
> "destroy_allocator",
> "destroy_lock",
If we use just 2 sections, then the two sections should be merged (they were
in alphabetic order in each section).
Or we can keep 3 sections and say that the first one is for the
calls on the library side without suffixes and second is for those with
no and _ suffixes, but that in DECL_NAME those don't make a difference.
Or make it 3 sections but the first two not separated by NULL but just a
comment, i.e. what you have in the patch except that the comments would
be adjusted...
Either of those 3 section solutions would be more useful if we ever reconsider
this and go with DECL_ASSEMBLER_NAME.
> @@ -3994,7 +3993,8 @@ omp_runtime_api_call (const_tree fndecl)
> "unset_lock",
> "unset_nest_lock",
> NULL,
> - /* And finally calls available as omp_*, omp_*_ and omp_*_8_. */
> + /* Calls available with DECL_NAME omp_* and omp_*_8, the latter matches
> + omp_*_8_ in libgomp/fortran.c. */
> "display_env",
> "get_ancestor_thread_num",
> "init_allocator",
> @@ -4024,11 +4024,7 @@ omp_runtime_api_call (const_tree fndecl)
> size_t len = strlen (omp_runtime_apis[i]);
> if (strncmp (name + 4, omp_runtime_apis[i], len) == 0
> && (name[4 + len] == '\0'
> - || (mode > 0
> - && name[4 + len] == '_'
> - && (name[4 + len + 1] == '\0'
> - || (mode > 1
> - && strcmp (name + 4 + len + 1, "8_") == 0)))))
> + || (mode && strcmp (name + 4 + len, "_8") == 0)))
> return true;
> }
> return false;
> @@ -4095,9 +4091,24 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p,
> "OpenMP runtime API call %qD in a region with "
> "%<order(concurrent)%> clause", fndecl);
> }
> + if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
> + && omp_runtime_api_call (fndecl)
> + && strncmp (IDENTIFIER_POINTER (DECL_NAME (fndecl)),
> + "omp_get_num_teams",
> + strlen ("omp_get_num_teams")) != 0
> + && strncmp (IDENTIFIER_POINTER (DECL_NAME (fndecl)),
> + "omp_get_team_num",
> + strlen ("omp_get_team_num")) != 0)
If we wanted to optimize, we could decide based on IDENTIFIER_LENGTH whether
to use strncmp at all and which one. Your choice.
> + #pragma omp distribute
> + for (int i = 0; i < 1; ++i)
> + if (omp_in_parallel ()
> + || omp_get_level () != 0
> + || omp_get_ancestor_thread_num (0) != 0
> + || omp_get_ancestor_thread_num (1) != -1)
> + abort ();
One thing I've missed, with such omp distribute we unfortunately test
it only on one of the teams (probably the first one) rather than all of
them.
Can't we use instead
#pragma omp distribute dist_schedule(static,1)
for (int i = 0; i < omp_get_num_teams (); ++i)
which I believe should ensure that each team will execute exactly one
iteration (i.e. exactly what the code has been doing before).
Otherwise LGTM.
Jakub
On 29.10.21 18:47, Jakub Jelinek wrote:
> Or we can keep 3 sections and say that the first one is for the
> calls on the library side without suffixes and second is for those with
> no and _ suffixes, but that in DECL_NAME those don't make a difference.
That's what I have now done.
>> + && strncmp (IDENTIFIER_POINTER (DECL_NAME (fndecl)),
>> + "omp_get_num_teams",
>> + strlen ("omp_get_num_teams")) != 0
>> + && strncmp (IDENTIFIER_POINTER (DECL_NAME (fndecl)),
>> + "omp_get_team_num",
>> + strlen ("omp_get_team_num")) != 0)
> If we wanted to optimize, we could decide based on IDENTIFIER_LENGTH whether
> to use strncmp at all and which one.
I did this optimization and moved to strcmp has there is no _ suffix in
general and no _8 suffix in particular.
> Can't we use instead
> #pragma omp distribute dist_schedule(static,1)
> for (int i = 0; i < omp_get_num_teams (); ++i)
> which I believe should ensure that each team will execute exactly one
> iteration (i.e. exactly what the code has been doing before).
Did use this now.
> Otherwise LGTM.
Thanks for the review. Committed as Rev.
r12-4809-g948d461954f2642ca187f86c19d297ba7a86320f
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
OpenMP: Add strictly nested API call check [PR102972]
The teams construct only permits omp_get_num_teams and omp_get_team_num
as API call in strictly nested regions - check for it.
Additionally, for Fortran, using DECL_NAME does not show the mangled
name, hence, DECL_ASSEMBLER_NAME had to be used to.
Finally, 'target device(ancestor:1)' wrongly rejected non-API calls
as well.
PR middle-end/102972
gcc/ChangeLog:
* omp-low.c (omp_runtime_api_call): Use DECL_ASSEMBLER_NAME to get
internal Fortran name; new permit_num_teams arg to permit
omp_get_num_teams and omp_get_team_num.
(scan_omp_1_stmt): Update call to it, add missing call for
reverse offload, and check for strictly nested API calls in teams.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-device-ancestor-3.c: Add non-API
routine test.
* gfortran.dg/gomp/order-6.f90: Add missing bind(C).
* c-c++-common/gomp/teams-3.c: New test.
* gfortran.dg/gomp/teams-3.f90: New test.
* gfortran.dg/gomp/teams-4.f90: New test.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/icv-3.c: Nest API calls inside
parallel construct.
* testsuite/libgomp.c-c++-common/icv-4.c: Likewise.
* testsuite/libgomp.c/target-3.c: Likewise.
* testsuite/libgomp.c/target-5.c: Likewise.
* testsuite/libgomp.c/target-6.c: Likewise.
* testsuite/libgomp.c/target-teams-1.c: Likewise.
* testsuite/libgomp.c/teams-1.c: Likewise.
* testsuite/libgomp.c/thread-limit-2.c: Likewise.
* testsuite/libgomp.c/thread-limit-3.c: Likewise.
* testsuite/libgomp.c/thread-limit-4.c: Likewise.
* testsuite/libgomp.c/thread-limit-5.c: Likewise.
* testsuite/libgomp.fortran/icv-3.f90: Likewise.
* testsuite/libgomp.fortran/icv-4.f90: Likewise.
* testsuite/libgomp.fortran/teams1.f90: Likewise.
gcc/omp-low.c | 30 +++++-
.../c-c++-common/gomp/target-device-ancestor-3.c | 2 +
gcc/testsuite/c-c++-common/gomp/teams-3.c | 64 ++++++++++++
gcc/testsuite/gfortran.dg/gomp/order-6.f90 | 2 +-
gcc/testsuite/gfortran.dg/gomp/teams-3.f90 | 65 ++++++++++++
gcc/testsuite/gfortran.dg/gomp/teams-4.f90 | 47 +++++++++
libgomp/testsuite/libgomp.c-c++-common/icv-3.c | 3 +
libgomp/testsuite/libgomp.c-c++-common/icv-4.c | 1 +
libgomp/testsuite/libgomp.c/target-3.c | 6 +-
libgomp/testsuite/libgomp.c/target-5.c | 1 +
libgomp/testsuite/libgomp.c/target-6.c | 12 ++-
libgomp/testsuite/libgomp.c/target-teams-1.c | 115 +++++++++++++++------
libgomp/testsuite/libgomp.c/teams-1.c | 6 +-
libgomp/testsuite/libgomp.c/thread-limit-2.c | 21 ++--
libgomp/testsuite/libgomp.c/thread-limit-3.c | 1 +
libgomp/testsuite/libgomp.c/thread-limit-4.c | 25 +++--
libgomp/testsuite/libgomp.c/thread-limit-5.c | 1 +
libgomp/testsuite/libgomp.fortran/icv-3.f90 | 6 ++
libgomp/testsuite/libgomp.fortran/icv-4.f90 | 2 +
libgomp/testsuite/libgomp.fortran/teams1.f90 | 16 +--
20 files changed, 351 insertions(+), 75 deletions(-)
@@ -3911,7 +3911,7 @@ setjmp_or_longjmp_p (const_tree fndecl)
/* Return true if FNDECL is an omp_* runtime API call. */
static bool
-omp_runtime_api_call (const_tree fndecl)
+omp_runtime_api_call (tree fndecl, bool permit_num_teams)
{
tree declname = DECL_NAME (fndecl);
if (!declname
@@ -3920,6 +3920,8 @@ omp_runtime_api_call (const_tree fndecl)
|| !TREE_PUBLIC (fndecl))
return false;
+ if (HAS_DECL_ASSEMBLER_NAME_P (fndecl))
+ declname = DECL_ASSEMBLER_NAME (fndecl);
const char *name = IDENTIFIER_POINTER (declname);
if (!startswith (name, "omp_"))
return false;
@@ -4029,7 +4031,17 @@ omp_runtime_api_call (const_tree fndecl)
&& (name[4 + len + 1] == '\0'
|| (mode > 1
&& strcmp (name + 4 + len + 1, "8_") == 0)))))
- return true;
+ {
+ /* Only omp_get_num_teams + omp_get_team_num. */
+ if (permit_num_teams
+ && mode == 1
+ && (strncmp (name + 4, "get_num_teams",
+ strlen ("get_num_teams")) == 0
+ || strncmp (name + 4, "get_team_num",
+ strlen ("get_team_num")) == 0))
+ return false;
+ return true;
+ }
}
return false;
}
@@ -4088,16 +4100,26 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p,
omp_context *octx = ctx;
if (gimple_code (ctx->stmt) == GIMPLE_OMP_SCAN && ctx->outer)
octx = ctx->outer;
- if (octx->order_concurrent && omp_runtime_api_call (fndecl))
+ if (octx->order_concurrent
+ && omp_runtime_api_call (fndecl, false))
{
remove = true;
error_at (gimple_location (stmt),
"OpenMP runtime API call %qD in a region with "
"%<order(concurrent)%> clause", fndecl);
}
+ if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
+ && omp_runtime_api_call (fndecl, true))
+ {
+ remove = true;
+ error_at (gimple_location (stmt),
+ "OpenMP runtime API call %qD strictly nested in a "
+ "%<teams%> region", fndecl);
+ }
if (gimple_code (ctx->stmt) == GIMPLE_OMP_TARGET
&& (gimple_omp_target_kind (ctx->stmt)
- == GF_OMP_TARGET_KIND_REGION))
+ == GF_OMP_TARGET_KIND_REGION)
+ && omp_runtime_api_call (fndecl, false))
{
tree tgt_clauses = gimple_omp_target_clauses (ctx->stmt);
tree c = omp_find_clause (tgt_clauses, OMP_CLAUSE_DEVICE);
@@ -3,6 +3,7 @@ extern "C" {
#endif
int omp_get_num_teams (void);
+int bar (void);
#ifdef __cplusplus
}
@@ -22,6 +23,7 @@ foo (void)
#pragma omp target device (ancestor: 1)
{
+ a = bar (); /* OK */
a = omp_get_num_teams (); /* { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_num_teams\[^\n\r]*' in a region with 'device\\(ancestor\\)' clause" } */
}
new file mode 100644
@@ -0,0 +1,64 @@
+/* PR middle-end/102972 */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* From omp.h */
+extern int omp_get_num_teams (void);
+extern void omp_set_num_teams (int);
+extern int omp_get_team_size (int);
+extern int omp_get_team_num (void);
+extern int omp_get_max_teams (void);
+extern void omp_set_teams_thread_limit (int);
+extern int omp_get_teams_thread_limit (void);
+extern int omp_is_initial_device (void);
+extern int omp_get_num_threads (void);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+
+void valid ()
+{
+ #pragma omp teams
+ {
+ #pragma omp distribute
+ for (int i = 0; i < 64; i++)
+ ;
+
+ int n = omp_get_num_teams ();
+ if (n >= omp_get_team_num ())
+ __builtin_abort ();
+
+ #pragma omp parallel for
+ for (int i = 0; i < 64; i++)
+ if (!omp_is_initial_device () || omp_get_num_threads () < 0)
+ __builtin_abort ();
+
+ #pragma omp loop
+ for (int i = 0; i < 64; i++)
+ ;
+ }
+}
+
+void invalid_nest ()
+{
+ #pragma omp teams
+ {
+ #pragma distribute parallel for simd
+ for (int i = 0; i < 64; i++)
+ ;
+
+ int n = 0;
+ n += omp_get_team_size (0); /* { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_team_size\[^\n\r]*' strictly nested in a 'teams' region" } */
+ n += omp_get_num_teams ();
+ n += omp_get_team_num ();
+ omp_set_num_teams (n); /* { dg-error "OpenMP runtime API call '\[^\n\r]*omp_set_num_teams\[^\n\r]*' strictly nested in a 'teams' region" } */
+ n += omp_get_max_teams (); /* { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_max_teams\[^\n\r]*' strictly nested in a 'teams' region" } */
+ n += omp_get_teams_thread_limit (); /* { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_teams_thread_limit\[^\n\r]*' strictly nested in a 'teams' region" } */
+ omp_set_teams_thread_limit (n); /* { dg-error "OpenMP runtime API call '\[^\n\r]*omp_set_teams_thread_limit\[^\n\r]*' strictly nested in a 'teams' region" } */
+ }
+}
@@ -8,7 +8,7 @@ module m
end
integer function omp_get_num_threads ()
end
- integer function omp_target_is_present (x, i)
+ integer function omp_target_is_present (x, i) bind(c)
import :: c_ptr
type(c_ptr) :: x
integer, value :: i
new file mode 100644
@@ -0,0 +1,65 @@
+! PR middle-end/102972
+
+module m
+implicit none (type, external)
+interface
+subroutine omp_set_num_teams (num_teams); integer :: num_teams; end
+subroutine omp_set_teams_thread_limit (thread_limit); integer :: thread_limit; end
+subroutine omp_set_num_teams_8 (num_teams); integer(8) :: num_teams; end
+subroutine omp_set_num_teams_9 (num_teams); integer :: num_teams; end
+subroutine omp_set_teams_thread_limit_8 (thread_limit); integer(8) :: thread_limit; end
+integer function omp_get_num_teams (); end
+integer function omp_get_team_size (level); integer :: level; end
+integer function omp_get_team_num (); end
+integer function omp_get_max_teams (); end
+integer function omp_get_teams_thread_limit (); end
+logical function omp_is_initial_device (); end
+integer function omp_get_num_threads (); end
+end interface
+
+contains
+
+subroutine valid ()
+ integer :: i, n
+ !$omp teams
+ !$omp distribute
+ do i = 1, 64
+ end do
+
+ n = omp_get_num_teams ()
+ if (n >= omp_get_team_num ()) &
+ error stop
+
+ !$omp parallel do
+ do i = 1, 64
+ if (.not.omp_is_initial_device () .or. omp_get_num_threads () < 0) &
+ error stop
+ end do
+
+ !$omp loop
+ do i = 1, 64
+ end do
+ !$omp end teams
+end
+
+subroutine invalid_nest ()
+ integer :: i, n
+ !$omp teams
+ !$omp distribute parallel do simd
+ do i = 1, 64
+ end do
+
+ n = 0
+ n = n + omp_get_team_size (0) ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_team_size\[^\n\r]*' strictly nested in a 'teams' region" }
+ n = n + omp_get_num_teams ()
+ n = n + omp_get_team_num ()
+ call omp_set_num_teams (n) ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_set_num_teams\[^\n\r]*' strictly nested in a 'teams' region" }
+ call omp_set_num_teams_8 (4_8) ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_set_num_teams_8\[^\n\r]*' strictly nested in a 'teams' region" }
+ call omp_set_num_teams_9 (4) ! OK - but misnamed user function
+ n = n + omp_get_max_teams () ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_max_teams\[^\n\r]*' strictly nested in a 'teams' region" }
+ n = n + omp_get_teams_thread_limit () ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_get_teams_thread_limit\[^\n\r]*' strictly nested in a 'teams' region" }
+ call omp_set_teams_thread_limit (n) ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_set_teams_thread_limit'\[^\n\r]* strictly nested in a 'teams' region" }
+ call omp_set_teams_thread_limit_8 (3_8) ! { dg-error "OpenMP runtime API call '\[^\n\r]*omp_set_teams_thread_limit_8'\[^\n\r]* strictly nested in a 'teams' region" }
+ !$omp end teams
+end
+end module
new file mode 100644
@@ -0,0 +1,47 @@
+! PR middle-end/102972
+
+module m
+implicit none (type, external)
+
+! Note: Those are module functions - not an interface
+! Hence, they are internally manged to contain the module name!
+
+contains
+
+subroutine omp_set_num_teams (num_teams); integer :: num_teams; end
+subroutine omp_set_teams_thread_limit (thread_limit); integer :: thread_limit; end
+subroutine omp_set_num_teams_8 (num_teams); integer(8) :: num_teams; end
+subroutine omp_set_num_teams_9 (num_teams); integer :: num_teams; end
+subroutine omp_set_teams_thread_limit_8 (thread_limit); integer(8) :: thread_limit; end
+integer function omp_get_num_teams (); omp_get_num_teams = 0; end
+integer function omp_get_team_size (level); integer :: level; omp_get_team_size = 0; end
+integer function omp_get_team_num (); omp_get_team_num = 0; end
+integer function omp_get_max_teams (); omp_get_max_teams = 0; end
+integer function omp_get_teams_thread_limit (); omp_get_teams_thread_limit = 0; end
+logical function omp_is_initial_device (); omp_is_initial_device = .true.; end
+integer function omp_get_num_threads (); omp_get_num_threads = 0; end
+end module
+
+subroutine nest_test ()
+ use m
+ implicit none (type, external)
+
+ integer :: i, n
+ !$omp teams
+ !$omp distribute parallel do simd
+ do i = 1, 64
+ end do
+
+ n = 0
+ n = n + omp_get_team_size (0)
+ n = n + omp_get_num_teams ()
+ n = n + omp_get_team_num ()
+ call omp_set_num_teams (n)
+ call omp_set_num_teams_8 (4_8)
+ call omp_set_num_teams_9 (4)
+ n = n + omp_get_max_teams ()
+ n = n + omp_get_teams_thread_limit ()
+ call omp_set_teams_thread_limit (n)
+ call omp_set_teams_thread_limit_8 (3_8)
+ !$omp end teams
+end
@@ -18,6 +18,7 @@ main ()
abort ();
#pragma omp teams
{
+ #pragma omp parallel
if (omp_get_max_teams () != 7
|| omp_get_teams_thread_limit () != 15
|| omp_get_num_teams () < 1
@@ -30,6 +31,7 @@ main ()
}
#pragma omp teams num_teams(5) thread_limit (13)
{
+ #pragma omp parallel
if (omp_get_max_teams () != 7
|| omp_get_teams_thread_limit () != 15
|| omp_get_num_teams () != 5
@@ -41,6 +43,7 @@ main ()
}
#pragma omp teams num_teams(8) thread_limit (16)
{
+ #pragma omp parallel
if (omp_get_max_teams () != 7
|| omp_get_teams_thread_limit () != 15
|| omp_get_num_teams () != 8
@@ -26,6 +26,7 @@ main ()
omp_set_teams_thread_limit (12);
#pragma omp teams
{
+ #pragma omp parallel
if (omp_get_max_teams () != 6
|| omp_get_teams_thread_limit () != 12
|| omp_get_num_teams () < 1
@@ -11,7 +11,9 @@ main ()
abort ();
#pragma omp target if (0)
#pragma omp teams
- if (omp_get_level ())
- abort ();
+ #pragma omp distribute
+ for (int i = 0; i < 1; ++i)
+ if (omp_get_level ())
+ abort ();
return 0;
}
@@ -55,6 +55,7 @@ main ()
abort ();
#pragma omp target if (0)
#pragma omp teams
+ #pragma omp parallel
{
omp_sched_t s_c;
int c_c;
@@ -47,11 +47,13 @@ main ()
{
#pragma omp teams thread_limit (2)
{
- if (omp_in_parallel ()
- || omp_get_level () != 0
- || omp_get_ancestor_thread_num (0) != 0
- || omp_get_ancestor_thread_num (1) != -1)
- abort ();
+ #pragma omp distribute
+ for (int i = 0; i < 1; ++i)
+ if (omp_in_parallel ()
+ || omp_get_level () != 0
+ || omp_get_ancestor_thread_num (0) != 0
+ || omp_get_ancestor_thread_num (1) != -1)
+ abort ();
#pragma omp parallel num_threads (2)
{
if (!omp_in_parallel ()
@@ -35,76 +35,115 @@ foo (int a, int b, long c, long d)
abort ();
#pragma omp target map(from: err)
#pragma omp teams
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1;
+ {
+ err = omp_get_num_teams () < 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1;
+ }
if (err)
abort ();
#pragma omp target teams map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1;
+ {
+ err = omp_get_num_teams () < 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1;
+ }
if (err)
abort ();
#pragma omp target map(from: err)
#pragma omp teams num_teams (4)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > 4;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > 4;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1;
+ }
if (err)
abort ();
#pragma omp target teams num_teams (4) map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > 4;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > 4;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1;
+ }
if (err)
abort ();
#pragma omp target map(from: err)
#pragma omp teams thread_limit (7)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_thread_limit () > 7;
+ {
+ err = omp_get_num_teams () < 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > 7;
+ }
if (err)
abort ();
#pragma omp target teams thread_limit (7) map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_thread_limit () > 7;
+ {
+ err = omp_get_num_teams () < 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > 7;
+ }
if (err)
abort ();
#pragma omp target map(from: err)
#pragma omp teams num_teams (4) thread_limit (8)
{
{
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > 4 || omp_get_thread_limit () > 8;
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > 4;
}
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > 8;
}
if (err)
abort ();
#pragma omp target teams num_teams (4) thread_limit (8) map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > 4 || omp_get_thread_limit () > 8;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > 4;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > 8;
+ }
if (err)
abort ();
#pragma omp target map(from: err)
#pragma omp teams num_teams (a) thread_limit (b)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > a || omp_get_thread_limit () > b;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > a;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > b;
+ }
if (err)
abort ();
#pragma omp target teams num_teams (a) thread_limit (b) map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > a || omp_get_thread_limit () > b;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > a;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > b;
+ }
if (err)
abort ();
#pragma omp target map(from: err)
#pragma omp teams num_teams (c + 1) thread_limit (d - 1)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > c + 1 || omp_get_thread_limit () > d - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > c + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > d - 1;
+ }
if (err)
abort ();
#pragma omp target teams num_teams (c + 1) thread_limit (d - 1) map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > c + 1 || omp_get_thread_limit () > d - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > c + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > d - 1;
+ }
if (err)
abort ();
#pragma omp target map (always, to: c, d) map(from: err)
#pragma omp teams num_teams (c + 1) thread_limit (d - 1)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > c + 1 || omp_get_thread_limit () > d - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > c + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > d - 1;
+ }
if (err)
abort ();
#pragma omp target data map (to: c, d)
@@ -116,8 +155,11 @@ foo (int a, int b, long c, long d)
their device and original values match is unclear. */
#pragma omp target map (to: c, d) map(from: err)
#pragma omp teams num_teams (c + 1) thread_limit (d - 1)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > c + 1 || omp_get_thread_limit () > d - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > c + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > d - 1;
+ }
if (err)
abort ();
}
@@ -125,21 +167,30 @@ foo (int a, int b, long c, long d)
target involved. */
#pragma omp target map(from: err)
#pragma omp teams num_teams (baz () + 1) thread_limit (baz () - 1)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > baz () + 1 || omp_get_thread_limit () > baz () - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > baz () + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > baz () - 1;
+ }
if (err)
abort ();
#pragma omp target teams num_teams (baz () + 1) thread_limit (baz () - 1) map(from: err)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > baz () + 1 || omp_get_thread_limit () > baz () - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > baz () + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > baz () - 1;
+ }
if (err)
abort ();
/* This one can't be optimized, as v might have different value between
host and target. */
#pragma omp target map(from: err)
#pragma omp teams num_teams (v + 1) thread_limit (v - 1)
- err = omp_get_num_teams () < 1 || omp_get_thread_limit () < 1
- || omp_get_num_teams () > v + 1 || omp_get_thread_limit () > v - 1;
+ {
+ err = omp_get_num_teams () < 1 || omp_get_num_teams () > v + 1;
+ #pragma omp parallel masked
+ err |= omp_get_thread_limit () < 1 || omp_get_thread_limit () > v - 1;
+ }
if (err)
abort ();
}
@@ -6,15 +6,17 @@
int
main ()
{
+ omp_set_dynamic (0);
+ omp_set_nested (1);
#pragma omp teams thread_limit (2)
{
+ #pragma omp distribute
+ for (int i = 0; i < 1; ++i)
if (omp_in_parallel ()
|| omp_get_level () != 0
|| omp_get_ancestor_thread_num (0) != 0
|| omp_get_ancestor_thread_num (1) != -1)
abort ();
- omp_set_dynamic (0);
- omp_set_nested (1);
#pragma omp parallel num_threads (2)
{
if (!omp_in_parallel ()
@@ -20,25 +20,26 @@ main ()
if (omp_get_num_threads () > 9)
abort ();
#pragma omp target if (0)
- #pragma omp teams thread_limit (6)
{
- if (omp_get_thread_limit () > 6)
- abort ();
- if (omp_get_thread_limit () == 6)
+ omp_set_dynamic (0);
+ omp_set_nested (1);
+ #pragma omp teams thread_limit (6)
{
- omp_set_dynamic (0);
- omp_set_nested (1);
#pragma omp parallel num_threads (3)
- if (omp_get_num_threads () != 3)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () != 3))
abort ();
#pragma omp parallel num_threads (3)
- if (omp_get_num_threads () != 3)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () != 3))
abort ();
#pragma omp parallel num_threads (8)
- if (omp_get_num_threads () > 6)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () > 6))
abort ();
#pragma omp parallel num_threads (6)
- if (omp_get_num_threads () != 6)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () != 6))
abort ();
int cnt = 0;
#pragma omp parallel num_threads (5)
@@ -6,6 +6,7 @@ main ()
{
#pragma omp target if (0)
#pragma omp teams thread_limit (1)
+ #pragma omp parallel
if (omp_get_thread_limit () != 1)
abort ();
return 0;
@@ -18,25 +18,25 @@ main ()
#pragma omp parallel num_threads (16)
if (omp_get_num_threads () > 9)
abort ();
+ omp_set_dynamic (0);
+ omp_set_nested (1);
#pragma omp teams thread_limit (6)
- {
- if (omp_get_thread_limit () > 6)
- abort ();
- if (omp_get_thread_limit () == 6)
- {
- omp_set_dynamic (0);
- omp_set_nested (1);
+ {
#pragma omp parallel num_threads (3)
- if (omp_get_num_threads () != 3)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () != 3))
abort ();
#pragma omp parallel num_threads (3)
- if (omp_get_num_threads () != 3)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () != 3))
abort ();
#pragma omp parallel num_threads (8)
- if (omp_get_num_threads () > 6)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () > 6))
abort ();
#pragma omp parallel num_threads (6)
- if (omp_get_num_threads () != 6)
+ if (omp_get_thread_limit () > 6
+ || (omp_get_thread_limit () == 6 && omp_get_num_threads () != 6))
abort ();
int cnt = 0;
#pragma omp parallel num_threads (5)
@@ -52,7 +52,6 @@ main ()
#pragma omp atomic
--cnt;
}
- }
- }
+ }
return 0;
}
@@ -5,6 +5,7 @@ int
main ()
{
#pragma omp teams thread_limit (1)
+ #pragma omp parallel
if (omp_get_thread_limit () != 1)
abort ();
return 0;
@@ -13,6 +13,7 @@ implicit none (type, external)
if (omp_get_teams_thread_limit () /= 15) &
error stop 4
!$omp teams
+ !$omp parallel
if (omp_get_max_teams () /= 7 &
.or. omp_get_teams_thread_limit () /= 15 &
.or. omp_get_num_teams () < 1 &
@@ -22,8 +23,10 @@ implicit none (type, external)
.or. omp_get_thread_limit () < 1 &
.or. omp_get_thread_limit () > 15) &
error stop 5
+ !$omp end parallel
!$omp end teams
!$omp teams num_teams(5) thread_limit (13)
+ !$omp parallel
if (omp_get_max_teams () /= 7 &
.or. omp_get_teams_thread_limit () /= 15 &
.or. omp_get_num_teams () /= 5 &
@@ -32,8 +35,10 @@ implicit none (type, external)
.or. omp_get_thread_limit () < 1 &
.or. omp_get_thread_limit () > 13) &
error stop 6
+ !$omp end parallel
!$omp end teams
!$omp teams num_teams(8) thread_limit (16)
+ !$omp parallel
if (omp_get_max_teams () /= 7 &
.or. omp_get_teams_thread_limit () /= 15 &
.or. omp_get_num_teams () /= 8 &
@@ -42,6 +47,7 @@ implicit none (type, external)
.or. omp_get_thread_limit () < 1 &
.or. omp_get_thread_limit () > 16) &
error stop 7
+ !$omp end parallel
!$omp end teams
contains
logical function env_exists (name)
@@ -16,6 +16,7 @@ implicit none (type, external)
call omp_set_teams_thread_limit (12)
end if
!$omp teams
+ !$omp parallel
if (omp_get_max_teams () /= 6 &
.or. omp_get_teams_thread_limit () /= 12 &
.or. omp_get_num_teams () < 1 &
@@ -25,6 +26,7 @@ implicit none (type, external)
.or. omp_get_thread_limit () < 1 &
.or. omp_get_thread_limit () > 12) &
error stop 3
+ !$omp end parallel
!$omp end teams
contains
logical function env_is_set (name, val)
@@ -2,13 +2,17 @@
program teams1
use omp_lib
+ integer :: i
!$omp teams thread_limit (2)
- if (omp_in_parallel ()) stop 1
- if (omp_get_level () .ne. 0) stop 2
- if (omp_get_ancestor_thread_num (0) .ne. 0) stop 3
- if (omp_get_ancestor_thread_num (1) .ne. -1) stop 4
- call omp_set_dynamic (.false.)
- call omp_set_nested (.true.)
+ !$omp distribute
+ do i = 1, 1
+ if (omp_in_parallel ()) stop 1
+ if (omp_get_level () .ne. 0) stop 2
+ if (omp_get_ancestor_thread_num (0) .ne. 0) stop 3
+ if (omp_get_ancestor_thread_num (1) .ne. -1) stop 4
+ call omp_set_dynamic (.false.)
+ call omp_set_nested (.true.)
+ end do
!$omp parallel num_threads (2)
if (.not. omp_in_parallel ()) stop 5
if (omp_get_level () .ne. 1) stop 6