[3/7] OpenMP: Support functions for nested "begin declare variant"
Checks
Commit Message
This patch adds functions for variant name mangling and context selector
merging that are shared by the C and C++ front ends.
The OpenMP specification says that name mangling is supposed to encode
the context selector for the variant, but also provides for no way to
reference these functions directly by name or from a different
compilation unit. It also gives no guidance on how dynamic selectors
might be encoded across compilation units.
The GCC implementation of this feature instead treats variant
functions as if they have no linkage and uses a simple counter to
generate names.
gcc/ChangeLog
* omp-general.cc (omp_mangle_variant_name): New.
(omp_check_for_duplicate_variant): New.
(omp_copy_trait_set): New.
(omp_trait_selectors_equivalent): New.
(omp_combine_trait_sets): New.
(omp_merge_context_selectors): New.
* omp-general.h (omp_mangle_variant_name): Declare.
(omp_check_for_duplicate_variant): Declare.
(omp_merge_context_selectors): Declare.
---
gcc/omp-general.cc | 194 +++++++++++++++++++++++++++++++++++++++++++++
gcc/omp-general.h | 5 ++
2 files changed, 199 insertions(+)
Comments
Hi Sandra, hello world,
On Feburary 10, 2025, Sandra Loosemore wrote:
> This patch adds functions for variant name mangling and context selector
> merging that are shared by the C and C++ front ends.
>
> The OpenMP specification says that name mangling is supposed to encode
> the context selector for the variant, but also provides for no way to
> reference these functions directly by name or from a different
> compilation unit. It also gives no guidance on how dynamic selectors
> might be encoded across compilation units.
In particular, OpenMP states that only function definitions are mangled,
not function declarations.
However, when I try it with the attached program, it fails with
'Error: symbol `_ZGIW3one' is already defined', i.e. mangling does not work.
> The GCC implementation of this feature instead treats variant
> functions as if they have no linkage and uses a simple counter to
> generate names.
The assumption 'no linkage' breaks as soon as C++ modules are used
as the attached example program shows.
As the module name is encoded, I think using the simple counter per
TU should still work. Namely, with -fno-openmp, the one.cc part of
the example gives:
000000000000000b T initializer for module one0000000000000000 T
test@one() Thus, "test.ompvariant0@one()" should be fine as name. * * * TODO: * Handle mangling with C++
modules * Honor the visibility in the module (i.e. if 'export' it needs
to be callable from other TU) → just keep setting as is inside modules.
* * * Unrelated to mangling and visibility, but the following is wrongly
rejected: ----------------- #pragma omp begin declare variant
match(construct={parallel}) const char *test () { return "Parallel"; }
#pragma omp end declare variant -----------------
Namely, this fails with the bogus error:
error: no previous declaration of base function in this scope
However, OpenMP permits this: "For the purpose of call resolution, each
function definition that appears in the delimited code region is a
function variant for an assumed base function, with the same name and a
compatible prototype, that is declared elsewhere without an associated
declare variant directive." Obviously, if the base function is in no
(linked) TU defined, calling it will lead to a link time error. But
otherwise, having it only be defined in a different file / TU or later
in the same file or not at all is perfectly valid. Tobias PS: Note that
I tried the attached examples with OG14 with the assumption that it
doesn't differ from mainline + your patches in this regard. PPS: OpenMP
6.1 will add delimited declare variant also for Fortran.
Hi Sandra, hello world,
see other email from today in this thread regarding a C++
module test case + C++ module issues and a parser issues.
Regarding the patch itself:
On February 10, 2025, Sandra Loosemore wrote:
> This patch adds functions for variant name mangling and context selector
> merging that are shared by the C and C++ front ends.
>
> The OpenMP specification says that name mangling is supposed to encode
> the context selector for the variant, but also provides for no way to
> reference these functions directly by name or from a different
> compilation unit. It also gives no guidance on how dynamic selectors
> might be encoded across compilation units.
>
> The GCC implementation of this feature instead treats variant
> functions as if they have no linkage and uses a simple counter to
> generate names.
The wording needs to be updated for C++ modules. The counting can be
done identically - as the name gets the module name appended,
however, it can be externally accessed (if and only if exported).
> gcc/ChangeLog
> * omp-general.cc (omp_mangle_variant_name): New.
> (omp_check_for_duplicate_variant): New.
> (omp_copy_trait_set): New.
> (omp_trait_selectors_equivalent): New.
> (omp_combine_trait_sets): New.
> (omp_merge_context_selectors): New.
> * omp-general.h (omp_mangle_variant_name): Declare.
> (omp_check_for_duplicate_variant): Declare.
> (omp_merge_context_selectors): Declare.
> ---
> gcc/omp-general.cc | 194 +++++++++++++++++++++++++++++++++++++++++++++
> gcc/omp-general.h | 5 ++
> 2 files changed, 199 insertions(+)
>
> diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
> index 0a2dd6b5be7..c7b8e17d1f7 100644
> --- a/gcc/omp-general.cc
> +++ b/gcc/omp-general.cc
> @@ -1501,6 +1501,66 @@ omp_check_context_selector (location_t loc, tree ctx,
> return ctx;
> }
>
> +/* Produce a mangled version of BASE_ID for the name of the variant
> + function with context selector CTX. SEP is a separator string.
> + The return value is an IDENTIFIER_NODE.
> +
> + Per the OpenMP spec, "the symbol names of two definitions of a function are
> + considered to be equal if and only if their effective context selectors are
> + equivalent". However, if we did have two such definitions, we'd get an ODR
> + violation. We already take steps in the front ends to make variant
> + functions internal to the compilation unit, since there is no (portable) way
> + to reference them directly by name or declare them as extern in another
> + compilation unit. So, we can diagnose the would-be ODR violations by
> + checking that there is not already a variant for the same function with an
> + equivalent context selector, and otherwise just use a simple counter to name
> + the variant functions instead of any complicated scheme to encode the
> + context selector in the name. */
> +
Likewise.
(That 'begin/end declare variant' mangling only applies to function
definitions and not function declarations, which also prevents using
them outside either a local variant or a C++ module.)
> +tree
> +omp_mangle_variant_name (tree base_id, tree ctx ATTRIBUTE_UNUSED,
> + const char *sep)
> +{
> + const char *base_name = IDENTIFIER_POINTER (base_id);
> +
> + /* Now do the actual mangling. */
> + static int variant_counter;
> + /* The numeric suffix and terminating byte ought to need way less than
> + 32 bytes extra, that's just a magic number. */
> + size_t buflen = (strlen (base_name) + strlen (sep) + strlen ("ompvariant")
> + + 32);
> + char *buffer = (char *) alloca (buflen);
> + snprintf (buffer, buflen, "%s%sompvariant%d", base_name, sep,
> + ++variant_counter);
> + return get_identifier (buffer);
How about:
buflen = snprintf (...)
returnget_identifier_with_length (buffer, buflen); as tiny optimization?
> +/* Diagnose an error if there is already a variant with CTX registered
> + for BASE_DECL. Returns true if OK, false otherwise. */
> +bool
> +omp_check_for_duplicate_variant (location_t loc, tree base_decl, tree ctx)
> +{
> + for (tree attr = DECL_ATTRIBUTES (base_decl); attr; attr = TREE_CHAIN (attr))
> + {
> + attr = lookup_attribute ("omp declare variant base", attr);
> + if (attr == NULL_TREE)
> + break;
Wouldn't it be simpler and cleaner to just write:
for (tree attr = lookup_attribute ("...", DECL_ATTRIBUTES (base_decl);
attr; attr = TREE_CHAIN (attr))
{
> + tree selector = TREE_VALUE (TREE_VALUE (attr));
> + if (omp_context_selector_compare (ctx, selector) == 0)
> + {
> + error_at (loc,
> + "Multiple definitions of variants with the same "
> + "context selector violate the one-definition rule");
I think the following would be helpful (untested; needs nicher wording):
inform (DECL_SOURCE_LOCATION(TREE_PURPOSE (attr)), "... previous definition at ...."); Otherwise: I
think in C/C++ and, hence, in the middle end GCC uses lower case after
'error:' - contrary to Fortran that uses upper case after 'Error:'. * * *
Otherwise, LGTM.
Tobias
@@ -1501,6 +1501,66 @@ omp_check_context_selector (location_t loc, tree ctx,
return ctx;
}
+/* Produce a mangled version of BASE_ID for the name of the variant
+ function with context selector CTX. SEP is a separator string.
+ The return value is an IDENTIFIER_NODE.
+
+ Per the OpenMP spec, "the symbol names of two definitions of a function are
+ considered to be equal if and only if their effective context selectors are
+ equivalent". However, if we did have two such definitions, we'd get an ODR
+ violation. We already take steps in the front ends to make variant
+ functions internal to the compilation unit, since there is no (portable) way
+ to reference them directly by name or declare them as extern in another
+ compilation unit. So, we can diagnose the would-be ODR violations by
+ checking that there is not already a variant for the same function with an
+ equivalent context selector, and otherwise just use a simple counter to name
+ the variant functions instead of any complicated scheme to encode the
+ context selector in the name. */
+
+tree
+omp_mangle_variant_name (tree base_id, tree ctx ATTRIBUTE_UNUSED,
+ const char *sep)
+{
+ const char *base_name = IDENTIFIER_POINTER (base_id);
+
+ /* Now do the actual mangling. */
+ static int variant_counter;
+ /* The numeric suffix and terminating byte ought to need way less than
+ 32 bytes extra, that's just a magic number. */
+ size_t buflen = (strlen (base_name) + strlen (sep) + strlen ("ompvariant")
+ + 32);
+ char *buffer = (char *) alloca (buflen);
+ snprintf (buffer, buflen, "%s%sompvariant%d", base_name, sep,
+ ++variant_counter);
+ return get_identifier (buffer);
+}
+
+/* Forward declaration. */
+static int omp_context_selector_compare (tree ctx1, tree ctx2);
+
+/* Diagnose an error if there is already a variant with CTX registered
+ for BASE_DECL. Returns true if OK, false otherwise. */
+bool
+omp_check_for_duplicate_variant (location_t loc, tree base_decl, tree ctx)
+{
+ for (tree attr = DECL_ATTRIBUTES (base_decl); attr; attr = TREE_CHAIN (attr))
+ {
+ attr = lookup_attribute ("omp declare variant base", attr);
+ if (attr == NULL_TREE)
+ break;
+
+ tree selector = TREE_VALUE (TREE_VALUE (attr));
+ if (omp_context_selector_compare (ctx, selector) == 0)
+ {
+ error_at (loc,
+ "Multiple definitions of variants with the same "
+ "context selector violate the one-definition rule");
+ return false;
+ }
+ }
+ return true;
+}
+
/* Forward declarations. */
static int omp_context_selector_set_compare (enum omp_tss_code, tree, tree);
static int omp_construct_simd_compare (tree, tree, bool);
@@ -4925,3 +4985,137 @@ omp_maybe_apply_loop_xforms (tree *expr_p, tree for_clauses)
}
}
+/* The next group of functions support merging of context selectors for
+ nested "begin declare variant" directives. The spec says:
+
+ ...the effective context selectors of the outer directive are
+ appended to the context selector of the inner directive to form the
+ effective context selector of the inner directive. If a
+ trait-set-selector is present on both directives, the trait-selector
+ list of the outer directive is appended to the trait-selector list
+ of the inner directive after equivalent trait-selectors have been
+ removed from the outer list.
+
+ In other words, there is no requirement to combine non-equivalent
+ trait-selectors according to their peculiar semantics, such as allowing
+ "any" as a wildcard, ANDing trait-property-expressions of "condition" trait
+ property expressions together, etc. Also there is no special provision for
+ treating the "construct" selector as an ordered list.
+
+ Note that the spec does not explicitly say what "equivalent" means;
+ whether the properties and score of the trait-selectors must be identical,
+ or only the name of the trait-selector. This code assumes the former
+ except for the construct trait set where the order of selectors
+ is significant (so that it is *not* equivalent to have the same
+ trait-selector appearing in a different order in the list). */
+
+/* Copy traits from FROM_TS and push them onto TAIL. */
+
+static tree
+omp_copy_trait_set (tree from_ts, tree tail)
+{
+ for (tree ts = from_ts; ts; ts = TREE_CHAIN (ts))
+ tail = make_trait_selector (OMP_TS_CODE (ts), OMP_TS_SCORE (ts),
+ OMP_TS_PROPERTIES (ts), tail);
+ return nreverse (tail);
+}
+
+/* Return true if trait selectors TS1 and TS2 for set TSS are "equivalent". */
+
+static bool
+omp_trait_selectors_equivalent (tree ts1, tree ts2, enum omp_tss_code tss)
+{
+ if (OMP_TS_CODE (ts1) != OMP_TS_CODE (ts2))
+ return false;
+
+ tree score1 = OMP_TS_SCORE (ts1);
+ tree score2 = OMP_TS_SCORE (ts2);
+ if ((score1 && score2 && !simple_cst_equal (score1, score2))
+ || (score1 && !score2)
+ || (!score1 && score2))
+ return false;
+
+ return (omp_context_selector_props_compare (tss, OMP_TS_CODE (ts1),
+ OMP_TS_PROPERTIES (ts1),
+ OMP_TS_PROPERTIES (ts2))
+ == 0);
+}
+
+/* Merge lists of the trait selectors OUTER_TS and INNER_TS for selector set
+ TSS: "the trait-selector list of the outer directive is appended to the
+ trait-selector list of the inner directive after equivalent trait-selectors
+ have been removed from the outer list". */
+
+static tree
+omp_combine_trait_sets (tree outer_ts, tree inner_ts, enum omp_tss_code tss)
+{
+ unsigned HOST_WIDE_INT inner_traits = 0;
+ tree to_list = NULL_TREE;
+
+ for (tree inner = inner_ts; inner; inner = TREE_CHAIN (inner))
+ {
+ omp_ts_code ts_code = OMP_TS_CODE (inner);
+ inner_traits |= 1 << ts_code;
+ to_list
+ = make_trait_selector (ts_code, OMP_TS_SCORE (inner),
+ unshare_expr (OMP_TS_PROPERTIES (inner)),
+ to_list);
+ }
+
+ for (tree outer = outer_ts; outer; outer = TREE_CHAIN (outer))
+ {
+ omp_ts_code ts_code = OMP_TS_CODE (outer);
+ bool remove = false;
+ if (inner_traits & (1 << ts_code))
+ for (tree inner = inner_ts; inner; inner = TREE_CHAIN (inner))
+ if (OMP_TS_CODE (inner) == ts_code)
+ {
+ if (omp_trait_selectors_equivalent (inner, outer, tss))
+ remove = true;
+ break;
+ }
+ if (!remove)
+ to_list
+ = make_trait_selector (ts_code, OMP_TS_SCORE (outer),
+ unshare_expr (OMP_TS_PROPERTIES (outer)),
+ to_list);
+ }
+
+ return nreverse (to_list);
+}
+
+/* Merge context selector INNER_CTX with OUTER_CTX. LOC and DIRECTIVE are
+ used for error checking. Returns the merged context, or error_mark_node
+ if the contexts cannot be merged. */
+
+tree
+omp_merge_context_selectors (location_t loc, tree outer_ctx, tree inner_ctx,
+ enum omp_ctx_directive directive)
+{
+ tree merged_ctx = NULL_TREE;
+
+ if (inner_ctx == error_mark_node || outer_ctx == error_mark_node)
+ return error_mark_node;
+
+ for (unsigned i = OMP_TRAIT_SET_CONSTRUCT; i != OMP_TRAIT_SET_LAST; i++)
+ {
+ omp_tss_code tss_code = static_cast<omp_tss_code>(i);
+ tree outer_ts = omp_get_context_selector_list (outer_ctx, tss_code);
+ tree inner_ts = omp_get_context_selector_list (inner_ctx, tss_code);
+ tree merged_ts = NULL_TREE;
+
+ if (inner_ts && outer_ts)
+ merged_ts = omp_combine_trait_sets (outer_ts, inner_ts, tss_code);
+ else if (inner_ts)
+ merged_ts = omp_copy_trait_set (inner_ts, NULL_TREE);
+ else if (outer_ts)
+ merged_ts = omp_copy_trait_set (outer_ts, NULL_TREE);
+
+ if (merged_ts)
+ merged_ctx = make_trait_set_selector (tss_code, merged_ts,
+ merged_ctx);
+ }
+
+ merged_ctx = nreverse (merged_ctx);
+ return omp_check_context_selector (loc, merged_ctx, directive);
+}
@@ -200,9 +200,14 @@ enum omp_ctx_directive
OMP_CTX_METADIRECTIVE };
extern tree omp_check_context_selector (location_t loc, tree ctx,
enum omp_ctx_directive directive);
+extern tree omp_mangle_variant_name (tree base_id, tree ctx, const char *sep);
+extern bool omp_check_for_duplicate_variant (location_t loc,
+ tree base_decl, tree ctx);
extern void omp_mark_declare_variant (location_t loc, tree variant,
tree construct);
extern int omp_context_selector_matches (tree, tree, bool);
+extern tree omp_merge_context_selectors (location_t, tree, tree,
+ enum omp_ctx_directive);
extern tree resolve_omp_target_device_matches (tree node);
extern tree omp_get_context_selector (tree, enum omp_tss_code,
enum omp_ts_code);