[v3,03/12] libgomp: runtime support for target_device selector

Message ID 20240720204231.2229891-4-sloosemore@baylibre.com
State New
Headers
Series Metadirective support + "declare variant" improvements |

Commit Message

Sandra Loosemore July 20, 2024, 8:42 p.m. UTC
  This patch implements the libgomp runtime support for the dynamic
target_device selector via the GOMP_evaluate_target_device function.

include/ChangeLog
	* cuda/cuda.h (CUdevice_attribute): Add definitions for
	CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and
	CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR.

libgomp/ChangeLog
	* Makefile.am (libgomp_la_SOURCES): Add selector.c.
	* Makefile.in: Regenerate.
	* config/gcn/selector.c: New.
	* config/linux/selector.c: New.
	* config/linux/x86/selector.c: New.
	* config/nvptx/selector.c: New.
	* libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New.
	* libgomp.h (struct gomp_device_descr): Add evaluate_device_func field.
	* libgomp.map (GOMP_5.1.3): New, add GOMP_evaluate_target_device.
	* libgomp.texi (OpenMP Context Selectors): Document dynamic selector
	matching of kind/arch/isa.
	* libgomp_g.h (GOMP_evaluate_current_device): New.
	(GOMP_evaluate_target_device): New.
	* oacc-host.c (host_evaluate_device): New.
	(host_openacc_exec): Initialize evaluate_device_func field to
	host_evaluate_device.
	* plugin/plugin-gcn.c (gomp_match_selectors): New.
	(gomp_match_isa): New.
	(GOMP_OFFLOAD_evaluate_device): New.
	* plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and
	compute_minor fields.
	(nvptx_open_device): Read compute capability information from device.
	(gomp_match_selectors): New.
	(gomp_match_selector): New.
	(CHECK_ISA): New macro.
	(GOMP_OFFLOAD_evaluate_device): New.
	* selector.c: New.
	* target.c (GOMP_evaluate_target_device): New.
	(gomp_load_plugin_for_device): Load evaluate_device plugin function.

Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Sandra Loosemore <sandra@codesourcery.com>
---
 include/cuda/cuda.h                 |   2 +
 libgomp/Makefile.am                 |   2 +-
 libgomp/Makefile.in                 |   5 +-
 libgomp/config/gcn/selector.c       | 102 +++++++
 libgomp/config/linux/selector.c     |  65 +++++
 libgomp/config/linux/x86/selector.c | 406 ++++++++++++++++++++++++++++
 libgomp/config/nvptx/selector.c     |  77 ++++++
 libgomp/libgomp-plugin.h            |   2 +
 libgomp/libgomp.h                   |   1 +
 libgomp/libgomp.map                 |   5 +
 libgomp/libgomp.texi                |  18 +-
 libgomp/libgomp_g.h                 |   8 +
 libgomp/oacc-host.c                 |  11 +
 libgomp/plugin/plugin-gcn.c         |  52 ++++
 libgomp/plugin/plugin-nvptx.c       |  82 ++++++
 libgomp/selector.c                  |  64 +++++
 libgomp/target.c                    |  22 ++
 17 files changed, 918 insertions(+), 6 deletions(-)
 create mode 100644 libgomp/config/gcn/selector.c
 create mode 100644 libgomp/config/linux/selector.c
 create mode 100644 libgomp/config/linux/x86/selector.c
 create mode 100644 libgomp/config/nvptx/selector.c
 create mode 100644 libgomp/selector.c
  

Comments

Jakub Jelinek Aug. 14, 2024, 10:25 a.m. UTC | #1
On Sat, Jul 20, 2024 at 02:42:22PM -0600, Sandra Loosemore wrote:
> This patch implements the libgomp runtime support for the dynamic
> target_device selector via the GOMP_evaluate_target_device function.

For kind, isa and arch traits in the device sets we decide based on
compiler flags and overrides through target attribute etc., not on actual
hw capabilities (and I think we have to, it shouldn't be a dynamic
selection).

Now for kind, isa and arch traits in the target_device set this patch
decides based on compiler flags used to compile some routine in libgomp.so
or libgomp.a.

While this can work in the (very unfortunate) GCN state of things where
only exact isa match is possible (I really hope we can one day generalize
it by being able to compile for a set of isas by supporting lowest
denominator and patching the EM_* in the ELF header or something similar,
perhaps with runtime decisions on what to do for different CPUs), deciding
what to do based on how libgomp.a or libgomp.so.1 has been compiled for the
rest is IMHO wrong.

Now, at least in 5.2 I don't see a restriction that target_device trait
can't be used inside of selectors in a target region.
IMHO that is a bug in the standard.  E.g. it says that
"The expression of a device_num trait must evaluate to a non-negative integer value that is
less than or equal to the value returned by omp_get_num_devices."
but it is unspecified what happens when omp_get_num_devices is
called in the target region.
Not really sure if in the patch you actually support say metadirective
with target_device from inside of a target region querying properties of
say the host device or something similar.

If (hopefully) one can only query target_device on the host, then I think
the best would be that at least for the initial device we actually use
the ISAs etc. of whatever function queries that trait, rather than what
compiler flags were used to compile libgomp.so.1.  That would mean
returning from the function something to the caller to say it is actually
a host device and in the emitted code do the matching based on that rather
than on what the function would otherwise match.
That would then mean we don't need to supply special x86 etc. versions (and
whatever other host, powerpc, ..., where we just didn't define enough
details).

For other devices, this is harder because there is no specific offload code
associated with the target_device trait use.  Guess it would be best if
it could be picked from the minimum ISA actually supported in the offloading
code or something similar, by the time this is invoked libgomp should have
the offloading code (if any) already registered (unless it is in some
library dlopened later, that is fuzzy thing), so best would be e.g. for PTX
to watch the minimum required SM level of the code that is being registered,
whether stored in the offload section separately or figured out from the PTX
code being loaded.  But perhaps initially what you do for offloading devices
might be still ok.

> include/ChangeLog
> 	* cuda/cuda.h (CUdevice_attribute): Add definitions for
> 	CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and
> 	CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR.
> 
> libgomp/ChangeLog
> 	* Makefile.am (libgomp_la_SOURCES): Add selector.c.
> 	* Makefile.in: Regenerate.
> 	* config/gcn/selector.c: New.
> 	* config/linux/selector.c: New.
> 	* config/linux/x86/selector.c: New.
> 	* config/nvptx/selector.c: New.
> 	* libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New.
> 	* libgomp.h (struct gomp_device_descr): Add evaluate_device_func field.
> 	* libgomp.map (GOMP_5.1.3): New, add GOMP_evaluate_target_device.
> 	* libgomp.texi (OpenMP Context Selectors): Document dynamic selector
> 	matching of kind/arch/isa.
> 	* libgomp_g.h (GOMP_evaluate_current_device): New.
> 	(GOMP_evaluate_target_device): New.
> 	* oacc-host.c (host_evaluate_device): New.
> 	(host_openacc_exec): Initialize evaluate_device_func field to
> 	host_evaluate_device.
> 	* plugin/plugin-gcn.c (gomp_match_selectors): New.
> 	(gomp_match_isa): New.
> 	(GOMP_OFFLOAD_evaluate_device): New.
> 	* plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and
> 	compute_minor fields.
> 	(nvptx_open_device): Read compute capability information from device.
> 	(gomp_match_selectors): New.
> 	(gomp_match_selector): New.
> 	(CHECK_ISA): New macro.
> 	(GOMP_OFFLOAD_evaluate_device): New.
> 	* selector.c: New.
> 	* target.c (GOMP_evaluate_target_device): New.
> 	(gomp_load_plugin_for_device): Load evaluate_device plugin function.
> 
> Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
> Co-Authored-By: Sandra Loosemore <sandra@codesourcery.com>

> --- /dev/null
> +++ b/libgomp/config/gcn/selector.c
> @@ -0,0 +1,102 @@
> +/* Copyright (C) 2022 Free Software Foundation, Inc.

2022-2024

> +
> +/* The selectors are passed as strings, but are actually sets of multiple
> +   trait property names, separated by '\0' and with an extra '\0' at
> +   the end.  Match such a string SELECTORS against an array of strings
> +   CHOICES, that is terminated by a null pointer.
> +   matches.  */
> +static bool
> +gomp_match_selectors (const char *selectors, const char **choices)
> +{
> +  while (*selectors != '\0')
> +    {
> +      bool match = false;
> +      for (int i = 0; !match && choices[i]; i++)
> +	match = !strcmp (selectors, choices[i]);
> +      if (!match)
> +	return false;
> +      selectors += strlen (selectors) + 1;
> +    }
> +  return true;
> +}

Isn't this function the same on all arches?
If yes, shouldn't it be defined in one place (static inline function in
libgomp.h, or defined somewhere where it will be compiled for all libgomp
targets?

> +bool
> +GOMP_evaluate_current_device (const char *kind, const char *arch,
> +			      const char *isa)
> +{
> +  static const char *kind_choices[] = { "gpu", "nohost", NULL };

Is "any" handled on the compiler side and never makes it through here?

> --- /dev/null
> +++ b/libgomp/config/linux/selector.c

Why does this exist at all?  Isn't the libgomp/selector.c the same?

> --- /dev/null
> +++ b/libgomp/config/linux/x86/selector.c

As I wrote earlier, I'd strongly prefer if the host device part was
done on the compiler side based on actual compiler flags, not decide
based on libgomp.so compilation flags (which are mostly the same at least
in distro builds, lowest common denominator).
> +#ifdef __AVX2__
> +       "avx2",
> +#endif
> +#ifdef __AVX512F__
> +  "avx512f",

This one is misindented.

Also, any time one adds a new isa this would need to be updated.)

> --- a/libgomp/oacc-host.c
> +++ b/libgomp/oacc-host.c
> @@ -136,6 +136,16 @@ host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars,
>    fn (vars);
>  }
>  
> +static bool
> +host_evaluate_device (int device_num __attribute__ ((unused)),
> +		      const char *kind __attribute__ ((unused)),
> +		      const char *arch __attribute__ ((unused)),
> +		      const char *isa __attribute__ ((unused)))
> +{
> +  __builtin_unreachable ();

GOMP_fatal or at least abort (); please.

Plus, what Tobias said about the passed in device_num, decide whether it
is the actual device number, or one with backwards compatibility
transformations and depending on that adjust it on the runtime side or
on the library side.

	Jakub
  
Jakub Jelinek Aug. 14, 2024, 11:01 a.m. UTC | #2
On Wed, Aug 14, 2024 at 12:25:23PM +0200, Jakub Jelinek wrote:
> Now, at least in 5.2 I don't see a restriction that target_device trait
> can't be used inside of selectors in a target region.
> IMHO that is a bug in the standard.  E.g. it says that
> "The expression of a device_num trait must evaluate to a non-negative integer value that is
> less than or equal to the value returned by omp_get_num_devices."
> but it is unspecified what happens when omp_get_num_devices is
> called in the target region.
> Not really sure if in the patch you actually support say metadirective
> with target_device from inside of a target region querying properties of
> say the host device or something similar.

I've filed https://github.com/OpenMP/spec/issues/4133 for this (only
accessible to OpenMP language committee members unfortunately).

	Jakub
  
Tobias Burnus Sept. 9, 2024, 10:46 a.m. UTC | #3
Hi all,

Jakub Jelinek wrote:
> On Sat, Jul 20, 2024 at 02:42:22PM -0600, Sandra Loosemore wrote:
>> This patch implements the libgomp runtime support for the dynamic
>> target_device selector via the GOMP_evaluate_target_device function.
> […]
>
> Now for kind, isa and arch traits in the target_device set this patch
> decides based on compiler flags used to compile some routine in libgomp.so
> or libgomp.a.
>
> While this can work in the (very unfortunate) GCN state of things where
> only exact isa match is possible (I really hope we can one day generalize
> it by being able to compile for a set of isas by supporting lowest
> denominator and patching the EM_* in the ELF header or something similar,
> perhaps with runtime decisions on what to do for different CPUs),

I think that can only work to some extend. LLVM has "gfx11-generic" 
which is compatible with gfx110{0,1,2,3,} and gfx115{0,1,2}, which at 
least helps a bit. For gfx10, it has gfx10-1-generic for gfx101{0,1,2,3} 
and gfx10-3-generic for gfx103[0-6] and gfx9-generic for gfx90{0,2,4,6,9,c}.

Thus, we could have versions which support a common subset, but we still 
need multiple libraries. And it needs to be implemented …

This sounds like a task for the GCN maintainer …

* * *

> deciding what to do based on how libgomp.a or libgomp.so.1 has been compiled for the
> rest is IMHO wrong.

I wonder whether we should do something like the following.

[The following is a mix between compile code and generated code, for illustrative
purpose.]

Inside the compiler do:

#ifndef ACCEL_COMPILER
intr = 0; if (targetm.omp.device_kind_arch_isa != NULL) r = 
targetm.omp.device_kind_arch_isa (omp_device_{kind,arch,isa}, val);

    if (dev_num && TREE_CODE (dev_num) == INTEGER_CST)
      {
        if (dev_num < -1 /* INVALID_DEVICE or nonconforming */)
          → 0
        if (dev_num == initial_device)
          → r
      }
<code gen>
      /* The '? :' condition is a compile time condition. */
      d = <dev_num> ? <dev_num> : omp_get_default_device ();
      if (d < -1)
        → 0
      else if (d == -1 || d == omp_get_initial_device ())
        → r
      else
        → GOMP_get_device_kind_arch_isa  (d, kind, arch, isa)
</codegen>
#else
    /* VARIANT 1: Assume that neither reverse offload nor nested target occurs. */
    →targetm.omp.device_kind_arch_isa  (kind, arch, isa)
    /* VARIANT 2 -
    d = <dev_num> ? <dev_num> : omp_get_default_device ();
    if (d == omp_get_device_num ())
      →targetm.omp.device_kind_arch_isa  (kind, arch, isa)
    else
      /* Cannot really do anything here - and as no nested target is permitted,
         use 'false'.  */
      → 0
#endif


* * *

And on the libgomp side GOMP_get_device_kind_arch_isa → plugin code.

And there:

(A) GCN:

kind and arch are clear. For ISA:

agent->device_isa + use existing isa_hsa_name() function (or likewise).

(B) Nvptx:

cuDeviceGetAttribute + CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR = 75 
and CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR = 76.

Example: sm_89 = (major) 8 and (minor) 9.

* * *

Does this sound sensible?

Tobias

PS: For the current host-offload GSoC task, we might eventually think of 
using cpuid on x86-64, i.e. gcc/config/i386/cpuid.h.

PS: RFC remains: Should 'sm_80' be true if the hardware/compilation is 
'sm_89' or not? Namely: Does 'sm_80' denote the capability or the 
specific hardware?

Regarding this topic, see also 
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662059.html
  
Sandra Loosemore Sept. 22, 2024, 1 a.m. UTC | #4
On 9/9/24 04:46, Tobias Burnus wrote:

> I wonder whether we should do something like the following.
> 
> [The following is a mix between compile code and generated code, for 
> illustrative
> purpose.]
> 
> Inside the compiler do:
> 
> #ifndef ACCEL_COMPILER
> intr = 0; if (targetm.omp.device_kind_arch_isa != NULL) r = 
> targetm.omp.device_kind_arch_isa (omp_device_{kind,arch,isa}, val);
> 
>     if (dev_num && TREE_CODE (dev_num) == INTEGER_CST)
>       {
>         if (dev_num < -1 /* INVALID_DEVICE or nonconforming */)
>           → 0
>         if (dev_num == initial_device)
>           → r
>       }
> <code gen>
>       /* The '? :' condition is a compile time condition. */
>       d = <dev_num> ? <dev_num> : omp_get_default_device ();
>       if (d < -1)
>         → 0
>       else if (d == -1 || d == omp_get_initial_device ())
>         → r
>       else
>         → GOMP_get_device_kind_arch_isa  (d, kind, arch, isa)
> </codegen>
> #else
>     /* VARIANT 1: Assume that neither reverse offload nor nested target 
> occurs. */
>     →targetm.omp.device_kind_arch_isa  (kind, arch, isa)
>     /* VARIANT 2 -
>     d = <dev_num> ? <dev_num> : omp_get_default_device ();
>     if (d == omp_get_device_num ())
>       →targetm.omp.device_kind_arch_isa  (kind, arch, isa)
>     else
>       /* Cannot really do anything here - and as no nested target is 
> permitted,
>          use 'false'.  */
>       → 0
> #endif
> 
> 
> * * *
> 
> And on the libgomp side GOMP_get_device_kind_arch_isa → plugin code.

Hmmmm.  I've fleshed out my own idea a bit, that would entirely get rid 
of the libgomp runtime support.

Leaving aside cases that can be trivially determined (offloading is 
disabled, the device number is an expression that can be statically 
determined to refer to the host or an invalid device, kind/arch/isa 
don't match any enabled offload target), I think the predicate of the 
more general case for

target_device={device_num (NUM), kind(KIND), arch(ARCH), isa(ISA)}

can be expressed (using GCC statement expression syntax) as

({
    int matches;
    #pragma omp target device (NUM)
      matches = magic_cookie (KIND, ARCH, ISA)
    matches;
})

where magic_cookie is either a built-in or new gimple code.  I think the 
gimplifier is probably the right place to do the above transformation, 
and the magic_cookie expansion would happen during (or at least at the 
same point in compilation as) late metadirective resolution; IOW, in the 
offload compiler).  That part can call targetm.omp.device_kind_arch_isa 
to resolve the whole works into a constant true/false, similar to how 
the "device" selector is handled in the offload compiler, rather than 
into any runtime routine.

The gimplifier can issue a "sorry" if the target_device selector appears 
in a target region, if this seems like something that should not/can not 
reasonably be supported.

Does this seem like a plausible way to continue?  Of course I might run 
into some unanticipated difficulty in implementation but it would be 
helpful to know if I'm totally barking up the wrong tree here before I 
waste too much time continuing down this path.

-Sandra
  
Tobias Burnus Sept. 22, 2024, 5:37 a.m. UTC | #5
On Sunday, September 22, 2024, Sandra Loosemore <sloosemore@baylibre.com>
wrote:
> […] I think the predicate of the more general case for
>
> target_device={device_num (NUM), kind(KIND), arch(ARCH), isa(ISA)}
>
> can be expressed (using GCC statement expression syntax) as
>
> ({
>    int matches;
>    #pragma omp target device (NUM)
>      matches = magic_cookie (KIND, ARCH, ISA)
>    matches;
> })
>
> where magic_cookie is either a built-in or new gimple code.  I think the
gimplifier is probably the right place to do the above transformation, and
the magic_cookie expansion would happen during (or at least at the same
point in compilation as) late metadirective resolution; IOW, in the offload
compiler).  That part can call targetm.omp.device_kind_arch_isa to resolve
the whole works into a constant true/false, similar to how the "device"
selector is handled in the offload compiler, rather than into any runtime
routine.

I think that can work. I was (and am to a much lesser extent) worrying a
bit about the overhead the target call, but as the spec only has one
(default or the one specified) that should be fine.
(One can think of merging multiple target regions for multiple candidates
or moving them out of a hot loop.)

And for uid(xxx) it still needs a runtime call, but then calling
__builtin_strcmp(xxx, omp_get_uid_from_device(...)) should be fine.

There is the larger question whether we should report the compile time
supported isa or the real one, but I think either works. And whether to
regard the isa as feature set, which newer systems also support (done for
x86(_64)) or as strictly that specific version (as done for nvptx), but
that's independent of the way we implement it.


> Does this seem like a plausible way to continue?

At a glace, yes.

Tobias
  

Patch

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 804d08ca57e..81545c2ebef 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -83,6 +83,8 @@  typedef enum {
   CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR = 39,
   CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT = 40,
   CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING = 41,
+  CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR = 75,
+  CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR = 76,
   CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR = 82,
   CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS = 88
 } CUdevice_attribute;
diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 855f0affddf..ba2dd0bb3c2 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -70,7 +70,7 @@  libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c error.c \
 	target.c splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \
 	oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \
 	priority_queue.c affinity-fmt.c teams.c allocator.c oacc-profiling.c \
-	oacc-target.c target-indirect.c
+	oacc-target.c target-indirect.c selector.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index da902f3daca..b5d704992fc 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -219,7 +219,7 @@  am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo critical.lo \
 	oacc-parallel.lo oacc-host.lo oacc-init.lo oacc-mem.lo \
 	oacc-async.lo oacc-plugin.lo oacc-cuda.lo priority_queue.lo \
 	affinity-fmt.lo teams.lo allocator.lo oacc-profiling.lo \
-	oacc-target.lo target-indirect.lo $(am__objects_1)
+	oacc-target.lo target-indirect.lo selector.lo $(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 AM_V_P = $(am__v_P_@AM_V@)
 am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
@@ -552,7 +552,7 @@  libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c \
 	oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \
 	oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
 	affinity-fmt.c teams.c allocator.c oacc-profiling.c \
-	oacc-target.c target-indirect.c $(am__append_3)
+	oacc-target.c target-indirect.c selector.c $(am__append_3)
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION)
@@ -777,6 +777,7 @@  distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ptrlock.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scope.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/selector.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/splay-tree.Plo@am__quote@
diff --git a/libgomp/config/gcn/selector.c b/libgomp/config/gcn/selector.c
new file mode 100644
index 00000000000..7e099a00b97
--- /dev/null
+++ b/libgomp/config/gcn/selector.c
@@ -0,0 +1,102 @@ 
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Mentor, a Siemens Business.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file contains an implementation of GOMP_evaluate_current_device for
+   an AMD GCN GPU.  */
+
+#include "libgomp.h"
+#include <string.h>
+
+/* The selectors are passed as strings, but are actually sets of multiple
+   trait property names, separated by '\0' and with an extra '\0' at
+   the end.  Match such a string SELECTORS against an array of strings
+   CHOICES, that is terminated by a null pointer.
+   matches.  */
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+bool
+GOMP_evaluate_current_device (const char *kind, const char *arch,
+			      const char *isa)
+{
+  static const char *kind_choices[] = { "gpu", "nohost", NULL };
+  static const char *arch_choices[] = { "gcn", "amdgcn", NULL };
+  static const char *isa_choices[]
+    = {
+#ifdef __fiji__
+       "fiji", "gfx803",
+#endif
+#ifdef __gfx900__
+       "gfx900",
+#endif
+#ifdef __gfx906__
+       "gfx906",
+#endif
+#ifdef __gfx908__
+       "gfx908",
+#endif
+#ifdef __gfx90a__
+       "gfx90a",
+#endif
+#ifdef __gfx90c__
+       "gfx90c",
+#endif
+#ifdef __gfx1030__
+       "gfx1030",
+#endif
+#ifdef __gfx1036__
+       "gfx1036",
+#endif
+#ifdef __gfx1100__
+       "gfx1100",
+#endif
+#ifdef __gfx1103__
+       "gfx1103",
+#endif
+       NULL };
+
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+
+  if (arch && !gomp_match_selectors (arch, arch_choices))
+    return false;
+
+  if (isa && !gomp_match_selectors (isa, isa_choices))
+    return false;
+
+  return true;
+}
diff --git a/libgomp/config/linux/selector.c b/libgomp/config/linux/selector.c
new file mode 100644
index 00000000000..064cb937ecc
--- /dev/null
+++ b/libgomp/config/linux/selector.c
@@ -0,0 +1,65 @@ 
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Mentor, a Siemens Business.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file contains a generic implementation of
+   GOMP_evaluate_current_device when run on a Linux host.  */
+
+#include <string.h>
+#include "libgomp.h"
+
+/* The selectors are passed as strings, but are actually sets of multiple
+   trait property names, separated by '\0' and with an extra '\0' at
+   the end.  Match such a string SELECTORS against an array of strings
+   CHOICES, that is terminated by a null pointer.
+   matches.  */
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+bool
+GOMP_evaluate_current_device (const char *kind, const char *arch,
+			      const char *isa)
+{
+  static const char *kind_choices[] = { "cpu", "host", NULL };
+
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+
+  if (arch || isa)
+    return false;
+
+  return true;
+}
diff --git a/libgomp/config/linux/x86/selector.c b/libgomp/config/linux/x86/selector.c
new file mode 100644
index 00000000000..13cd2e14389
--- /dev/null
+++ b/libgomp/config/linux/x86/selector.c
@@ -0,0 +1,406 @@ 
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Mentor, a Siemens Business.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file contains an implementation of GOMP_evaluate_current_device for
+   an x86/x64-based Linux host.  */
+
+#include <string.h>
+#include "libgomp.h"
+
+/* The selectors are passed as strings, but are actually sets of multiple
+   trait property names, separated by '\0' and with an extra '\0' at
+   the end.  Match such a string SELECTORS against an array of strings
+   CHOICES, that is terminated by a null pointer.
+   matches.  */
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+bool
+GOMP_evaluate_current_device (const char *kind, const char *arch,
+			      const char *isa)
+{
+  static const char *kind_choices[] = { "cpu", "host", NULL };
+
+  static const char *arch_choices[]
+    = { "x86",
+	"ia32",
+#ifdef __x86_64__
+	"x86_64",
+#endif
+#ifdef __ILP32__
+	"x32",
+#endif
+	"i386",
+#ifdef __i486__
+	"i486",
+#endif
+#ifdef __i586__
+	"i586",
+#endif
+#ifdef __i686__
+	"i686",
+#endif
+	NULL };
+
+  static const char *isa_choices[]
+    = {
+#ifdef __WBNOINVD__
+       "wbnoinvd",
+#endif
+#ifdef __AVX512VP2INTERSECT__
+       "avx512vp2intersect",
+#endif
+#ifdef __MMX__
+       "mmx",
+#endif
+#ifdef __3dNOW__
+       "3dnow",
+#endif
+#ifdef __3dNOW_A__
+       "3dnowa",
+#endif
+#ifdef __SSE__
+       "sse",
+#endif
+#ifdef __SSE2__
+       "sse2",
+#endif
+#ifdef __SSE3__
+       "sse3",
+#endif
+#ifdef __SSSE3__
+       "ssse3",
+#endif
+#ifdef __SSE4_1__
+       "sse4.1",
+#endif
+#ifdef __SSE4_2__
+       "sse4",
+       "sse4.2",
+#endif
+#ifdef __AES__
+       "aes",
+#endif
+#ifdef __SHA__
+       "sha",
+#endif
+#ifdef __PCLMUL__
+       "pclmul",
+#endif
+#ifdef __AVX__
+       "avx",
+#endif
+#ifdef __AVX2__
+       "avx2",
+#endif
+#ifdef __AVX512F__
+  "avx512f",
+#endif
+#ifdef __AVX512ER__
+       "avx512er",
+#endif
+#ifdef __AVX512CD__
+       "avx512cd",
+#endif
+#ifdef __AVX512PF__
+       "avx512pf",
+#endif
+#ifdef __AVX512DQ__
+       "avx512dq",
+#endif
+#ifdef __AVX512BW__
+       "avx512bw",
+#endif
+#ifdef __AVX512VL__
+       "avx512vl",
+#endif
+#ifdef __AVX512VBMI__
+       "avx512vbmi",
+#endif
+#ifdef __AVX512IFMA__
+       "avx512ifma",
+#endif
+#ifdef __AVX5124VNNIW__
+       "avx5124vnniw",
+#endif
+#ifdef __AVX512VBMI2__
+       "avx512vbmi2",
+#endif
+#ifdef __AVX512VNNI__
+       "avx512vnni",
+#endif
+#ifdef __PCONFIG__
+       "pconfig",
+#endif
+#ifdef __SGX__
+       "sgx",
+#endif
+#ifdef __AVX5124FMAPS__
+       "avx5124fmaps",
+#endif
+#ifdef __AVX512BITALG__
+       "avx512bitalg",
+#endif
+#ifdef __AVX512VPOPCNTDQ__
+       "avx512vpopcntdq",
+#endif
+#ifdef __FMA__
+       "fma",
+#endif
+#ifdef __RTM__
+       "rtm",
+#endif
+#ifdef __SSE4A__
+       "sse4a",
+#endif
+#ifdef __FMA4__
+       "fma4",
+#endif
+#ifdef __XOP__
+       "xop",
+#endif
+#ifdef __LWP__
+       "lwp",
+#endif
+#ifdef __ABM__
+       "abm",
+#endif
+#ifdef __BMI__
+       "bmi",
+#endif
+#ifdef __BMI2__
+       "bmi2",
+#endif
+#ifdef __LZCNT__
+       "lzcnt",
+#endif
+#ifdef __TBM__
+       "tbm",
+#endif
+#ifdef __CRC32__
+       "crc32",
+#endif
+#ifdef __POPCNT__
+       "popcnt",
+#endif
+#ifdef __FSGSBASE__
+       "fsgsbase",
+#endif
+#ifdef __RDRND__
+       "rdrnd",
+#endif
+#ifdef __F16C__
+       "f16c",
+#endif
+#ifdef __RDSEED__
+       "rdseed",
+#endif
+#ifdef __PRFCHW__
+       "prfchw",
+#endif
+#ifdef __ADX__
+       "adx",
+#endif
+#ifdef __FXSR__
+       "fxsr",
+#endif
+#ifdef __XSAVE__
+       "xsave",
+#endif
+#ifdef __XSAVEOPT__
+       "xsaveopt",
+#endif
+#ifdef __PREFETCHWT1__
+       "prefetchwt1",
+#endif
+#ifdef __CLFLUSHOPT__
+       "clflushopt",
+#endif
+#ifdef __CLZERO__
+       "clzero",
+#endif
+#ifdef __XSAVEC__
+       "xsavec",
+#endif
+#ifdef __XSAVES__
+       "xsaves",
+#endif
+#ifdef __CLWB__
+       "clwb",
+#endif
+#ifdef __MWAITX__
+       "mwaitx",
+#endif
+#ifdef __PKU__
+       "pku",
+#endif
+#ifdef __RDPID__
+       "rdpid",
+#endif
+#ifdef __GFNI__
+       "gfni",
+#endif
+#ifdef __SHSTK__
+       "shstk",
+#endif
+#ifdef __VAES__
+       "vaes",
+#endif
+#ifdef __VPCLMULQDQ__
+       "vpclmulqdq",
+#endif
+#ifdef __MOVDIRI__
+       "movdiri",
+#endif
+#ifdef __MOVDIR64B__
+       "movdir64b",
+#endif
+#ifdef __WAITPKG__
+       "waitpkg",
+#endif
+#ifdef __CLDEMOTE__
+       "cldemote",
+#endif
+#ifdef __SERIALIZE__
+       "serialize",
+#endif
+#ifdef __PTWRITE__
+       "ptwrite",
+#endif
+#ifdef __AVX512BF16__
+       "avx512bf16",
+#endif
+#ifdef __AVX512FP16__
+       "avx512fp16",
+#endif
+#ifdef __ENQCMD__
+       "enqcmd",
+#endif
+#ifdef __TSXLDTRK__
+       "tsxldtrk",
+#endif
+#ifdef __AMX_TILE__
+       "amx-tile",
+#endif
+#ifdef __AMX_INT8__
+       "amx-int8",
+#endif
+#ifdef __AMX_BF16__
+       "amx-bf16",
+#endif
+#ifdef __LAHF_SAHF__
+       "sahf",
+#endif
+#ifdef __MOVBE__
+       "movbe",
+#endif
+#ifdef __UINTR__
+       "uintr",
+#endif
+#ifdef __HRESET__
+       "hreset",
+#endif
+#ifdef __KL__
+       "kl",
+#endif
+#ifdef __WIDEKL__
+       "widekl",
+#endif
+#ifdef __AVXVNNI__
+       "avxvnni",
+#endif
+#ifdef __AVXIFMA_
+       "avxifma",_
+#endif
+#ifdef __AVXVNNIINT8__
+       "avxvnniint8",
+#endif
+#ifdef __AVXNECONVERT__
+       "avxneconvert",
+#endif
+#ifdef __CMPCCXADD__
+       "cmpccxadd",
+#endif
+#ifdef __AMX_FP16__
+       "amx-fp16",
+#endif
+#ifdef __PREFETCHI__
+       "prefetchi",
+#endif
+#ifdef __RAOINT__
+       "raoint",
+#endif
+#ifdef __AMX_COMPLEX__
+       "amx-complex",
+#endif
+#ifdef __AVXVNNIINT16__
+       "amxvnniint16",
+#endif
+#ifdef __SM3__
+       "sm3",
+#endif
+#ifdef __SHA512__
+       "sha512",
+#endif
+#ifdef __SM4__
+       "sm4",
+#endif
+#ifdef __EVEX512__
+       "evex512",
+#endif
+#ifdef __USER_MSR__
+       "usermsr",
+#endif
+#ifdef __AVX10_1_256__
+       "avx10.1-256",
+#endif
+#ifdef __AVX10_1_512__
+       "avx10.1-512",
+#endif
+#ifdef __APX_F__
+       "apxf",
+#endif
+       NULL };
+
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+  if (arch && !gomp_match_selectors (arch, arch_choices))
+    return false;
+  if (isa && !gomp_match_selectors (isa, isa_choices))
+    return false;
+  return true;
+}
diff --git a/libgomp/config/nvptx/selector.c b/libgomp/config/nvptx/selector.c
new file mode 100644
index 00000000000..c1e81efca28
--- /dev/null
+++ b/libgomp/config/nvptx/selector.c
@@ -0,0 +1,77 @@ 
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Mentor, a Siemens Business.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file contains an implementation of GOMP_evaluate_current_device for
+   a Nvidia GPU.  */
+
+#include "libgomp.h"
+#include <string.h>
+
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+bool
+GOMP_evaluate_current_device (const char *kind, const char *arch,
+			      const char *isa)
+{
+  static const char *kind_choices[] = { "gpu", "nohost", NULL };
+  static const char *arch_choices[] = { "nvptx", NULL };
+  static const char *isa_choices[]
+    = {
+       "sm_30",
+#if __PTX_SM__ >= 350
+       "sm_35",
+#endif
+#if __PTX_SM__ >= 530
+       "sm_53",
+#endif
+#if __PTX_SM__ >= 750
+       "sm_75",
+#endif
+#if __PTX_SM__ >= 800
+       "sm_80",
+#endif
+       NULL };
+
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+  if (arch && !gomp_match_selectors (arch, arch_choices))
+    return false;
+  if (isa && !gomp_match_selectors (isa, isa_choices))
+    return false;
+  return true;
+}
diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index 0c9c28c65cf..73f880ffa2f 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -152,6 +152,8 @@  extern int GOMP_OFFLOAD_memcpy3d (int, int, size_t, size_t, size_t, void *,
 extern bool GOMP_OFFLOAD_can_run (void *);
 extern void GOMP_OFFLOAD_run (int, void *, void *, void **);
 extern void GOMP_OFFLOAD_async_run (int, void *, void *, void **, void *);
+extern bool GOMP_OFFLOAD_evaluate_device (int, const char *, const char *,
+					  const char *);
 
 extern void GOMP_OFFLOAD_openacc_exec (void (*) (void *), size_t, void **,
 				       void **, unsigned *, void *);
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 089393846d1..4dad4bc321a 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1417,6 +1417,7 @@  struct gomp_device_descr
   __typeof (GOMP_OFFLOAD_can_run) *can_run_func;
   __typeof (GOMP_OFFLOAD_run) *run_func;
   __typeof (GOMP_OFFLOAD_async_run) *async_run_func;
+  __typeof (GOMP_OFFLOAD_evaluate_device) *evaluate_device_func;
 
   /* Splay tree containing information about mapped memory regions.  */
   struct splay_tree_s mem_map;
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 65901dff235..70a48874417 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -428,6 +428,11 @@  GOMP_5.1.2 {
 	GOMP_target_map_indirect_ptr;
 } GOMP_5.1.1;
 
+GOMP_5.1.3 {
+  global:
+	GOMP_evaluate_target_device;
+} GOMP_5.1.2;
+
 OACC_2.0 {
   global:
 	acc_get_num_devices;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 50da248b74d..4b9459048d1 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -6188,9 +6188,10 @@  smaller number.  On non-host devices, the value of the
 @c has to be implemented; cf. also PR target/105640.
 @c For offload devices, add *additionally* gcc/config/*/t-omp-device.
 
-For the host compiler, @code{kind} always matches @code{host}; for the
-offloading architectures AMD GCN and Nvidia PTX, @code{kind} always matches
-@code{gpu}.  For the x86 family of computers, AMD GCN and Nvidia PTX
+For the host compiler, @code{kind} always matches @code{host} and @code{cpu};
+for the offloading architectures AMD GCN and Nvidia PTX, @code{kind}
+always matches @code{gpu} and @code{nohost}.
+For the x86 family of computers, AMD GCN and Nvidia PTX
 the following traits are supported in addition; while OpenMP is supported
 on more architectures, GCC currently does not match any @code{arch} or
 @code{isa} traits for those.
@@ -6207,6 +6208,17 @@  on more architectures, GCC currently does not match any @code{arch} or
       @tab See @code{-march=} in ``Nvidia PTX Options''
 @end multitable
 
+For x86, note that the set of matching @code{arch} and @code{isa}
+selectors is determined by command-line options rather than the actual
+hardware.  This is particularly true of dynamic selectors, which match
+the options used to build libgomp rather than the options used to
+build user programs (which may also differ between compilation units).
+
+For the @code{target_device} selector on AMD GCN and Nvidia PTX,
+the actual hardware is checked at run time.  On AMD GCN, an exact match
+of the @code{isa} selector is required, while on Nvidia PTX lower-numbered
+revisions also match.
+
 @node Memory allocation
 @section Memory allocation
 
diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index c0cc03ae61f..e9d60238e2b 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -337,6 +337,11 @@  extern void GOMP_single_copy_end (void *);
 
 extern void GOMP_scope_start (uintptr_t *);
 
+/* selector.c */
+
+extern bool GOMP_evaluate_current_device (const char *, const char *,
+					  const char *);
+
 /* target.c */
 
 extern void GOMP_target (int, void (*) (void *), const void *,
@@ -359,6 +364,9 @@  extern void GOMP_teams (unsigned int, unsigned int);
 extern bool GOMP_teams4 (unsigned int, unsigned int, unsigned int, bool);
 extern void *GOMP_target_map_indirect_ptr (void *);
 
+extern bool GOMP_evaluate_target_device (int, const char *, const char *,
+					 const char *);
+
 /* teams.c */
 
 extern void GOMP_teams_reg (void (*) (void *), void *, unsigned, unsigned,
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index 5efdf7fb796..b6883850250 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -136,6 +136,16 @@  host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars,
   fn (vars);
 }
 
+static bool
+host_evaluate_device (int device_num __attribute__ ((unused)),
+		      const char *kind __attribute__ ((unused)),
+		      const char *arch __attribute__ ((unused)),
+		      const char *isa __attribute__ ((unused)))
+{
+  __builtin_unreachable ();
+  return false;
+}
+
 static void
 host_openacc_exec (void (*fn) (void *),
 		   size_t mapnum __attribute__ ((unused)),
@@ -285,6 +295,7 @@  static struct gomp_device_descr host_dispatch =
     .memcpy2d_func = NULL,
     .memcpy3d_func = NULL,
     .run_func = host_run,
+    .evaluate_device_func = host_evaluate_device,
 
     .mem_map = { NULL },
     .mem_map_rev = { NULL },
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 3d882b5ab63..8b001edfccf 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -4412,6 +4412,58 @@  GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars,
 		       GOMP_PLUGIN_target_task_completion, async_data);
 }
 
+/* The selectors are passed as strings, but are actually sets of multiple
+   trait property names, separated by '\0' and with an extra '\0' at
+   the end.  Match such a string SELECTORS against an array of strings
+   CHOICES, that is terminated by a null pointer.
+   matches.  */
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+/* Here we can only have one possible match and it must be
+   the only selector provided.  */
+static bool
+gomp_match_isa (const char *selectors, gcn_isa isa)
+{
+  if (isa_code (selectors) != isa)
+    return false;
+  if (*(selectors + strlen (selectors) + 1) != '\0')
+    return false;
+  return true;
+}
+
+bool
+GOMP_OFFLOAD_evaluate_device (int device_num, const char *kind,
+			      const char *arch, const char *isa)
+{
+  static const char *kind_choices[] = { "gpu", "nohost", NULL };
+  static const char *arch_choices[] = { "gcn", "amdgcn", NULL };
+  struct agent_info *agent = get_agent_info (device_num);
+
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+
+  if (arch && !gomp_match_selectors (arch, arch_choices))
+    return false;
+
+  if (isa && !gomp_match_isa (isa, agent->device_isa))
+    return false;
+
+  return true;
+}
+
 /* }}} */
 /* {{{ OpenACC Plugin API  */
 
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 99cbcb699b3..a9c28fdb9ec 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -319,6 +319,7 @@  struct ptx_device
   int max_threads_per_block;
   int max_threads_per_multiprocessor;
   int default_dims[GOMP_DIM_MAX];
+  int compute_major, compute_minor;
 
   /* Length as used by the CUDA Runtime API ('struct cudaDeviceProp').  */
   char name[256];
@@ -551,6 +552,14 @@  nvptx_open_device (int n)
   for (int i = 0; i != GOMP_DIM_MAX; i++)
     ptx_dev->default_dims[i] = 0;
 
+  CUDA_CALL_ERET (NULL, cuDeviceGetAttribute, &pi,
+		  CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, dev);
+  ptx_dev->compute_major = pi;
+
+  CUDA_CALL_ERET (NULL, cuDeviceGetAttribute, &pi,
+		  CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, dev);
+  ptx_dev->compute_minor = pi;
+
   CUDA_CALL_ERET (NULL, cuDeviceGetName, ptx_dev->name, sizeof ptx_dev->name,
 		  dev);
 
@@ -2489,3 +2498,76 @@  GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args)
 }
 
 /* TODO: Implement GOMP_OFFLOAD_async_run. */
+
+/* The selectors are passed as strings, but are actually sets of multiple
+   trait property names, separated by '\0' and with an extra '\0' at
+   the end.  Match such a string SELECTORS against an array of strings
+   CHOICES, that is terminated by a null pointer.
+   matches.  */
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+/* Here we can only have one possible match and it must be
+   the only selector provided.  */
+static bool
+gomp_match_selector (const char *selectors, const char *choice)
+{
+  if (!strcmp (selectors, choice))
+    return false;
+  if (*(selectors + strlen (selectors) + 1) != '\0')
+    return false;
+  return true;
+}
+
+#define CHECK_ISA(major, minor)					\
+  if (device->compute_major >= major				\
+      && device->compute_minor >= minor				\
+      && gomp_match_selector (isa, "sm_"#major#minor))		\
+    return true
+
+bool
+GOMP_OFFLOAD_evaluate_device (int device_num, const char *kind,
+			      const char *arch, const char *isa)
+{
+  static const char *kind_choices[] = { "gpu", "nohost", NULL };
+  static const char *arch_choices[] = { "nvptx", NULL };
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+
+  if (arch && !gomp_match_selectors (arch, arch_choices))
+    return false;
+
+  if (!isa)
+    return true;
+
+  struct ptx_device *device = ptx_devices[device_num];
+
+  CHECK_ISA (3, 0);
+  CHECK_ISA (3, 5);
+  CHECK_ISA (3, 7);
+  CHECK_ISA (5, 0);
+  CHECK_ISA (5, 2);
+  CHECK_ISA (5, 3);
+  CHECK_ISA (6, 0);
+  CHECK_ISA (6, 1);
+  CHECK_ISA (6, 2);
+  CHECK_ISA (7, 0);
+  CHECK_ISA (7, 2);
+  CHECK_ISA (7, 5);
+  CHECK_ISA (8, 0);
+  CHECK_ISA (8, 6);
+
+  return false;
+}
diff --git a/libgomp/selector.c b/libgomp/selector.c
new file mode 100644
index 00000000000..5b21e582844
--- /dev/null
+++ b/libgomp/selector.c
@@ -0,0 +1,64 @@ 
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Mentor, a Siemens Business.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file contains a placeholder implementation of
+   GOMP_evaluate_current_device.  */
+
+#include "libgomp.h"
+
+/* The selectors are passed as strings, but are actually sets of multiple
+   trait property names, separated by '\0' and with an extra '\0' at
+   the end.  Match such a string SELECTORS against an array of strings
+   CHOICES, that is terminated by a null pointer.
+   matches.  */
+static bool
+gomp_match_selectors (const char *selectors, const char **choices)
+{
+  while (*selectors != '\0')
+    {
+      bool match = false;
+      for (int i = 0; !match && choices[i]; i++)
+	match = !strcmp (selectors, choices[i]);
+      if (!match)
+	return false;
+      selectors += strlen (selectors) + 1;
+    }
+  return true;
+}
+
+bool
+GOMP_evaluate_current_device (const char *kind, const char *arch,
+			      const char *isa)
+{
+  static const char *kind_choices[] = { "cpu", "host", NULL };
+
+  if (kind && !gomp_match_selectors (kind, kind_choices))
+    return false;
+
+  if (arch || isa)
+    return false;
+
+  return true;
+}
diff --git a/libgomp/target.c b/libgomp/target.c
index 48689920d4a..a3db167cd07 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -5109,6 +5109,27 @@  omp_pause_resource_all (omp_pause_resource_t kind)
 ialias (omp_pause_resource)
 ialias (omp_pause_resource_all)
 
+bool
+GOMP_evaluate_target_device (int device_num, const char *kind,
+			     const char *arch, const char *isa)
+{
+  bool result = true;
+  if (kind && strcmp (kind, "any") == 0)
+    kind = NULL;
+
+  gomp_debug (1, "%s: device_num = %u, kind=%s, arch=%s, isa=%s",
+	      __FUNCTION__, device_num, kind, arch, isa);
+
+  struct gomp_device_descr *devicep = resolve_device (device_num, true);
+  if (devicep == NULL)
+    result = GOMP_evaluate_current_device (kind, arch, isa);
+  else
+    result = devicep->evaluate_device_func (device_num, kind, arch, isa);
+
+  gomp_debug (1, " -> %s\n", result ? "true" : "false");
+  return result;
+}
+
 #ifdef PLUGIN_SUPPORT
 
 /* This function tries to load a plugin for DEVICE.  Name of plugin is passed
@@ -5161,6 +5182,7 @@  gomp_load_plugin_for_device (struct gomp_device_descr *device,
   DLSYM (free);
   DLSYM (dev2host);
   DLSYM (host2dev);
+  DLSYM (evaluate_device);
   DLSYM_OPT (memcpy2d, memcpy2d);
   DLSYM_OPT (memcpy3d, memcpy3d);
   device->capabilities = device->get_caps_func ();