From patchwork Mon Jan 24 21:35:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kwok Cheung Yeung X-Patchwork-Id: 50418 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8BA993857C5B for ; Mon, 24 Jan 2022 21:36:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 3692F3858412 for ; Mon, 24 Jan 2022 21:35:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3692F3858412 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: CunijWrJ3KcidJGK7aoXFv3qSp52IKi3S1j6tUKhXMzyQEvZnqIgeghq/4g4u1to+7OiL4jU2X 35ryDF1q9favlkYL9wRiu1yNAbDGhDCph9A5AoCaK4OpdEL0bKzZVS7vyTI1pbDyrWkTqUhrs5 szrl12NTTCgl7ap+hiR0ZI/76WfodMYSeQzMgvf/2nENpMItDxywfegInzUzoEYPgu8wW62GtP MITsOJzKmP5v+KyCmh7wXXwEt5JgIlfHSz8o1Vxe/zA6Xqmjm1E9VzK65YsrQq2wUigoBT7G2e LJLWh/IVVnkQlDdpal89ha99 X-IronPort-AV: E=Sophos;i="5.88,313,1635235200"; d="scan'208,223";a="73691500" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 24 Jan 2022 13:35:56 -0800 IronPort-SDR: wucoO1cBj9NE36rYJX5CTUsMNIpRUg0JaapDUXGgH3dO+ER2l4o/Z1JIVVN36Ws5o6B+bB9Vd5 63LOV7oae4sYa/Lc0qPtjbGB2Ziyw2RoQd09LJmP97ixYK9by841fMTNs2jtpaQtSnfcReeOLm CFzK4G1vjkZ5CoZ6BsFCUakYrN+nY4m1uXZsFa/WneJmYzxLsH29oAvWmsbpLgolThbRWw+PLz 57pAg8nC+nhFfuhu9+XLK3X+eoLviIwpDcS6hn6SkIlTeFeqHMYsjQM6ibt3EdER0B0Lf8cJMU ekM= Message-ID: Date: Mon, 24 Jan 2022 21:35:46 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 From: Kwok Cheung Yeung Subject: [PATCH] openmp: Add support for target_device selector set in metadirectives To: Jakub Jelinek , gcc-patches X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hello This patch builds on top of the previous patches for metadirective support to add support for the target_device selector set introduced in OpenMP 5.1. This selector set is similar to the existing device selector set, but can take an additional device_num selector, specifying the device number of the OpenMP device to be matched against. Since the device at a particular number depends on the hardware configuration, the check necessarily needs to be made at runtime. This patch expands a target_device selector into a call to a new libgomp function GOMP_evaluate_target_device, which returns true if there is a match. If device_num is the same as the current device, then it returns the result of calling GOMP_evaluate_current_device. This function is currently implemented for nvptx, amdgcn and x86 Linux - they behave similarly to the various implementations of TARGET_OMP_DEVICE_KIND_ARCH_ISA in gcc/config/*, but are part of the libgomp runtime rather than the GCC internals. Stub implementations have been added so that libgomp can still compile in other configurations. If the current device is the host device (which should be the usual scenario with target_device), and device_num is an accelerator, then it will return the result of a new plugin function GOMP_OFFLOAD_evaluate_device, which is implemented for nvptx and amdgcn. I have added some extra code to the nvptx plugin to determine the supported SM level at runtime (which is matched against the 'isa' selector), while GCN uses the existing device_isa field in agent_info. In the case where the current device is an accelerator and the device_num is not the current device number, GOMP_evaluate_target_device simply returns false, as multiple accelerators generally do not know anything about each other. Bootstrapped on x86_64 Linux and tested with no offloading, and with offloading to nvptx and gcn. I have also checked that it builds when targetting a powerpc64le Linux host. Okay for inclusion in trunk (after the metadirective work is reviewed, presumably after GCC 12 is released)? Thanks Kwok From 2d2f00947783e1ecdf54943d9c499015ce61d267 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Wed, 24 Nov 2021 02:51:28 -0800 Subject: [PATCH] openmp: Add support for 'target_device' context selector set 2022-01-18 Kwok Cheung Yeung gcc/ * builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. * omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New builtin. * omp-general.cc (omp_context_selector_matches): Handle 'target_device' selector set. (omp_dynamic_cond): Generate expression tree for 'target_device' selector set. (omp_context_compute_score): Handle selectors in 'target_device' set. gcc/c/ * c-parser.cc (omp_target_device_selectors): New. (c_parser_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (c_parser_omp_context_selector_specification): Handle 'target_device' selector set. gcc/cp/ * parser.cc (omp_target_device_selectors): New. (cp_parser_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (cp_parser_omp_context_selector_specification): Handle 'target_device' selector set. gcc/fortran/ * openmp.cc (omp_target_device_selectors): New. (gfc_match_omp_context_selector): Accept 'target_device' selector set. Treat 'device_num' selector as expression. (gfc_match_omp_context_selector_specification): Handle 'target_device' selector set. * types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type. gcc/testsuite/ * c-c++-common/gomp/metadirective-7.c: New. * gfortran.dg/gomp/metadirective-7.f90: New. libgomp/ * Makefile.am (libgomp_la_SOURCES): Add selector.c. * Makefile.am: Regenerate. * config/gcn/selector.c: New. * config/linux/selector.c: New. * config/linux/x86/selector.c: New. * config/nvptx/selector.c: New. * libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New. * libgomp.h (struct gomp_device_descr): Add evaluate_device_func field. * libgomp.map (GOMP_5.1): Add GOMP_evaluate_target_device. * libgomp_g.h (GOMP_evaluate_current_device): New. (GOMP_evaluate_target_device): New. * oacc-host.c (host_evaluate_device): New. (host_openacc_exec): Initialize evaluate_device_func field to host_evaluate_device. * plugin/plugin-gcn.c (GOMP_OFFLOAD_evaluate_device): New. * plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and compute_minor fields. (nvptx_open_device): Read compute capability information from device. (CHECK_ISA): New macro. (GOMP_OFFLOAD_evaluate_device): New. * selector.c: New. * target.c (GOMP_evaluate_target_device): New. (gomp_load_plugin_for_device): Load evaulate_device plugin function. * testsuite/libgomp.c-c++-common/metadirective-5.c: New testcase. * testsuite/libgomp.fortran/metadirective-5.f90: New testcase. --- gcc/builtin-types.def | 2 + gcc/c/c-parser.cc | 28 +- gcc/cp/parser.cc | 28 +- gcc/fortran/openmp.cc | 34 +- gcc/fortran/types.def | 2 + gcc/omp-builtins.def | 3 + gcc/omp-general.cc | 71 +++- .../c-c++-common/gomp/metadirective-7.c | 31 ++ .../gfortran.dg/gomp/metadirective-7.f90 | 36 ++ libgomp/Makefile.am | 8 +- libgomp/Makefile.in | 24 +- libgomp/config/gcn/selector.c | 57 +++ libgomp/config/linux/selector.c | 43 +++ libgomp/config/linux/x86/selector.c | 325 ++++++++++++++++++ libgomp/config/nvptx/selector.c | 65 ++++ libgomp/libgomp-plugin.h | 2 + libgomp/libgomp.h | 1 + libgomp/libgomp.map | 1 + libgomp/libgomp_g.h | 8 + libgomp/oacc-host.c | 11 + libgomp/plugin/plugin-gcn.c | 14 + libgomp/plugin/plugin-nvptx.c | 45 +++ libgomp/selector.c | 36 ++ libgomp/target.c | 38 ++ libgomp/testsuite/Makefile.in | 1 + .../libgomp.c-c++-common/metadirective-5.c | 46 +++ .../libgomp.fortran/metadirective-5.f90 | 44 +++ 27 files changed, 973 insertions(+), 31 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-7.c create mode 100644 gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 create mode 100644 libgomp/config/gcn/selector.c create mode 100644 libgomp/config/linux/selector.c create mode 100644 libgomp/config/linux/x86/selector.c create mode 100644 libgomp/config/nvptx/selector.c create mode 100644 libgomp/selector.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c create mode 100644 libgomp/testsuite/libgomp.fortran/metadirective-5.f90 diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 3a7cecdf087..263cc1d0536 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -681,6 +681,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR, BT_INT, BT_PTR) DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_UINT_UINT_BOOL, BT_BOOL, BT_UINT, BT_UINT, BT_UINT, BT_BOOL) +DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR, + BT_BOOL, BT_INT, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR) DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG, BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index f3afc38eb65..58fcbb398ee 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -21410,6 +21410,8 @@ static const char *const omp_device_selectors[] = { static const char *const omp_implementation_selectors[] = { "vendor", "extension", "atomic_default_mem_order", "unified_address", "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL }; +static const char *const omp_target_device_selectors[] = { + "device_num", "kind", "isa", "arch", NULL }; static const char *const omp_user_selectors[] = { "condition", NULL }; @@ -21467,6 +21469,13 @@ c_parser_omp_context_selector (c_parser *parser, tree set, tree parms, property_limit = 3; property_kind = CTX_PROPERTY_NAME_LIST; break; + case 't': /* target_device */ + selectors = omp_target_device_selectors; + allow_score = false; + allow_user = true; + property_limit = 4; + property_kind = CTX_PROPERTY_NAME_LIST; + break; case 'u': /* user */ selectors = omp_user_selectors; property_limit = 1; @@ -21505,6 +21514,12 @@ c_parser_omp_context_selector (c_parser *parser, tree set, tree parms, "atomic_default_mem_order") == 0) property_kind = CTX_PROPERTY_ID; + if (property_kind == CTX_PROPERTY_NAME_LIST + && IDENTIFIER_POINTER (set)[0] == 't' + && strcmp (IDENTIFIER_POINTER (selector), + "device_num") == 0) + property_kind = CTX_PROPERTY_EXPR; + c_parser_consume_token (parser); if (c_parser_next_token_is (parser, CPP_OPEN_PAREN)) @@ -21729,6 +21744,10 @@ c_parser_omp_context_selector_specification (c_parser *parser, tree parms, if (strcmp (setp, "implementation") == 0) setp = NULL; break; + case 't': + if (metadirective_p && strcmp (setp, "target_device") == 0) + setp = NULL; + break; case 'u': if (strcmp (setp, "user") == 0) setp = NULL; @@ -21738,8 +21757,13 @@ c_parser_omp_context_selector_specification (c_parser *parser, tree parms, } if (setp) { - c_parser_error (parser, "expected %, %, " - "% or %"); + if (metadirective_p) + c_parser_error (parser, "expected %, %, " + "%, % " + "or %"); + else + c_parser_error (parser, "expected %, %, " + "% or %"); return error_mark_node; } diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 7cfaff9d65b..aa23688814a 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -45080,6 +45080,8 @@ static const char *const omp_device_selectors[] = { static const char *const omp_implementation_selectors[] = { "vendor", "extension", "atomic_default_mem_order", "unified_address", "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL }; +static const char *const omp_target_device_selectors[] = { + "device_num", "kind", "isa", "arch", NULL }; static const char *const omp_user_selectors[] = { "condition", NULL }; @@ -45137,6 +45139,13 @@ cp_parser_omp_context_selector (cp_parser *parser, tree set, bool has_parms_p, property_limit = 3; property_kind = CTX_PROPERTY_NAME_LIST; break; + case 't': /* target_device */ + selectors = omp_target_device_selectors; + allow_score = false; + allow_user = true; + property_limit = 4; + property_kind = CTX_PROPERTY_NAME_LIST; + break; case 'u': /* user */ selectors = omp_user_selectors; property_limit = 1; @@ -45174,6 +45183,12 @@ cp_parser_omp_context_selector (cp_parser *parser, tree set, bool has_parms_p, "atomic_default_mem_order") == 0) property_kind = CTX_PROPERTY_ID; + if (property_kind == CTX_PROPERTY_NAME_LIST + && IDENTIFIER_POINTER (set)[0] == 't' + && strcmp (IDENTIFIER_POINTER (selector), + "device_num") == 0) + property_kind = CTX_PROPERTY_EXPR; + cp_lexer_consume_token (parser->lexer); if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN)) @@ -45411,6 +45426,10 @@ cp_parser_omp_context_selector_specification (cp_parser *parser, if (strcmp (setp, "implementation") == 0) setp = NULL; break; + case 't': + if (metadirective_p && strcmp (setp, "target_device") == 0) + setp = NULL; + break; case 'u': if (strcmp (setp, "user") == 0) setp = NULL; @@ -45420,8 +45439,13 @@ cp_parser_omp_context_selector_specification (cp_parser *parser, } if (setp) { - cp_parser_error (parser, "expected %, %, " - "% or %"); + if (metadirective_p) + cp_parser_error (parser, "expected %, %, " + "%, % " + "or %"); + else + cp_parser_error (parser, "expected %, %, " + "% or %"); return error_mark_node; } diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index 88000076761..1a97a62462f 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -4638,6 +4638,8 @@ static const char *const omp_device_selectors[] = { static const char *const omp_implementation_selectors[] = { "vendor", "extension", "atomic_default_mem_order", "unified_address", "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL }; +static const char *const omp_target_device_selectors[] = { + "device_num", "kind", "isa", "arch", NULL }; static const char *const omp_user_selectors[] = { "condition", NULL }; @@ -4695,6 +4697,13 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss) property_limit = 3; property_kind = CTX_PROPERTY_NAME_LIST; break; + case 't': /* target_device */ + selectors = omp_target_device_selectors; + allow_score = false; + allow_user = true; + property_limit = 4; + property_kind = CTX_PROPERTY_NAME_LIST; + break; case 'u': /* user */ selectors = omp_user_selectors; property_limit = 1; @@ -4730,6 +4739,11 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss) && strcmp (selector, "atomic_default_mem_order") == 0) property_kind = CTX_PROPERTY_ID; + if (property_kind == CTX_PROPERTY_NAME_LIST + && oss->trait_set_selector_name[0] == 't' + && strcmp (selector, "device_num") == 0) + property_kind = CTX_PROPERTY_EXPR; + if (gfc_match (" (") == MATCH_YES) { if (property_kind == CTX_PROPERTY_NONE) @@ -4918,13 +4932,14 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss) user */ match -gfc_match_omp_context_selector_specification (gfc_omp_set_selector **oss_head) +gfc_match_omp_context_selector_specification (gfc_omp_set_selector **oss_head, + bool metadirective_p = false) { do { match m; - const char *selector_sets[] = { "construct", "device", - "implementation", "user" }; + const char *selector_sets[] = { "construct", "device", "implementation", + "target_device", "user" }; const int selector_set_count = sizeof (selector_sets) / sizeof (*selector_sets); int i; @@ -4936,10 +4951,15 @@ gfc_match_omp_context_selector_specification (gfc_omp_set_selector **oss_head) if (strcmp (buf, selector_sets[i]) == 0) break; - if (m != MATCH_YES || i == selector_set_count) + if (m != MATCH_YES || i == selector_set_count + || (!metadirective_p && strcmp (buf, "target_device") == 0)) { - gfc_error ("expected 'construct', 'device', 'implementation' or " - "'user' at %C"); + if (metadirective_p) + gfc_error ("expected 'construct', 'device', 'implementation', " + "'target_device' or 'user' at %C"); + else + gfc_error ("expected 'construct', 'device', 'implementation' " + "or 'user' at %C"); return MATCH_ERROR; } @@ -5113,7 +5133,7 @@ match_omp_metadirective (bool begin_p) if (!default_p) { - if (gfc_match_omp_context_selector_specification (&selectors) + if (gfc_match_omp_context_selector_specification (&selectors, true) != MATCH_YES) return MATCH_ERROR; diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index cd79ad45167..383bdc630f4 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -174,6 +174,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR, BT_INT, BT_PTR) DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_UINT_UINT_BOOL, BT_BOOL, BT_UINT, BT_UINT, BT_UINT, BT_BOOL) +DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR, + BT_BOOL, BT_INT, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR) DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT, diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def index cfa6483c7ae..620add2a67c 100644 --- a/gcc/omp-builtins.def +++ b/gcc/omp-builtins.def @@ -467,3 +467,6 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_WARNING, "GOMP_warning", BT_FN_VOID_CONST_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST) DEF_GOMP_BUILTIN (BUILT_IN_GOMP_ERROR, "GOMP_error", BT_FN_VOID_CONST_PTR_SIZE, ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST) +DEF_GOMP_BUILTIN (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE, "GOMP_evaluate_target_device", + BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR, + ATTR_NOTHROW_LEAF_LIST) diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index bab4a932f5d..4edeb58cc73 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -1322,6 +1322,12 @@ omp_context_selector_matches (tree ctx, bool metadirective_p) ret = -1; continue; } + else if (set == 't') + { + /* The target_device set is dynamic, so treat it as always + resolvable. */ + continue; + } for (tree t2 = TREE_VALUE (t1); t2; t2 = TREE_CHAIN (t2)) { const char *sel = IDENTIFIER_POINTER (TREE_PURPOSE (t2)); @@ -1995,6 +2001,8 @@ omp_get_context_selector (tree ctx, const char *set, const char *sel) static tree omp_dynamic_cond (tree ctx) { + tree expr = NULL_TREE; + tree user = omp_get_context_selector (ctx, "user", "condition"); if (user) { @@ -2004,10 +2012,60 @@ omp_dynamic_cond (tree ctx) /* The user condition is not dynamic if it is constant. */ if (!tree_fits_shwi_p (TREE_VALUE (expr_list))) - return TREE_VALUE (expr_list); + expr = TREE_VALUE (expr_list); } - return NULL_TREE; + tree target_device = omp_get_context_selector (ctx, "target_device", NULL); + if (target_device) + { + tree device_num = null_pointer_node; + tree kind = null_pointer_node; + tree arch = null_pointer_node; + tree isa = null_pointer_node; + + tree device_num_sel = omp_get_context_selector (ctx, "target_device", + "device_num"); + if (device_num_sel) + device_num = TREE_VALUE (TREE_VALUE (device_num_sel)); + else + device_num = build_int_cst (integer_type_node, -1); + + tree kind_sel = omp_get_context_selector (ctx, "target_device", "kind"); + if (kind_sel) + { + const char *str + = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (kind_sel))); + kind = build_string_literal (strlen (str) + 1, str); + } + + tree arch_sel = omp_get_context_selector (ctx, "target_device", "arch"); + if (arch_sel) + { + const char *str + = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (arch_sel))); + arch = build_string_literal (strlen (str) + 1, str); + } + + tree isa_sel = omp_get_context_selector (ctx, "target_device", "isa"); + if (isa_sel) + { + const char *str + = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (isa_sel))); + isa = build_string_literal (strlen (str) + 1, str); + } + + /* Generate a call to GOMP_evaluate_target_device. */ + tree builtin_fn + = builtin_decl_explicit (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE); + tree call = build_call_expr (builtin_fn, 4, device_num, kind, arch, isa); + + if (expr == NULL_TREE) + expr = call; + else + expr = fold_build2 (TRUTH_ANDIF_EXPR, boolean_type_node, expr, call); + } + + return expr; } /* Return true iff the context selector CTX contains a dynamic element @@ -2028,9 +2086,12 @@ static bool omp_context_compute_score (tree ctx, widest_int *score, bool declare_simd) { tree construct = omp_get_context_selector (ctx, "construct", NULL); - bool has_kind = omp_get_context_selector (ctx, "device", "kind"); - bool has_arch = omp_get_context_selector (ctx, "device", "arch"); - bool has_isa = omp_get_context_selector (ctx, "device", "isa"); + bool has_kind = omp_get_context_selector (ctx, "device", "kind") + || omp_get_context_selector (ctx, "target_device", "kind"); + bool has_arch = omp_get_context_selector (ctx, "device", "arch") + || omp_get_context_selector (ctx, "target_device", "arch"); + bool has_isa = omp_get_context_selector (ctx, "device", "isa") + || omp_get_context_selector (ctx, "target_device", "isa"); bool ret = false; *score = 1; for (tree t1 = ctx; t1; t1 = TREE_CHAIN (t1)) diff --git a/gcc/testsuite/c-c++-common/gomp/metadirective-7.c b/gcc/testsuite/c-c++-common/gomp/metadirective-7.c new file mode 100644 index 00000000000..cf695aa24cb --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/metadirective-7.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fdump-tree-gimple" } */ + +#define N 256 + +void f (int a[], int num) +{ + int i; + + #pragma omp metadirective \ + when (target_device={device_num(num), kind("gpu"), arch("nvptx")}: \ + target parallel for map(tofrom: a[0:N])) \ + when (target_device={device_num(num), kind("gpu"), \ + arch("amdgcn"), isa("gfx906")}: \ + target parallel for) \ + when (target_device={device_num(num), kind("cpu"), arch("x86_64")}: \ + parallel for) + for (i = 0; i < N; i++) + a[i] += i; + + #pragma omp metadirective \ + when (target_device={kind("gpu"), arch("nvptx")}: \ + target parallel for map(tofrom: a[0:N])) + for (i = 0; i < N; i++) + a[i] += i; +} + +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(num, &\"gpu\"\\\[0\\\], &\"amdgcn\"\\\[0\\\], &\"gfx906\"\\\[0\\\]\\)" "gimple" } } */ +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(num, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } */ +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(num, &\"cpu\"\\\[0\\\], &\"x86_64\"\\\[0\\\], 0B\\)" "gimple" } } */ +/* { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(-1, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } */ diff --git a/gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 b/gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 new file mode 100644 index 00000000000..870ea192fbc --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/metadirective-7.f90 @@ -0,0 +1,36 @@ +! { dg-do compile } +! { dg-additional-options "-fdump-tree-gimple" } + +program main + integer, parameter :: N = 256 +contains + subroutine f (a, num) + integer :: a(N) + integer :: num + integer :: i + + !$omp metadirective & + !$omp& when (target_device={device_num(num), kind("gpu"), arch("nvptx")}: & + !$omp& target parallel do map(tofrom: a(1:N))) & + !$omp& when (target_device={device_num(num), kind("gpu"), & + !$omp& arch("amdgcn"), isa("gfx906")}: & + !$omp& target parallel do) & + !$omp& when (target_device={device_num(num), kind("cpu"), arch("x86_64")}: & + !$omp& parallel do) + do i = 1, N + a(i) = a(i) + i + end do + + !$omp metadirective & + !$omp& when (target_device={kind("gpu"), arch("nvptx")}: & + !$omp& target parallel do map(tofrom: a(1:N))) + do i = 1, N + a(i) = a(i) + i + end do + end subroutine +end program + +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(.+, &\"gpu\"\\\[0\\\], &\"amdgcn\"\\\[0\\\], &\"gfx906\"\\\[0\\\]\\)" "gimple" } } +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(.+, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(.+, &\"cpu\"\\\[0\\\], &\"x86_64\"\\\[0\\\], 0B\\)" "gimple" } } +! { dg-final { scan-tree-dump "__builtin_GOMP_evaluate_target_device \\(-1, &\"gpu\"\\\[0\\\], &\"nvptx\"\\\[0\\\], 0B\\)" "gimple" } } diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am index f8b2a06d63e..071def92820 100644 --- a/libgomp/Makefile.am +++ b/libgomp/Makefile.am @@ -63,10 +63,10 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c error.c \ icv.c icv-device.c iter.c iter_ull.c loop.c loop_ull.c ordered.c \ parallel.c scope.c sections.c single.c task.c team.c work.c lock.c \ mutex.c proc.c sem.c bar.c ptrlock.c time.c fortran.c affinity.c \ - target.c splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \ - oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \ - priority_queue.c affinity-fmt.c teams.c allocator.c oacc-profiling.c \ - oacc-target.c + selector.c target.c splay-tree.c libgomp-plugin.c oacc-parallel.c \ + oacc-host.c oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c \ + oacc-cuda.c priority_queue.c affinity-fmt.c teams.c allocator.c \ + oacc-profiling.c oacc-target.c include $(top_srcdir)/plugin/Makefrag.am diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in index 22cb2136a08..e438bf09899 100644 --- a/libgomp/Makefile.in +++ b/libgomp/Makefile.in @@ -16,7 +16,7 @@ # Plugins for offload execution, Makefile.am fragment. # -# Copyright (C) 2014-2021 Free Software Foundation, Inc. +# Copyright (C) 2014-2022 Free Software Foundation, Inc. # # Contributed by Mentor Embedded. # @@ -216,11 +216,11 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo critical.lo \ loop.lo loop_ull.lo ordered.lo parallel.lo scope.lo \ sections.lo single.lo task.lo team.lo work.lo lock.lo mutex.lo \ proc.lo sem.lo bar.lo ptrlock.lo time.lo fortran.lo \ - affinity.lo target.lo splay-tree.lo libgomp-plugin.lo \ - oacc-parallel.lo oacc-host.lo oacc-init.lo oacc-mem.lo \ - oacc-async.lo oacc-plugin.lo oacc-cuda.lo priority_queue.lo \ - affinity-fmt.lo teams.lo allocator.lo oacc-profiling.lo \ - oacc-target.lo $(am__objects_1) + affinity.lo selector.lo target.lo splay-tree.lo \ + libgomp-plugin.lo oacc-parallel.lo oacc-host.lo oacc-init.lo \ + oacc-mem.lo oacc-async.lo oacc-plugin.lo oacc-cuda.lo \ + priority_queue.lo affinity-fmt.lo teams.lo allocator.lo \ + oacc-profiling.lo oacc-target.lo $(am__objects_1) libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS) AM_V_P = $(am__v_P_@AM_V@) am__v_P_ = $(am__v_P_@AM_DEFAULT_V@) @@ -506,6 +506,7 @@ pdfdir = @pdfdir@ prefix = @prefix@ program_transform_name = @program_transform_name@ psdir = @psdir@ +runstatedir = @runstatedir@ sbindir = @sbindir@ sharedstatedir = @sharedstatedir@ srcdir = @srcdir@ @@ -555,11 +556,11 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c \ error.c icv.c icv-device.c iter.c iter_ull.c loop.c loop_ull.c \ ordered.c parallel.c scope.c sections.c single.c task.c team.c \ work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c time.c \ - fortran.c affinity.c target.c splay-tree.c libgomp-plugin.c \ - oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \ - oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \ - affinity-fmt.c teams.c allocator.c oacc-profiling.c \ - oacc-target.c $(am__append_3) + fortran.c affinity.c selector.c target.c splay-tree.c \ + libgomp-plugin.c oacc-parallel.c oacc-host.c oacc-init.c \ + oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \ + priority_queue.c affinity-fmt.c teams.c allocator.c \ + oacc-profiling.c oacc-target.c $(am__append_3) # Nvidia PTX OpenACC plugin. @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION) @@ -771,6 +772,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ptrlock.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/scope.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/selector.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/splay-tree.Plo@am__quote@ diff --git a/libgomp/config/gcn/selector.c b/libgomp/config/gcn/selector.c new file mode 100644 index 00000000000..60793fc05d3 --- /dev/null +++ b/libgomp/config/gcn/selector.c @@ -0,0 +1,57 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains an implementation of GOMP_evaluate_current_device for + an AMD GCN GPU. */ + +#include "libgomp.h" +#include + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "gpu") != 0) + return false; + + if (arch && strcmp (arch, "gcn") != 0) + return false; + + if (!isa) + return true; + +#ifdef __GCN3__ + if (strcmp (isa, "fiji") == 0 || strcmp (isa, "gfx803") == 0) + return true; +#endif + +#ifdef __GCN5__ + if (strcmp (isa, "gfx900") == 0 || strcmp (isa, "gfx906") != 0 + || strcmp (isa, "gfx908") == 0) + return true; +#endif + + return false; +} diff --git a/libgomp/config/linux/selector.c b/libgomp/config/linux/selector.c new file mode 100644 index 00000000000..84e59c7aabe --- /dev/null +++ b/libgomp/config/linux/selector.c @@ -0,0 +1,43 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains a generic implementation of + GOMP_evaluate_current_device when run on a Linux host. */ + +#include +#include "libgomp.h" + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "cpu") != 0) + return false; + + if (!arch && !isa) + return true; + + return false; +} diff --git a/libgomp/config/linux/x86/selector.c b/libgomp/config/linux/x86/selector.c new file mode 100644 index 00000000000..2b6c2ba165b --- /dev/null +++ b/libgomp/config/linux/x86/selector.c @@ -0,0 +1,325 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains an implementation of GOMP_evaluate_current_device for + an x86/x64-based Linux host. */ + +#include +#include "libgomp.h" + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "cpu") != 0) + return false; + + if (arch + && strcmp (arch, "x86") != 0 + && strcmp (arch, "ia32") != 0 +#ifdef __x86_64__ + && strcmp (arch, "x86_64") != 0 +#endif +#ifdef __ILP32__ + && strcmp (arch, "x32") != 0 +#endif + && strcmp (arch, "i386") != 0 + && strcmp (arch, "i486") != 0 +#ifndef __i486__ + && strcmp (arch, "i586") != 0 +#endif +#if !defined (__i486__) && !defined (__i586__) + && strcmp (arch, "i686") != 0 +#endif + ) + return false; + + if (!isa) + return true; + +#ifdef __WBNOINVD__ + if (strcmp (isa, "wbnoinvd") == 0) return true; +#endif +#ifdef __AVX512VP2INTERSECT__ + if (strcmp (isa, "avx512vp2intersect") == 0) return true; +#endif +#ifdef __MMX__ + if (strcmp (isa, "mmx") == 0) return true; +#endif +#ifdef __3dNOW__ + if (strcmp (isa, "3dnow") == 0) return true; +#endif +#ifdef __3dNOW_A__ + if (strcmp (isa, "3dnowa") == 0) return true; +#endif +#ifdef __SSE__ + if (strcmp (isa, "sse") == 0) return true; +#endif +#ifdef __SSE2__ + if (strcmp (isa, "sse2") == 0) return true; +#endif +#ifdef __SSE3__ + if (strcmp (isa, "sse3") == 0) return true; +#endif +#ifdef __SSSE3__ + if (strcmp (isa, "ssse3") == 0) return true; +#endif +#ifdef __SSE4_1__ + if (strcmp (isa, "sse4.1") == 0) return true; +#endif +#ifdef __SSE4_2__ + if (strcmp (isa, "sse4") == 0 || strcmp (isa, "sse4.2") == 0) return true; +#endif +#ifdef __AES__ + if (strcmp (isa, "aes") == 0) return true; +#endif +#ifdef __SHA__ + if (strcmp (isa, "sha") == 0) return true; +#endif +#ifdef __PCLMUL__ + if (strcmp (isa, "pclmul") == 0) return true; +#endif +#ifdef __AVX__ + if (strcmp (isa, "avx") == 0) return true; +#endif +#ifdef __AVX2__ + if (strcmp (isa, "avx2") == 0) return true; +#endif +#ifdef __AVX512F__ + if (strcmp (isa, "avx512f") == 0) return true; +#endif +#ifdef __AVX512ER__ + if (strcmp (isa, "avx512er") == 0) return true; +#endif +#ifdef __AVX512CD__ + if (strcmp (isa, "avx512cd") == 0) return true; +#endif +#ifdef __AVX512PF__ + if (strcmp (isa, "avx512pf") == 0) return true; +#endif +#ifdef __AVX512DQ__ + if (strcmp (isa, "avx512dq") == 0) return true; +#endif +#ifdef __AVX512BW__ + if (strcmp (isa, "avx512bw") == 0) return true; +#endif +#ifdef __AVX512VL__ + if (strcmp (isa, "avx512vl") == 0) return true; +#endif +#ifdef __AVX512VBMI__ + if (strcmp (isa, "avx512vbmi") == 0) return true; +#endif +#ifdef __AVX512IFMA__ + if (strcmp (isa, "avx512ifma") == 0) return true; +#endif +#ifdef __AVX5124VNNIW__ + if (strcmp (isa, "avx5124vnniw") == 0) return true; +#endif +#ifdef __AVX512VBMI2__ + if (strcmp (isa, "avx512vbmi2") == 0) return true; +#endif +#ifdef __AVX512VNNI__ + if (strcmp (isa, "avx512vnni") == 0) return true; +#endif +#ifdef __PCONFIG__ + if (strcmp (isa, "pconfig") == 0) return true; +#endif +#ifdef __SGX__ + if (strcmp (isa, "sgx") == 0) return true; +#endif +#ifdef __AVX5124FMAPS__ + if (strcmp (isa, "avx5124fmaps") == 0) return true; +#endif +#ifdef __AVX512BITALG__ + if (strcmp (isa, "avx512bitalg") == 0) return true; +#endif +#ifdef __AVX512VPOPCNTDQ__ + if (strcmp (isa, "avx512vpopcntdq") == 0) return true; +#endif +#ifdef __FMA__ + if (strcmp (isa, "fma") == 0) return true; +#endif +#ifdef __RTM__ + if (strcmp (isa, "rtm") == 0) return true; +#endif +#ifdef __SSE4A__ + if (strcmp (isa, "sse4a") == 0) return true; +#endif +#ifdef __FMA4__ + if (strcmp (isa, "fma4") == 0) return true; +#endif +#ifdef __XOP__ + if (strcmp (isa, "xop") == 0) return true; +#endif +#ifdef __LWP__ + if (strcmp (isa, "lwp") == 0) return true; +#endif +#ifdef __ABM__ + if (strcmp (isa, "abm") == 0) return true; +#endif +#ifdef __BMI__ + if (strcmp (isa, "bmi") == 0) return true; +#endif +#ifdef __BMI2__ + if (strcmp (isa, "bmi2") == 0) return true; +#endif +#ifdef __LZCNT__ + if (strcmp (isa, "lzcnt") == 0) return true; +#endif +#ifdef __TBM__ + if (strcmp (isa, "tbm") == 0) return true; +#endif +#ifdef __CRC32__ + if (strcmp (isa, "crc32") == 0) return true; +#endif +#ifdef __POPCNT__ + if (strcmp (isa, "popcnt") == 0) return true; +#endif +#ifdef __FSGSBASE__ + if (strcmp (isa, "fsgsbase") == 0) return true; +#endif +#ifdef __RDRND__ + if (strcmp (isa, "rdrnd") == 0) return true; +#endif +#ifdef __F16C__ + if (strcmp (isa, "f16c") == 0) return true; +#endif +#ifdef __RDSEED__ + if (strcmp (isa, "rdseed") == 0) return true; +#endif +#ifdef __PRFCHW__ + if (strcmp (isa, "prfchw") == 0) return true; +#endif +#ifdef __ADX__ + if (strcmp (isa, "adx") == 0) return true; +#endif +#ifdef __FXSR__ + if (strcmp (isa, "fxsr") == 0) return true; +#endif +#ifdef __XSAVE__ + if (strcmp (isa, "xsave") == 0) return true; +#endif +#ifdef __XSAVEOPT__ + if (strcmp (isa, "xsaveopt") == 0) return true; +#endif +#ifdef __PREFETCHWT1__ + if (strcmp (isa, "prefetchwt1") == 0) return true; +#endif +#ifdef __CLFLUSHOPT__ + if (strcmp (isa, "clflushopt") == 0) return true; +#endif +#ifdef __CLZERO__ + if (strcmp (isa, "clzero") == 0) return true; +#endif +#ifdef __XSAVEC__ + if (strcmp (isa, "xsavec") == 0) return true; +#endif +#ifdef __XSAVES__ + if (strcmp (isa, "xsaves") == 0) return true; +#endif +#ifdef __CLWB__ + if (strcmp (isa, "clwb") == 0) return true; +#endif +#ifdef __MWAITX__ + if (strcmp (isa, "mwaitx") == 0) return true; +#endif +#ifdef __PKU__ + if (strcmp (isa, "pku") == 0) return true; +#endif +#ifdef __RDPID__ + if (strcmp (isa, "rdpid") == 0) return true; +#endif +#ifdef __GFNI__ + if (strcmp (isa, "gfni") == 0) return true; +#endif +#ifdef __SHSTK__ + if (strcmp (isa, "shstk") == 0) return true; +#endif +#ifdef __VAES__ + if (strcmp (isa, "vaes") == 0) return true; +#endif +#ifdef __VPCLMULQDQ__ + if (strcmp (isa, "vpclmulqdq") == 0) return true; +#endif +#ifdef __MOVDIRI__ + if (strcmp (isa, "movdiri") == 0) return true; +#endif +#ifdef __MOVDIR64B__ + if (strcmp (isa, "movdir64b") == 0) return true; +#endif +#ifdef __WAITPKG__ + if (strcmp (isa, "waitpkg") == 0) return true; +#endif +#ifdef __CLDEMOTE__ + if (strcmp (isa, "cldemote") == 0) return true; +#endif +#ifdef __SERIALIZE__ + if (strcmp (isa, "serialize") == 0) return true; +#endif +#ifdef __PTWRITE__ + if (strcmp (isa, "ptwrite") == 0) return true; +#endif +#ifdef __AVX512BF16__ + if (strcmp (isa, "avx512bf16") == 0) return true; +#endif +#ifdef __AVX512FP16__ + if (strcmp (isa, "avx512fp16") == 0) return true; +#endif +#ifdef __ENQCMD__ + if (strcmp (isa, "enqcmd") == 0) return true; +#endif +#ifdef __TSXLDTRK__ + if (strcmp (isa, "tsxldtrk") == 0) return true; +#endif +#ifdef __AMX_TILE__ + if (strcmp (isa, "amx-tile") == 0) return true; +#endif +#ifdef __AMX_INT8__ + if (strcmp (isa, "amx-int8") == 0) return true; +#endif +#ifdef __AMX_BF16__ + if (strcmp (isa, "amx-bf16") == 0) return true; +#endif +#ifdef __LAHF_SAHF__ + if (strcmp (isa, "sahf") == 0) return true; +#endif +#ifdef __MOVBE__ + if (strcmp (isa, "movbe") == 0) return true; +#endif +#ifdef __UINTR__ + if (strcmp (isa, "uintr") == 0) return true; +#endif +#ifdef __HRESET__ + if (strcmp (isa, "hreset") == 0) return true; +#endif +#ifdef __KL__ + if (strcmp (isa, "kl") == 0) return true; +#endif +#ifdef __WIDEKL__ + if (strcmp (isa, "widekl") == 0) return true; +#endif + + return false; +} diff --git a/libgomp/config/nvptx/selector.c b/libgomp/config/nvptx/selector.c new file mode 100644 index 00000000000..50b5f9020ac --- /dev/null +++ b/libgomp/config/nvptx/selector.c @@ -0,0 +1,65 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains an implementation of GOMP_evaluate_current_device for + a Nvidia GPU. */ + +#include "libgomp.h" +#include + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + if (kind && strcmp (kind, "gpu") != 0) + return false; + + if (arch && strcmp (arch, "nvptx") != 0) + return false; + + if (!isa) + return true; + + if (strcmp (isa, "sm_30") == 0) + return true; +#if __PTX_SM__ >= 350 + if (strcmp (isa, "sm_35") == 0) + return true; +#endif +#if __PTX_SM__ >= 530 + if (strcmp (isa, "sm_53") == 0) + return true; +#endif +#if __PTX_SM__ >= 750 + if (strcmp (isa, "sm_75") == 0) + return true; +#endif +#if __PTX_SM__ >= 800 + if (strcmp (isa, "sm_80") == 0) + return true; +#endif + + return false; +} diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h index 07ab700b80c..2826548280a 100644 --- a/libgomp/libgomp-plugin.h +++ b/libgomp/libgomp-plugin.h @@ -140,6 +140,8 @@ extern bool GOMP_OFFLOAD_dev2dev (int, void *, const void *, size_t); extern bool GOMP_OFFLOAD_can_run (void *); extern void GOMP_OFFLOAD_run (int, void *, void *, void **); extern void GOMP_OFFLOAD_async_run (int, void *, void *, void **, void *); +extern bool GOMP_OFFLOAD_evaluate_device (int, const char *, const char *, + const char *); extern void GOMP_OFFLOAD_openacc_exec (void (*) (void *), size_t, void **, void **, unsigned *, void *); diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index b9e03919993..5a31257c598 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -1244,6 +1244,7 @@ struct gomp_device_descr __typeof (GOMP_OFFLOAD_can_run) *can_run_func; __typeof (GOMP_OFFLOAD_run) *run_func; __typeof (GOMP_OFFLOAD_async_run) *async_run_func; + __typeof (GOMP_OFFLOAD_evaluate_device) *evaluate_device_func; /* Splay tree containing information about mapped memory regions. */ struct splay_tree_s mem_map; diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 2ac58094169..9d94aba3f70 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -400,6 +400,7 @@ GOMP_5.1 { GOMP_scope_start; GOMP_warning; GOMP_teams4; + GOMP_evaluate_target_device; } GOMP_5.0.1; OACC_2.0 { diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h index 09674a1225c..1567473d0c9 100644 --- a/libgomp/libgomp_g.h +++ b/libgomp/libgomp_g.h @@ -336,6 +336,11 @@ extern void GOMP_single_copy_end (void *); extern void GOMP_scope_start (uintptr_t *); +/* selector.c */ + +extern bool GOMP_evaluate_current_device (const char *, const char *, + const char *); + /* target.c */ extern void GOMP_target (int, void (*) (void *), const void *, @@ -357,6 +362,9 @@ extern void GOMP_target_enter_exit_data (int, size_t, void **, size_t *, extern void GOMP_teams (unsigned int, unsigned int); extern bool GOMP_teams4 (unsigned int, unsigned int, unsigned int, bool); +extern bool GOMP_evaluate_target_device (int, const char *, const char *, + const char *); + /* teams.c */ extern void GOMP_teams_reg (void (*) (void *), void *, unsigned, unsigned, diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c index 5bb889926d3..cf5bc8d876c 100644 --- a/libgomp/oacc-host.c +++ b/libgomp/oacc-host.c @@ -134,6 +134,16 @@ host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars, fn (vars); } +static bool +host_evaluate_device (int device_num __attribute__ ((unused)), + const char *kind __attribute__ ((unused)), + const char *arch __attribute__ ((unused)), + const char *isa __attribute__ ((unused))) +{ + __builtin_unreachable (); + return false; +} + static void host_openacc_exec (void (*fn) (void *), size_t mapnum __attribute__ ((unused)), @@ -281,6 +291,7 @@ static struct gomp_device_descr host_dispatch = .dev2host_func = host_dev2host, .host2dev_func = host_host2dev, .run_func = host_run, + .evaluate_device_func = host_evaluate_device, .mem_map = { NULL }, /* .lock initialized in goacc_host_init. */ diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index f305d726874..8242b38166c 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3799,6 +3799,20 @@ GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars, GOMP_PLUGIN_target_task_completion, async_data); } +bool +GOMP_OFFLOAD_evaluate_device (int device_num, const char *kind, + const char *arch, const char *isa) +{ + struct agent_info *agent = get_agent_info (device_num); + + if (kind && strcmp (kind, "gpu") != 0) + return false; + if (arch && strcmp (arch, "gcn") != 0) + return false; + + return !isa || isa_code (isa) == agent->device_isa; +} + /* }}} */ /* {{{ OpenACC Plugin API */ diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index b4f0a84d77a..7427677e69d 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -307,6 +307,7 @@ struct ptx_device int max_threads_per_block; int max_threads_per_multiprocessor; int default_dims[GOMP_DIM_MAX]; + int compute_major, compute_minor; /* Length as used by the CUDA Runtime API ('struct cudaDeviceProp'). */ char name[256]; @@ -523,6 +524,14 @@ nvptx_open_device (int n) for (int i = 0; i != GOMP_DIM_MAX; i++) ptx_dev->default_dims[i] = 0; + CUDA_CALL_ERET (NULL, cuDeviceGetAttribute, &pi, + CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, dev); + ptx_dev->compute_major = pi; + + CUDA_CALL_ERET (NULL, cuDeviceGetAttribute, &pi, + CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, dev); + ptx_dev->compute_minor = pi; + CUDA_CALL_ERET (NULL, cuDeviceGetName, ptx_dev->name, sizeof ptx_dev->name, dev); @@ -2036,3 +2045,39 @@ GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args) } /* TODO: Implement GOMP_OFFLOAD_async_run. */ + +#define CHECK_ISA(major, minor) \ + if (device->compute_major >= major && device->compute_minor >= minor \ + && strcmp (isa, "sm_"#major#minor) == 0) \ + return true + +bool +GOMP_OFFLOAD_evaluate_device (int device_num, const char *kind, + const char *arch, const char *isa) +{ + if (kind && strcmp (kind, "gpu") != 0) + return false; + if (arch && strcmp (arch, "nvptx") != 0) + return false; + if (!isa) + return true; + + struct ptx_device *device = ptx_devices[device_num]; + + CHECK_ISA (3, 0); + CHECK_ISA (3, 5); + CHECK_ISA (3, 7); + CHECK_ISA (5, 0); + CHECK_ISA (5, 2); + CHECK_ISA (5, 3); + CHECK_ISA (6, 0); + CHECK_ISA (6, 1); + CHECK_ISA (6, 2); + CHECK_ISA (7, 0); + CHECK_ISA (7, 2); + CHECK_ISA (7, 5); + CHECK_ISA (8, 0); + CHECK_ISA (8, 6); + + return false; +} diff --git a/libgomp/selector.c b/libgomp/selector.c new file mode 100644 index 00000000000..dc920ee065f --- /dev/null +++ b/libgomp/selector.c @@ -0,0 +1,36 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + Contributed by Mentor, a Siemens Business. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* This file contains a placeholder implementation of + GOMP_evaluate_current_device. */ + +#include "libgomp.h" + +bool +GOMP_evaluate_current_device (const char *kind, const char *arch, + const char *isa) +{ + return false; +} diff --git a/libgomp/target.c b/libgomp/target.c index 698ff14a05f..77e0b3f1c24 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -3690,6 +3690,43 @@ omp_pause_resource_all (omp_pause_resource_t kind) ialias (omp_pause_resource) ialias (omp_pause_resource_all) +bool +GOMP_evaluate_target_device (int device_num, const char *kind, + const char *arch, const char *isa) +{ + bool result = true; + + if (device_num < 0) + device_num = omp_get_default_device (); + + if (kind && strcmp (kind, "any") == 0) + kind = NULL; + + gomp_debug (1, "%s: device_num = %u, kind=%s, arch=%s, isa=%s", + __FUNCTION__, device_num, kind, arch, isa); + + if (omp_get_device_num () == device_num) + result = GOMP_evaluate_current_device (kind, arch, isa); + else + { + if (!omp_is_initial_device ()) + /* Accelerators are not expected to know about other devices. */ + result = false; + else + { + struct gomp_device_descr *device = resolve_device (device_num); + if (device == NULL) + result = false; + else if (device->evaluate_device_func) + result = device->evaluate_device_func (device_num, kind, arch, + isa); + } + } + + gomp_debug (1, " -> %s\n", result ? "true" : "false"); + return result; +} + #ifdef PLUGIN_SUPPORT /* This function tries to load a plugin for DEVICE. Name of plugin is passed @@ -3742,6 +3779,7 @@ gomp_load_plugin_for_device (struct gomp_device_descr *device, DLSYM (free); DLSYM (dev2host); DLSYM (host2dev); + DLSYM (evaluate_device); device->capabilities = device->get_caps_func (); if (device->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400) { diff --git a/libgomp/testsuite/Makefile.in b/libgomp/testsuite/Makefile.in index e48c3f2f9b0..5eed05f5dde 100644 --- a/libgomp/testsuite/Makefile.in +++ b/libgomp/testsuite/Makefile.in @@ -284,6 +284,7 @@ pdfdir = @pdfdir@ prefix = @prefix@ program_transform_name = @program_transform_name@ psdir = @psdir@ +runstatedir = @runstatedir@ sbindir = @sbindir@ sharedstatedir = @sharedstatedir@ srcdir = @srcdir@ diff --git a/libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c b/libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c new file mode 100644 index 00000000000..e8ab7ccb166 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/metadirective-5.c @@ -0,0 +1,46 @@ +/* { dg-do run } */ + +#define N 100 + +#include +#include + +int f(int a[], int num) +{ + int on_device = 0; + int i; + + #pragma omp metadirective \ + when (target_device={device_num(num), kind("gpu")}: \ + target parallel for map(to: a[0:N]), map(from: on_device)) \ + default (parallel for private (on_device)) + for (i = 0; i < N; i++) + { + a[i] += i; + on_device = 1; + } + + return on_device; +} + +int main (void) +{ + int a[N]; + int on_device_count = 0; + int i; + + for (i = 0; i < N; i++) + a[i] = i; + + for (i = 0; i <= omp_get_num_devices (); i++) + on_device_count += f (a, i); + + if (on_device_count != omp_get_num_devices ()) + return 1; + + for (i = 0; i < N; i++) + if (a[i] != 2 * i) + return 2; + + return 0; +} diff --git a/libgomp/testsuite/libgomp.fortran/metadirective-5.f90 b/libgomp/testsuite/libgomp.fortran/metadirective-5.f90 new file mode 100644 index 00000000000..3992286dc08 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/metadirective-5.f90 @@ -0,0 +1,44 @@ +! { dg-do run } + +program main + use omp_lib + + implicit none + + integer, parameter :: N = 100 + integer :: a(N) + integer :: on_device_count = 0 + integer :: i + + do i = 1, N + a(i) = i + end do + + do i = 0, omp_get_num_devices () + on_device_count = on_device_count + f (a, i) + end do + + if (on_device_count .ne. omp_get_num_devices ()) stop 1 + + do i = 1, N + if (a(i) .ne. 2 * i) stop 2; + end do +contains + integer function f (a, num) + integer, intent(inout) :: a(N) + integer, intent(in) :: num + integer :: on_device + integer :: i + + on_device = 0 + !$omp metadirective & + !$omp& when (target_device={device_num(num), kind("gpu")}: & + !$omp& target parallel do map(to: a(1:N)), map(from: on_device)) & + !$omp& default (parallel do private(on_device)) + do i = 1, N + a(i) = a(i) + i + on_device = 1 + end do + f = on_device; + end function +end program