From patchwork Wed Nov 16 11:42:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 60692 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E71763959CAC for ; Wed, 16 Nov 2022 11:42:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id A7C283959C8D for ; Wed, 16 Nov 2022 11:42:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A7C283959C8D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.96,167,1665475200"; d="diff'?scan'208";a="87291259" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 16 Nov 2022 03:42:20 -0800 IronPort-SDR: 2ocCfk3p9YYBsDTdHV+l9+Fm0R0H32aEiXcRn8e8nf859UEbMMiKAa/YaTmf6tY+VCKbSsi5Rp qFMfEet5d2HdmnAsyAXb+IBC+PzhdOeg/JcxpklLfp1zvA4hcKHMY6JM4u2K+eghT3fIz2Xd1V LwFPGpg8C1DFKsxlrjWEK6etpcZh/IWKAerc3E2UNt/1ekXVm3YFVX1jAPeoOCMWAHcPANrf24 sKLr4tykiSzWZWLdv8TSWQs2XeZrg8UiiMGYHYsOlBkNK8pfDlAyjUqg6hbKEn62wuTMjApCi1 luw= Message-ID: <31a32901-1912-988b-c641-1f23093e8563@codesourcery.com> Date: Wed, 16 Nov 2022 12:42:16 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Content-Language: en-US To: Andrew Stubbs , gcc-patches From: Tobias Burnus Subject: [patch] gcn: Add __builtin_gcn_kernarg_ptr X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-15.mgc.mentorg.com (139.181.222.15) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This is a part of a patch by Andrew (hi!) - namely that part that only adds the __builtin_gcn_kernarg_ptr. More is planned, see below. The short term benefit of this patch is to permit replacing hardcoded numbers by a builtin – like in libgomp (see patch) or in newlib (not submitted): It would also replace the 'asm("s8")' in reverse offload (GCN) patch, i.e. https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602339.html However, this patch is only the very first step. Next one is to add several additional builtins, namely those that are required for newlib, i.e. newlib/libc/machine/amdgcn/mlock.c (sbrk) and newlib/libc/machine/amdgcn/getreent.c (__getreent) use some additional hard-coded value for heap and stack memory. And at some point - but only after newlib has been updated - we can think of making stack variables non-private. That's a general goal - and in any case required for reverse offload to be able to transfer between the host and on-device stack variables. * * * Regarding the patch: Besides the obvious change (addition of the builtin), the change to DEFAULT memory space is required to avoid a memory-space conversion ICE when using the new builtin. The gcn_oacc_dim_size change is mainly just picked from Andrew's patch as it seems to be reasonable. In terms of the libgomp testsuite, I did not spot anything except that the -O2 run now does no longer fail with "libgomp: target function wasn't mapped" for libgomp.oacc-fortran/kernels-map-1.f90 - but I am not sure it is related or not. In any case, the libgomp testsuite shows no fails (but the usual fails) with the attached patch. OK for mainline? Tobias PS: The plan is to have at least all builtins in GCC and use them in newlib by at the end of this year (i.e. in newlib's end of year snapshot - aka as annual release). PPS: I wonder whether [Patch] libgomp/gcn: Prepare for reverse-offload callback handling https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602339.html would be okay after this patch - with the asm("s8") replaced by the builtin - or not. The code itself would be fine, but it is unreachable until GOMP_OFFLOAD_get_num_devices accepts reverse offload and the latter depends on the support for non-private stack variables. ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 gcn: Add __builtin_gcn_kernarg_ptr Add __builtin_gcn_kernarg_ptr to avoid using hard-coded register values and permit future ABI changes while keeping the API. gcc/ChangeLog: * config/gcn/gcn-builtins.def (KERNARG_PTR): Add. * config/gcn/gcn.cc (gcn_init_builtin_types): Change siptr_type_node, sfptr_type_node and voidptr_type_node from FLAT to ADDR_SPACE_DEFAULT. (gcn_expand_builtin_1): Handle GCN_BUILTIN_KERNARG_PTR. (gcn_oacc_dim_size): Return in ADDR_SPACE_FLAT. libgomp/ChangeLog: * config/gcn/team.c (gomp_gcn_enter_kernel): Use __builtin_gcn_kernarg_ptr instead of asm ("s8"). Co-Authored-By: Andrew Stubbs gcc/config/gcn/gcn-builtins.def | 4 ++++ gcc/config/gcn/gcn.cc | 24 ++++++++++++++++++++---- libgomp/config/gcn/team.c | 2 +- 3 files changed, 25 insertions(+), 5 deletions(-) diff --git a/gcc/config/gcn/gcn-builtins.def b/gcc/config/gcn/gcn-builtins.def index c50777bd..eeeaebf 100644 --- a/gcc/config/gcn/gcn-builtins.def +++ b/gcc/config/gcn/gcn-builtins.def @@ -158,6 +158,10 @@ DEF_BUILTIN (ACC_SINGLE_COPY_END, -1, "single_copy_end", B_INSN, DEF_BUILTIN (ACC_BARRIER, -1, "acc_barrier", B_INSN, _A1 (GCN_BTI_VOID), gcn_expand_builtin_1) +/* Kernel inputs. */ + +DEF_BUILTIN (KERNARG_PTR, -1, "kernarg_ptr", B_INSN, _A1 (GCN_BTI_VOIDPTR), + gcn_expand_builtin_1) #undef _A1 #undef _A2 diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index 5e6f3b8..b3814c2 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -4058,15 +4058,15 @@ gcn_init_builtin_types (void) (integer_type_node) */ , 64); tree tmp = build_distinct_type_copy (intSI_type_node); - TYPE_ADDR_SPACE (tmp) = ADDR_SPACE_FLAT; + TYPE_ADDR_SPACE (tmp) = ADDR_SPACE_DEFAULT; siptr_type_node = build_pointer_type (tmp); tmp = build_distinct_type_copy (float_type_node); - TYPE_ADDR_SPACE (tmp) = ADDR_SPACE_FLAT; + TYPE_ADDR_SPACE (tmp) = ADDR_SPACE_DEFAULT; sfptr_type_node = build_pointer_type (tmp); tmp = build_distinct_type_copy (void_type_node); - TYPE_ADDR_SPACE (tmp) = ADDR_SPACE_FLAT; + TYPE_ADDR_SPACE (tmp) = ADDR_SPACE_DEFAULT; voidptr_type_node = build_pointer_type (tmp); tmp = build_distinct_type_copy (void_type_node); @@ -4493,6 +4493,20 @@ gcn_expand_builtin_1 (tree exp, rtx target, rtx /*subtarget */ , emit_insn (gen_gcn_wavefront_barrier ()); return target; + case GCN_BUILTIN_KERNARG_PTR: + { + rtx ptr; + if (cfun->machine->args.reg[KERNARG_SEGMENT_PTR_ARG] >= 0) + ptr = gen_rtx_REG (DImode, + cfun->machine->args.reg[KERNARG_SEGMENT_PTR_ARG]); + else + { + ptr = gen_reg_rtx (DImode); + emit_move_insn (ptr, const0_rtx); + } + return ptr; + } + default: gcc_unreachable (); } @@ -5700,7 +5714,9 @@ gcn_oacc_dim_size (int dim) cfun->machine->args. reg[DISPATCH_PTR_ARG]), GEN_INT (offset[dim])); - return gen_rtx_MEM (SImode, addr); + rtx mem = gen_rtx_MEM (SImode, addr); + set_mem_addr_space (mem, ADDR_SPACE_SCALAR_FLAT); + return mem; } /* Helper function for oacc_dim_pos instruction. diff --git a/libgomp/config/gcn/team.c b/libgomp/config/gcn/team.c index 254dd4d..4fc7b62 100644 --- a/libgomp/config/gcn/team.c +++ b/libgomp/config/gcn/team.c @@ -60,7 +60,7 @@ gomp_gcn_enter_kernel (void) /* Initialize the team arena for optimized memory allocation. The arena has been allocated on the host side, and the address passed in via the kernargs. Each team takes a small slice of it. */ - register void **kernargs asm("s8"); + void **kernargs = (void**) __builtin_gcn_kernarg_ptr (); void *team_arena = (kernargs[4] + TEAM_ARENA_SIZE*teamid); void * __lds *arena_start = (void * __lds *)TEAM_ARENA_START; void * __lds *arena_free = (void * __lds *)TEAM_ARENA_FREE;