From patchwork Tue Feb 22 17:00:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 51309 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9F6583947414 for ; Tue, 22 Feb 2022 17:00:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id B2E633857818 for ; Tue, 22 Feb 2022 17:00:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B2E633857818 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: PpAiE0HFlkQVgA0TGEcxv8mC8qdcaFf2oQgso7oi240/ZlBjWwmNOTLALPef+bK5cujjw7EDpB fFbYuc8+nE4TwawX0OHG2yQwlA2jiwVQk+axTHva4i5GDDeCcrOyoOJrtYcrqRNn72LtMDML4v K1lz0+uXrvJXaolFdlEc5AOoTvc/gAwl4oipa1SngLUSHmhIwZ1us/Mql/Grp71KBlbdbUJAmh VtONSfJtE6DEOBd50vlIhEf2R047Q7619PrWthMRdoxHGNEV/K6oXR9XZ/+NxH2v66oPaDddCS GMWD7haF9ttwothbgx/Z8Ql2 X-IronPort-AV: E=Sophos;i="5.88,387,1635235200"; d="scan'208,223";a="72295069" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 22 Feb 2022 09:00:26 -0800 IronPort-SDR: MrtSu3Sx0NJtHGqGSynZGzpxmqQoOa6U8L6h+npEA/hGV+Y9yZzta6SdiNsrWJEViLH6vv3i5h JW0PUQCN8PRHXvdEac8Grc/80mgJrBZtmMrFpKID0sfbzcu4rMTlFUIQZ06CFSiDrQ9riUdaPE PdkYluGQvcEn3VfZ8dhYSt4Ld1sFCGyeIpLJ3jHsiH+NxxLydx0W2ROHFYyWR+2KBUd59iGgQL smbCQh1MnXbculPmCMe2GYxWW8F6ScRX+y/iPVfEI+MYRoBWLJO8rwF4gXrbJdW0Fvoi3kuSc2 aoY= From: Thomas Schwinge To: Jakub Jelinek , Subject: Get rid of 'gcc/omp-oacc-neuter-broadcast.cc:oacc_build_component_ref' (was: Re-unify 'omp_build_component_ref' and 'oacc_build_component_ref') In-Reply-To: <87r1f2puss.fsf@euler.schwinge.homeip.net> References: <992c7c29-5773-45b6-6fb7-ffb71299a98f@mentor.com> <87r1f2puss.fsf@euler.schwinge.homeip.net> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Tue, 22 Feb 2022 18:00:18 +0100 Message-ID: <87sfsaomzh.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LOTSOFHASH, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julian Brown , Kwok Cheung Yeung , Andrew Stubbs Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi! On 2021-08-09T16:16:51+0200, I wrote: > This concerns a class of ICEs seen as of og10 branch with the > "openacc: Middle-end worker-partitioning support" and "amdgcn: > Enable OpenACC worker partitioning for AMD GCN" changes applied: I've determined that as of commit 2a3f9f6532bb21d8ab6f16fbe9ee603f6b1405f2 "openacc: Shared memory layout optimisation", we're no longer running into the vectorizer ICEs for '!ADDR_SPACE_GENERIC_P'. I have not researched if they've just gone latent (again), or whether that commit really changed something to avoid those (bug fix). Anyway: pushed to master branch commit 54f745023276e5025e34b2cc22530c78423a93cb "Get rid of 'gcc/omp-oacc-neuter-broadcast.cc:oacc_build_component_ref'", see attached. Grüße Thomas > On 2020-06-06T16:07:36+0100, Kwok Cheung Yeung wrote: >> On 01/06/2020 8:48 pm, Kwok Cheung Yeung wrote: >>> On 21/05/2020 10:23 pm, Kwok Cheung Yeung wrote: >>>> These all have the same failure mode: >>>> >>>> during RTL pass: expand >>>> [...]/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90: In function 'MAIN__._omp_fn.1': >>>> [...]/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90:86: internal compiler error: in convert_memory_address_addr_space_1, at explow.c:302 >>>> 0xc29f20 convert_memory_address_addr_space_1(scalar_int_mode, rtx_def*, unsigned char, bool, bool) >>>> [...]/gcc/explow.c:302 >>>> 0xc29f57 convert_memory_address_addr_space(scalar_int_mode, rtx_def*, unsigned char) >>>> [...]/gcc/explow.c:404 >>>> [...] > >>>> This occurs if the -ftree-slp-vectorize flag is specified (default at -O3). > >>> The problematic bit of Gimple code is this: >>> >>> .oacc_worker_o.44._120 = gangs_min_472; >>> .oacc_worker_o.44._122 = workers_min_473; >>> .oacc_worker_o.44._124 = vectors_min_474; >>> .oacc_worker_o.44._126 = gangs_max_475; >>> .oacc_worker_o.44._128 = workers_max_476; >>> .oacc_worker_o.44._130 = vectors_max_477; >>> .oacc_worker_o.44._132 = 0; >>> >>> With SLP vectorization enabled, it becomes this: >>> >>> _40 = {gangs_min_472, workers_min_473, vectors_min_474, gangs_max_475}; >>> ... >>> MEM [(int *)&.oacc_worker_o.44] = _40; >>> .oacc_worker_o.44._128 = workers_max_476; >>> .oacc_worker_o.44._130 = vectors_max_477; >>> .oacc_worker_o.44._132 = 0; >>> >>> The optimization is trying to transform 4 separate assignments into a single >>> memory operation. The trouble is that &o.acc_worker_o is an SImode pointer in >>> AS4 (LDS), while the memory expression appears to be in the default memory >>> space. The 'to' expression of the assignment is: >>> >>> >> type >> type >> size >>> unit-size >>> align:32 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type 0x7ffff73195e8 precision:32 min max >>> pointer_to_this reference_to_this > >>> TI >>> size >>> unit-size >>> align:128 warn_if_not_align:0 symtab:0 alias-set 1 structural-equality nunits:4 >>> pointer_to_this > >>> >>> arg:0 >> type >>> public unsigned DI >>> size >>> unit-size >>> align:64 warn_if_not_align:0 symtab:0 alias-set 2 structural-equality> >>> constant >>> arg:0 >>> addressable used static ignored BLK [...]/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90:86:0 >>> >>> size >>> unit-size >>> align:128 warn_if_not_align:0 >>> (mem/c:BLK (symbol_ref:SI (".oacc_worker_o.44.14") [flags 0x2] ) [9 .oacc_worker_o.44+0 S28 A128 AS4])>> >>> arg:1 constant 0>> >>> >>> In convert_memory_address_addr_space_1: >>> >>> #ifndef POINTERS_EXTEND_UNSIGNED >>> gcc_assert (GET_MODE (x) == to_mode || GET_MODE (x) == VOIDmode); >>> return x; >>> #else /* defined(POINTERS_EXTEND_UNSIGNED) */ >>> >>> POINTERS_EXTEND_UNSIGNED is not defined, so it hits the assert. The expected >>> to_mode is DI_mode, but x is SI_mode, so the assert fires. > >> I now have a fix for this. >> >> > MEM [(int *)&.oacc_worker_o.44] = _40; >> >> The ICE occurs because the SLP vectorization pass creates the new statement >> using the type of the expression '&.oacc_worker_o.44', which is a pointer to a >> component ref in the default address space. The expand pass gets confused >> because it is handed an SImode pointer (for LDS) when it is expecting a DImode >> pointer (for flat/global space). >> >> The underlying problem is that although .oacc_worker_o is in the correct address >> space, the component ref .oacc_worker_o is not. I fixed this by propagating the >> address space of .oacc_worker_o when the component ref is created. > >> static tree >> oacc_build_component_ref (tree obj, tree field) >> { >> - tree ret = build3 (COMPONENT_REF, TREE_TYPE (field), obj, field, NULL); >> + tree field_type = TREE_TYPE (field); >> + tree obj_type = TREE_TYPE (obj); >> + if (!ADDR_SPACE_GENERIC_P (TYPE_ADDR_SPACE (obj_type))) >> + field_type = build_qualified_type >> + (field_type, >> + KEEP_QUAL_ADDR_SPACE (TYPE_QUALS (obj_type))); >> + >> + tree ret = build3 (COMPONENT_REF, field_type, obj, field, NULL); >> if (TREE_THIS_VOLATILE (field)) >> TREE_THIS_VOLATILE (ret) |= 1; >> if (TREE_READONLY (field)) > > This code change has been included in the recent master branch commit > e2a58ed6dc5293602d0d168475109caa81ad0f0d "openacc: Middle-end > worker-partitioning support", which thus includes a > 'gcc/omp-oacc-neuter-broadcast.cc:oacc_build_component_ref' that is > slightly different from 'gcc/omp-low.c:omp_build_component_ref'. > > I'm confirming that with this reverted, we're seeing ICEs as follows: > > +FAIL: libgomp.oacc-fortran/gemm-2.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/gemm-2.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > +FAIL: libgomp.oacc-fortran/gemm.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/gemm.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > +FAIL: libgomp.oacc-fortran/optional-reduction.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/optional-reduction.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > +FAIL: libgomp.oacc-fortran/private-variables.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/private-variables.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > +FAIL: libgomp.oacc-fortran/reduction-1.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/reduction-1.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > +FAIL: libgomp.oacc-fortran/reduction-5.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/reduction-5.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > +FAIL: libgomp.oacc-fortran/reduction-6.f90 [...] -foffload=amdgcn-amdhsa -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) > > +FAIL: libgomp.oacc-fortran/reduction-6.f90 [...] -foffload=amdgcn-amdhsa -O3 -g (internal compiler error) > > Concerning the current 'gcc/omp-low.c:omp_build_component_ref', for the > current set of offloading testcases, we never see a > '!ADDR_SPACE_GENERIC_P' there, so the address space handling doesn't seem > to be necessary there (but also won't do any harm: no-op). > > Would it make sense to "Re-unify 'omp_build_component_ref' and > 'oacc_build_component_ref'", see attached? > > > Grüße > Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 From 54f745023276e5025e34b2cc22530c78423a93cb Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Fri, 30 Jul 2021 16:15:25 +0200 Subject: [PATCH] Get rid of 'gcc/omp-oacc-neuter-broadcast.cc:oacc_build_component_ref' Clean-up for commit e2a58ed6dc5293602d0d168475109caa81ad0f0d "openacc: Middle-end worker-partitioning support": as of commit 2a3f9f6532bb21d8ab6f16fbe9ee603f6b1405f2 "openacc: Shared memory layout optimisation", we're no longer running into the vectorizer ICEs for '!ADDR_SPACE_GENERIC_P'. gcc/ * omp-low.cc (omp_build_component_ref): Move function... * omp-general.cc (omp_build_component_ref): ... here. Remove 'static'. * omp-general.h (omp_build_component_ref): Declare function. * omp-oacc-neuter-broadcast.cc (oacc_build_component_ref): Remove function. (build_receiver_ref, build_sender_ref): Call 'omp_build_component_ref' instead. --- gcc/omp-general.cc | 14 ++++++++++++++ gcc/omp-general.h | 2 ++ gcc/omp-low.cc | 15 --------------- gcc/omp-oacc-neuter-broadcast.cc | 26 ++------------------------ 4 files changed, 18 insertions(+), 39 deletions(-) diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc index 19f40dc0b1d..a406c578f33 100644 --- a/gcc/omp-general.cc +++ b/gcc/omp-general.cc @@ -2980,4 +2980,18 @@ oacc_get_ifn_dim_arg (const gimple *stmt) return (int) axis; } +/* Build COMPONENT_REF and set TREE_THIS_VOLATILE and TREE_READONLY on it + as appropriate. */ + +tree +omp_build_component_ref (tree obj, tree field) +{ + tree ret = build3 (COMPONENT_REF, TREE_TYPE (field), obj, field, NULL); + if (TREE_THIS_VOLATILE (field)) + TREE_THIS_VOLATILE (ret) |= 1; + if (TREE_READONLY (field)) + TREE_READONLY (ret) |= 1; + return ret; +} + #include "gt-omp-general.h" diff --git a/gcc/omp-general.h b/gcc/omp-general.h index c0cf5f014cd..7a94831e8f5 100644 --- a/gcc/omp-general.h +++ b/gcc/omp-general.h @@ -149,4 +149,6 @@ get_openacc_privatization_dump_flags () return l_dump_flags; } +extern tree omp_build_component_ref (tree obj, tree field); + #endif /* GCC_OMP_GENERAL_H */ diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index 77176efe715..2294456b27d 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -621,21 +621,6 @@ omp_copy_decl_1 (tree var, omp_context *ctx) return omp_copy_decl_2 (var, DECL_NAME (var), TREE_TYPE (var), ctx); } -/* Build COMPONENT_REF and set TREE_THIS_VOLATILE and TREE_READONLY on it - as appropriate. */ -/* See also 'gcc/omp-oacc-neuter-broadcast.cc:oacc_build_component_ref'. */ - -static tree -omp_build_component_ref (tree obj, tree field) -{ - tree ret = build3 (COMPONENT_REF, TREE_TYPE (field), obj, field, NULL); - if (TREE_THIS_VOLATILE (field)) - TREE_THIS_VOLATILE (ret) |= 1; - if (TREE_READONLY (field)) - TREE_READONLY (ret) |= 1; - return ret; -} - /* Build tree nodes to access the field for VAR on the receiver side. */ static tree diff --git a/gcc/omp-oacc-neuter-broadcast.cc b/gcc/omp-oacc-neuter-broadcast.cc index 314161e38f5..81e3223a94c 100644 --- a/gcc/omp-oacc-neuter-broadcast.cc +++ b/gcc/omp-oacc-neuter-broadcast.cc @@ -937,35 +937,13 @@ worker_single_simple (basic_block from, basic_block to, } } -/* Build COMPONENT_REF and set TREE_THIS_VOLATILE and TREE_READONLY on it - as appropriate. */ -/* Adapted from 'gcc/omp-low.cc:omp_build_component_ref'. */ - -static tree -oacc_build_component_ref (tree obj, tree field) -{ - tree field_type = TREE_TYPE (field); - tree obj_type = TREE_TYPE (obj); - if (!ADDR_SPACE_GENERIC_P (TYPE_ADDR_SPACE (obj_type))) - field_type = build_qualified_type - (field_type, - KEEP_QUAL_ADDR_SPACE (TYPE_QUALS (obj_type))); - - tree ret = build3 (COMPONENT_REF, field_type, obj, field, NULL); - if (TREE_THIS_VOLATILE (field)) - TREE_THIS_VOLATILE (ret) |= 1; - if (TREE_READONLY (field)) - TREE_READONLY (ret) |= 1; - return ret; -} - static tree build_receiver_ref (tree var, tree receiver_decl, field_map_t *fields) { tree x = build_simple_mem_ref (receiver_decl); tree field = *fields->get (var); TREE_THIS_NOTRAP (x) = 1; - x = oacc_build_component_ref (x, field); + x = omp_build_component_ref (x, field); return x; } @@ -975,7 +953,7 @@ build_sender_ref (tree var, tree sender_decl, field_map_t *fields) if (POINTER_TYPE_P (TREE_TYPE (sender_decl))) sender_decl = build_simple_mem_ref (sender_decl); tree field = *fields->get (var); - return oacc_build_component_ref (sender_decl, field); + return omp_build_component_ref (sender_decl, field); } static int -- 2.34.1