From patchwork Tue Apr 26 17:25:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 53235 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4F2CE3857359 for ; Tue, 26 Apr 2022 17:26:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 8D53F3858C53; Tue, 26 Apr 2022 17:25:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8D53F3858C53 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.90,291,1643702400"; d="scan'208,223";a="74777306" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 26 Apr 2022 09:25:39 -0800 IronPort-SDR: K5Dv6Glkj6fu7sd6WHtJUuOUpRMptCSX7zO7o1kwVzCpbvvXXgjJYhAVS7tSGc9a1h3oXSeDQ1 fIoNtULn0xxjoTZyGfmMu/nRZ0+TrsE+NJ1Ca+zUONNUJ1ZiSB8mzpLNGi1u2BkRBnyQnnfpid jNHPohj+I9M4VEXAA3qZGkPOV0Qymg2dn8D7/idlmSgshtm/i66lUGWmXvmjkfrCntXA59ggOx +w1bB1VqbmNnCUxcE9BgCIPzDJYV4BI6BEJwh9ed5jELb8LwtFl9ygUE/YohUrztgocknJ2DQ6 NsQ= From: Thomas Schwinge To: , Julian Brown , Andrew Stubbs Subject: Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] (was: [PATCH] fortran: Fix up gfc_trans_oacc_construct [PR104717]) In-Reply-To: <87tuagsvxd.fsf@dem-tschwing-1.ger.mentorg.com> References: <87tuagsvxd.fsf@dem-tschwing-1.ger.mentorg.com> User-Agent: Notmuch/0.29.1+93~g67ed7df (https://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) Date: Tue, 26 Apr 2022 19:25:31 +0200 Message-ID: <87levrsqno.fsf@dem-tschwing-1.ger.mentorg.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-06.mgc.mentorg.com (139.181.222.6) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jakub Jelinek , Tobias Burnus , fortran@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi! On 2022-04-25T23:19:26+0200, I wrote: > On 2022-04-20T19:06:17+0200, Jakub Jelinek wrote: >> So that move_sese_region_to_fn works properly, OpenMP/OpenACC constructs >> for which that function is invoked need an extra artificial BIND_EXPR >> around their body so that we move all variables of the bodies. >> >> The C/C++ FEs do that both for OpenMP constructs like OMP_PARALLEL, OMP_TASK >> or OMP_TARGET and for OpenACC constructs that behave similarly to >> OMP_TARGET, but the Fortran FE only does that for OpenMP constructs. >> >> The following patch does that for OpenACC constructs too. >> This fixes ICE on the attached testcase. > > ACK, thanks. >> Unfortunately, it also regresses >> FAIL: gfortran.dg/goacc/privatization-1-compute-loop.f90 -O (test for excess errors) >> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O0 (test for excess errors) >> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O1 (test for excess errors) >> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors) >> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) >> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O3 -g (test for excess errors) >> FAIL: libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for excess errors) >> Those emits emit tons of various messages and now there are some extra ones, > > I've fixed these up. One more issue became apparent, where the code changes pushed actually do lead to a GCN offloading compilation failure: [...]/libgomp.oacc-fortran/print-1.f90: In function ‘MAIN__._omp_fn.0’: [...]/libgomp.oacc-fortran/print-1.f90:13:14: error: 512 bytes of gang-private data-share memory exhausted (increase with ‘-mgang-private-size=560’, for example) 13 | !$acc parallel | ^ In my configuration, I may indeed fix GCN offloading compilation with '-foffload-options=amdgcn-amdhsa=-mgang-private-size=560', but I don't think that's generally correct/sufficient, so in the the attached "Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717]", I instead "raise '-mgang-private-size' to an arbitrary high value". This avoids having to route the actual 'sizeof' from GCC build down to the test suite harness (which ought to be doable, but non-trivial). OK to push that: +! For GCN offloading compilation, when gang-privatizing 'dt_parm.N' +! (see below), we run into an 'gang-private data-share memory exhausted' +! error: the default '-mgang-private-size' is too small. Per +! 'gcc/fortran/trans-io.cc'/'libgfortran/io/io.h', that one is +! 'struct st_parameter_dt', which indeed is rather big. Instead of +! working out its exact size (which may vary per GCC configuration), +! raise '-mgang-private-size' to an arbitrary high value. +! { dg-additional-options "-foffload-options=amdgcn-amdhsa=-mgang-private-size=13579" { target openacc_radeon_accel_selected } } ... to master branch? (This doubles the use/testing of the '-mgang-private-size' option!) ;-) We've currently not been doing OpenACC privatization scanning in 'libgomp.oacc-fortran/print-1.f90', which I've now added, to help document the issue; no need to review that. Of course, the issue could alternatively be fixed by adding more logic to the GCN back end to auto-scale the allocation, or be fixed by adding more logic to the compiler to avoid gang-privatizing varibales such as 'dt_parm.N' in such cases, but that's not something I'm going to look into at this point. Or, of course, be avoided by re-writing the test case to not require gang-privatizing 'dt_parm.N', but the test case is correct as it is. Grüße Thomas > PR fortran/104717 > gcc/fortran/ > * trans-openmp.cc (gfc_trans_oacc_construct): Wrap construct body > in an extra BIND_EXPR. > --- a/gcc/fortran/trans-openmp.cc > +++ b/gcc/fortran/trans-openmp.cc > @@ -4444,7 +4444,9 @@ gfc_trans_oacc_construct (gfc_code *code) > gfc_start_block (&block); > oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses, > code->loc, false, true); > + pushlevel (); > stmt = gfc_trans_omp_code (code->block->next, true); > + stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0)); > stmt = build2_loc (gfc_get_location (&code->loc), construct_code, > void_type_node, stmt, oacc_clauses); > gfc_add_expr_to_block (&block, stmt); ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 From 3dfea06371aa9bcc84ad75a2bc821a45e131dca6 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Tue, 26 Apr 2022 18:41:23 +0200 Subject: [PATCH] Fix up 'libgomp.oacc-fortran/print-1.f90' GCN offloading compilation [PR104717] That got broken by recent commit b2202431910e30d8505c94d1cb9341cac7080d10 "fortran: Fix up gfc_trans_oacc_construct [PR104717]". PR fortran/104717 libgomp/ * testsuite/libgomp.oacc-fortran/print-1.f90: Add OpenACC privatization scanning. For GCN offloading compilation, raise '-mgang-private-size'. --- .../libgomp.oacc-fortran/print-1.f90 | 30 ++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 index 7b7f73741fe..42a8538e1fb 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/print-1.f90 @@ -6,11 +6,39 @@ ! Separate file 'print-1-nvptx.f90' for nvptx offloading. ! { dg-skip-if "separate file" { offload_target_nvptx } } +! For GCN offloading compilation, when gang-privatizing 'dt_parm.N' +! (see below), we run into an 'gang-private data-share memory exhausted' +! error: the default '-mgang-private-size' is too small. Per +! 'gcc/fortran/trans-io.cc'/'libgfortran/io/io.h', that one is +! 'struct st_parameter_dt', which indeed is rather big. Instead of +! working out its exact size (which may vary per GCC configuration), +! raise '-mgang-private-size' to an arbitrary high value. +! { dg-additional-options "-foffload-options=amdgcn-amdhsa=-mgang-private-size=13579" { target openacc_radeon_accel_selected } } + +! { dg-additional-options "-fopt-info-note-omp" } +! { dg-additional-options "-foffload=-fopt-info-note-omp" } + +! { dg-additional-options "--param=openacc-privatization=noisy" } +! { dg-additional-options "-foffload=--param=openacc-privatization=noisy" } +! Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types): +! { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */ + +! It's only with Tcl 8.5 (released in 2007) that "the variable 'varName' +! passed to 'incr' may be unset, and in that case, it will be set to [...]", +! so to maintain compatibility with earlier Tcl releases, we manually +! initialize counter variables: +! { dg-line l_dummy[variable c_compute 0] } +! { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid +! "WARNING: dg-line var l_dummy defined, but not used". + program main implicit none integer :: var = 42 -!$acc parallel +!$acc parallel ! { dg-line l_compute[incr c_compute] } + ! { dg-note {variable 'dt_parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'dt_parm\.[0-9]+' ought to be adjusted for OpenACC privatization level: 'gang'} {} { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'dt_parm\.[0-9]+' adjusted for OpenACC privatization level: 'gang'} {} { target { ! openacc_host_selected } } l_compute$c_compute } write (0, '("The answer is ", I2)') var !$acc end parallel -- 2.25.1