From patchwork Wed Dec 7 08:08:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 61620 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 357C1392178F for ; Wed, 7 Dec 2022 08:08:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 804673842583 for ; Wed, 7 Dec 2022 08:08:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 804673842583 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.96,223,1665475200"; d="diff'?scan'208";a="88660037" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa3.mentor.iphmx.com with ESMTP; 07 Dec 2022 00:08:16 -0800 IronPort-SDR: 6sqxn+ydJSEJ5AzVjkQZTYG9J/tCyWG4PLG6fMFnZd1l15snmBr1zgRTRmUO9D9tEBltkM2Jd9 n1r/qX1fGwc1OjRYfWntBkX81Mp4Oq9TzuYeFukeZ6StRMR71Vo8bSgoHqgOTdm7e/hLG/68j7 bx60t9H7cFMV/91ivieCVyzqcDkCZmSOkFeVG4kAXa0/tuTfsgetr1Ka3j/alQ1u4LjHu3SMyF 9wlfRkr5S6lCjkRapfgf4MMSu0ziJk6rvyVnfv1GefJY71yj4eTS3RtzARd62uuPHb3kn1A/QQ 1f4= Message-ID: Date: Wed, 7 Dec 2022 09:08:09 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.1 Subject: [Patch] libgomp.texi: Reverse-offload updates (was: [Patch] libgomp: Handle OpenMP's reverse offloads) Content-Language: en-US From: Tobias Burnus To: gcc-patches , Jakub Jelinek References: <0567b7c6-fede-72b8-63d1-1fc10dca36a0@codesourcery.com> In-Reply-To: <0567b7c6-fede-72b8-63d1-1fc10dca36a0@codesourcery.com> X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-07.mgc.mentorg.com (139.181.222.7) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" On 06.12.22 08:45, Tobias Burnus wrote: > * As follow-up, libgomp.texi must be updated That is what the attached patch does – obviously, it is depending on the main patch. OK (once the main patch is in)? Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 libgomp.texi: Reverse-offload updates libgomp/ * libgomp.texi (5.0 Impl. Status): Update 'requires' and 'ancestor'. (GCN): Add item about 'omp requires'. (nvptx): Likewise; add item about reverse offload. diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index efa7d956a33..e9ab079ecf5 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -192,8 +192,8 @@ The OpenMP 4.5 specification is fully supported. env variable @tab Y @tab @item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab @item @code{requires} directive @tab P - @tab complete but no non-host devices provides @code{unified_address}, - @code{unified_shared_memory} or @code{reverse_offload} + @tab complete but no non-host devices provides @code{unified_address} or + @code{unified_shared_memory} @item @code{teams} construct outside an enclosing target region @tab Y @tab @item Non-rectangular loop nests @tab Y @tab @item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab @@ -228,7 +228,7 @@ The OpenMP 4.5 specification is fully supported. @item @code{allocate} clause @tab P @tab Initial support @item @code{use_device_addr} clause on @code{target data} @tab Y @tab @item @code{ancestor} modifier on @code{device} clause - @tab Y @tab See comment for @code{requires} + @tab Y @tab Host fallback with GCN devices @item Implicit declare target directive @tab Y @tab @item Discontiguous array section with @code{target update} construct @tab N @tab @@ -288,7 +288,7 @@ The OpenMP 4.5 specification is fully supported. @code{append_args} @tab N @tab @item @code{dispatch} construct @tab N @tab @item device-specific ICV settings with environment variables @tab Y @tab -@item @code{assume} directive @tab Y @tab +@item @code{assume} and @code{assumes} directives @tab Y @tab @item @code{nothing} directive @tab Y @tab @item @code{error} directive @tab Y @tab @item @code{masked} construct @tab Y @tab @@ -4455,6 +4455,9 @@ The implementation remark: @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported using the C library @code{printf} functions and the Fortran @code{print}/@code{write} statements. +@item OpenMP code that has a requires directive with @code{unified_address}, + @code{unified_shared_memory} or @code{reverse_offload} will remove + any GCN device from the list of available devices (``host fallback''). @end itemize @@ -4504,6 +4507,13 @@ The implementation remark: @item Compilation OpenMP code that contains @code{requires reverse_offload} requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30} is not supported. +@item For code containing reverse offload (i.e. @code{target} regions with + @code{device(ancestor:1)}), there is a slight performance penality + for @emph{all} target regions, consisting mostly of shutdown delay + between zero to one microsecond and a tiny device querying overhead. +@item OpenMP code that has a requires directive with @code{unified_address} + or @code{unified_shared_memory} will remove any nvptx device from the + list of available devices (``host fallback''). @end itemize