From patchwork Sun Jan 14 23:15:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 84067 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BAA783858C74 for ; Sun, 14 Jan 2024 23:16:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.10]) by sourceware.org (Postfix) with ESMTPS id 66EA83858D1E for ; Sun, 14 Jan 2024 23:15:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 66EA83858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=net-b.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=net-b.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 66EA83858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=212.227.17.10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705274145; cv=none; b=bWH8LJSDMEnwTEYy8XCsNRl1el9SR5+gyKJR3zX4va3g3FyTeZG4MgIO4gc2uY3xS+AkPfD6g1yldkKqA1jiuaxjJmPtxYsQUO9WBGjvS6OJ5SiFKl4eTLz8c9E9iW75ssmMFCZHlULxq83+6evcRyaRPR/gyeyoqeQa4sQf0qs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705274145; c=relaxed/simple; bh=YJZiI2apLM07zqRHM6vV+tUv51rP9g113jTBiLsf+Xw=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=ZTJuW34je8AlkZrKZV6ff7ldjWpoo7KcLbSiwtlknAocV96rRrN2afnvyrmXBgCssU9seHqiAMa1/r09KJkVI4XdOTbYQxBQsCONzr5dLu20jggHU6ixGM0HHyP/vJRbdh7woS2PM/y4vzjE/5a3WT1Co0aIzpGOnkmnjD82AWw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [172.28.1.130] ([5.50.5.7]) by mrelayeu.kundenserver.de (mreue107 [213.165.67.113]) with ESMTPSA (Nemesis) id 1MfHxt-1qxWeL166N-00goZ6; Mon, 15 Jan 2024 00:15:34 +0100 Message-ID: <301e4198-4dbd-453f-8746-95d5d1ec2bf2@net-b.de> Date: Mon, 15 Jan 2024 00:15:32 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [Patch] libgomp.texi: Document omp_pause_resource{,_all} and omp_target_memcpy* (was: [Patch] libgomp.texi: Document omp_pause_resource{,_all}) Content-Language: en-US To: Sandra Loosemore , gcc-patches , Jakub Jelinek , Sandra Loosemore References: <98216ca7-6a10-4342-b510-1f362127f619@net-b.de> From: Tobias Burnus In-Reply-To: X-Provags-ID: V03:K1:LCFzDkT9uhhunDZBfpUCArfMDjRmho6d92yF6yjKlr6mMBBS0ue Kslp1ckUYcWjbySj2q8EcFe9/CiTYR8lu5PTJkkh+dZWxbLwqnR5BaWqWHoWoF1v3khtRRn shidoG8MPmEK8G/gitQaeQ3RTSeBixS1kqCsG77yVQ6FPeGQLc5s1jXODjkfztD+kAwMIj2 TGrWBTwYLtBfLsfZcUt5w== UI-OutboundReport: notjunk:1;M01:P0:22O5bXdtli0=;Mq3s6uPZTURXOLLIRT1YTFtY6Iz CPHrwV7PesJtqVR5vF+zae3oghODDUQzw5v/OTKN4vhqdQ7rOgI6Bcs3mtnZn85f7kjv1o7ZR 8h+wQbx0FCjIQqzOFpIbfn6ft5lrSUZKDIUVDJ9qgxCDl/51mO5xNV4+aKHpZg5qAZ7rAogyC r9RpX1aclK9xrwSAuqv8cDlTdxXtEDGZqM91RG4kpvoDrwMdqaN0Nlpah1yTvR8rvt3+B5a4+ xsieRo9iOYANr+BvqQAVEew2BdyHUusYW6i1VaYHo/y8ZC3kzOcji48NOi8KlgT401QQ13k4q kqNqwWxsoUOuCrPyzxSQ6KNYeQUYVCZD/OX2GNYVNGn8+wgRNyn9BKrzrVe96KQwUSbrTXdlk 0mmKSHfjCGyYzg1b76TltKUb1JEO6ffS32V8AuoILLJVWsFyF/61pUF0J0up+NRLYAE6aah76 9VPntbwc1FQ4nIsG3vZqtC0ln/vSpfnORjYK6v13M+MU5ZVjvDO2HwBZZQuEsyt0Ud07We3se 3OG4GiIBWsg6DnjUDdHgbcwRjX0z+piojE/tcEpNGPXaMgSrcGNC1GJQ7NyZk1BTIUXKk/XF5 4Zehs7LAT5MDkCRPwKixjpR2TXEPAKv2YAnjWV4PLs6/Kj82Hcl1HNoaMTS2UQ/7pTQACW42U JyYLMK52a7uiRv60/QrnIqC+i7HnciwnT0yzH7wW/pR0O7Oe3d1wQQqCYOkC21U/bOOrSilrn u0i51+toPP+uzKnvdw1xfaydIIlZ24q6UtZXIuztxohYd/19qh4eir589rMb3ANugZ3dNqTDQ XBKB9nLYUplU+sDsdGUXhT7v/F6BR64fG5n3mHWt3iB8Yu1KF9ch4iE+tlcKnPPOr2WMm6rQt /+Z6tVgvBe6ncdg== X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi Sandra, hi all, Sandra Loosemore: > On 1/14/24 07:26, Tobias Burnus wrote: > I have some minor nits about typos and copy-editing. Thanks. That's the downside of doing editing while being sleepy on a train. Updated and extended version (documenting also omp_target_memcpy) is attached. Warning: Still mostly done during a train ride. > I assume the formatting of the interface syntax > is consistent with how it's done elsewhere in the manual. It should be consistent, but I think it eventually need some cleanup as the indentation does not work for the continuation lines. And I think some other improvements are needed, but that's a slow step-by-step process. * * * > Re the content, I see no documentation for omp_pause_resource_t or the > equivalent in Fortran, or any hint about what the kind argument is for. There is actually some Fortran documentation at https://gcc.gnu.org/onlinedocs/gfortran/OpenMP-Modules-OMP_005fLIB-and-OMP_005fLIB_005fKINDS.html (gcc/fortran/intrinsic.texi). But I concur that moving it to libgomp.texi and adding a C version makes sense; see also PR110364 under "BTW" (2nd paragraph). > If it's to explain implementation-specific > features, then it should at least be documenting whether GCC supports > additional pause kinds as permitted by the spec. It doesn't - and it lacks the OpenMP 6.0 addition (post-TR12) as well. Tobias libgomp.texi: Document omp_pause_resource{,_all} and omp_target_memcpy* libgomp/ChangeLog: * libgomp.texi (Runtime Library Routines): Document omp_pause_resource, omp_pause_resource_all and omp_target_memcpy{,_rect}{,_async}. libgomp/libgomp.texi | 329 ++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 314 insertions(+), 15 deletions(-) diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 74d4ef34c43..d3adfd48545 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -561,7 +561,7 @@ specification in version 5.2. * Thread Affinity Routines:: * Teams Region Routines:: * Tasking Routines:: -@c * Resource Relinquishing Routines:: +* Resource Relinquishing Routines:: * Device Information Routines:: * Device Memory Routines:: * Lock Routines:: @@ -1504,16 +1504,78 @@ and @code{false} represent their language-specific counterparts. -@c @node Resource Relinquishing Routines -@c @section Resource Relinquishing Routines -@c -@c Routines releasing resources used by the OpenMP runtime. -@c They have C linkage and do not throw exceptions. -@c -@c @menu -@c * omp_pause_resource:: -@c * omp_pause_resource_all:: -@c @end menu +@node Resource Relinquishing Routines +@section Resource Relinquishing Routines + +Routines releasing resources used by the OpenMP runtime. +They have C linkage and do not throw exceptions. + +@menu +* omp_pause_resource:: Release OpenMP resources on a device +* omp_pause_resource_all:: Release OpenMP resources on all devices +@end menu + + + +@node omp_pause_resource +@subsection @code{omp_pause_resource} -- Release OpenMP resources on a device +@table @asis +@item @emph{Description}: +Free resources used by the OpenMP program and the runtime library on and for the +device specified by @var{device_num}; on success, zero is returned and non-zero +otherwise. + +The value of @var{device_num} must be a conforming device number. The routine +may not be called from within any explicit region and all explicit threads that +do not bind to the implicit parallel region have finalized execution. + +@item @emph{C/C++}: +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{int omp_pause_resource(omp_pause_resource_t kind, int device_num);} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{integer function omp_pause_resource(kind, device_num)} +@item @tab @code{integer (kind=omp_pause_resource_kind) kind} +@item @tab @code{integer device_num} +@end multitable + +@item @emph{Reference}: +@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.43. +@end table + + + +@node omp_pause_resource_all +@subsection @code{omp_pause_resource_all} -- Release OpenMP resources on all devices +@table @asis +@item @emph{Description}: +Free resources used by the OpenMP program and the runtime library on all devices, +including the host. On success, zero is returned and non-zero otherwise. + +The routine may not be called from within any explicit region and all explicit +threads that do not bind to the implicit parallel region have finalized execution. + +@item @emph{C/C++}: +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{int omp_pause_resource(omp_pause_resource_t kind);} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{integer function omp_pause_resource(kind)} +@item @tab @code{integer (kind=omp_pause_resource_kind) kind} +@end multitable + +@item @emph{See also}: +@ref{omp_pause_resource} + +@item @emph{Reference}: +@uref{https://www.openmp.org, OpenMP specification v5.0}, Section 3.2.44. +@end table + + @node Device Information Routines @section Device Information Routines @@ -1720,10 +1782,10 @@ pointers on devices. They have C linkage and do not throw exceptions. * omp_target_free:: Free device memory * omp_target_is_present:: Check whether storage is mapped * omp_target_is_accessible:: Check whether memory is device accessible -@c * omp_target_memcpy:: -@c * omp_target_memcpy_rect:: -@c * omp_target_memcpy_async:: -@c * omp_target_memcpy_rect_async:: +* omp_target_memcpy:: Copy data between devices +* omp_target_memcpy_rect:: Copy a subvolume of data between devices +* omp_target_memcpy_async:: Copy data between devices asynchronously +* omp_target_memcpy_rect_async:: Copy a subvolume of data between devices asynchronously @c * omp_target_memset:: /TR12 @c * omp_target_memset_async:: /TR12 * omp_target_associate_ptr:: Associate a device pointer with a host pointer @@ -1899,6 +1961,243 @@ is not supported. +@node omp_target_memcpy +@subsection @code{omp_target_memcpy} -- Copy data between devices +@table @asis +@item @emph{Description}: +This routine tests copies @var{length} of bytes of data from the device +identified by device number @var{src_device_num} to device @var{dst_device_num}. +The data is copied from the source device from the address provided by +@var{src}, shifted by the offset of @var{src_offset} bytes, to the destination +device's @var{dst} address shifted by @var{dst_offset}. The routine returns +zero on success and non-zero otherwise. + +Running this routine in a @code{target} region except on the initial device +is not supported. + +@item @emph{C/C++} +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{int omp_target_memcpy(void *dst,} +@item @tab @code{ const void *src,} +@item @tab @code{ size_t length,} +@item @tab @code{ size_t dst_offset,} +@item @tab @code{ size_t src_offset,} +@item @tab @code{ int dst_device_num,} +@item @tab @code{ int src_device_num)} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_memcpy( &} +@item @tab @code{ dst, src, length, dst_offset, src_offset, &} +@item @tab @code{ dst_device_num, src_device_num) bind(C)} +@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_size_t, c_int} +@item @tab @code{type(c_ptr), value :: dst, src} +@item @tab @code{integer(c_size_t), value :: length, dst_offset, src_offset} +@item @tab @code{integer(c_int), value :: dst_device_num, src_device_num} +@end multitable + +@item @emph{See also}: +@ref{omp_target_memcpy_async}, @ref{omp_target_memcpy_rect} + +@item @emph{Reference}: +@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.5 +@end table + + + +@node omp_target_memcpy_async +@subsection @code{omp_target_memcpy_async} -- Copy data between devices asynchronously +@table @asis +@item @emph{Description}: +This routine tests copies asynchronously @var{length} of bytes of data from the +device identified by device number @var{src_device_num} to device +@var{dst_device_num}. The data is copied from the source device from the +address provided by @var{src}, shifted by the offset of @var{src_offset} bytes, +to the destination device's @var{dst} address shifted by @var{dst_offset}. +Task dependence is expressed by passing an array of depend objects to +@var{depobj_list}, where the number of array elements is passed as +@var{depobj_count}; if the count is zero, the @var{depobj_list} argument is +ignored. The routine returns zero if the copying process has successfully +been started and non-zero otherwise. + +Running this routine in a @code{target} region except on the initial device +is not supported. + +@item @emph{C/C++} +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{int omp_target_memcpy_async(void *dst,} +@item @tab @code{ const void *src,} +@item @tab @code{ size_t length,} +@item @tab @code{ size_t dst_offset,} +@item @tab @code{ size_t src_offset,} +@item @tab @code{ int dst_device_num,} +@item @tab @code{ int src_device_num,} +@item @tab @code{ int depobj_count,} +@item @tab @code{ omp_depend_t *depobj_list)} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_memcpy_async( &} +@item @tab @code{ dst, src, length, dst_offset, src_offset, &} +@item @tab @code{ dst_device_num, src_device_num, &} +@item @tab @code{ depobj_count, depobj_list) bind(C)} +@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_size_t, c_int} +@item @tab @code{type(c_ptr), value :: dst, src} +@item @tab @code{integer(c_size_t), value :: length, dst_offset, src_offset} +@item @tab @code{integer(c_int), value :: dst_device_num, src_device_num, depobj_count} +@item @tab @code{integer(omp_depend_kind), optional :: depobj_list(*)} +@end multitable + +@item @emph{See also}: +@ref{omp_target_memcpy}, @ref{omp_target_memcpy_rect_async} + +@item @emph{Reference}: +@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.7 +@end table + + + +@node omp_target_memcpy_rect +@subsection @code{omp_target_memcpy_rect} -- Copy a subvolume of data between devices +@table @asis +@item @emph{Description}: +This routine tests copies a subvolume of data from the device identified by +device number @var{src_device_num} to device @var{dst_device_num}. The +subvolume of a multi-dimensional array of array dimension @var{num_dims} and +each array element has a size of @var{element_size} bytes. The @var{volume} +array specifies how many elements per dimension will be copied. The full +array in number of elements is given by the @var{dst_dimensions} and +@var{src_dimensions} arguments for the array on the destination and source +device, respectively. The offset per dimension to the first element to +be copied is given by the @var{dst_offset} and @var{src_offset} arguments. +The routine returns zero on success and non-zero otherwise. + +The OpenMP only requires that @var{num_dims} up to three is supported. In order +to find implementation-specific maximally supported number of dimensions, the +routine will return this value when invoked with a NULL pointer to both the +@var{dst} and @var{src} arguments. As GCC supports arbitrary dimensions, it +will return INTMAX. + +The device-number arguments must be conforming device number, the @var{src} and +@var{dst} must be either both NULL or any of the following must be fulfilled: +@var{element_size} and @var{num_dims} must be positive, the @var{volume}, offset +and dimension arrays must have at least @var{num_dims} dimensions. +Running this routine in a @code{target} region except on the initial device +is not supported. + +@item @emph{C/C++} +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{int omp_target_memcpy_rect(void *dst,} +@item @tab @code{ const void *src,} +@item @tab @code{ size_t element_size,} +@item @tab @code{ int num_dims,} +@item @tab @code{ const size_t *volume,} +@item @tab @code{ const size_t *dst_offset,} +@item @tab @code{ const size_t *src_offset,} +@item @tab @code{ const size_t *dst_dimensions,} +@item @tab @code{ const size_t *src_dimensions,} +@item @tab @code{ int dst_device_num,} +@item @tab @code{ int src_device_num)} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_memcpy_rect( &} +@item @tab @code{ dst, src, element_size, num_dims, volume, &} +@item @tab @code{ dst_offset, src_offset, dst_dimensions, &} +@item @tab @code{ src_dimensions, dst_device_num, src_device_num) bind(C)} +@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_size_t, c_int} +@item @tab @code{type(c_ptr), value :: dst, src} +@item @tab @code{integer(c_size_t), value :: element_size, dst_offset, src_offset} +@item @tab @code{integer(c_size_t), value :: volume, dst_dimensions, src_dimensions} +@item @tab @code{integer(c_int), value :: num_dims, dst_device_num, src_device_num} +@end multitable + +@item @emph{See also}: +@ref{omp_target_memcpy_rect_async}, @ref{omp_target_memcpy} + +@item @emph{Reference}: +@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.6 +@end table + + + +@node omp_target_memcpy_rect_async +@subsection @code{omp_target_memcpy_rect_async} -- Copy a subvolume of data between devices asynchronously +@table @asis +@item @emph{Description}: +This routine tests copies asynchronously a subvolume of data from the device +identified by device number @var{src_device_num} to device @var{dst_device_num}. +The subvolume of a multi-dimensional array of array dimension @var{num_dims} and +each array element has a size of @var{element_size} bytes. The @var{volume} +array specifies how many elements per dimension will be copied. The full +array in number of elements is given by the @var{dst_dimensions} and +@var{src_dimensions} arguments for the array on the destination and source +device, respectively. The offset per dimension to the first element to +be copied is given by the @var{dst_offset} and @var{src_offset} arguments. +The routine returns zero if the copying process has successfully +been started and non-zero otherwise. + +The OpenMP only requires that @var{num_dims} up to three is supported. In order +to find implementation-specific maximally supported number of dimensions, the +routine will return this value when invoked with a NULL pointer to both the +@var{dst} and @var{src} arguments. As GCC supports arbitrary dimensions, it +will return INTMAX. + +The device-number arguments must be conforming device number, the @var{src} and +@var{dst} must be either both NULL or any of the following must be fulfilled: +@var{element_size} and @var{num_dims} must be positive, the @var{volume}, offset +and dimension arrays must have at least @var{num_dims} dimensions. +Running this routine in a @code{target} region except on the initial device +is not supported. + +Running this routine in a @code{target} region except on the initial device +is not supported. + +@item @emph{C/C++} +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{int omp_target_memcpy_rect_async(void *dst,} +@item @tab @code{ const void *src,} +@item @tab @code{ size_t element_size,} +@item @tab @code{ int num_dims,} +@item @tab @code{ const size_t *volume,} +@item @tab @code{ const size_t *dst_offset,} +@item @tab @code{ const size_t *src_offset,} +@item @tab @code{ const size_t *dst_dimensions,} +@item @tab @code{ const size_t *src_dimensions,} +@item @tab @code{ int dst_device_num,} +@item @tab @code{ int src_device_num,} +@item @tab @code{ int depobj_count,} +@item @tab @code{ omp_depend_t *depobj_list)} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{integer(c_int) function omp_target_memcpy_rect_async( &} +@item @tab @code{ dst, src, element_size, num_dims, volume, &} +@item @tab @code{ dst_offset, src_offset, dst_dimensions, &} +@item @tab @code{ src_dimensions, dst_device_num, src_device_num, &} +@item @tab @code{ depobj_count, depobj_list) bind(C)} +@item @tab @code{use, intrinsic :: iso_c_binding, only: c_ptr, c_size_t, c_int} +@item @tab @code{type(c_ptr), value :: dst, src} +@item @tab @code{integer(c_size_t), value :: element_size, dst_offset, src_offset} +@item @tab @code{integer(c_size_t), value :: volume, dst_dimensions, src_dimensions} +@item @tab @code{integer(c_int), value :: num_dims, dst_device_num, src_device_num} +@item @tab @code{integer(c_int), value :: depobj_count} +@item @tab @code{integer(omp_depend_kind), optional :: depobj_list(*)} +@end multitable + +@item @emph{See also}: +@ref{omp_target_memcpy_rect}, @ref{omp_target_memcpy_async} + +@item @emph{Reference}: +@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.8 +@end table + + + @node omp_target_associate_ptr @subsection @code{omp_target_associate_ptr} -- Associate a device pointer with a host pointer @table @asis