libgomp, OpenMP, nvptx: Low-latency memory allocator

  This patch is submitted now for review and so I can commit a backport it 
to the OG11 branch, but isn't suitable for mainline until stage 1.

The patch implements support for omp_low_lat_mem_space and 
omp_low_lat_mem_alloc on NVPTX offload devices. The omp_pteam_mem_alloc, 
omp_cgroup_mem_alloc and omp_thread_mem_alloc allocators are also 
configured to use this space (this to match the current or intended 
behaviour in other toolchains).

The memory is drawn from the ".shared" space that is accessible only 
from within the team in which it is allocated, and which effectively 
ceases to exist when the kernel exits.  By default, 8 KiB of space is 
reserved for each team at launch time. This can be adjusted, at runtime, 
via a new environment variable "GOMP_NVPTX_LOWLAT_POOL". Reserving a 
larger amount may limit the number of teams that can be run in parallel 
(due to hardware limitations). Conversely, reducing the allocation may 
increase the number of teams that can be run in parallel. (I have not 
yet attempted to tune the default too precisely.) The actual maximum 
size will vary according to the available hardware and the number of 
variables that the compiler has placed in .shared space.

The allocator implementation is designed to add no extra space-overhead 
than omp_alloc already does (aside from rounding allocations up to a 
multiple of 8 bytes), thus the internal free and realloc must be told 
how big the original allocation was. The free algorithm maintains an 
in-order linked-list of free memory chunks. Memory is allocated on a 
first-fit basis.

If the allocation fails the NVPTX allocator returns NULL and omp_alloc 
handles the fall-back. Now that this is a thing that is likely to happen 
(low-latency memory is small) this patch also implements appropriate 
fall-back modes for the predefined allocators (fall-back for custom 
allocators already worked).

In order to support the %dynamic_smem_size PTX feature is is necessary 
to bump the minimum supported PTX version from 3.1 (~2013) to 4.1 (~2014).

OK for stage 1?

Andrew
libgomp, nvptx: low-latency memory allocator

This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc.  The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.

The use of the PTX dynamic_smem_size feature means that the minimum version
requirement is now bumped to 4.1 (still old at this point).

gcc/ChangeLog:

	* config/nvptx/nvptx.c (nvptx_file_start): Bump minimum PTX version
	to 4.1.

libgomp/ChangeLog:

	* allocator.c (MEMSPACE_ALLOC): New macro.
	(MEMSPACE_CALLOC): New macro.
	(MEMSPACE_REALLOC): New macro.
	(MEMSPACE_FREE): New macro.
	(dynamic_smem_size): New constants.
	(omp_alloc): Use MEMSPACE_ALLOC.
	Implement fall-backs for predefined allocators.
	(omp_free): Use MEMSPACE_FREE.
	(omp_calloc): Use MEMSPACE_CALLOC.
	Implement fall-backs for predefined allocators.
	(omp_realloc): Use MEMSPACE_REALLOC.
	Implement fall-backs for predefined allocators.
	* config/nvptx/team.c (__nvptx_lowlat_heap_root): New variable.
	(__nvptx_lowlat_pool): New asm varaible.
	(gomp_nvptx_main): Initialize the low-latency heap.
	* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
	(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
	(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
	* config/nvptx/allocator.c: New file.
	* testsuite/libgomp.c/allocators-1.c: New test.
	* testsuite/libgomp.c/allocators-2.c: New test.
	* testsuite/libgomp.c/allocators-3.c: New test.
	* testsuite/libgomp.c/allocators-4.c: New test.
	* testsuite/libgomp.c/allocators-5.c: New test.
	* testsuite/libgomp.c/allocators-6.c: New test.

Message ID	25ad524d-f0d6-1970-b8e9-9b11b6cde68b@codesourcery.com
State	Superseded
Headers	Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 950CE3858438 for <patchwork@sourceware.org>; Mon, 20 Dec 2021 15:58:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id BA7583858C60 for <gcc-patches@gcc.gnu.org>; Mon, 20 Dec 2021 15:58:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BA7583858C60 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: sUy4FM7rfS9d2tMaSWvUJy10aGFn0WYtY8ntLvqp2TWAY5Tu2xwyFsENIFHtRB8HRRw6fO8BUn +LnOdIeUXBBb/DH8OFIq0D1QJbbiXRU+2B/Ipi9FbcsjaZ6xOUAkFbeLrx09lWXMsOYr9L79AJ yNGTKJVnjFwhL7Yd1lPpTs6yHASrnutqUQJml6cgiR5PDOVN2HmrHVyL0HVnxxekAmgkYJFf2v 076OKSov0JBzbYGAiVJ7lkhZEP5UA9ZlPBQ1Dv1KcWpNbWPraR+zktw6AygSONW3oXC9o9mQGt p21q9d2OEKPEHHp3CkZj4f5U X-IronPort-AV: E=Sophos;i="5.88,220,1635235200"; d="scan'208";a="69930687" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 20 Dec 2021 07:58:33 -0800 IronPort-SDR: mhcmsINAe17oTSWsfBiMvh92zigmR0uzWnS6EkE72NByxuLKCP2w6Bxha1dWIaqpF0JFdC6NCt F8MP5lJ7mJKWdpR5dD6u7BFddSpDWgRi0RSCuHyYFvQ7PVBF/fAMY5dPZAYJILqeuI6K6PFyeU lZaqVIhAAbFS4lM9ruZRbic51Oh5gxpd5y1AWRdLc+1QOa+BYRY79AkTo001iw+K/F6pM7OhVe rmARhMEv2y/JFS4BBcvZBZJORu3nGVlN9QNuvn6fL3tzbDx4AsfbS66oCbMOMnHvXo9IgIFz7A edM= Content-Type: multipart/mixed; boundary="------------MTB5cKoksRV5OCvLNeLbL0zm" Message-ID: <25ad524d-f0d6-1970-b8e9-9b11b6cde68b@codesourcery.com> Date: Mon, 20 Dec 2021 15:58:27 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Content-Language: en-GB From: Andrew Stubbs <ams@codesourcery.com> Subject: [PATCH] libgomp, OpenMP, nvptx: Low-latency memory allocator To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org> X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
Series	libgomp, OpenMP, nvptx: Low-latency memory allocator \| libgomp, OpenMP, nvptx: Low-latency memory allocator

libgomp, OpenMP, nvptx: Low-latency memory allocator

Commit Message

Comments

Patch