From patchwork Tue Mar 8 11:30:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abid Qadeer X-Patchwork-Id: 51784 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9B627385DC10 for ; Tue, 8 Mar 2022 11:35:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 55D19385C421; Tue, 8 Mar 2022 11:31:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55D19385C421 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.90,164,1643702400"; d="scan'208";a="72701969" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa3.mentor.iphmx.com with ESMTP; 08 Mar 2022 03:31:47 -0800 IronPort-SDR: VysnImb/mOnku8KOVHXKgv+cbZjziKIobGLyDBPGdUyxKb6qpL1nahNmH8YXIAaYhg/zTNUi8S eUsv2Hiw6mGMWIwLqw2RSL4WTQ+lzwZEH2XH1pOnxtfJi1otPmqTv05RauPovi0YzfUl2KR67B P2EUnL5Lt2CjuGwTI8vqPu5y14qYPmIJ0Rlf3+HM+B+g8wPf1IqnyoDT22vtMuxbkwHaMdtmTV WSMbwlNEh8P/0xBzSxyCFCw36sj6BUORBb9BLeACZqDHFcazMrkv6zHBX1+EpTjMq0Bwq1gJBf Qg0= From: Hafiz Abid Qadeer To: , Subject: [PATCH 5/5] openmp: -foffload-memory=pinned Date: Tue, 8 Mar 2022 11:30:59 +0000 Message-ID: <20220308113059.688551-6-abidh@codesourcery.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220308113059.688551-1-abidh@codesourcery.com> References: <20220308113059.688551-1-abidh@codesourcery.com> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jakub@redhat.com, ams@codesourcery.com, joseph@codesourcery.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Andrew Stubbs Implement the -foffload-memory=pinned option such that libgomp is instructed to enable fully-pinned memory at start-up. The option is intended to provide a performance boost to certain offload programs without modifying the code. This feature only works on Linux, at present, and simply calls mlockall to enable always-on memory pinning. It requires that the ulimit feature is set high enough to accommodate all the program's memory usage. In this mode the ompx_pinned_memory_alloc feature is disabled as it is not needed and may conflict. gcc/ChangeLog: * omp-low.cc (omp_enable_pinned_mode): New function. (execute_lower_omp): Call omp_enable_pinned_mode. libgomp/ChangeLog: * config/linux/allocator.c (always_pinned_mode): New variable. (GOMP_enable_pinned_mode): New function. (linux_memspace_alloc): Disable pinning when always_pinned_mode set. (linux_memspace_calloc): Likewise. (linux_memspace_free): Likewise. (linux_memspace_realloc): Likewise. * libgomp.map (GOMP_5.1.1): New version space with GOMP_enable_pinned_mode. * testsuite/libgomp.c/alloc-pinned-7.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/alloc-pinned-1.c: New test. --- gcc/omp-low.cc | 68 +++++++++++++++++++ .../c-c++-common/gomp/alloc-pinned-1.c | 28 ++++++++ libgomp/config/linux/allocator.c | 26 +++++++ libgomp/libgomp.map | 5 ++ libgomp/testsuite/libgomp.c/alloc-pinned-7.c | 66 ++++++++++++++++++ 5 files changed, 193 insertions(+) create mode 100644 gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-7.c diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index ec08d59f676..ce21b3bd6f8 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -14441,6 +14441,70 @@ lower_omp (gimple_seq *body, omp_context *ctx) input_location = saved_location; } +/* Emit a constructor function to enable -foffload-memory=pinned + at runtime. Libgomp handles the OS mode setting, but we need to trigger + it by calling GOMP_enable_pinned mode before the program proper runs. */ + +static void +omp_enable_pinned_mode () +{ + static bool visited = false; + if (visited) + return; + visited = true; + + /* Create a new function like this: + + static void __attribute__((constructor)) + __set_pinned_mode () + { + GOMP_enable_pinned_mode (); + } + */ + + tree name = get_identifier ("__set_pinned_mode"); + tree voidfntype = build_function_type_list (void_type_node, NULL_TREE); + tree decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, name, voidfntype); + + TREE_STATIC (decl) = 1; + TREE_USED (decl) = 1; + DECL_ARTIFICIAL (decl) = 1; + DECL_IGNORED_P (decl) = 0; + TREE_PUBLIC (decl) = 0; + DECL_UNINLINABLE (decl) = 1; + DECL_EXTERNAL (decl) = 0; + DECL_CONTEXT (decl) = NULL_TREE; + DECL_INITIAL (decl) = make_node (BLOCK); + BLOCK_SUPERCONTEXT (DECL_INITIAL (decl)) = decl; + DECL_STATIC_CONSTRUCTOR (decl) = 1; + DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("constructor"), + NULL_TREE, NULL_TREE); + + tree t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, + void_type_node); + DECL_ARTIFICIAL (t) = 1; + DECL_IGNORED_P (t) = 1; + DECL_CONTEXT (t) = decl; + DECL_RESULT (decl) = t; + + push_struct_function (decl); + init_tree_ssa (cfun); + + tree callname = get_identifier ("GOMP_enable_pinned_mode"); + tree calldecl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, callname, + voidfntype); + gcall *call = gimple_build_call (calldecl, 0); + + gimple_seq seq = NULL; + gimple_seq_add_stmt (&seq, call); + gimple_set_body (decl, gimple_build_bind (NULL_TREE, seq, NULL)); + + cfun->function_end_locus = UNKNOWN_LOCATION; + cfun->curr_properties |= PROP_gimple_any; + pop_cfun (); + cgraph_node::add_new_function (decl, true); +} + /* Main entry point. */ static unsigned int @@ -14497,6 +14561,10 @@ execute_lower_omp (void) for (auto task_stmt : task_cpyfns) finalize_task_copyfn (task_stmt); task_cpyfns.release (); + + if (flag_offload_memory == OFFLOAD_MEMORY_PINNED) + omp_enable_pinned_mode (); + return 0; } diff --git a/gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c b/gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c new file mode 100644 index 00000000000..e0e08019bff --- /dev/null +++ b/gcc/testsuite/c-c++-common/gomp/alloc-pinned-1.c @@ -0,0 +1,28 @@ +/* { dg-do run } */ +/* { dg-additional-options "-foffload-memory=pinned" } */ +/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */ + +#if __cplusplus +#define EXTERNC extern "C" +#else +#define EXTERNC +#endif + +/* Intercept the libgomp initialization call to check it happens. */ + +int good = 0; + +EXTERNC void +GOMP_enable_pinned_mode () +{ + good = 1; +} + +int +main () +{ + if (!good) + __builtin_exit (1); + + return 0; +} diff --git a/libgomp/config/linux/allocator.c b/libgomp/config/linux/allocator.c index face524259c..4bd5bd6b930 100644 --- a/libgomp/config/linux/allocator.c +++ b/libgomp/config/linux/allocator.c @@ -39,9 +39,26 @@ #include #include "libgomp.h" +static bool always_pinned_mode = false; + +/* This function is called by the compiler when -foffload-memory=pinned + is used. */ + +void +GOMP_enable_pinned_mode () +{ + if (mlockall (MCL_CURRENT | MCL_FUTURE) != 0) + gomp_error ("failed to pin all memory (ulimit too low?)"); + else + always_pinned_mode = true; +} + static void * linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size, int pin) { + /* Explicit pinning may not be required. */ + pin = pin && !always_pinned_mode; + if (memspace == ompx_unified_shared_mem_space) { return gomp_usm_alloc (size, GOMP_DEVICE_ICV); @@ -69,6 +86,9 @@ linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size, int pin) static void * linux_memspace_calloc (omp_memspace_handle_t memspace, size_t size, int pin) { + /* Explicit pinning may not be required. */ + pin = pin && !always_pinned_mode; + if (memspace == ompx_unified_shared_mem_space) { void *ret = gomp_usm_alloc (size, GOMP_DEVICE_ICV); @@ -86,6 +106,9 @@ static void linux_memspace_free (omp_memspace_handle_t memspace, void *addr, size_t size, int pin) { + /* Explicit pinning may not be required. */ + pin = pin && !always_pinned_mode; + if (memspace == ompx_unified_shared_mem_space) gomp_usm_free (addr, GOMP_DEVICE_ICV); else if (pin) @@ -98,6 +121,9 @@ static void * linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr, size_t oldsize, size_t size, int oldpin, int pin) { + /* Explicit pinning may not be required. */ + pin = pin && !always_pinned_mode; + if (memspace == ompx_unified_shared_mem_space) goto manual_realloc; else if (oldpin && pin) diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 2ac58094169..40402dc9893 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -402,6 +402,11 @@ GOMP_5.1 { GOMP_teams4; } GOMP_5.0.1; +GOMP_5.1.1 { + global: + GOMP_enable_pinned_mode; +} GOMP_5.1; + OACC_2.0 { global: acc_get_num_devices; diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-7.c b/libgomp/testsuite/libgomp.c/alloc-pinned-7.c new file mode 100644 index 00000000000..6fd19b46a5c --- /dev/null +++ b/libgomp/testsuite/libgomp.c/alloc-pinned-7.c @@ -0,0 +1,66 @@ +/* { dg-do run } */ +/* { dg-additional-options "-foffload-memory=pinned" } */ + +/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */ + +/* Test that pinned memory works. */ + +#ifdef __linux__ +#include +#include +#include +#include + +#include + +int +get_pinned_mem () +{ + int pid = getpid (); + char buf[100]; + sprintf (buf, "/proc/%d/status", pid); + + FILE *proc = fopen (buf, "r"); + if (!proc) + abort (); + while (fgets (buf, 100, proc)) + { + int val; + if (sscanf (buf, "VmLck: %d", &val)) + { + printf ("lock %d\n", val); + fclose (proc); + return val; + } + } + abort (); +} +#else +int +get_pinned_mem () +{ + return 0; +} + +#define mlockall(...) 0 +#endif + +#include + +/* Allocate more than a page each time, but stay within the ulimit. */ +#define SIZE 10*1024 + +int +main () +{ + // Sanity check + if (get_pinned_mem () == 0) + { + /* -foffload-memory=pinned has failed, but maybe that's because + isufficient pinned memory was available. */ + if (mlockall (MCL_CURRENT | MCL_FUTURE) == 0) + abort (); + } + + return 0; +}