[v3,1/6] libgomp: basic pinned memory on Linux

Message ID 20231211170405.2538247-2-ams@codesourcery.com
State Committed
Commit 348874f0baac0f22c98ab11abbfa65fd172f6bdd
Headers
Series libgomp: OpenMP pinned memory omp_alloc |

Commit Message

Andrew Stubbs Dec. 11, 2023, 5:04 p.m. UTC
  Implement the OpenMP pinned memory trait on Linux hosts using the mlock
syscall.  Pinned allocations are performed using mmap, not malloc, to ensure
that they can be unpinned safely when freed.

This implementation will work OK for page-scale allocations, and finer-grained
allocations will be implemented in a future patch.

libgomp/ChangeLog:

	* allocator.c (MEMSPACE_ALLOC): Add PIN.
	(MEMSPACE_CALLOC): Add PIN.
	(MEMSPACE_REALLOC): Add PIN.
	(MEMSPACE_FREE): Add PIN.
	(MEMSPACE_VALIDATE): Add PIN.
	(omp_init_allocator): Use MEMSPACE_VALIDATE to check pinning.
	(omp_aligned_alloc): Add pinning to all MEMSPACE_* calls.
	(omp_aligned_calloc): Likewise.
	(omp_realloc): Likewise.
	(omp_free): Likewise.
	* config/linux/allocator.c: New file.
	* config/nvptx/allocator.c (MEMSPACE_ALLOC): Add PIN.
	(MEMSPACE_CALLOC): Add PIN.
	(MEMSPACE_REALLOC): Add PIN.
	(MEMSPACE_FREE): Add PIN.
	(MEMSPACE_VALIDATE): Add PIN.
	* config/gcn/allocator.c (MEMSPACE_ALLOC): Add PIN.
	(MEMSPACE_CALLOC): Add PIN.
	(MEMSPACE_REALLOC): Add PIN.
	(MEMSPACE_FREE): Add PIN.
	* libgomp.texi: Switch pinned trait to supported.
	(MEMSPACE_VALIDATE): Add PIN.
	* testsuite/libgomp.c/alloc-pinned-1.c: New test.
	* testsuite/libgomp.c/alloc-pinned-2.c: New test.
	* testsuite/libgomp.c/alloc-pinned-3.c: New test.
	* testsuite/libgomp.c/alloc-pinned-4.c: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
---
 libgomp/allocator.c                          |  65 +++++---
 libgomp/config/gcn/allocator.c               |  21 +--
 libgomp/config/linux/allocator.c             | 111 +++++++++++++
 libgomp/config/nvptx/allocator.c             |  21 +--
 libgomp/libgomp.texi                         |   3 +-
 libgomp/testsuite/libgomp.c/alloc-pinned-1.c | 115 ++++++++++++++
 libgomp/testsuite/libgomp.c/alloc-pinned-2.c | 120 ++++++++++++++
 libgomp/testsuite/libgomp.c/alloc-pinned-3.c | 156 +++++++++++++++++++
 libgomp/testsuite/libgomp.c/alloc-pinned-4.c | 150 ++++++++++++++++++
 9 files changed, 716 insertions(+), 46 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-4.c
  

Comments

Tobias Burnus Dec. 12, 2023, 9:02 a.m. UTC | #1
On 11.12.23 18:04, Andrew Stubbs wrote:
> Implement the OpenMP pinned memory trait on Linux hosts using the mlock
> syscall.  Pinned allocations are performed using mmap, not malloc, to ensure
> that they can be unpinned safely when freed.
>
> This implementation will work OK for page-scale allocations, and finer-grained
> allocations will be implemented in a future patch.

LGTM.

Thanks,

Tobias

> libgomp/ChangeLog:
>
>       * allocator.c (MEMSPACE_ALLOC): Add PIN.
>       (MEMSPACE_CALLOC): Add PIN.
>       (MEMSPACE_REALLOC): Add PIN.
>       (MEMSPACE_FREE): Add PIN.
>       (MEMSPACE_VALIDATE): Add PIN.
>       (omp_init_allocator): Use MEMSPACE_VALIDATE to check pinning.
>       (omp_aligned_alloc): Add pinning to all MEMSPACE_* calls.
>       (omp_aligned_calloc): Likewise.
>       (omp_realloc): Likewise.
>       (omp_free): Likewise.
>       * config/linux/allocator.c: New file.
>       * config/nvptx/allocator.c (MEMSPACE_ALLOC): Add PIN.
>       (MEMSPACE_CALLOC): Add PIN.
>       (MEMSPACE_REALLOC): Add PIN.
>       (MEMSPACE_FREE): Add PIN.
>       (MEMSPACE_VALIDATE): Add PIN.
>       * config/gcn/allocator.c (MEMSPACE_ALLOC): Add PIN.
>       (MEMSPACE_CALLOC): Add PIN.
>       (MEMSPACE_REALLOC): Add PIN.
>       (MEMSPACE_FREE): Add PIN.
>       * libgomp.texi: Switch pinned trait to supported.
>       (MEMSPACE_VALIDATE): Add PIN.
>       * testsuite/libgomp.c/alloc-pinned-1.c: New test.
>       * testsuite/libgomp.c/alloc-pinned-2.c: New test.
>       * testsuite/libgomp.c/alloc-pinned-3.c: New test.
>       * testsuite/libgomp.c/alloc-pinned-4.c: New test.
>
> Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
> ---
>   libgomp/allocator.c                          |  65 +++++---
>   libgomp/config/gcn/allocator.c               |  21 +--
>   libgomp/config/linux/allocator.c             | 111 +++++++++++++
>   libgomp/config/nvptx/allocator.c             |  21 +--
>   libgomp/libgomp.texi                         |   3 +-
>   libgomp/testsuite/libgomp.c/alloc-pinned-1.c | 115 ++++++++++++++
>   libgomp/testsuite/libgomp.c/alloc-pinned-2.c | 120 ++++++++++++++
>   libgomp/testsuite/libgomp.c/alloc-pinned-3.c | 156 +++++++++++++++++++
>   libgomp/testsuite/libgomp.c/alloc-pinned-4.c | 150 ++++++++++++++++++
>   9 files changed, 716 insertions(+), 46 deletions(-)
>   create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-1.c
>   create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-2.c
>   create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-3.c
>   create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-4.c
>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Andrew Stubbs Dec. 13, 2023, 2:28 p.m. UTC | #2
On 12/12/2023 09:02, Tobias Burnus wrote:
> On 11.12.23 18:04, Andrew Stubbs wrote:
>> Implement the OpenMP pinned memory trait on Linux hosts using the mlock
>> syscall.  Pinned allocations are performed using mmap, not malloc, to 
>> ensure
>> that they can be unpinned safely when freed.
>>
>> This implementation will work OK for page-scale allocations, and 
>> finer-grained
>> allocations will be implemented in a future patch.
> 
> LGTM.
> 
> Thanks,
> 
> Tobias

Thank you, this one is now pushed.

Andrew
  

Patch

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index a8a80f8028d..666adf9a3a9 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -38,27 +38,30 @@ 
 #define omp_max_predefined_alloc omp_thread_mem_alloc
 
 /* These macros may be overridden in config/<target>/allocator.c.
+   The defaults (no override) are to return NULL for pinned memory requests
+   and pass through to the regular OS calls otherwise.
    The following definitions (ab)use comma operators to avoid unused
    variable errors.  */
 #ifndef MEMSPACE_ALLOC
-#define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
-  malloc (((void)(MEMSPACE), (SIZE)))
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \
+  (PIN ? NULL : malloc (((void)(MEMSPACE), (SIZE))))
 #endif
 #ifndef MEMSPACE_CALLOC
-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
-  calloc (1, (((void)(MEMSPACE), (SIZE))))
+#define MEMSPACE_CALLOC(MEMSPACE, SIZE, PIN) \
+  (PIN ? NULL : calloc (1, (((void)(MEMSPACE), (SIZE)))))
 #endif
 #ifndef MEMSPACE_REALLOC
-#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE) \
-  realloc (ADDR, (((void)(MEMSPACE), (void)(OLDSIZE), (SIZE))))
+#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE, OLDPIN, PIN) \
+   ((PIN) || (OLDPIN) ? NULL \
+   : realloc (ADDR, (((void)(MEMSPACE), (void)(OLDSIZE), (SIZE)))))
 #endif
 #ifndef MEMSPACE_FREE
-#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
-  free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  if (PIN) free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))
 #endif
 #ifndef MEMSPACE_VALIDATE
-#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) \
-  (((void)(MEMSPACE), (void)(ACCESS), 1))
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS, PIN) \
+  (PIN ? 0 : ((void)(MEMSPACE), (void)(ACCESS), 1))
 #endif
 
 /* Map the predefined allocators to the correct memory space.
@@ -439,12 +442,8 @@  omp_init_allocator (omp_memspace_handle_t memspace, int ntraits,
     }
 #endif
 
-  /* No support for this so far.  */
-  if (data.pinned)
-    return omp_null_allocator;
-
   /* Reject unsupported memory spaces.  */
-  if (!MEMSPACE_VALIDATE (data.memspace, data.access))
+  if (!MEMSPACE_VALIDATE (data.memspace, data.access, data.pinned))
     return omp_null_allocator;
 
   ret = gomp_malloc (sizeof (struct omp_allocator_data));
@@ -586,7 +585,8 @@  retry:
 	}
       else
 #endif
-	ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size);
+	ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size,
+			      allocator_data->pinned);
       if (ptr == NULL)
 	{
 #ifdef HAVE_SYNC_BUILTINS
@@ -623,7 +623,8 @@  retry:
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
 		      : predefined_alloc_mapping[allocator]);
-	  ptr = MEMSPACE_ALLOC (memspace, new_size);
+	  ptr = MEMSPACE_ALLOC (memspace, new_size,
+				allocator_data && allocator_data->pinned);
 	}
       if (ptr == NULL)
 	goto fail;
@@ -694,6 +695,7 @@  omp_free (void *ptr, omp_allocator_handle_t allocator)
 {
   struct omp_mem_header *data;
   omp_memspace_handle_t memspace = omp_default_mem_space;
+  int pinned = false;
 
   if (ptr == NULL)
     return;
@@ -735,6 +737,7 @@  omp_free (void *ptr, omp_allocator_handle_t allocator)
 #endif
 
       memspace = allocator_data->memspace;
+      pinned = allocator_data->pinned;
     }
   else
     {
@@ -759,7 +762,7 @@  omp_free (void *ptr, omp_allocator_handle_t allocator)
       memspace = predefined_alloc_mapping[data->allocator];
     }
 
-  MEMSPACE_FREE (memspace, data->ptr, data->size);
+  MEMSPACE_FREE (memspace, data->ptr, data->size, pinned);
 }
 
 ialias (omp_free)
@@ -890,7 +893,8 @@  retry:
 	}
       else
 #endif
-	ptr = MEMSPACE_CALLOC (allocator_data->memspace, new_size);
+	ptr = MEMSPACE_CALLOC (allocator_data->memspace, new_size,
+			       allocator_data->pinned);
       if (ptr == NULL)
 	{
 #ifdef HAVE_SYNC_BUILTINS
@@ -929,7 +933,8 @@  retry:
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
 		      : predefined_alloc_mapping[allocator]);
-	  ptr = MEMSPACE_CALLOC (memspace, new_size);
+	  ptr = MEMSPACE_CALLOC (memspace, new_size,
+				 allocator_data && allocator_data->pinned);
 	}
       if (ptr == NULL)
 	goto fail;
@@ -1161,9 +1166,13 @@  retry:
 #endif
       if (prev_size)
 	new_ptr = MEMSPACE_REALLOC (allocator_data->memspace, data->ptr,
-				    data->size, new_size);
+				    data->size, new_size,
+				    (free_allocator_data
+				     && free_allocator_data->pinned),
+				    allocator_data->pinned);
       else
-	new_ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size);
+	new_ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size,
+				  allocator_data->pinned);
       if (new_ptr == NULL)
 	{
 #ifdef HAVE_SYNC_BUILTINS
@@ -1216,10 +1225,14 @@  retry:
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
 		      : predefined_alloc_mapping[allocator]);
-	  new_ptr = MEMSPACE_REALLOC (memspace, data->ptr, data->size, new_size);
+	  new_ptr = MEMSPACE_REALLOC (memspace, data->ptr, data->size, new_size,
+				      (free_allocator_data
+				       && free_allocator_data->pinned),
+				      allocator_data && allocator_data->pinned);
 	}
       if (new_ptr == NULL)
 	goto fail;
+
       ret = (char *) new_ptr + sizeof (struct omp_mem_header);
       ((struct omp_mem_header *) ret)[-1].ptr = new_ptr;
       ((struct omp_mem_header *) ret)[-1].size = new_size;
@@ -1249,7 +1262,8 @@  retry:
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
 		      : predefined_alloc_mapping[allocator]);
-	  new_ptr = MEMSPACE_ALLOC (memspace, new_size);
+	  new_ptr = MEMSPACE_ALLOC (memspace, new_size,
+				    allocator_data && allocator_data->pinned);
 	}
       if (new_ptr == NULL)
 	goto fail;
@@ -1304,7 +1318,8 @@  retry:
     was_memspace = (free_allocator_data
 		    ? free_allocator_data->memspace
 		    : predefined_alloc_mapping[free_allocator]);
-    MEMSPACE_FREE (was_memspace, data->ptr, data->size);
+    int was_pinned = (free_allocator_data && free_allocator_data->pinned);
+    MEMSPACE_FREE (was_memspace, data->ptr, data->size, was_pinned);
   }
   return ret;
 
diff --git a/libgomp/config/gcn/allocator.c b/libgomp/config/gcn/allocator.c
index e9a95d683f9..679218f08d2 100644
--- a/libgomp/config/gcn/allocator.c
+++ b/libgomp/config/gcn/allocator.c
@@ -109,16 +109,17 @@  gcn_memspace_validate (omp_memspace_handle_t memspace, unsigned access)
 	  || access != omp_atv_all);
 }
 
-#define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
-  gcn_memspace_alloc (MEMSPACE, SIZE)
-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
-  gcn_memspace_calloc (MEMSPACE, SIZE)
-#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE) \
-  gcn_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, SIZE)
-#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
-  gcn_memspace_free (MEMSPACE, ADDR, SIZE)
-#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) \
-  gcn_memspace_validate (MEMSPACE, ACCESS)
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \
+  gcn_memspace_alloc (MEMSPACE, ((void)(PIN), (SIZE)))
+#define MEMSPACE_CALLOC(MEMSPACE, SIZE, PIN) \
+  gcn_memspace_calloc (MEMSPACE, ((void)(PIN), (SIZE)))
+#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE, OLDPIN, PIN) \
+  gcn_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, \
+			((void)(PIN), (void)(OLDPIN), (SIZE)))
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  gcn_memspace_free (MEMSPACE, ADDR, ((void)(PIN), (SIZE)))
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS, PIN) \
+  gcn_memspace_validate (MEMSPACE, ((void)(PIN), (ACCESS)))
 
 /* The default low-latency memspace implies omp_atv_all, which is incompatible
    with the LDS memory space.  */
diff --git a/libgomp/config/linux/allocator.c b/libgomp/config/linux/allocator.c
index 64b1b4b9623..269d0d607d8 100644
--- a/libgomp/config/linux/allocator.c
+++ b/libgomp/config/linux/allocator.c
@@ -34,4 +34,115 @@ 
 #define LIBGOMP_USE_LIBNUMA
 #endif
 
+/* Implement malloc routines that can handle pinned memory on Linux.
+   
+   It's possible to use mlock on any heap memory, but using munlock is
+   problematic if there are multiple pinned allocations on the same page.
+   Tracking all that manually would be possible, but adds overhead. This may
+   be worth it if there are a lot of small allocations getting pinned, but
+   this seems less likely in a HPC application.
+
+   Instead we optimize for large pinned allocations, and use mmap to ensure
+   that two pinned allocations don't share the same page.  This also means
+   that large allocations don't pin extra pages by being poorly aligned.  */
+
+#define _GNU_SOURCE
+#include <sys/mman.h>
+#include <string.h>
+#include "libgomp.h"
+
+static void *
+linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size, int pin)
+{
+  (void)memspace;
+
+  if (pin)
+    {
+      /* Note that mmap always returns zeroed memory and is therefore also a
+	 suitable implementation of calloc.  */
+      void *addr = mmap (NULL, size, PROT_READ | PROT_WRITE,
+			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+      if (addr == MAP_FAILED)
+	return NULL;
+
+      if (mlock (addr, size))
+	{
+	  gomp_debug (0, "libgomp: failed to pin %ld bytes of"
+		      " memory (ulimit too low?)\n", size);
+	  munmap (addr, size);
+	  return NULL;
+	}
+
+      return addr;
+    }
+  else
+    return malloc (size);
+}
+
+static void *
+linux_memspace_calloc (omp_memspace_handle_t memspace, size_t size, int pin)
+{
+  if (pin)
+    return linux_memspace_alloc (memspace, size, pin);
+  else
+    return calloc (1, size);
+}
+
+static void
+linux_memspace_free (omp_memspace_handle_t memspace, void *addr, size_t size,
+		     int pin)
+{
+  (void)memspace;
+
+  if (pin)
+    munmap (addr, size);
+  else
+    free (addr);
+}
+
+static void *
+linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr,
+			size_t oldsize, size_t size, int oldpin, int pin)
+{
+  if (oldpin && pin)
+    {
+      void *newaddr = mremap (addr, oldsize, size, MREMAP_MAYMOVE);
+      if (newaddr == MAP_FAILED)
+	return NULL;
+
+      return newaddr;
+    }
+  else if (oldpin || pin)
+    {
+      void *newaddr = linux_memspace_alloc (memspace, size, pin);
+      if (newaddr)
+	{
+	  memcpy (newaddr, addr, oldsize < size ? oldsize : size);
+	  linux_memspace_free (memspace, addr, oldsize, oldpin);
+	}
+
+      return newaddr;
+    }
+  else
+    return realloc (addr, size);
+}
+
+static int
+linux_memspace_validate (omp_memspace_handle_t, unsigned, int)
+{
+  /* Everything should be accepted on Linux, including pinning.  */
+  return 1;
+}
+
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \
+  linux_memspace_alloc (MEMSPACE, SIZE, PIN)
+#define MEMSPACE_CALLOC(MEMSPACE, SIZE, PIN) \
+  linux_memspace_calloc (MEMSPACE, SIZE, PIN)
+#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE, OLDPIN, PIN) \
+  linux_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, SIZE, OLDPIN, PIN)
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  linux_memspace_free (MEMSPACE, ADDR, SIZE, PIN)
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS, PIN) \
+  linux_memspace_validate (MEMSPACE, ACCESS, PIN)
+
 #include "../../allocator.c"
diff --git a/libgomp/config/nvptx/allocator.c b/libgomp/config/nvptx/allocator.c
index a3302411bcb..6a226a81b75 100644
--- a/libgomp/config/nvptx/allocator.c
+++ b/libgomp/config/nvptx/allocator.c
@@ -123,16 +123,17 @@  nvptx_memspace_validate (omp_memspace_handle_t memspace, unsigned access)
 #endif
 }
 
-#define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
-  nvptx_memspace_alloc (MEMSPACE, SIZE)
-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
-  nvptx_memspace_calloc (MEMSPACE, SIZE)
-#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE) \
-  nvptx_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, SIZE)
-#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
-  nvptx_memspace_free (MEMSPACE, ADDR, SIZE)
-#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) \
-  nvptx_memspace_validate (MEMSPACE, ACCESS)
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \
+  nvptx_memspace_alloc (MEMSPACE, ((void)(PIN), (SIZE)))
+#define MEMSPACE_CALLOC(MEMSPACE, SIZE, PIN) \
+  nvptx_memspace_calloc (MEMSPACE, ((void)(PIN), (SIZE)))
+#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE, OLDPIN, PIN) \
+  nvptx_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, \
+			  ((void)(OLDPIN), (void)(PIN), (SIZE)))
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  nvptx_memspace_free (MEMSPACE, ADDR, ((void)(PIN), (SIZE)))
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS, PIN) \
+  nvptx_memspace_validate (MEMSPACE, ((void)(PIN), (ACCESS)))
 
 /* The default low-latency memspace implies omp_atv_all, which is incompatible
    with the .shared memory space.  */
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 67a111265a0..5838311b7d8 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -5751,7 +5751,8 @@  a @code{nearest} allocation.
 
 Additional notes regarding the traits:
 @itemize
-@item The @code{pinned} trait is unsupported.
+@item The @code{pinned} trait is supported on Linux hosts, but is subject to
+      the OS @code{ulimit}/@code{rlimit} locked memory settings.
 @item The default for the @code{pool_size} trait is no pool and for every
       (re)allocation the associated library routine is called, which might
       internally use a memory pool.
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-1.c b/libgomp/testsuite/libgomp.c/alloc-pinned-1.c
new file mode 100644
index 00000000000..e17a21f0a6c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-1.c
@@ -0,0 +1,115 @@ 
+/* { dg-do run } */
+
+/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */
+
+/* Test that pinned memory works.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#ifdef __linux__
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+#include <sys/resource.h>
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+#define CHECK_SIZE(SIZE) { \
+  struct rlimit limit; \
+  if (getrlimit (RLIMIT_MEMLOCK, &limit) \
+      || limit.rlim_cur <= SIZE) \
+    fprintf (stderr, "unsufficient lockable memory; please increase ulimit\n"); \
+  }
+
+int
+get_pinned_mem ()
+{
+  int pid = getpid ();
+  char buf[100];
+  sprintf (buf, "/proc/%d/status", pid);
+
+  FILE *proc = fopen (buf, "r");
+  if (!proc)
+    abort ();
+  while (fgets (buf, 100, proc))
+    {
+      int val;
+      if (sscanf (buf, "VmLck: %d", &val))
+	{
+	  fclose (proc);
+	  return val;
+	}
+    }
+  abort ();
+}
+#else
+#define PAGE_SIZE 1024 /* unknown */
+#define CHECK_SIZE(SIZE) fprintf (stderr, "OS unsupported\n");
+#define EXPECT_OMP_NULL_ALLOCATOR
+
+int
+get_pinned_mem ()
+{
+  return 0;
+}
+#endif
+
+static void
+verify0 (char *p, size_t s)
+{
+  for (size_t i = 0; i < s; ++i)
+    if (p[i] != 0)
+      abort ();
+}
+
+#include <omp.h>
+
+int
+main ()
+{
+  /* Allocate at least a page each time, allowing space for overhead,
+     but stay within the ulimit.  */
+  const int SIZE = PAGE_SIZE - 128;
+  CHECK_SIZE (SIZE * 5);  // This is intended to help diagnose failures
+
+  const omp_alloctrait_t traits[] = {
+      { omp_atk_pinned, 1 }
+  };
+  omp_allocator_handle_t allocator = omp_init_allocator (omp_default_mem_space,
+							 1, traits);
+
+#ifdef EXPECT_OMP_NULL_ALLOCATOR
+  if (allocator == omp_null_allocator)
+    return 0;
+#endif
+
+  // Sanity check
+  if (get_pinned_mem () != 0)
+    abort ();
+
+  void *p = omp_alloc (SIZE, allocator);
+  if (!p)
+    abort ();
+
+  int amount = get_pinned_mem ();
+  if (amount == 0)
+    abort ();
+
+  p = omp_realloc (p, SIZE * 2, allocator, allocator);
+
+  int amount2 = get_pinned_mem ();
+  if (amount2 <= amount)
+    abort ();
+
+  /* SIZE*2 ensures that it doesn't slot into the space possibly
+     vacated by realloc.  */
+  p = omp_calloc (1, SIZE * 2, allocator);
+
+  if (get_pinned_mem () <= amount2)
+    abort ();
+
+  verify0 (p, SIZE * 2);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-2.c b/libgomp/testsuite/libgomp.c/alloc-pinned-2.c
new file mode 100644
index 00000000000..3cf322cfbc8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-2.c
@@ -0,0 +1,120 @@ 
+/* { dg-do run } */
+
+/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */
+
+/* Test that pinned memory works (pool_size code path).  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#ifdef __linux__
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+#include <sys/resource.h>
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+#define CHECK_SIZE(SIZE) { \
+  struct rlimit limit; \
+  if (getrlimit (RLIMIT_MEMLOCK, &limit) \
+      || limit.rlim_cur <= SIZE) \
+    fprintf (stderr, "unsufficient lockable memory; please increase ulimit\n"); \
+  }
+
+int
+get_pinned_mem ()
+{
+  int pid = getpid ();
+  char buf[100];
+  sprintf (buf, "/proc/%d/status", pid);
+
+  FILE *proc = fopen (buf, "r");
+  if (!proc)
+    abort ();
+  while (fgets (buf, 100, proc))
+    {
+      int val;
+      if (sscanf (buf, "VmLck: %d", &val))
+	{
+	  fclose (proc);
+	  return val;
+	}
+    }
+  abort ();
+}
+#else
+#define PAGE_SIZE 1024 /* unknown */
+#define CHECK_SIZE(SIZE) fprintf (stderr, "OS unsupported\n");
+#define EXPECT_OMP_NULL_ALLOCATOR
+
+int
+get_pinned_mem ()
+{
+  return 0;
+}
+#endif
+
+static void
+verify0 (char *p, size_t s)
+{
+  for (size_t i = 0; i < s; ++i)
+    if (p[i] != 0)
+      abort ();
+}
+
+#include <omp.h>
+
+int
+main ()
+{
+  /* Allocate at least a page each time, allowing space for overhead,
+     but stay within the ulimit.  */
+  const int SIZE = PAGE_SIZE - 128;
+  CHECK_SIZE (SIZE * 5);  // This is intended to help diagnose failures
+
+  const omp_alloctrait_t traits[] = {
+      { omp_atk_pinned, 1 },
+      { omp_atk_pool_size, SIZE * 8 }
+  };
+  omp_allocator_handle_t allocator = omp_init_allocator (omp_default_mem_space,
+							 2, traits);
+
+#ifdef EXPECT_OMP_NULL_ALLOCATOR
+  if (allocator == omp_null_allocator)
+    return 0;
+#endif
+
+  // Sanity check
+  if (get_pinned_mem () != 0)
+    abort ();
+
+  void *p = omp_alloc (SIZE, allocator);
+  if (!p)
+    abort ();
+
+  int amount = get_pinned_mem ();
+  if (amount == 0)
+    abort ();
+
+  p = omp_realloc (p, SIZE * 2, allocator, allocator);
+  if (!p)
+    abort ();
+
+  int amount2 = get_pinned_mem ();
+  if (amount2 <= amount)
+    abort ();
+
+  /* SIZE*2 ensures that it doesn't slot into the space possibly
+     vacated by realloc.  */
+  p = omp_calloc (1, SIZE * 2, allocator);
+  if (!p)
+    abort ();
+
+  if (get_pinned_mem () <= amount2)
+    abort ();
+
+  verify0 (p, SIZE * 2);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-3.c b/libgomp/testsuite/libgomp.c/alloc-pinned-3.c
new file mode 100644
index 00000000000..53e4720cc9c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-3.c
@@ -0,0 +1,156 @@ 
+/* { dg-do run } */
+
+/* Test that pinned memory fails correctly.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#ifdef __linux__
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+#include <sys/resource.h>
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+
+int
+get_pinned_mem ()
+{
+  int pid = getpid ();
+  char buf[100];
+  sprintf (buf, "/proc/%d/status", pid);
+
+  FILE *proc = fopen (buf, "r");
+  if (!proc)
+    abort ();
+  while (fgets (buf, 100, proc))
+    {
+      int val;
+      if (sscanf (buf, "VmLck: %d", &val))
+	{
+	  fclose (proc);
+	  return val;
+	}
+    }
+  abort ();
+}
+
+void
+set_pin_limit (int size)
+{
+  struct rlimit limit;
+  if (getrlimit (RLIMIT_MEMLOCK, &limit))
+    abort ();
+  limit.rlim_cur = (limit.rlim_max < size ? limit.rlim_max : size);
+  if (setrlimit (RLIMIT_MEMLOCK, &limit))
+    abort ();
+}
+#else
+#define PAGE_SIZE 10000 * 1024 /* unknown */
+#define EXPECT_OMP_NULL_ALLOCATOR
+
+int
+get_pinned_mem ()
+{
+  return 0;
+}
+
+void
+set_pin_limit ()
+{
+}
+#endif
+
+static void
+verify0 (char *p, size_t s)
+{
+  for (size_t i = 0; i < s; ++i)
+    if (p[i] != 0)
+      abort ();
+}
+
+#include <omp.h>
+
+int
+main ()
+{
+  /* This needs to be large enough to cover multiple pages.  */
+  const int SIZE = PAGE_SIZE * 4;
+
+  /* Pinned memory, no fallback.  */
+  const omp_alloctrait_t traits1[] = {
+      { omp_atk_pinned, 1 },
+      { omp_atk_fallback, omp_atv_null_fb }
+  };
+  omp_allocator_handle_t allocator1 = omp_init_allocator (omp_default_mem_space,
+							  2, traits1);
+
+  /* Pinned memory, plain memory fallback.  */
+  const omp_alloctrait_t traits2[] = {
+      { omp_atk_pinned, 1 },
+      { omp_atk_fallback, omp_atv_default_mem_fb }
+  };
+  omp_allocator_handle_t allocator2 = omp_init_allocator (omp_default_mem_space,
+							  2, traits2);
+
+#ifdef EXPECT_OMP_NULL_ALLOCATOR
+  if (allocator1 == omp_null_allocator
+      && allocator2 == omp_null_allocator)
+    return 0;
+#endif
+
+  /* Ensure that the limit is smaller than the allocation.  */
+  set_pin_limit (SIZE / 2);
+
+  // Sanity check
+  if (get_pinned_mem () != 0)
+    abort ();
+
+  // Should fail
+  void *p1 = omp_alloc (SIZE, allocator1);
+  if (p1)
+    abort ();
+
+  // Should fail
+  void *p2 = omp_calloc (1, SIZE, allocator1);
+  if (p2)
+    abort ();
+
+  // Should fall back
+  void *p3 = omp_alloc (SIZE, allocator2);
+  if (!p3)
+    abort ();
+
+  // Should fall back
+  void *p4 = omp_calloc (1, SIZE, allocator2);
+  if (!p4)
+    abort ();
+  verify0 (p4, SIZE);
+
+  // Should fail to realloc
+  void *notpinned = omp_alloc (SIZE, omp_default_mem_alloc);
+  void *p5 = omp_realloc (notpinned, SIZE, allocator1, omp_default_mem_alloc);
+  if (!notpinned || p5)
+    abort ();
+
+  // Should fall back to no realloc needed
+  void *p6 = omp_realloc (notpinned, SIZE, allocator2, omp_default_mem_alloc);
+  if (p6 != notpinned)
+    abort ();
+
+  // No memory should have been pinned
+  int amount = get_pinned_mem ();
+  if (amount != 0)
+    abort ();
+
+  // Ensure free works correctly
+  if (p1) omp_free (p1, allocator1);
+  if (p2) omp_free (p2, allocator1);
+  if (p3) omp_free (p3, allocator2);
+  if (p4) omp_free (p4, allocator2);
+  // p5 and notpinned have been reallocated
+  if (p6) omp_free (p6, omp_default_mem_alloc);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-4.c b/libgomp/testsuite/libgomp.c/alloc-pinned-4.c
new file mode 100644
index 00000000000..9d850c23e4b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-4.c
@@ -0,0 +1,150 @@ 
+/* { dg-do run } */
+
+/* Test that pinned memory fails correctly, pool_size code path.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#ifdef __linux__
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+#include <sys/resource.h>
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+
+int
+get_pinned_mem ()
+{
+  int pid = getpid ();
+  char buf[100];
+  sprintf (buf, "/proc/%d/status", pid);
+
+  FILE *proc = fopen (buf, "r");
+  if (!proc)
+    abort ();
+  while (fgets (buf, 100, proc))
+    {
+      int val;
+      if (sscanf (buf, "VmLck: %d", &val))
+	{
+	  fclose (proc);
+	  return val;
+	}
+    }
+  abort ();
+}
+
+void
+set_pin_limit (int size)
+{
+  struct rlimit limit;
+  if (getrlimit (RLIMIT_MEMLOCK, &limit))
+    abort ();
+  limit.rlim_cur = (limit.rlim_max < size ? limit.rlim_max : size);
+  if (setrlimit (RLIMIT_MEMLOCK, &limit))
+    abort ();
+}
+#else
+#define PAGE_SIZE 10000 * 1024 /* unknown */
+#define EXPECT_OMP_NULL_ALLOCATOR
+
+int
+get_pinned_mem ()
+{
+  return 0;
+}
+
+void
+set_pin_limit ()
+{
+}
+#endif
+
+static void
+verify0 (char *p, size_t s)
+{
+  for (size_t i = 0; i < s; ++i)
+    if (p[i] != 0)
+      abort ();
+}
+
+#include <omp.h>
+
+int
+main ()
+{
+  /* This needs to be large enough to cover multiple pages.  */
+  const int SIZE = PAGE_SIZE * 4;
+
+  /* Pinned memory, no fallback.  */
+  const omp_alloctrait_t traits1[] = {
+      { omp_atk_pinned, 1 },
+      { omp_atk_fallback, omp_atv_null_fb },
+      { omp_atk_pool_size, SIZE * 8 }
+  };
+  omp_allocator_handle_t allocator1 = omp_init_allocator (omp_default_mem_space,
+							  3, traits1);
+
+  /* Pinned memory, plain memory fallback.  */
+  const omp_alloctrait_t traits2[] = {
+      { omp_atk_pinned, 1 },
+      { omp_atk_fallback, omp_atv_default_mem_fb },
+      { omp_atk_pool_size, SIZE * 8 }
+  };
+  omp_allocator_handle_t allocator2 = omp_init_allocator (omp_default_mem_space,
+							  3, traits2);
+
+#ifdef EXPECT_OMP_NULL_ALLOCATOR
+  if (allocator1 == omp_null_allocator
+      && allocator2 == omp_null_allocator)
+    return 0;
+#endif
+
+  /* Ensure that the limit is smaller than the allocation.  */
+  set_pin_limit (SIZE / 2);
+
+  // Sanity check
+  if (get_pinned_mem () != 0)
+    abort ();
+
+  // Should fail
+  void *p = omp_alloc (SIZE, allocator1);
+  if (p)
+    abort ();
+
+  // Should fail
+  p = omp_calloc (1, SIZE, allocator1);
+  if (p)
+    abort ();
+
+  // Should fall back
+  p = omp_alloc (SIZE, allocator2);
+  if (!p)
+    abort ();
+
+  // Should fall back
+  p = omp_calloc (1, SIZE, allocator2);
+  if (!p)
+    abort ();
+  verify0 (p, SIZE);
+
+  // Should fail to realloc
+  void *notpinned = omp_alloc (SIZE, omp_default_mem_alloc);
+  p = omp_realloc (notpinned, SIZE, allocator1, omp_default_mem_alloc);
+  if (!notpinned || p)
+    abort ();
+
+  // Should fall back to no realloc needed
+  p = omp_realloc (notpinned, SIZE, allocator2, omp_default_mem_alloc);
+  if (p != notpinned)
+    abort ();
+
+  // No memory should have been pinned
+  int amount = get_pinned_mem ();
+  if (amount != 0)
+    abort ();
+
+  return 0;
+}