diff mbox

Add and use new glibc-internal futex API.

Message ID 1433960847.10071.96.camel@triegel.csb
State Dropped
Headers show

Commit Message

Torvald Riegel June 10, 2015, 6:27 p.m. UTC
On Mon, 2015-06-08 at 22:42 +0200, Torvald Riegel wrote:
> This adds new functions for futex operations, starting with wait,
> abstimed_wait, reltimed_wait, wake.  They add documentation and error
> checking according to the current draft of the Linux kernel futex
> manpage.
> 
> Waiting with absolute or relative timeouts is split into separate
> functions.  This allows for removing a few cases of code duplication in
> pthreads code, which uses absolute timeouts; also, it allows us to put
> platform-specific code to go from an absolute to a relative timeout into
> the platform-specific futex abstractions.  The latter is done by adding
> lll_futex_abstimed_wait.  I expect that we will refactor this later on,
> depending on how we do the lll_ parts.
> 
> Futex operations that can be canceled are also split out into separate
> functions suffixed by "_cancelable".
> 
> This is a revision of
> https://sourceware.org/ml/libc-alpha/2015-01/msg00215.html
> 
> I have transformed all the lll_futex_* uses that I'm aware of except the
> following:
> * Anything related to lowlevellock or mutexes.
> * sparc-specific files: I'll send a follow-up patch so this can be
> reviewed and tested separately.
> * pthread condvar: I'll send a follow-up patch on top of my revised
> condvar implementation.
> * All tls.h files: Siddhesh wants to look at this area in more detail,
> so I'll work with him to see how to best move this to the new API.
> 
> Interacting with futex words requires atomic accesses, which isn't done
> by most of glibc's current futex callers.  I did not fix these in this
> patch to keep the patch easier to review: Using the new futex API is in
> most cases a pretty mechanical change, which I didn't want to obfuscate
> by lots of other changes to atomics.  The core motivation behind this
> patch is to add error handling and improve the internal futex API, not
> to change any synchronization.
> Nonetheless, using atomics where they are needed is on my list of things
> to do, so don't worry :)
> Specifically, this is already done in the my new condvar implementation
> and in the new semaphore.  It will get done for rwlock in the new
> implementation I'm working on.  I also plan to update the barrier
> implementation.  For TLS, this is something Siddhesh has on his radar,
> AFAIK.  Adhemerval is working on a new cancellation scheme.
> 
> I kept the old semaphore code unchanged for now because I don't have a
> testing setup for this ready.
> 
> Adhemerval, is this API compatible with the new cancellation scheme you
> are working on?
> 
> Roland, okay for NaCl?
> 
> Tested on x86_64-linux.
> 
> 
> 2015-06-08  Torvald Riegel  <triegel@redhat.com>
> 
> 	* nptl/futex-internal.h: New file.
> 	* nptl/allocatestack.c (setxid_mark_thread): Use futex wrappers with
> 	error checking.
> 	(setxid_unmark_thread): Likewise.
> 	(__nptl_setxid): Likewise.
> 	(__wait_lookup_done): Likewise.
> 	* nptl/cancellation.c (__pthread_disable_asynccancel): Likewise.
> 	* nptl/nptl-init.c (sighandler_setxid): Likewise.
> 	* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
> 	* nptl/pthread_once.c (clear_once_control): Likewise.
> 	(__pthread_once_slow): Likewise.
> 	* nptl/pthread_rwlock_rdlock.c (__pthread_rwlock_rdlock_slow):
> 	Likewise.
> 	(__pthread_rwlock_rdlock): Likewise.
> 	* nptl/pthread_rwlock_timedrdlock.c (pthread_rwlock_timedrdlock):
> 	Likewise.
> 	* nptl/pthread_rwlock_timedwrlock.c (pthread_rwlock_timedwrlock):
> 	Likewise.
> 	* nptl/pthread_rwlock_tryrdlock.c (__pthread_rwlock_tryrdlock):
> 	Likewise.
> 	* nptl/pthread_rwlock_unlock.c (__pthread_rwlock_unlock): Likewise.
> 	* nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock_slow:
> 	Likewise.
> 	* nptl/unregister-atfork.c (__unregister_atfork): Likewise.
> 	* sysdeps/nacl/exit-thread.h (__exit_thread): Likewise.
> 	* sysdeps/nptl/aio_misc.h (AIO_MISC_NOTIFY, AIO_MISC_WAIT): Likewise.
> 	* sysdeps/nptl/fork.c (__libc_fork): Likewise.
> 	* sysdeps/nptl/gai_misc.h (GAI_MISC_NOTIFY, GAI_MISC_WAIT): Likewise.
> 	* nptl/pthread_barrier_wait.c (pthread_barrier_wait): Likewise.
> 	* nptl/pthread_barrier_init.c (pthread_barrier_init): Add comment.
> 	* nptl/sem_init.c (futex_private_if_supported): Remove.
> 	* nptl/sem_post.c (futex_wake): Likewise.
> 	* nptl/sem_waitcommon.c (futex_abstimed_wait, futex_wake): Likewise.
> 	(do_futex_wait): Use futex wrappers with error checking.
> 	* nptl/sem_open.c (sem_open): Use FUTEX_SHARED.
> 	* sysdeps/nptl/lowlevellock-futex.h (lll_futex_abstimed_wait): New.
> 	* sysdeps/unix/sysv/linux/lowlevellock-futex.h
> 	(lll_futex_abstimed_wait): New.
> 	* sysdeps/nacl/lowlevellock-futex.h (lll_futex_abstimed_wait): New.
> 

I've revised this patch to address Roland's feedback regarding the
"ignore_value (futex_wait ())" occurrences.  I've added a
futex_wait_simple call that returns void and explains the use case.
There are comments in cases where the pattern is not visible in the
immediate surroundings.

futex_wait_simple was the best name I could come up with.  It's meant to
be for the simplest futex_wait use there is: Just using it as an
additional point of potential blocking in what otherwise looks like a
simple busy-waiting loop.  See the comments for futex_wait_simple.

Otherwise, no changes except a comment on some synchronization code that
looked weird and that we should look at in detail in the future.
diff mbox

Patch

commit ac6c54657cb221f94e76df350a152914f4f1319b
Author: Torvald Riegel <triegel@redhat.com>
Date:   Thu Dec 4 14:12:23 2014 +0100

    Add and use new glibc-internal futex API.
    
    This adds new functions for futex operations, starting with wait,
    abstimed_wait, reltimed_wait, wake.  They add documentation and error
    checking according to the current draft of the Linux kernel futex manpage.
    
    Waiting with absolute or relative timeouts is split into separate functions.
    This allows for removing a few cases of code duplication in pthreads code,
    which uses absolute timeouts; also, it allows us to put platform-specific
    code to go from an absolute to a relative timeout into the platform-specific
    futex abstractions.  The latter is done by adding lll_futex_abstimed_wait.
    I expect that we will refactor this later on, depending on how we do the
    lll_ parts.
    
    Futex operations that can be canceled are also split out into separate
    functions suffixed by "_cancelable".

diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 8e620c4..7595186 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -29,6 +29,7 @@ 
 #include <tls.h>
 #include <list.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <kernel-features.h>
 #include <stack-aliasing.h>
 
@@ -987,7 +988,8 @@  setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
   if (t->setxid_futex == -1
       && ! atomic_compare_and_exchange_bool_acq (&t->setxid_futex, -2, -1))
     do
-      lll_futex_wait (&t->setxid_futex, -2, LLL_PRIVATE);
+      futex_wait_simple ((unsigned int *) &t->setxid_futex, -2,
+			 FUTEX_PRIVATE);
     while (t->setxid_futex == -2);
 
   /* Don't let the thread exit before the setxid handler runs.  */
@@ -1005,7 +1007,7 @@  setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
 	  if ((ch & SETXID_BITMASK) == 0)
 	    {
 	      t->setxid_futex = 1;
-	      lll_futex_wake (&t->setxid_futex, 1, LLL_PRIVATE);
+	      futex_wake ((unsigned int *) &t->setxid_futex, 1, FUTEX_PRIVATE);
 	    }
 	  return;
 	}
@@ -1032,7 +1034,7 @@  setxid_unmark_thread (struct xid_command *cmdp, struct pthread *t)
 
   /* Release the futex just in case.  */
   t->setxid_futex = 1;
-  lll_futex_wake (&t->setxid_futex, 1, LLL_PRIVATE);
+  futex_wake ((unsigned int *) &t->setxid_futex, 1, FUTEX_PRIVATE);
 }
 
 
@@ -1141,7 +1143,8 @@  __nptl_setxid (struct xid_command *cmdp)
       int cur = cmdp->cntr;
       while (cur != 0)
 	{
-	  lll_futex_wait (&cmdp->cntr, cur, LLL_PRIVATE);
+	  futex_wait_simple ((unsigned int *) &cmdp->cntr, cur,
+			     FUTEX_PRIVATE);
 	  cur = cmdp->cntr;
 	}
     }
@@ -1251,7 +1254,8 @@  __wait_lookup_done (void)
 	continue;
 
       do
-	lll_futex_wait (gscope_flagp, THREAD_GSCOPE_FLAG_WAIT, LLL_PRIVATE);
+	futex_wait_simple ((unsigned int *) gscope_flagp,
+			   THREAD_GSCOPE_FLAG_WAIT, FUTEX_PRIVATE);
       while (*gscope_flagp == THREAD_GSCOPE_FLAG_WAIT);
     }
 
@@ -1273,7 +1277,8 @@  __wait_lookup_done (void)
 	continue;
 
       do
-	lll_futex_wait (gscope_flagp, THREAD_GSCOPE_FLAG_WAIT, LLL_PRIVATE);
+	futex_wait_simple ((unsigned int *) gscope_flagp,
+			   THREAD_GSCOPE_FLAG_WAIT, FUTEX_PRIVATE);
       while (*gscope_flagp == THREAD_GSCOPE_FLAG_WAIT);
     }
 
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index deac1eb..ea1bbc9 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -19,6 +19,7 @@ 
 #include <setjmp.h>
 #include <stdlib.h>
 #include "pthreadP.h"
+#include <futex-internal.h>
 
 
 /* The next two functions are similar to pthread_setcanceltype() but
@@ -93,7 +94,7 @@  __pthread_disable_asynccancel (int oldtype)
   while (__builtin_expect ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
 			   == CANCELING_BITMASK, 0))
     {
-      lll_futex_wait (&self->cancelhandling, newval, LLL_PRIVATE);
+      futex_wait_simple (&self->cancelhandling, newval, FUTEX_PRIVATE);
       newval = THREAD_GETMEM (self, cancelhandling);
     }
 }
diff --git a/nptl/futex-internal.h b/nptl/futex-internal.h
new file mode 100644
index 0000000..b415760
--- /dev/null
+++ b/nptl/futex-internal.h
@@ -0,0 +1,341 @@ 
+/* futex operations for glibc-internal use.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef FUTEX_INTERNAL_H
+#define FUTEX_INTERNAL_H
+
+#include <errno.h>
+#include <lowlevellock-futex.h>
+#include <sys/time.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <nptl/pthreadP.h>
+#include <libc-internal.h>
+
+/* This file defines futex operations used internally in glibc.  A futex
+   consists of the so-called futex word in userspace, which is of type int
+   and represents an application-specific condition, and kernel state
+   associated with this particular futex word (e.g., wait queues).  The futex
+   operations we provide are wrappers for the futex syscalls and add
+   glibc-specific error checking of the syscall return value.  We abort on
+   error codes that are caused by bugs in glibc or in the calling application,
+   or when an error code is not known.  We return error codes that can arise
+   in correct executions to the caller.  Each operation calls out exactly the
+   return values that callers need to handle.
+
+   The private flag must be either FUTEX_PRIVATE or FUTEX_SHARED.
+
+   We expect callers to only use these operations if futexes are supported.
+
+   Given that waking other threads waiting on a futex involves concurrent
+   accesses to the futex word, you must use atomic operations to access the
+   futex word.
+
+   Due to POSIX requirements on when synchronization data structures such
+   as mutexes or semaphores can be destroyed and due to the futex design
+   having separate fast/slow paths for wake-ups, we need to consider that
+   futex_wake calls might effectively target a data structure that has been
+   destroyed and reused for another object, or unmapped; thus, some
+   errors or spurious wake-ups can happen in correct executions that would
+   not be possible in a program using just a single futex whose lifetime
+   does not end before the program terminates.  For background, see:
+   https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html
+   https://lkml.org/lkml/2014/11/27/472
+
+   We also expect a Linux kernel version of 2.6.22 or more recent (since this
+   version, EINTR is not returned on spurious wake-ups anymore).  */
+
+#define FUTEX_PRIVATE LLL_PRIVATE
+#define FUTEX_SHARED  LLL_SHARED
+
+/* Returns FUTEX_PRIVATE if pshared is zero and private futexes are supported;
+   returns FUTEX_SHARED otherwise.  */
+static __always_inline int
+futex_private_if_supported (int pshared)
+{
+  if (pshared != 0)
+    return FUTEX_SHARED;
+#ifdef __ASSUME_PRIVATE_FUTEX
+  return FUTEX_PRIVATE;
+#else
+  return THREAD_GETMEM (THREAD_SELF, header.private_futex)
+      ^ FUTEX_PRIVATE_FLAG;
+#endif
+}
+
+
+/* Atomically wrt other futex operations on the same futex, this blocks iff
+   the value *FUTEX_WORD matches the expected value.  This is
+   semantically equivalent to:
+     l = <get lock associated with futex> (FUTEX_WORD);
+     wait_flag = <get wait_flag associated with futex> (FUTEX_WORD);
+     lock (l);
+     val = atomic_load_relaxed (FUTEX_WORD);
+     if (val != expected) { unlock (l); return EAGAIN; }
+     atomic_store_relaxed (wait_flag, true);
+     unlock (l);
+     // Now block; can time out in futex_time_wait (see below)
+     while (atomic_load_relaxed(wait_flag) && !<spurious wake-up>);
+
+   Note that no guarantee of a happens-before relation between a woken
+   futex_wait and a futex_wake is documented; however, this does not matter
+   in practice because we have to consider spurious wake-ups (see below),
+   and thus would not be able to reliably reason about which futex_wake woke
+   us.
+
+   Returns 0 if woken by a futex operation or spuriously.  (Note that due to
+   the POSIX requirements mentioned above, we need to conservatively assume
+   that unrelated futex_wake operations could wake this futex; it is easiest
+   to just be prepared for spurious wake-ups.)
+   Returns EAGAIN if the futex word did not match the expected value.
+   Returns EINTR if waiting was interrupted by a signal.
+
+   Note that some previous code in glibc assumed the underlying futex
+   operation (e.g., syscall) to start with or include the equivalent of a
+   seq_cst fence; this allows one to avoid an explicit seq_cst fence before
+   a futex_wait call when synchronizing similar to Dekker synchronization.
+   However, we make no such guarantee here.
+   */
+static __always_inline int
+futex_wait (unsigned int *futex_word, unsigned int expected, int private)
+{
+  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+      return -err;
+
+    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* Like futex_wait but does not provide any indication why we stopped waiting.
+   Thus, when this function returns, you have to always check FUTEX_WORD to
+   determine whether you need to continue waiting, and you cannot detect
+   whether the waiting was interrupted by a signal.  Example use:
+     while (atomic_load_relaxed (&futex_word) == 23)
+       futex_wait_simple (&futex_word, 23, FUTEX_PRIVATE);
+   This is common enough to make providing this wrapper worthwhile.  */
+static __always_inline void
+futex_wait_simple (unsigned int *futex_word, unsigned int expected,
+		   int private)
+{
+  ignore_value (futex_wait (futex_word, expected, private));
+}
+
+/* Like futex_wait but cancelable.  */
+static __always_inline int
+futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
+		       int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+      return -err;
+
+    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* Like futex_wait, but will eventually time out (i.e., stop being
+   blocked) after the duration of time provided (i.e., RELTIME) has
+   passed.  The caller must provide a normalized RELTIME.  RELTIME can also
+   equal NULL, in which case this function behaves equivalent to futex_wait.
+
+   Returns 0 if woken by a futex operation or spuriously.  (Note that due to
+   the POSIX requirements mentioned above, we need to conservatively assume
+   that unrelated futex_wake operations could wake this futex; it is easiest
+   to just be prepared for spurious wake-ups.)
+   Returns EAGAIN if the futex word did not match the expected value.
+   Returns EINTR if waiting was interrupted by a signal.
+   Returns ETIMEDOUT if the timeout expired.
+   */
+static __always_inline int
+futex_reltimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* reltime, int private)
+{
+  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* Like futex_reltimed_wait, but cancelable.  */
+static __always_inline int
+futex_reltimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* reltime, int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* Like futex_reltimed_wait, but the provided timeout (ABSTIME) is an
+   absolute point in time; a call will time out after this point in time.  */
+static __always_inline int
+futex_abstimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* abstime, int private)
+{
+  int err = lll_futex_abstimed_wait (futex_word, expected, abstime, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* Like futex_reltimed_wait, but cancelable.  */
+static __always_inline int
+futex_abstimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* abstime, int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_abstimed_wait (futex_word, expected, abstime, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* Atomically wrt other futex operations on the same futex, this unblocks the
+   specified number of processes, or all processes blocked on this futex if
+   there are fewer than the specified number.  Semantically, this is
+   equivalent to:
+     l = <get lock associated with futex> (FUTEX_WORD);
+     lock (l);
+     for (res = 0; PROCESSES_TO_WAKE > 0; PROCESSES_TO_WAKE--, res++) {
+       if (<no process blocked on futex>) break;
+       wf = <get wait_flag of a process blocked on futex> (FUTEX_WORD);
+       // No happens-before guarantee with woken futex_wait (see above)
+       atomic_store_relaxed (wf, 0);
+     }
+     return res;
+
+   Note that we need to support futex_wake calls to past futexes whose memory
+   has potentially been reused due to POSIX' requirements on synchronization
+   object destruction (see above); therefore, we must not report or abort
+   on most errors.  */
+static __always_inline void
+futex_wake (unsigned int* futex_word, int processes_to_wake, int private)
+{
+  int res = lll_futex_wake (futex_word, processes_to_wake, private);
+  /* No error.  Ignore the number of woken processes.  */
+  if (res >= 0)
+    return;
+  switch (res)
+    {
+    case -EFAULT: /* Could have happened due to memory reuse.  */
+    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
+		     glibc or in the application) or due to memory being
+		     reused for a PI futex.  We cannot distinguish between the
+		     two causes, and one of them is correct use, so we do not
+		     act in this case.  */
+      return;
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+#endif  /* futex-internal.h */
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 8a51161..c875d3d 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -34,6 +34,7 @@ 
 #include <shlib-compat.h>
 #include <smp.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <kernel-features.h>
 #include <libc-internal.h>
 #include <pthread-pids.h>
@@ -279,10 +280,10 @@  sighandler_setxid (int sig, siginfo_t *si, void *ctx)
 
   /* And release the futex.  */
   self->setxid_futex = 1;
-  lll_futex_wake (&self->setxid_futex, 1, LLL_PRIVATE);
+  futex_wake ((unsigned int *) &self->setxid_futex, 1, FUTEX_PRIVATE);
 
   if (atomic_decrement_val (&__xidcmd->cntr) == 0)
-    lll_futex_wake (&__xidcmd->cntr, 1, LLL_PRIVATE);
+    futex_wake ((unsigned int *) &__xidcmd->cntr, 1, FUTEX_PRIVATE);
 }
 #endif
 
diff --git a/nptl/pthread_barrier_init.c b/nptl/pthread_barrier_init.c
index 82e13fb..9d09ea6 100644
--- a/nptl/pthread_barrier_init.c
+++ b/nptl/pthread_barrier_init.c
@@ -57,6 +57,8 @@  pthread_barrier_init (barrier, attr, count)
   ibarrier->init_count = count;
   ibarrier->curr_event = 0;
 
+  /* XXX Don't use futex_private_if_supported here as long as there are still
+     assembly implementations that expect the value determined below.  */
 #ifdef __ASSUME_PRIVATE_FUTEX
   ibarrier->private = (iattr->pshared != PTHREAD_PROCESS_PRIVATE
 		       ? 0 : FUTEX_PRIVATE_FLAG);
diff --git a/nptl/pthread_barrier_wait.c b/nptl/pthread_barrier_wait.c
index 9e7090f..b2fed86 100644
--- a/nptl/pthread_barrier_wait.c
+++ b/nptl/pthread_barrier_wait.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthreadP.h>
 
 
@@ -29,9 +30,12 @@  pthread_barrier_wait (barrier)
 {
   struct pthread_barrier *ibarrier = (struct pthread_barrier *) barrier;
   int result = 0;
+  int lll_private = ibarrier->private ^ FUTEX_PRIVATE_FLAG;
+  int futex_private = (lll_private == LLL_PRIVATE)
+		      ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Make sure we are alone.  */
-  lll_lock (ibarrier->lock, ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+  lll_lock (ibarrier->lock, lll_private);
 
   /* One more arrival.  */
   --ibarrier->left;
@@ -44,8 +48,7 @@  pthread_barrier_wait (barrier)
       ++ibarrier->curr_event;
 
       /* Wake up everybody.  */
-      lll_futex_wake (&ibarrier->curr_event, INT_MAX,
-		      ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+      futex_wake (&ibarrier->curr_event, INT_MAX, futex_private);
 
       /* This is the thread which finished the serialization.  */
       result = PTHREAD_BARRIER_SERIAL_THREAD;
@@ -57,12 +60,11 @@  pthread_barrier_wait (barrier)
       unsigned int event = ibarrier->curr_event;
 
       /* Before suspending, make the barrier available to others.  */
-      lll_unlock (ibarrier->lock, ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+      lll_unlock (ibarrier->lock, lll_private);
 
       /* Wait for the event counter of the barrier to change.  */
       do
-	lll_futex_wait (&ibarrier->curr_event, event,
-			ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+	futex_wait_simple (&ibarrier->curr_event, event, futex_private);
       while (event == ibarrier->curr_event);
     }
 
@@ -72,7 +74,7 @@  pthread_barrier_wait (barrier)
   /* If this was the last woken thread, unlock.  */
   if (atomic_increment_val (&ibarrier->left) == init_count)
     /* We are done.  */
-    lll_unlock (ibarrier->lock, ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+    lll_unlock (ibarrier->lock, lll_private);
 
   return result;
 }
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 71a5619..b23deb2 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -31,6 +31,7 @@ 
 #include <kernel-features.h>
 #include <exit-thread.h>
 #include <default-sched.h>
+#include <futex-internal.h>
 
 #include <shlib-compat.h>
 
@@ -269,7 +270,7 @@  START_THREAD_DEFN
 
   /* Allow setxid from now onwards.  */
   if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
-    lll_futex_wake (&pd->setxid_futex, 1, LLL_PRIVATE);
+    futex_wake ((unsigned int *) &pd->setxid_futex, 1, FUTEX_PRIVATE);
 
 #ifdef __NR_set_robust_list
 # ifndef __ASSUME_SET_ROBUST_LIST
@@ -414,7 +415,7 @@  START_THREAD_DEFN
 	  this->__list.__next = NULL;
 
 	  atomic_or (&this->__lock, FUTEX_OWNER_DIED);
-	  lll_futex_wake (&this->__lock, 1, /* XYZ */ LLL_SHARED);
+	  futex_wake (&this->__lock, 1, /* XYZ */ FUTEX_SHARED);
 	}
       while (robust != (void *) &pd->robust_head);
     }
@@ -442,7 +443,12 @@  START_THREAD_DEFN
       /* Some other thread might call any of the setXid functions and expect
 	 us to reply.  In this case wait until we did that.  */
       do
-	lll_futex_wait (&pd->setxid_futex, 0, LLL_PRIVATE);
+	/* XXX This differs from the typical futex_wait_simple pattern in that
+	   the futex_wait condition (setxid_futex) is different from the
+	   condition used in the surrounding loop (cancelhandling).  We need
+	   to check and document why this is correct.  */
+	futex_wait_simple ((unsigned int *) &pd->setxid_futex, 0,
+			   FUTEX_PRIVATE);
       while (pd->cancelhandling & SETXID_BITMASK);
 
       /* Reset the value so that the stack can be reused.  */
@@ -683,7 +689,7 @@  __pthread_create_2_1 (newthread, attr, start_routine, arg)
 	     stillborn thread.  */
 	  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0)
 				== -2))
-	    lll_futex_wake (&pd->setxid_futex, 1, LLL_PRIVATE);
+	    futex_wake ((unsigned int *) &pd->setxid_futex, 1, FUTEX_PRIVATE);
 
 	  /* Free the resources.  */
 	  __deallocate_stack (pd);
diff --git a/nptl/pthread_once.c b/nptl/pthread_once.c
index fe6d923..642730b 100644
--- a/nptl/pthread_once.c
+++ b/nptl/pthread_once.c
@@ -17,7 +17,7 @@ 
    <http://www.gnu.org/licenses/>.  */
 
 #include "pthreadP.h"
-#include <lowlevellock.h>
+#include <futex-internal.h>
 #include <atomic.h>
 
 
@@ -35,7 +35,7 @@  clear_once_control (void *arg)
      get interrupted (see __pthread_once), so all we need to relay to other
      threads is the state being reset again.  */
   atomic_store_relaxed (once_control, 0);
-  lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE);
+  futex_wake ((unsigned int *) once_control, INT_MAX, FUTEX_PRIVATE);
 }
 
 
@@ -100,8 +100,10 @@  __pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
 	     is set and __PTHREAD_ONCE_DONE is not.  */
 	  if (val == newval)
 	    {
-	      /* Same generation, some other thread was faster. Wait.  */
-	      lll_futex_wait (once_control, newval, LLL_PRIVATE);
+	      /* Same generation, some other thread was faster.  Wait and
+		 retry.  */
+	      futex_wait_simple ((unsigned int *)once_control,
+				 (unsigned int) newval, FUTEX_PRIVATE);
 	      continue;
 	    }
 	}
@@ -122,7 +124,7 @@  __pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
       atomic_store_release (once_control, __PTHREAD_ONCE_DONE);
 
       /* Wake up all other threads.  */
-      lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE);
+      futex_wake ((unsigned int *) once_control, INT_MAX, FUTEX_PRIVATE);
       break;
     }
 
diff --git a/nptl/pthread_rwlock_rdlock.c b/nptl/pthread_rwlock_rdlock.c
index 004a386..aa1593d 100644
--- a/nptl/pthread_rwlock_rdlock.c
+++ b/nptl/pthread_rwlock_rdlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <stap-probe.h>
@@ -32,6 +33,8 @@  __pthread_rwlock_rdlock_slow (pthread_rwlock_t *rwlock)
 {
   int result = 0;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Lock is taken in caller.  */
 
@@ -60,9 +63,10 @@  __pthread_rwlock_rdlock_slow (pthread_rwlock_t *rwlock)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer to finish.  */
-      lll_futex_wait (&rwlock->__data.__readers_wakeup, waitval,
-		      rwlock->__data.__shared);
+      /* Wait for the writer to finish.  We do not check the return value
+	 because we decide how to continue based on the state of the rwlock.  */
+      futex_wait_simple (&rwlock->__data.__readers_wakeup, waitval,
+			 futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -103,8 +107,7 @@  __pthread_rwlock_rdlock_slow (pthread_rwlock_t *rwlock)
   lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
   if (wake)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-		    rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
@@ -117,6 +120,8 @@  __pthread_rwlock_rdlock (pthread_rwlock_t *rwlock)
 {
   int result = 0;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   LIBC_PROBE (rdlock_entry, 1, rwlock);
 
@@ -164,8 +169,7 @@  __pthread_rwlock_rdlock (pthread_rwlock_t *rwlock)
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
       if (wake)
-	lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-			rwlock->__data.__shared);
+	futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
       return result;
     }
diff --git a/nptl/pthread_rwlock_timedrdlock.c b/nptl/pthread_rwlock_timedrdlock.c
index 63fb313..207918e 100644
--- a/nptl/pthread_rwlock_timedrdlock.c
+++ b/nptl/pthread_rwlock_timedrdlock.c
@@ -19,10 +19,10 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <sys/time.h>
-#include <kernel-features.h>
 #include <stdbool.h>
 
 
@@ -34,6 +34,8 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
 {
   int result = 0;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Make sure we are alone.  */
   lll_lock(rwlock->__data.__lock, rwlock->__data.__shared);
@@ -91,38 +93,6 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
 	  break;
 	}
 
-      /* Work around the fact that the kernel rejects negative timeout values
-	 despite them being valid.  */
-      if (__glibc_unlikely (abstime->tv_sec < 0))
-	{
-	  result = ETIMEDOUT;
-	  break;
-	}
-
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      /* Get the current time.  So far we support only one clock.  */
-      struct timeval tv;
-      (void) __gettimeofday (&tv, NULL);
-
-      /* Convert the absolute timeout value to a relative timeout.  */
-      struct timespec rt;
-      rt.tv_sec = abstime->tv_sec - tv.tv_sec;
-      rt.tv_nsec = abstime->tv_nsec - tv.tv_usec * 1000;
-      if (rt.tv_nsec < 0)
-	{
-	  rt.tv_nsec += 1000000000;
-	  --rt.tv_sec;
-	}
-      /* Did we already time out?  */
-      if (rt.tv_sec < 0)
-	{
-	  /* Yep, return with an appropriate error.  */
-	  result = ETIMEDOUT;
-	  break;
-	}
-#endif
-
       /* Remember that we are a reader.  */
       if (++rwlock->__data.__nr_readers_queued == 0)
 	{
@@ -137,17 +107,11 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer to finish.  */
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      err = lll_futex_timed_wait (&rwlock->__data.__readers_wakeup,
-				  waitval, &rt, rwlock->__data.__shared);
-#else
-      err = lll_futex_timed_wait_bitset (&rwlock->__data.__readers_wakeup,
-					 waitval, abstime,
-					 FUTEX_CLOCK_REALTIME,
-					 rwlock->__data.__shared);
-#endif
+      /* Wait for the writer to finish.  We handle ETIMEDOUT below; on other
+	 return values, we decide how to continue based on the state of the
+	 rwlock.  */
+      err = futex_abstimed_wait (&rwlock->__data.__readers_wakeup, waitval,
+				 abstime, futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -155,7 +119,7 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
       --rwlock->__data.__nr_readers_queued;
 
       /* Did the futex call time out?  */
-      if (err == -ETIMEDOUT)
+      if (err == ETIMEDOUT)
 	{
 	  /* Yep, report it.  */
 	  result = ETIMEDOUT;
@@ -167,8 +131,7 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
   lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
   if (wake)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-		    rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
diff --git a/nptl/pthread_rwlock_timedwrlock.c b/nptl/pthread_rwlock_timedwrlock.c
index c542534..2f30022 100644
--- a/nptl/pthread_rwlock_timedwrlock.c
+++ b/nptl/pthread_rwlock_timedwrlock.c
@@ -19,10 +19,10 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <sys/time.h>
-#include <kernel-features.h>
 #include <stdbool.h>
 
 
@@ -34,6 +34,8 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
 {
   int result = 0;
   bool wake_readers = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Make sure we are alone.  */
   lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -71,37 +73,6 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
 	  break;
 	}
 
-      /* Work around the fact that the kernel rejects negative timeout values
-	 despite them being valid.  */
-      if (__glibc_unlikely (abstime->tv_sec < 0))
-	{
-	  result = ETIMEDOUT;
-	  break;
-	}
-
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      /* Get the current time.  So far we support only one clock.  */
-      struct timeval tv;
-      (void) __gettimeofday (&tv, NULL);
-
-      /* Convert the absolute timeout value to a relative timeout.  */
-      struct timespec rt;
-      rt.tv_sec = abstime->tv_sec - tv.tv_sec;
-      rt.tv_nsec = abstime->tv_nsec - tv.tv_usec * 1000;
-      if (rt.tv_nsec < 0)
-	{
-	  rt.tv_nsec += 1000000000;
-	  --rt.tv_sec;
-	}
-      /* Did we already time out?  */
-      if (rt.tv_sec < 0)
-	{
-	  result = ETIMEDOUT;
-	  break;
-	}
-#endif
-
       /* Remember that we are a writer.  */
       if (++rwlock->__data.__nr_writers_queued == 0)
 	{
@@ -116,17 +87,11 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer or reader(s) to finish.  */
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      err = lll_futex_timed_wait (&rwlock->__data.__writer_wakeup,
-				  waitval, &rt, rwlock->__data.__shared);
-#else
-      err = lll_futex_timed_wait_bitset (&rwlock->__data.__writer_wakeup,
-					 waitval, abstime,
-					 FUTEX_CLOCK_REALTIME,
-					 rwlock->__data.__shared);
-#endif
+      /* Wait for the writer or reader(s) to finish.  We handle ETIMEDOUT
+	 below; on other return values, we decide how to continue based on
+	 the state of the rwlock.  */
+      err = futex_abstimed_wait (&rwlock->__data.__writer_wakeup, waitval,
+				 abstime, futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -135,7 +100,7 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
       --rwlock->__data.__nr_writers_queued;
 
       /* Did the futex call time out?  */
-      if (err == -ETIMEDOUT)
+      if (err == ETIMEDOUT)
 	{
 	  result = ETIMEDOUT;
 	  /* If we prefer writers, it can have happened that readers blocked
@@ -166,8 +131,7 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
 
   /* Might be required after timeouts.  */
   if (wake_readers)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-	rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c
index cde123f..ee0ab1f 100644
--- a/nptl/pthread_rwlock_tryrdlock.c
+++ b/nptl/pthread_rwlock_tryrdlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include "pthreadP.h"
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <elide.h>
 #include <stdbool.h>
 
@@ -28,6 +29,8 @@  __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
 {
   int result = EBUSY;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   if (ELIDE_TRYLOCK (rwlock->__data.__rwelision,
 		     rwlock->__data.__lock == 0
@@ -63,8 +66,7 @@  __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
   lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
   if (wake)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-		    rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
diff --git a/nptl/pthread_rwlock_unlock.c b/nptl/pthread_rwlock_unlock.c
index d2ad4b0..b41c6ba 100644
--- a/nptl/pthread_rwlock_unlock.c
+++ b/nptl/pthread_rwlock_unlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <stap-probe.h>
@@ -29,6 +30,9 @@ 
 int
 __pthread_rwlock_unlock (pthread_rwlock_t *rwlock)
 {
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
+
   LIBC_PROBE (rwlock_unlock, 1, rwlock);
 
   if (ELIDE_UNLOCK (rwlock->__data.__writer == 0
@@ -51,16 +55,15 @@  __pthread_rwlock_unlock (pthread_rwlock_t *rwlock)
 	{
 	  ++rwlock->__data.__writer_wakeup;
 	  lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
-	  lll_futex_wake (&rwlock->__data.__writer_wakeup, 1,
-			  rwlock->__data.__shared);
+	  futex_wake (&rwlock->__data.__writer_wakeup, 1, futex_shared);
 	  return 0;
 	}
       else if (rwlock->__data.__nr_readers_queued)
 	{
 	  ++rwlock->__data.__readers_wakeup;
 	  lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
-	  lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-			  rwlock->__data.__shared);
+	  futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
+		      futex_shared);
 	  return 0;
 	}
     }
diff --git a/nptl/pthread_rwlock_wrlock.c b/nptl/pthread_rwlock_wrlock.c
index 835a62f..9c495d8 100644
--- a/nptl/pthread_rwlock_wrlock.c
+++ b/nptl/pthread_rwlock_wrlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <stap-probe.h>
@@ -30,6 +31,8 @@  static int __attribute__((noinline))
 __pthread_rwlock_wrlock_slow (pthread_rwlock_t *rwlock)
 {
   int result = 0;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Caller has taken the lock.  */
 
@@ -58,9 +61,11 @@  __pthread_rwlock_wrlock_slow (pthread_rwlock_t *rwlock)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer or reader(s) to finish.  */
-      lll_futex_wait (&rwlock->__data.__writer_wakeup, waitval,
-		      rwlock->__data.__shared);
+      /* Wait for the writer or reader(s) to finish.  We do not check the
+	 return value because we decide how to continue based on the state of
+	 the rwlock.  */
+      futex_wait_simple (&rwlock->__data.__writer_wakeup, waitval,
+			 futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
diff --git a/nptl/sem_init.c b/nptl/sem_init.c
index 575b661..6dbbb64 100644
--- a/nptl/sem_init.c
+++ b/nptl/sem_init.c
@@ -21,22 +21,7 @@ 
 #include <shlib-compat.h>
 #include "semaphoreP.h"
 #include <kernel-features.h>
-
-/* Returns FUTEX_PRIVATE if pshared is zero and private futexes are supported;
-   returns FUTEX_SHARED otherwise.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline int
-futex_private_if_supported (int pshared)
-{
-  if (pshared != 0)
-    return LLL_SHARED;
-#ifdef __ASSUME_PRIVATE_FUTEX
-  return LLL_PRIVATE;
-#else
-  return THREAD_GETMEM (THREAD_SELF, header.private_futex)
-      ^ FUTEX_PRIVATE_FLAG;
-#endif
-}
+#include <futex-internal.h>
 
 
 int
diff --git a/nptl/sem_open.c b/nptl/sem_open.c
index bfd2dea..2e053ad 100644
--- a/nptl/sem_open.c
+++ b/nptl/sem_open.c
@@ -30,6 +30,7 @@ 
 #include <sys/stat.h>
 #include "semaphoreP.h"
 #include <shm-directory.h>
+#include <futex-internal.h>
 
 
 /* Comparison function for search of existing mapping.  */
@@ -200,7 +201,7 @@  sem_open (const char *name, int oflag, ...)
       sem.newsem.nwaiters = 0;
 #endif
       /* This always is a shared semaphore.  */
-      sem.newsem.private = LLL_SHARED;
+      sem.newsem.private = FUTEX_SHARED;
 
       /* Initialize the remaining bytes as well.  */
       memset ((char *) &sem.initsem + sizeof (struct new_sem), '\0',
diff --git a/nptl/sem_post.c b/nptl/sem_post.c
index b6d30b5..06d8359 100644
--- a/nptl/sem_post.c
+++ b/nptl/sem_post.c
@@ -20,37 +20,13 @@ 
 #include <atomic.h>
 #include <errno.h>
 #include <sysdep.h>
-#include <lowlevellock.h>
+#include <lowlevellock.h>	/* lll_futex* used by the old code.  */
+#include <futex-internal.h>
 #include <internaltypes.h>
 #include <semaphore.h>
 
 #include <shlib-compat.h>
 
-/* Wrapper for lll_futex_wake, with error checking.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline void
-futex_wake (unsigned int* futex, int processes_to_wake, int private)
-{
-  int res = lll_futex_wake (futex, processes_to_wake, private);
-  /* No error.  Ignore the number of woken processes.  */
-  if (res >= 0)
-    return;
-  switch (res)
-    {
-    case -EFAULT: /* Could have happened due to memory reuse.  */
-    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
-		     glibc or in the application) or due to memory being
-		     reused for a PI futex.  We cannot distinguish between the
-		     two causes, and one of them is correct use, so we do not
-		     act in this case.  */
-      return;
-    case -ENOSYS: /* Must have been caused by a glibc bug.  */
-    /* No other errors are documented at this time.  */
-    default:
-      abort ();
-    }
-}
-
 
 /* See sem_wait for an explanation of the algorithm.  */
 int
diff --git a/nptl/sem_wait.c b/nptl/sem_wait.c
index c1fd10c..fce7ed4 100644
--- a/nptl/sem_wait.c
+++ b/nptl/sem_wait.c
@@ -17,6 +17,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#include <lowlevellock.h>	/* lll_futex* used by the old code.  */
 #include "sem_waitcommon.c"
 
 int
diff --git a/nptl/sem_waitcommon.c b/nptl/sem_waitcommon.c
index 772425d..d3702c7 100644
--- a/nptl/sem_waitcommon.c
+++ b/nptl/sem_waitcommon.c
@@ -20,7 +20,7 @@ 
 #include <kernel-features.h>
 #include <errno.h>
 #include <sysdep.h>
-#include <lowlevellock.h>
+#include <futex-internal.h>
 #include <internaltypes.h>
 #include <semaphore.h>
 #include <sys/time.h>
@@ -29,110 +29,6 @@ 
 #include <shlib-compat.h>
 #include <atomic.h>
 
-/* Wrapper for lll_futex_wait with absolute timeout and error checking.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline int
-futex_abstimed_wait (unsigned int* futex, unsigned int expected,
-		     const struct timespec* abstime, int private, bool cancel)
-{
-  int err, oldtype;
-  if (abstime == NULL)
-    {
-      if (cancel)
-	oldtype = __pthread_enable_asynccancel ();
-      err = lll_futex_wait (futex, expected, private);
-      if (cancel)
-	__pthread_disable_asynccancel (oldtype);
-    }
-  else
-    {
-#if (defined __ASSUME_FUTEX_CLOCK_REALTIME	\
-     && defined lll_futex_timed_wait_bitset)
-      /* The Linux kernel returns EINVAL for this, but in userspace
-	 such a value is valid.  */
-      if (abstime->tv_sec < 0)
-	return ETIMEDOUT;
-#else
-      struct timeval tv;
-      struct timespec rt;
-      int sec, nsec;
-
-      /* Get the current time.  */
-      __gettimeofday (&tv, NULL);
-
-      /* Compute relative timeout.  */
-      sec = abstime->tv_sec - tv.tv_sec;
-      nsec = abstime->tv_nsec - tv.tv_usec * 1000;
-      if (nsec < 0)
-        {
-          nsec += 1000000000;
-          --sec;
-        }
-
-      /* Already timed out?  */
-      if (sec < 0)
-        return ETIMEDOUT;
-
-      /* Do wait.  */
-      rt.tv_sec = sec;
-      rt.tv_nsec = nsec;
-#endif
-      if (cancel)
-	oldtype = __pthread_enable_asynccancel ();
-#if (defined __ASSUME_FUTEX_CLOCK_REALTIME	\
-     && defined lll_futex_timed_wait_bitset)
-      err = lll_futex_timed_wait_bitset (futex, expected, abstime,
-					 FUTEX_CLOCK_REALTIME, private);
-#else
-      err = lll_futex_timed_wait (futex, expected, &rt, private);
-#endif
-      if (cancel)
-	__pthread_disable_asynccancel (oldtype);
-    }
-  switch (err)
-    {
-    case 0:
-    case -EAGAIN:
-    case -EINTR:
-    case -ETIMEDOUT:
-      return -err;
-
-    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
-    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
-		     being normalized.  Must have been caused by a glibc or
-		     application bug.  */
-    case -ENOSYS: /* Must have been caused by a glibc bug.  */
-    /* No other errors are documented at this time.  */
-    default:
-      abort ();
-    }
-}
-
-/* Wrapper for lll_futex_wake, with error checking.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline void
-futex_wake (unsigned int* futex, int processes_to_wake, int private)
-{
-  int res = lll_futex_wake (futex, processes_to_wake, private);
-  /* No error.  Ignore the number of woken processes.  */
-  if (res >= 0)
-    return;
-  switch (res)
-    {
-    case -EFAULT: /* Could have happened due to memory reuse.  */
-    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
-		     glibc or in the application) or due to memory being
-		     reused for a PI futex.  We cannot distinguish between the
-		     two causes, and one of them is correct use, so we do not
-		     act in this case.  */
-      return;
-    case -ENOSYS: /* Must have been caused by a glibc bug.  */
-    /* No other errors are documented at this time.  */
-    default:
-      abort ();
-    }
-}
-
 
 /* The semaphore provides two main operations: sem_post adds a token to the
    semaphore; sem_wait grabs a token from the semaphore, potentially waiting
@@ -220,11 +116,12 @@  do_futex_wait (struct new_sem *sem, const struct timespec *abstime)
   int err;
 
 #if __HAVE_64B_ATOMICS
-  err = futex_abstimed_wait ((unsigned int *) &sem->data + SEM_VALUE_OFFSET, 0,
-			     abstime, sem->private, true);
+  err = futex_abstimed_wait_cancelable (
+      (unsigned int *) &sem->data + SEM_VALUE_OFFSET, 0, abstime,
+      sem->private);
 #else
-  err = futex_abstimed_wait (&sem->value, SEM_NWAITERS_MASK, abstime,
-			     sem->private, true);
+  err = futex_abstimed_wait_cancelable (&sem->value, SEM_NWAITERS_MASK,
+					abstime, sem->private);
 #endif
 
   return err;
diff --git a/nptl/unregister-atfork.c b/nptl/unregister-atfork.c
index 3838cb7..6d08ed7 100644
--- a/nptl/unregister-atfork.c
+++ b/nptl/unregister-atfork.c
@@ -20,6 +20,7 @@ 
 #include <stdlib.h>
 #include <fork.h>
 #include <atomic.h>
+#include <futex-internal.h>
 
 
 void
@@ -114,7 +115,7 @@  __unregister_atfork (dso_handle)
       atomic_decrement (&deleted->handler->refcntr);
       unsigned int val;
       while ((val = deleted->handler->refcntr) != 0)
-	lll_futex_wait (&deleted->handler->refcntr, val, LLL_PRIVATE);
+	futex_wait_simple (&deleted->handler->refcntr, val, FUTEX_PRIVATE);
 
       deleted = deleted->next;
     }
diff --git a/sysdeps/nacl/exit-thread.h b/sysdeps/nacl/exit-thread.h
index c809405..a092462 100644
--- a/sysdeps/nacl/exit-thread.h
+++ b/sysdeps/nacl/exit-thread.h
@@ -18,7 +18,7 @@ 
 
 #include <assert.h>
 #include <atomic.h>
-#include <lowlevellock.h>
+#include <futex-internal.h>
 #include <nacl-interfaces.h>
 #include <nptl/pthreadP.h>
 
@@ -64,7 +64,7 @@  __exit_thread (void)
       assert (NACL_EXITING_TID > 0);
 
       atomic_store_relaxed (&pd->tid, NACL_EXITING_TID);
-      lll_futex_wake (&pd->tid, 1, LLL_PRIVATE);
+      futex_wake (&pd->tid, 1, FUTEX_PRIVATE);
     }
 
   /* This clears PD->tid some time after the thread stack can never
diff --git a/sysdeps/nacl/lowlevellock-futex.h b/sysdeps/nacl/lowlevellock-futex.h
index b614ac8..c8fd0d8 100644
--- a/sysdeps/nacl/lowlevellock-futex.h
+++ b/sysdeps/nacl/lowlevellock-futex.h
@@ -63,6 +63,11 @@ 
     -_err;                                                              \
   })
 
+/* Wait until a lll_futex_wake call on FUTEXP, or time ABSTIME has passed.  */
+#define lll_futex_abstimed_wait(futexp, val, timeout, private)   \
+  (- __nacl_irt_futex.futex_wait_abs ((volatile int *) (futexp), \
+				      val, (abstime)));
+
 /* Wake up up to NR waiters on FUTEXP.  */
 #define lll_futex_wake(futexp, nr, private)                     \
   ({                                                            \
diff --git a/sysdeps/nptl/aio_misc.h b/sysdeps/nptl/aio_misc.h
index fb69b0f..672c2ad 100644
--- a/sysdeps/nptl/aio_misc.h
+++ b/sysdeps/nptl/aio_misc.h
@@ -22,14 +22,14 @@ 
 
 #include <assert.h>
 #include <nptl/pthreadP.h>
-#include <lowlevellock.h>
+#include <nptl/futex-internal.h>
 
 #define DONT_NEED_AIO_MISC_COND	1
 
 #define AIO_MISC_NOTIFY(waitlist) \
   do {									      \
     if (*waitlist->counterp > 0 && --*waitlist->counterp == 0)		      \
-      lll_futex_wake (waitlist->counterp, 1, LLL_PRIVATE);		      \
+      futex_wake ((unsigned int *) waitlist->counterp, 1, FUTEX_PRIVATE);     \
   } while (0)
 
 #define AIO_MISC_WAIT(result, futex, timeout, cancel)			      \
@@ -48,9 +48,9 @@ 
 	int status;							      \
 	do								      \
 	  {								      \
-	    status = lll_futex_timed_wait (futexaddr, oldval, timeout,	      \
-					   LLL_PRIVATE);		      \
-	    if (status != -EWOULDBLOCK)					      \
+	    status = futex_reltimed_wait ((unsigned int *) futexaddr,	      \
+					  oldval, timeout, FUTEX_PRIVATE);    \
+	    if (status != EAGAIN)					      \
 	      break;							      \
 									      \
 	    oldval = *futexaddr;					      \
@@ -60,12 +60,12 @@ 
 	if (cancel)							      \
 	  LIBC_CANCEL_RESET (oldtype);					      \
 									      \
-	if (status == -EINTR)						      \
+	if (status == EINTR)						      \
 	  result = EINTR;						      \
-	else if (status == -ETIMEDOUT)					      \
+	else if (status == ETIMEDOUT)					      \
 	  result = EAGAIN;						      \
 	else								      \
-	  assert (status == 0 || status == -EWOULDBLOCK);		      \
+	  assert (status == 0 || status == EAGAIN);			      \
 									      \
 	pthread_mutex_lock (&__aio_requests_mutex);			      \
       }									      \
diff --git a/sysdeps/nptl/fork.c b/sysdeps/nptl/fork.c
index 74482b7..f540bde 100644
--- a/sysdeps/nptl/fork.c
+++ b/sysdeps/nptl/fork.c
@@ -30,6 +30,7 @@ 
 #include <nptl/pthreadP.h>
 #include <fork.h>
 #include <arch-fork.h>
+#include <nptl/futex-internal.h>
 
 
 static void
@@ -219,7 +220,7 @@  __libc_fork (void)
 
 	  if (atomic_decrement_and_test (&allp->handler->refcntr)
 	      && allp->handler->need_signal)
-	    lll_futex_wake (&allp->handler->refcntr, 1, LLL_PRIVATE);
+	    futex_wake (&allp->handler->refcntr, 1, FUTEX_PRIVATE);
 
 	  allp = allp->next;
 	}
diff --git a/sysdeps/nptl/gai_misc.h b/sysdeps/nptl/gai_misc.h
index bb83dca..f8af189 100644
--- a/sysdeps/nptl/gai_misc.h
+++ b/sysdeps/nptl/gai_misc.h
@@ -23,14 +23,14 @@ 
 #include <assert.h>
 #include <signal.h>
 #include <nptl/pthreadP.h>
-#include <lowlevellock.h>
+#include <nptl/futex-internal.h>
 
 #define DONT_NEED_GAI_MISC_COND	1
 
 #define GAI_MISC_NOTIFY(waitlist) \
   do {									      \
     if (*waitlist->counterp > 0 && --*waitlist->counterp == 0)		      \
-      lll_futex_wake (waitlist->counterp, 1, LLL_PRIVATE);		      \
+      futex_wake ((unsigned int *) waitlist->counterp, 1, FUTEX_PRIVATE);     \
   } while (0)
 
 #define GAI_MISC_WAIT(result, futex, timeout, cancel) \
@@ -49,9 +49,9 @@ 
 	int status;							      \
 	do								      \
 	  {								      \
-	    status = lll_futex_timed_wait (futexaddr, oldval, timeout,	      \
-					   LLL_PRIVATE);		      \
-	    if (status != -EWOULDBLOCK)					      \
+	    status = futex_reltimed_wait ((unsigned int *)futexaddr,	      \
+					  oldval, timeout, FUTEX_PRIVATE);    \
+	    if (status != EAGAIN)					      \
 	      break;							      \
 									      \
 	    oldval = *futexaddr;					      \
@@ -61,12 +61,12 @@ 
 	if (cancel)							      \
 	  LIBC_CANCEL_RESET (oldtype);					      \
 									      \
-	if (status == -EINTR)						      \
+	if (status == EINTR)						      \
 	  result = EINTR;						      \
-	else if (status == -ETIMEDOUT)					      \
+	else if (status == ETIMEDOUT)					      \
 	  result = EAGAIN;						      \
 	else								      \
-	  assert (status == 0 || status == -EWOULDBLOCK);		      \
+	  assert (status == 0 || status == EAGAIN);			      \
 									      \
 	pthread_mutex_lock (&__gai_requests_mutex);			      \
       }									      \
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index a095ad9..883f144 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -43,6 +43,11 @@ 
 #define lll_futex_timed_wait(futexp, val, timeout, private)             \
   -ENOSYS
 
+/* Wait until a lll_futex_wake call on FUTEXP, or ABSTIME, a point in time
+   counted by CLOCK_REALTIME, has passed.  ABSTIME must be normalized.  */
+#define lll_futex_abstimed_wait(futexp, val, abstime, private) \
+  -ENOSYS
+
 /* This macro should be defined only if FUTEX_CLOCK_REALTIME is also defined.
    If CLOCKBIT is zero, this is identical to lll_futex_timed_wait.
    If CLOCKBIT has FUTEX_CLOCK_REALTIME set, then it's the same but
diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
index 59f6627..202b7ae 100644
--- a/sysdeps/unix/sysv/linux/lowlevellock-futex.h
+++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
@@ -22,6 +22,7 @@ 
 #ifndef __ASSEMBLER__
 #include <sysdep.h>
 #include <tls.h>
+#include <sys/time.h>
 #include <kernel-features.h>
 #endif
 
@@ -92,6 +93,19 @@ 
 		     __lll_private_flag (FUTEX_WAIT, private),  \
 		     val, timeout)
 
+#define lll_futex_abstimed_wait(futexp, val, abstime, private)		    \
+  ({									    \
+    /* Work around the fact that the kernel rejects negative timeout values \
+       despite them being valid.  */					    \
+    int ret;								    \
+    if (__glibc_unlikely (((abstime) != NULL) && ((abstime)->tv_sec < 0)))  \
+      ret = -ETIMEDOUT;							    \
+    else								    \
+      ret = lll_futex_timed_wait_bitset (futexp, val, abstime,		    \
+					 FUTEX_CLOCK_REALTIME, private);    \
+    ret;								    \
+  })
+
 #define lll_futex_timed_wait_bitset(futexp, val, timeout, clockbit, private) \
   lll_futex_syscall (6, futexp,                                         \
 		     __lll_private_flag (FUTEX_WAIT_BITSET | (clockbit), \