[roland/lll-futex] Separate Linuxisms from lowlevellock.h, make a generic one

Message ID 20140702231404.F04F32C3978@topped-with-meat.com
State Committed
Headers

Commit Message

Roland McGrath July 2, 2014, 11:14 p.m. UTC
  This makes lowlevellock.h generic, and splits out the Linux-specific bits
into a new file lowlevellock-futex.h.  Aside from the split, the generic
file is very similar to Bernie's linux-generic version.  (I actually
started with the ARM file and worked from there, but then looked at the
differences from Bernie's file.)

I think Bernie's version introduced a regression that's hard to test for.
In his file, all the macros return the "result" of the syscall, i.e. either
a negated errno code or the non-error return value of the syscall.
However, at least one or two places rely on the non-error return value of
at least some of these macros being exactly zero.  For example,
lll_futex_requeue is expected to return 0, but the syscall when successful
will usually return a positive number.  The failure mode of this regression
is that the code will fall back to thundering-herd wakeups, which is just a
performance regression (but a very important one).  So we need to take
great care in looking for those kinds of changes to the compiled code, not
simply verify that the test suite passes.

Unfortunately, the status quo ante is rather haphazard treatment of the
syscall return value in the different macros.  So cleaning the macros up
causes more changes to the generated code than I'd like.

I'm still looking at the testing (on arm-linux-gnueabihf).  There are some
nontrivial differences I wasn't expecting, and I need to examine them all
more thoroughly.  Some are just mysterious, where the compiler made a
different code-generation choice for no apparent reason (like swapping the
order of operands in a comparison and inverting the associated conditional
branch).  Other are understandable but stupid, where it does more checking
on the syscall return value than it actually cares about (an example of
this costs three or four extra instructions and an extra branch, but that's
after doing a blocking syscall, so it's lost deeply in the noise).  Others
are probably from the new uniformity of the syscall return value handling
but should be harmless.  I need to examine them all before claiming they
don't matter, and I won't get to that until next week.

Joseph is away and I don't expect him to approve the change for ARM very
soon.  After I've finished the testing and if Bernie and others are happy
with the new code, I figure I'll commit the new files next week but not
remove the ARM file.

I note that the AAarch64 file is well nigh identical to the ARM file and
several others probably are too.  So those machines' maintainers could (and
should) remove their files and test the new generic code as soon as they
manage.


2014-07-02  Roland McGrath  <roland@hack.frob.com>

	* sysdeps/nptl/lowlevellock.h: New file.
	* sysdeps/unix/sysv/linux/lowlevellock-futex.h: New file.
	* sysdeps/nptl/lowlevellock-futex.h: New file.
	* sysdeps/unix/sysv/linux/arm/lowlevellock.h: File removed.
  

Comments

Torvald Riegel July 3, 2014, 1:40 p.m. UTC | #1
On Wed, 2014-07-02 at 16:14 -0700, Roland McGrath wrote:
> diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
> new file mode 100644
> index 0000000..12f3876
> --- /dev/null
> +++ b/sysdeps/nptl/lowlevellock-futex.h
> @@ -0,0 +1,86 @@
> +/* Low-level locking access to futex facilities.  Stub version.
> +   Copyright (C) 2014 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef _LOWLEVELLOCK_FUTEX_H
> +#define _LOWLEVELLOCK_FUTEX_H   1
> +
> +#include <errno.h>
> +
> +
> +/* Values for 'private' parameter of locking macros.  Note pthreadP.h
> +   optimizes for these exact values, though they are not required.  */
> +#define LLL_PRIVATE     0
> +#define LLL_SHARED      128
> +
> +
> +/* For most of these macros, the return value is never really used.
> +   Nevertheless, the protocol is that each one returns a negated errno
> +   code for failure or zero for success.  (Note that the corresponding
> +   Linux system calls can sometimes return positive values for success
> +   cases too.  We never use those values.)  */
> +
> +
> +/* Wait while *FUTEXP == VAL for an lll_futex_wake call on FUTEXP.  */
> +#define lll_futex_wait(futexp, val, private) \
> +  lll_futex_timed_wait (futexp, val, NULL, private)
> +
> +/* Wait until a lll_futex_wake call on FUTEXP, or TIMEOUT elapses.  */
> +#define lll_futex_timed_wait(futexp, val, timeout, private)             \
> +  -ENOSYS

I'd prefer if the comment would explicitly mention spurious wake-ups;
what you write could be implementable but is not precisely how we use
futexes.  Maybe just say that it waits until a real or a spurious
wake-up occurs?

> +/* This macro should be defined only if FUTEX_CLOCK_REALTIME is also defined.
> +   If CLOCKBIT is zero, this is identical to lll_futex_timed_wait.
> +   If CLOCKBIT has FUTEX_CLOCK_REALTIME set, then it's the same but
> +   TIMEOUT is counted by CLOCK_REALTIME rather than CLOCK_MONOTONIC.  */
> +#define lll_futex_timed_wait_bitset(futexp, val, timeout, clockbit, private) \
> +  -ENOSYS
> +
> +/* Wake up up to NR waiters on FUTEXP.  */
> +#define lll_futex_wake(futexp, nr, private)                             \
> +  -ENOSYS
> +
> +/* Wake up up to NR_WAKE waiters on FUTEXP.  Move up to NR_MOVE of the
> +   rest from waiting on FUTEXP to waiting on MUTEX (a different futex).  */
> +#define lll_futex_requeue(futexp, nr_wake, nr_move, mutex, val, private) \
> +  -ENOSYS

Call mutex futexp2 (or something else) instead?  (It's just a parameter
name, but it's also not one of our mutexes, really...)

> +
> +/* Wake up up to NR_WAKE waiters on FUTEXP and NR_WAKE2 on FUTEXP2.  */
> +#define lll_futex_wake_unlock(futexp, nr_wake, nr_wake2, futexp2, private) \
> +  -ENOSYS
> +
> +
> +/* Like lll_futex_wait (FUTEXP, VAL, PRIVATE) but with the expectation
> +   that lll_futex_cmp_requeue_pi (FUTEXP, _, _, MUTEX, _, PRIVATE) will
> +   be used to do the wakeup.  Confers priority-inheritance behavior on
> +   the waiter.  */
> +#define lll_futex_wait_requeue_pi(futexp, val, mutex, private) \
> +  lll_futex_timed_wait_requeue_pi (futexp, val, NULL, 0, mutex, private)
> +
> +/* Like lll_futex_wait_requeue_pi, but with a timeout.  */
> +#define lll_futex_timed_wait_requeue_pi(futexp, val, timeout, clockbit, \
> +                                        mutex, private)                 \
> +  -ENOSYS
> +
> +/* Like lll_futex_requeue, but pairs with lll_futex_wait_requeue_pi
> +   and inherits priority from the waiter.  */
> +#define lll_futex_cmp_requeue_pi(futexp, nr_wake, nr_move, mutex,       \
> +                                 val, private)                          \
> +  -ENOSYS
> +
> +
> +#endif  /* lowlevellock-futex.h */
> diff --git a/sysdeps/nptl/lowlevellock.h b/sysdeps/nptl/lowlevellock.h
> new file mode 100644
> index 0000000..e440bb4
> --- /dev/null
> +++ b/sysdeps/nptl/lowlevellock.h
> @@ -0,0 +1,201 @@
> +/* Low-level lock implementation.  Generic futex-based version.
> +   Copyright (C) 2005-2014 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef _LOWLEVELLOCK_H
> +#define _LOWLEVELLOCK_H	1
> +
> +#include <atomic.h>
> +#include <lowlevellock-futex.h>

I believe it would be good if we could document these later on; now that
there's a generic version, there's (more or less) just one place we'd
need documentation at.

> +#define lll_robust_dead(futexv, private) \
> +  do									      \
> +    {									      \
> +      int *__futexp = &(futexv);					      \
> +      atomic_or (__futexp, FUTEX_OWNER_DIED);				      \
> +      lll_futex_wake (__futexp, 1, private);				      \
> +    }									      \
> +  while (0)
> +
> +#define lll_trylock(lock)	\
> +  atomic_compare_and_exchange_bool_acq (&(lock), 1, 0)
> +
> +#define lll_cond_trylock(lock)	\
> +  atomic_compare_and_exchange_bool_acq (&(lock), 2, 0)
> +
> +#define __lll_robust_trylock(futex, id) \
> +  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)

Use atomic_compare_and_exchange_bool_acq (or clean that up later)?

> +#define lll_robust_trylock(lock, id) \
> +  __lll_robust_trylock (&(lock), id)
> +
> +extern void __lll_lock_wait_private (int *futex) attribute_hidden;
> +extern void __lll_lock_wait (int *futex, int private) attribute_hidden;
> +extern int __lll_robust_lock_wait (int *futex, int private) attribute_hidden;
> +
> +/* This is an expression rather than a statement even though its value is
> +   void, so that it can be used in a comma expression or as an expression
> +   that's cast to void.  */
> +#define __lll_lock(futex, private)                                      \
> +  ((void)                                                               \
> +   ({                                                                   \
> +     int *__futex = (futex);                                            \
> +     if (__glibc_unlikely                                               \
> +         (atomic_compare_and_exchange_bool_acq (__futex, 1, 0)))        \
> +       {                                                                \
> +         if (__builtin_constant_p (private) && (private) == LLL_PRIVATE) \
> +           __lll_lock_wait_private (__futex);                           \
> +         else                                                           \
> +           __lll_lock_wait (__futex, private);                          \
> +       }                                                                \
> +   }))
> +#define lll_lock(futex, private)	\
> +  __lll_lock (&(futex), private)
> +
> +
> +#define __lll_robust_lock(futex, id, private)                           \
> +  ({                                                                    \
> +    int *__futex = (futex);                                             \
> +    int __val = 0;                                                      \
> +                                                                        \
> +    if (__glibc_unlikely                                                \
> +        (atomic_compare_and_exchange_bool_acq (__futex, id, 0)))        \
> +      __val = __lll_robust_lock_wait (__futex, private);                \
> +    __val;                                                              \
> +  })
> +#define lll_robust_lock(futex, id, private)     \
> +  __lll_robust_lock (&(futex), id, private)
> +
> +
> +/* This is an expression rather than a statement even though its value is
> +   void, so that it can be used in a comma expression or as an expression
> +   that's cast to void.  */
> +#define __lll_cond_lock(futex, private)                         \
> +  ((void)                                                       \
> +   ({                                                           \
> +     int *__futex = (futex);                                    \
> +     if (__glibc_unlikely (atomic_exchange_acq (__futex, 2)))   \
> +       __lll_lock_wait (__futex, private);                      \
> +   }))
> +#define lll_cond_lock(futex, private) __lll_cond_lock (&(futex), private)
> +
> +
> +#define lll_robust_cond_lock(futex, id, private)	\
> +  __lll_robust_lock (&(futex), (id) | FUTEX_WAITERS, private)
> +
> +
> +extern int __lll_timedlock_wait (int *futex, const struct timespec *,
> +				 int private) attribute_hidden;
> +extern int __lll_robust_timedlock_wait (int *futex, const struct timespec *,
> +					int private) attribute_hidden;
> +
> +#define __lll_timedlock(futex, abstime, private)                \
> +  ({                                                            \
> +    int *__futex = (futex);                                     \
> +    int __val = 0;                                              \
> +                                                                \
> +    if (__glibc_unlikely (atomic_exchange_acq (__futex, 1)))    \
> +      __val = __lll_timedlock_wait (__futex, abstime, private); \
> +    __val;                                                      \
> +  })
> +#define lll_timedlock(futex, abstime, private)  \
> +  __lll_timedlock (&(futex), abstime, private)

This differs from __lll_lock (e.g., CAS vs. atomic_exchange).  I suppose
that resembles the current code we have; nonetheless, I find it
surprising (e.g., lll_lock will not overwrite a value of 2 in the lock
(ie, contended), whereas timedlock will).  That changes whether a
subsequent lll_unlock will do a lll_futex_wake or not.

> +
> +
> +#define __lll_robust_timedlock(futex, abstime, id, private)             \
> +  ({                                                                    \
> +    int *__futex = (futex);                                             \
> +    int __val = 0;                                                      \
> +                                                                        \
> +    if (__glibc_unlikely                                                \
> +        (atomic_compare_and_exchange_bool_acq (__futex, id, 0)))        \
> +      __val = __lll_robust_timedlock_wait (__futex, abstime, private);  \
> +    __val;                                                              \
> +  })
> +#define lll_robust_timedlock(futex, abstime, id, private)       \
> +  __lll_robust_timedlock (&(futex), abstime, id, private)
> +
> +
> +/* This is an expression rather than a statement even though its value is
> +   void, so that it can be used in a comma expression or as an expression
> +   that's cast to void.  */
> +#define __lll_unlock(futex, private)                    \
> +  ((void)                                               \
> +   ({                                                   \
> +     int *__futex = (futex);                            \
> +     int __oldval = atomic_exchange_rel (__futex, 0);   \
> +     if (__glibc_unlikely (__oldval > 1))               \
> +       lll_futex_wake (__futex, 1, private);            \
> +   }))
> +#define lll_unlock(futex, private)	\
> +  __lll_unlock (&(futex), private)
> +
> +
> +/* This is an expression rather than a statement even though its value is
> +   void, so that it can be used in a comma expression or as an expression
> +   that's cast to void.  */
> +#define __lll_robust_unlock(futex, private)             \
> +  ((void)                                               \
> +   ({                                                   \
> +     int *__futex = (futex);                            \
> +     int __oldval = atomic_exchange_rel (__futex, 0);   \
> +     if (__glibc_unlikely (__oldval & FUTEX_WAITERS))	\
> +       lll_futex_wake (__futex, 1, private);            \
> +   }))
> +#define lll_robust_unlock(futex, private)       \
> +  __lll_robust_unlock (&(futex), private)
> +
> +
> +#define lll_islocked(futex) \
> +  ((futex) != LLL_LOCK_INITIALIZER)
> +
> +
> +/* Our internal lock implementation is identical to the binary-compatible
> +   mutex implementation. */
> +
> +/* Initializers for lock.  */
> +#define LLL_LOCK_INITIALIZER		(0)
> +#define LLL_LOCK_INITIALIZER_LOCKED	(1)
> +
> +/* The states of a lock are:
> +    0  -  untaken
> +    1  -  taken by one user

I guess "(not) acquired" is a better term here?

> +   >1  -  taken by more users */

A mutex is never acquired by more than one thread.  IIRC, this state
means that the lock is acquired by one thread, and other threads _might_
be trying to acquire the lock as well (including being blocked on the
futex).

> +
> +/* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex
> +   wakeup when the clone terminates.  The memory location contains the
> +   thread ID while the clone is running and is reset to zero
> +   afterwards.	*/
> +#define lll_wait_tid(tid) \
> +  do {					\
> +    __typeof (tid) __tid;		\
> +    while ((__tid = (tid)) != 0)	\
> +      lll_futex_wait (&(tid), __tid, LLL_SHARED);\
> +  } while (0)
> +
> +extern int __lll_timedwait_tid (int *, const struct timespec *)
> +     attribute_hidden;
> +
> +#define lll_timedwait_tid(tid, abstime) \
> +  ({							\
> +    int __res = 0;					\
> +    if ((tid) != 0)					\
> +      __res = __lll_timedwait_tid (&(tid), (abstime));	\
> +    __res;						\
> +  })
> +
> +
> +#endif	/* lowlevellock.h */
  
Roland McGrath July 4, 2014, 3:09 a.m. UTC | #2
> I'd prefer if the comment would explicitly mention spurious wake-ups;
> what you write could be implementable but is not precisely how we use
> futexes.  Maybe just say that it waits until a real or a spurious
> wake-up occurs?

I made a modicum of effort to document the macros' protocols in the stub
file because it's the generally right thing to do.  But I was just barely
cobbling together rough and terse descriptions of what I wasn't quite sure
things actually did.  I'd love it if you could provide detailed replacement
comments for the stub file that explain the contract the genericish NPTL
source code expects from these macros.  Separately, if there's something
useful to say about how the Linux implementation bridges the gap between
that contract and the kernel's exact contract to put into comments in the
Linux file, I'd appreciate that very much as well.

> Call mutex futexp2 (or something else) instead?  (It's just a parameter
> name, but it's also not one of our mutexes, really...)

I took the old ARM file as the model and didn't change most names.
But I can change these however you think looks best.

> I believe it would be good if we could document these later on; now that
> there's a generic version, there's (more or less) just one place we'd
> need documentation at.

Feel free! :-)

> > +#define __lll_robust_trylock(futex, id) \
> > +  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)
> 
> Use atomic_compare_and_exchange_bool_acq (or clean that up later)?

I changed a couple of cases that were val in the old ARM file to be bool
where it was obvious to me that the intended semantics were clearly bool
(which did not affect the generated code for ARM).  This one I was less
sure about, but I guess the semantics of that value test are exactly the
same here as the bool version so I should change it too.

> This differs from __lll_lock (e.g., CAS vs. atomic_exchange).  I suppose
> that resembles the current code we have; nonetheless, I find it
> surprising (e.g., lll_lock will not overwrite a value of 2 in the lock
> (ie, contended), whereas timedlock will).  That changes whether a
> subsequent lll_unlock will do a lll_futex_wake or not.

This change is intended not to materially change the compiled code at all
if possible.  Once we have many more machines consolidated on using a
generic implementation, then you should by all means go to town on
substantive cleanups like this.

> > +/* The states of a lock are:
> > +    0  -  untaken
> > +    1  -  taken by one user
> 
> I guess "(not) acquired" is a better term here?
> 
> > +   >1  -  taken by more users */
> 
> A mutex is never acquired by more than one thread.  IIRC, this state
> means that the lock is acquired by one thread, and other threads _might_
> be trying to acquire the lock as well (including being blocked on the
> futex).

Here I just preserved the existing comments from the old code.  Once my
change lands, please send your own changes to do comment cleanups and the
like.  (A pure comment change like this that is indisputably clarifying
things can go in as "obvious enough" without needing formal review.)


Thanks,
Roland
  
Bernard Ogden July 7, 2014, 11:26 a.m. UTC | #3
Ugh - yes, I missed that some syscalls would return non-zero for success.

I have a few comments, 2 about other things I missed before, and 1 about the bug I was trying to fix.

On 3 Jul 2014, at 00:14, Roland McGrath <roland@hack.frob.com> wrote:

> +#define __lll_robust_trylock(futex, id) \
> +  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)

My first thought was that the != 0 is superfluous. Either way you return 0 if you changed the memory, and non-zero if you didn't.

But looking at the call in pthread_mutex_trylock.c, I wonder if both patches have this wrong. Should we be returning the old value here? That's what (at least) m68k does, so there's a difference that'll need resolving.

> +#define __lll_cond_lock(futex, private)                         \
> +  ((void)                                                       \
> +   ({                                                           \
> +     int *__futex = (futex);                                    \
> +     if (__glibc_unlikely (atomic_exchange_acq (__futex, 2)))   \
> +       __lll_lock_wait (__futex, private);                      \
> +   }))

This is unconditionally setting the futex to 2, where my patch (based on Tile) sets it to 2 only if it was previously 0. Again, the platforms are inconsistent so it'll need resolving.

> +#define __lll_timedlock(futex, abstime, private)                \
> +  ({                                                            \
> +    int *__futex = (futex);                                     \
> +    int __val = 0;                                              \
> +                                                                \
> +    if (__glibc_unlikely (atomic_exchange_acq (__futex, 1)))    \
> +      __val = __lll_timedlock_wait (__futex, abstime, private); \
> +    __val;                                                      \
> +  })

This'll spread BZ 16892 (and be a binary difference on the currently unaffected platforms).

Should be something like:

if (__glibc_unlikely (atomic_compare_and_exchange_bool_acq (__futex, 1, 0)))
  
Andreas Schwab July 7, 2014, 12:03 p.m. UTC | #4
Bernard Ogden <bernie.ogden@linaro.org> writes:

> On 3 Jul 2014, at 00:14, Roland McGrath <roland@hack.frob.com> wrote:
>
>> +#define __lll_robust_trylock(futex, id) \
>> +  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)
>
> My first thought was that the != 0 is superfluous. Either way you return 0
> if you changed the memory, and non-zero if you didn't.
>
> But looking at the call in pthread_mutex_trylock.c, I wonder if both
> patches have this wrong. Should we be returning the old value here? That's
> what (at least) m68k does, so there's a difference that'll need resolving.

Looks like x86 and powerpc do that too, and all other archs got this
wrong.

Andreas.
  
Roland McGrath July 9, 2014, 9:54 p.m. UTC | #5
> I have a few comments, 2 about other things I missed before, and 1 about
> the bug I was trying to fix.

Thanks very much for the review.

My inclination is to start out with a version like mine that doesn't change
any code from what it does now on at least one platform.  (I chose ARM just
because that's what I'm working on first for the non-Linux port where this
refactoring matters.)  Once it's in and being used by ARM (and/or one or
more other machines where the code matches today), then we can do these
other fixes incrementally.  That is: first refactor without change; then fix.

> On 3 Jul 2014, at 00:14, Roland McGrath <roland@hack.frob.com> wrote:
> 
> > +#define __lll_robust_trylock(futex, id) \
> > +  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)
> 
> My first thought was that the != 0 is superfluous. Either way you return
> 0 if you changed the memory, and non-zero if you didn't.
> 
> But looking at the call in pthread_mutex_trylock.c, I wonder if both
> patches have this wrong. Should we be returning the old value here?
> That's what (at least) m68k does, so there's a difference that'll need
> resolving.

Indeed this seems clearly wrong.  Unfortunately it looks to me like the
scenario where it makes a difference can only come up in a race.  The logic
in pthread_mutex_trylock is somewhat intricate, so I'm not completely sure.
Can you see a way to come up with a (non-racy) regression test for this?

I've looked at all the machines' implementations of lll_robust_trylock.
Most look just like ARM, i.e. using atomic_compare_and_exchange_val_acq
and comparing against zero.

s390 open-codes in assembly exactly what the
atomic_compare_and_exchange_val_acq would do, and then does the
erroneous comparison in C.

sh open-codes in assembly exactly what the
atomic_compare_and_exchange_val_acq along with the erroneous comparison
would do.

i386, x86_64, and powerpc open-code in assembly exactly what the
atomic_compare_and_exchange_val_acq would do, and don't do the erroneous
comparison.

m68k uses atomic_compare_and_exchange_val_acq without the erroneous
comparison.

The sole use of lll_robust_trylock is in pthread_mutex_trylock, right
after code that just uses atomic_compare_and_exchange_val_acq directly
on the same memory location.

So I think what we should really do is just remove lll_robust_trylock
entirely.  It doesn't let the machine encapsulate any additional magic,
because the generic code already assumes exactly what the use of the
location has to be.  This could be done either before or after my patch.

> > +#define __lll_cond_lock(futex, private)                         \
> > +  ((void)                                                       \
> > +   ({                                                           \
> > +     int *__futex = (futex);                                    \
> > +     if (__glibc_unlikely (atomic_exchange_acq (__futex, 2)))   \
> > +       __lll_lock_wait (__futex, private);                      \
> > +   }))
> 
> This is unconditionally setting the futex to 2, where my patch (based
> on Tile) sets it to 2 only if it was previously 0. Again, the
> platforms are inconsistent so it'll need resolving.

The lay of the land is pretty similar for this case.  Many have exactly
equivalent code.  Many others have code that uses one of
atomic_compare_and_exchange_{bool,val}_acq and so match the behavior of
yours rather than mine.  A few others open-code assembly that is
equivalent to one or the other.

Unlike lll_robust_trylock, there is one outlier: sparc is almost the
same (as yours), but uses atomic_compare_and_exchange_val_24_acq (a
special sparcism).  So it's not a candidate for removing entirely even
though it too has only one user (pthread_mutex_cond_lock.c).

Off hand I don't understand the use here well enough to be completely
sure which is correct.  In the contended case, they all get to
__lll_lock_wait.  The generic C and the x86 assembly versions of that do
the equivalent of the unconditional atomic_exchange_acq.  I suspect that
is actually what's correct: if it was 0 before that's the
unlocked/uncontended case; if it was 1 before, your code leaves it 1
(until it gets into __lll_lock_wait, anyway) but there is now
contention, so it should be 2; when it gets into __lll_lock_wait, that
will do an unconditional atomic_exchange_acq anyway.

Given this control flow, I think it's impossible to write any test that
can tell the difference.  Your version is just going to be slightly
suboptimal in the contended case, because it will skip installing 2 and
then do an additional atomic operation in __lll_lock_wait to install it
before going into the kernel.

So we just have to reason it out.  Do you agree with my reasoning above?

I've updated my version on the branch to have an explicit != 0 in the
comparison to make it slightly more clear.  (All this stuff could use
vastly more comments.)

> > +#define __lll_timedlock(futex, abstime, private)                \
> > +  ({                                                            \
> > +    int *__futex = (futex);                                     \
> > +    int __val = 0;                                              \
> > +                                                                \
> > +    if (__glibc_unlikely (atomic_exchange_acq (__futex, 1)))    \
> > +      __val = __lll_timedlock_wait (__futex, abstime, private); \
> > +    __val;                                                      \
> > +  })
> 
> This'll spread BZ 16892 (and be a binary difference on the currently
> unaffected platforms).
> 
> Should be something like:
> 
> if (__glibc_unlikely (atomic_compare_and_exchange_bool_acq (__futex, 1, 0)))

OK.  I concur with the rationale described in the bug.  But as I said
above, I'm inclined to separate that fix to be done after the pure
refactoring has gone in.

I think what I'll do now is proceed with working up a change to
eliminate lll_robust_trylock in the status quo ante.  I'm going on the
assumption that we'll conclude it's impossible to test for that issue,
but please do put your mind to coming up with a useful test case (or
else contributing to the belief that it can't be done).

That will leave just the latter two issues to be resolved after the
refactoring goes in.  For lll_cond_lock, I think what we want is no
change from my version (and the status quo ante for ARM), but that needs
to be confirmed by other people contemplating the logic.  For
lll_timedlock, I do think your change is correct and we'll do it after.

Actually changing ARM has to wait for Joseph to come back next week and
approve it.  By the time he's back, I'll be travelling for Cauldron and
then I can't be relied on to do anything before the 28th (though I might
well do some things while travelling).

I pointed out earlier that the aarch64 file is nearly identical to the
arm file.  So if Marcus is around now and able to review and approve
before Joseph is back, we could make aarch64 the first to use the new
generic file instead of arm.  (I can do a build for aarch64 and verify
that it has no effect on the compiled code.)  Then I think we could
proceed with the follow-on fixes and let ARM and other machines drop
their now-redundant files at their maintainers' leisure to pick up those
fixes.

Assuming nobody tells me why any of this is a bad idea, I'll proceed in
that direction now.


Thanks,
Roland
  
Bernard Ogden July 14, 2014, 2:55 p.m. UTC | #6
On 9 Jul 2014, at 22:54, Roland McGrath <roland@hack.frob.com> wrote:

>> On 3 Jul 2014, at 00:14, Roland McGrath <roland@hack.frob.com> wrote:
>> 
>>> +#define __lll_robust_trylock(futex, id) \
>>> +  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)
>> 
>> My first thought was that the != 0 is superfluous. Either way you return
>> 0 if you changed the memory, and non-zero if you didn't.
>> 
>> But looking at the call in pthread_mutex_trylock.c, I wonder if both
>> patches have this wrong. Should we be returning the old value here?
>> That's what (at least) m68k does, so there's a difference that'll need
>> resolving.
> 
> Indeed this seems clearly wrong.  Unfortunately it looks to me like the
> scenario where it makes a difference can only come up in a race.  The logic
> in pthread_mutex_trylock is somewhat intricate, so I'm not completely sure.
> Can you see a way to come up with a (non-racy) regression test for this?

To test that particular case, it looks like we'd have to force a thread to die in an unclean way in the middle of the function. I don't think we can do that.

Perhaps there should be a test that we get EOWNERDEAD if a lock holder dies in an unclean way before the call to __pthread_mutex_trylock - I didn't see one and I guess that would be doable with a pair of threads. But that's getting further away from issues relating to this patch.

> 
>>> +#define __lll_cond_lock(futex, private)                         \
>>> +  ((void)                                                       \
>>> +   ({                                                           \
>>> +     int *__futex = (futex);                                    \
>>> +     if (__glibc_unlikely (atomic_exchange_acq (__futex, 2)))   \
>>> +       __lll_lock_wait (__futex, private);                      \
>>> +   }))
>> 
>> This is unconditionally setting the futex to 2, where my patch (based
>> on Tile) sets it to 2 only if it was previously 0. Again, the
>> platforms are inconsistent so it'll need resolving.
> 
> The lay of the land is pretty similar for this case.  Many have exactly
> equivalent code.  Many others have code that uses one of
> atomic_compare_and_exchange_{bool,val}_acq and so match the behavior of
> yours rather than mine.  A few others open-code assembly that is
> equivalent to one or the other.
> 
> Unlike lll_robust_trylock, there is one outlier: sparc is almost the
> same (as yours), but uses atomic_compare_and_exchange_val_24_acq (a
> special sparcism).  So it's not a candidate for removing entirely even
> though it too has only one user (pthread_mutex_cond_lock.c).
> 
> Off hand I don't understand the use here well enough to be completely
> sure which is correct.  In the contended case, they all get to
> __lll_lock_wait.  The generic C and the x86 assembly versions of that do
> the equivalent of the unconditional atomic_exchange_acq.  I suspect that
> is actually what's correct: if it was 0 before that's the
> unlocked/uncontended case; if it was 1 before, your code leaves it 1
> (until it gets into __lll_lock_wait, anyway) but there is now
> contention, so it should be 2; when it gets into __lll_lock_wait, that
> will do an unconditional atomic_exchange_acq anyway.
> 
> Given this control flow, I think it's impossible to write any test that
> can tell the difference.  Your version is just going to be slightly
> suboptimal in the contended case, because it will skip installing 2 and
> then do an additional atomic operation in __lll_lock_wait to install it
> before going into the kernel.
> 
> So we just have to reason it out.  Do you agree with my reasoning above?

I only learned about condition variables at all through looking at this patch, but I think your reasoning is correct.

> 
>>> +#define __lll_timedlock(futex, abstime, private)                \
>>> +  ({                                                            \
>>> +    int *__futex = (futex);                                     \
>>> +    int __val = 0;                                              \
>>> +                                                                \
>>> +    if (__glibc_unlikely (atomic_exchange_acq (__futex, 1)))    \
>>> +      __val = __lll_timedlock_wait (__futex, abstime, private); \
>>> +    __val;                                                      \
>>> +  })
>> 
>> This'll spread BZ 16892 (and be a binary difference on the currently
>> unaffected platforms).
>> 
>> Should be something like:
>> 
>> if (__glibc_unlikely (atomic_compare_and_exchange_bool_acq (__futex, 1, 0)))
> 
> OK.  I concur with the rationale described in the bug.  But as I said
> above, I'm inclined to separate that fix to be done after the pure
> refactoring has gone in.

That makes sense.

So in summary:
__lll_robust_trylock: Has already been removed, we can't test.
__lll_cond_lock: Your patch looks fine.
__lll_timedlock: To be fixed once the refactoring has gone in.
  
Bernard Ogden July 15, 2014, 7:38 a.m. UTC | #7
> Perhaps there should be a test that we get EOWNERDEAD if a lock holder dies in an unclean way before the call to __pthread_mutex_trylock - I didn't see one and I guess that would be doable with a pair of threads. But that's getting further away from issues relating to this patch.

Actually there is already a test for EOWNERDEAD (tst-robust2 and
tst-robust5). Just not in the racy case.
  

Patch

diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
new file mode 100644
index 0000000..12f3876
--- /dev/null
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -0,0 +1,86 @@ 
+/* Low-level locking access to futex facilities.  Stub version.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOWLEVELLOCK_FUTEX_H
+#define _LOWLEVELLOCK_FUTEX_H   1
+
+#include <errno.h>
+
+
+/* Values for 'private' parameter of locking macros.  Note pthreadP.h
+   optimizes for these exact values, though they are not required.  */
+#define LLL_PRIVATE     0
+#define LLL_SHARED      128
+
+
+/* For most of these macros, the return value is never really used.
+   Nevertheless, the protocol is that each one returns a negated errno
+   code for failure or zero for success.  (Note that the corresponding
+   Linux system calls can sometimes return positive values for success
+   cases too.  We never use those values.)  */
+
+
+/* Wait while *FUTEXP == VAL for an lll_futex_wake call on FUTEXP.  */
+#define lll_futex_wait(futexp, val, private) \
+  lll_futex_timed_wait (futexp, val, NULL, private)
+
+/* Wait until a lll_futex_wake call on FUTEXP, or TIMEOUT elapses.  */
+#define lll_futex_timed_wait(futexp, val, timeout, private)             \
+  -ENOSYS
+
+/* This macro should be defined only if FUTEX_CLOCK_REALTIME is also defined.
+   If CLOCKBIT is zero, this is identical to lll_futex_timed_wait.
+   If CLOCKBIT has FUTEX_CLOCK_REALTIME set, then it's the same but
+   TIMEOUT is counted by CLOCK_REALTIME rather than CLOCK_MONOTONIC.  */
+#define lll_futex_timed_wait_bitset(futexp, val, timeout, clockbit, private) \
+  -ENOSYS
+
+/* Wake up up to NR waiters on FUTEXP.  */
+#define lll_futex_wake(futexp, nr, private)                             \
+  -ENOSYS
+
+/* Wake up up to NR_WAKE waiters on FUTEXP.  Move up to NR_MOVE of the
+   rest from waiting on FUTEXP to waiting on MUTEX (a different futex).  */
+#define lll_futex_requeue(futexp, nr_wake, nr_move, mutex, val, private) \
+  -ENOSYS
+
+/* Wake up up to NR_WAKE waiters on FUTEXP and NR_WAKE2 on FUTEXP2.  */
+#define lll_futex_wake_unlock(futexp, nr_wake, nr_wake2, futexp2, private) \
+  -ENOSYS
+
+
+/* Like lll_futex_wait (FUTEXP, VAL, PRIVATE) but with the expectation
+   that lll_futex_cmp_requeue_pi (FUTEXP, _, _, MUTEX, _, PRIVATE) will
+   be used to do the wakeup.  Confers priority-inheritance behavior on
+   the waiter.  */
+#define lll_futex_wait_requeue_pi(futexp, val, mutex, private) \
+  lll_futex_timed_wait_requeue_pi (futexp, val, NULL, 0, mutex, private)
+
+/* Like lll_futex_wait_requeue_pi, but with a timeout.  */
+#define lll_futex_timed_wait_requeue_pi(futexp, val, timeout, clockbit, \
+                                        mutex, private)                 \
+  -ENOSYS
+
+/* Like lll_futex_requeue, but pairs with lll_futex_wait_requeue_pi
+   and inherits priority from the waiter.  */
+#define lll_futex_cmp_requeue_pi(futexp, nr_wake, nr_move, mutex,       \
+                                 val, private)                          \
+  -ENOSYS
+
+
+#endif  /* lowlevellock-futex.h */
diff --git a/sysdeps/nptl/lowlevellock.h b/sysdeps/nptl/lowlevellock.h
new file mode 100644
index 0000000..e440bb4
--- /dev/null
+++ b/sysdeps/nptl/lowlevellock.h
@@ -0,0 +1,201 @@ 
+/* Low-level lock implementation.  Generic futex-based version.
+   Copyright (C) 2005-2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOWLEVELLOCK_H
+#define _LOWLEVELLOCK_H	1
+
+#include <atomic.h>
+#include <lowlevellock-futex.h>
+
+#define lll_robust_dead(futexv, private) \
+  do									      \
+    {									      \
+      int *__futexp = &(futexv);					      \
+      atomic_or (__futexp, FUTEX_OWNER_DIED);				      \
+      lll_futex_wake (__futexp, 1, private);				      \
+    }									      \
+  while (0)
+
+#define lll_trylock(lock)	\
+  atomic_compare_and_exchange_bool_acq (&(lock), 1, 0)
+
+#define lll_cond_trylock(lock)	\
+  atomic_compare_and_exchange_bool_acq (&(lock), 2, 0)
+
+#define __lll_robust_trylock(futex, id) \
+  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)
+#define lll_robust_trylock(lock, id) \
+  __lll_robust_trylock (&(lock), id)
+
+extern void __lll_lock_wait_private (int *futex) attribute_hidden;
+extern void __lll_lock_wait (int *futex, int private) attribute_hidden;
+extern int __lll_robust_lock_wait (int *futex, int private) attribute_hidden;
+
+/* This is an expression rather than a statement even though its value is
+   void, so that it can be used in a comma expression or as an expression
+   that's cast to void.  */
+#define __lll_lock(futex, private)                                      \
+  ((void)                                                               \
+   ({                                                                   \
+     int *__futex = (futex);                                            \
+     if (__glibc_unlikely                                               \
+         (atomic_compare_and_exchange_bool_acq (__futex, 1, 0)))        \
+       {                                                                \
+         if (__builtin_constant_p (private) && (private) == LLL_PRIVATE) \
+           __lll_lock_wait_private (__futex);                           \
+         else                                                           \
+           __lll_lock_wait (__futex, private);                          \
+       }                                                                \
+   }))
+#define lll_lock(futex, private)	\
+  __lll_lock (&(futex), private)
+
+
+#define __lll_robust_lock(futex, id, private)                           \
+  ({                                                                    \
+    int *__futex = (futex);                                             \
+    int __val = 0;                                                      \
+                                                                        \
+    if (__glibc_unlikely                                                \
+        (atomic_compare_and_exchange_bool_acq (__futex, id, 0)))        \
+      __val = __lll_robust_lock_wait (__futex, private);                \
+    __val;                                                              \
+  })
+#define lll_robust_lock(futex, id, private)     \
+  __lll_robust_lock (&(futex), id, private)
+
+
+/* This is an expression rather than a statement even though its value is
+   void, so that it can be used in a comma expression or as an expression
+   that's cast to void.  */
+#define __lll_cond_lock(futex, private)                         \
+  ((void)                                                       \
+   ({                                                           \
+     int *__futex = (futex);                                    \
+     if (__glibc_unlikely (atomic_exchange_acq (__futex, 2)))   \
+       __lll_lock_wait (__futex, private);                      \
+   }))
+#define lll_cond_lock(futex, private) __lll_cond_lock (&(futex), private)
+
+
+#define lll_robust_cond_lock(futex, id, private)	\
+  __lll_robust_lock (&(futex), (id) | FUTEX_WAITERS, private)
+
+
+extern int __lll_timedlock_wait (int *futex, const struct timespec *,
+				 int private) attribute_hidden;
+extern int __lll_robust_timedlock_wait (int *futex, const struct timespec *,
+					int private) attribute_hidden;
+
+#define __lll_timedlock(futex, abstime, private)                \
+  ({                                                            \
+    int *__futex = (futex);                                     \
+    int __val = 0;                                              \
+                                                                \
+    if (__glibc_unlikely (atomic_exchange_acq (__futex, 1)))    \
+      __val = __lll_timedlock_wait (__futex, abstime, private); \
+    __val;                                                      \
+  })
+#define lll_timedlock(futex, abstime, private)  \
+  __lll_timedlock (&(futex), abstime, private)
+
+
+#define __lll_robust_timedlock(futex, abstime, id, private)             \
+  ({                                                                    \
+    int *__futex = (futex);                                             \
+    int __val = 0;                                                      \
+                                                                        \
+    if (__glibc_unlikely                                                \
+        (atomic_compare_and_exchange_bool_acq (__futex, id, 0)))        \
+      __val = __lll_robust_timedlock_wait (__futex, abstime, private);  \
+    __val;                                                              \
+  })
+#define lll_robust_timedlock(futex, abstime, id, private)       \
+  __lll_robust_timedlock (&(futex), abstime, id, private)
+
+
+/* This is an expression rather than a statement even though its value is
+   void, so that it can be used in a comma expression or as an expression
+   that's cast to void.  */
+#define __lll_unlock(futex, private)                    \
+  ((void)                                               \
+   ({                                                   \
+     int *__futex = (futex);                            \
+     int __oldval = atomic_exchange_rel (__futex, 0);   \
+     if (__glibc_unlikely (__oldval > 1))               \
+       lll_futex_wake (__futex, 1, private);            \
+   }))
+#define lll_unlock(futex, private)	\
+  __lll_unlock (&(futex), private)
+
+
+/* This is an expression rather than a statement even though its value is
+   void, so that it can be used in a comma expression or as an expression
+   that's cast to void.  */
+#define __lll_robust_unlock(futex, private)             \
+  ((void)                                               \
+   ({                                                   \
+     int *__futex = (futex);                            \
+     int __oldval = atomic_exchange_rel (__futex, 0);   \
+     if (__glibc_unlikely (__oldval & FUTEX_WAITERS))	\
+       lll_futex_wake (__futex, 1, private);            \
+   }))
+#define lll_robust_unlock(futex, private)       \
+  __lll_robust_unlock (&(futex), private)
+
+
+#define lll_islocked(futex) \
+  ((futex) != LLL_LOCK_INITIALIZER)
+
+
+/* Our internal lock implementation is identical to the binary-compatible
+   mutex implementation. */
+
+/* Initializers for lock.  */
+#define LLL_LOCK_INITIALIZER		(0)
+#define LLL_LOCK_INITIALIZER_LOCKED	(1)
+
+/* The states of a lock are:
+    0  -  untaken
+    1  -  taken by one user
+   >1  -  taken by more users */
+
+/* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex
+   wakeup when the clone terminates.  The memory location contains the
+   thread ID while the clone is running and is reset to zero
+   afterwards.	*/
+#define lll_wait_tid(tid) \
+  do {					\
+    __typeof (tid) __tid;		\
+    while ((__tid = (tid)) != 0)	\
+      lll_futex_wait (&(tid), __tid, LLL_SHARED);\
+  } while (0)
+
+extern int __lll_timedwait_tid (int *, const struct timespec *)
+     attribute_hidden;
+
+#define lll_timedwait_tid(tid, abstime) \
+  ({							\
+    int __res = 0;					\
+    if ((tid) != 0)					\
+      __res = __lll_timedwait_tid (&(tid), (abstime));	\
+    __res;						\
+  })
+
+
+#endif	/* lowlevellock.h */
diff --git a/sysdeps/unix/sysv/linux/arm/lowlevellock.h b/sysdeps/unix/sysv/linux/arm/lowlevellock.h
deleted file mode 100644
index 5d19434..0000000
--- a/sysdeps/unix/sysv/linux/arm/lowlevellock.h
+++ /dev/null
@@ -1,321 +0,0 @@ 
-/* Copyright (C) 2005-2014 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library.  If not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#ifndef _LOWLEVELLOCK_H
-#define _LOWLEVELLOCK_H	1
-
-#include <time.h>
-#include <sys/param.h>
-#include <bits/pthreadtypes.h>
-#include <atomic.h>
-#include <sysdep.h>
-#include <kernel-features.h>
-
-#define FUTEX_WAIT		0
-#define FUTEX_WAKE		1
-#define FUTEX_REQUEUE		3
-#define FUTEX_CMP_REQUEUE	4
-#define FUTEX_WAKE_OP		5
-#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE	((4 << 24) | 1)
-#define FUTEX_LOCK_PI		6
-#define FUTEX_UNLOCK_PI		7
-#define FUTEX_TRYLOCK_PI	8
-#define FUTEX_WAIT_BITSET	9
-#define FUTEX_WAKE_BITSET	10
-#define FUTEX_WAIT_REQUEUE_PI   11
-#define FUTEX_CMP_REQUEUE_PI    12
-#define FUTEX_PRIVATE_FLAG	128
-#define FUTEX_CLOCK_REALTIME	256
-
-#define FUTEX_BITSET_MATCH_ANY	0xffffffff
-
-/* Values for 'private' parameter of locking macros.  Yes, the
-   definition seems to be backwards.  But it is not.  The bit will be
-   reversed before passing to the system call.  */
-#define LLL_PRIVATE	0
-#define LLL_SHARED	FUTEX_PRIVATE_FLAG
-
-
-#if !defined NOT_IN_libc || defined IS_IN_rtld
-/* In libc.so or ld.so all futexes are private.  */
-# ifdef __ASSUME_PRIVATE_FUTEX
-#  define __lll_private_flag(fl, private) \
-  ((fl) | FUTEX_PRIVATE_FLAG)
-# else
-#  define __lll_private_flag(fl, private) \
-  ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))
-# endif
-#else
-# ifdef __ASSUME_PRIVATE_FUTEX
-#  define __lll_private_flag(fl, private) \
-  (((fl) | FUTEX_PRIVATE_FLAG) ^ (private))
-# else
-#  define __lll_private_flag(fl, private) \
-  (__builtin_constant_p (private)					      \
-   ? ((private) == 0							      \
-      ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))	      \
-      : (fl))								      \
-   : ((fl) | (((private) ^ FUTEX_PRIVATE_FLAG)				      \
-	      & THREAD_GETMEM (THREAD_SELF, header.private_futex))))
-# endif
-#endif
-
-
-#define lll_futex_wait(futexp, val, private) \
-  lll_futex_timed_wait(futexp, val, NULL, private)
-
-#define lll_futex_timed_wait(futexp, val, timespec, private) \
-  ({									      \
-    INTERNAL_SYSCALL_DECL (__err);					      \
-    long int __ret;							      \
-    __ret = INTERNAL_SYSCALL (futex, __err, 4, (futexp),		      \
-			      __lll_private_flag (FUTEX_WAIT, private),	      \
-			      (val), (timespec));			      \
-    __ret;								      \
-  })
-
-#define lll_futex_timed_wait_bitset(futexp, val, timespec, clockbit, private) \
-  ({									\
-    INTERNAL_SYSCALL_DECL (__err);					\
-    long int __ret;							\
-    int __op = FUTEX_WAIT_BITSET | clockbit;				\
-    __ret = INTERNAL_SYSCALL (futex, __err, 6, (futexp),		\
-			      __lll_private_flag (__op, private),	\
-			      (val), (timespec), NULL /* Unused.  */,	\
-			      FUTEX_BITSET_MATCH_ANY);			\
-    __ret;								\
-  })
-
-#define lll_futex_wake(futexp, nr, private) \
-  ({									      \
-    INTERNAL_SYSCALL_DECL (__err);					      \
-    long int __ret;							      \
-    __ret = INTERNAL_SYSCALL (futex, __err, 4, (futexp),		      \
-			      __lll_private_flag (FUTEX_WAKE, private),	      \
-			      (nr), 0);					      \
-    __ret;								      \
-  })
-
-#define lll_robust_dead(futexv, private) \
-  do									      \
-    {									      \
-      int *__futexp = &(futexv);					      \
-      atomic_or (__futexp, FUTEX_OWNER_DIED);				      \
-      lll_futex_wake (__futexp, 1, private);				      \
-    }									      \
-  while (0)
-
-/* Returns non-zero if error happened, zero if success.  */
-#define lll_futex_requeue(futexp, nr_wake, nr_move, mutex, val, private) \
-  ({									      \
-    INTERNAL_SYSCALL_DECL (__err);					      \
-    long int __ret;							      \
-    __ret = INTERNAL_SYSCALL (futex, __err, 6, (futexp),		      \
-			      __lll_private_flag (FUTEX_CMP_REQUEUE, private),\
-			      (nr_wake), (nr_move), (mutex), (val));	      \
-    INTERNAL_SYSCALL_ERROR_P (__ret, __err);				      \
-  })
-
-
-/* Returns non-zero if error happened, zero if success.  */
-#define lll_futex_wake_unlock(futexp, nr_wake, nr_wake2, futexp2, private) \
-  ({									      \
-    INTERNAL_SYSCALL_DECL (__err);					      \
-    long int __ret;							      \
-    __ret = INTERNAL_SYSCALL (futex, __err, 6, (futexp),		      \
-			      __lll_private_flag (FUTEX_WAKE_OP, private),    \
-			      (nr_wake), (nr_wake2), (futexp2),		      \
-			      FUTEX_OP_CLEAR_WAKE_IF_GT_ONE);		      \
-    INTERNAL_SYSCALL_ERROR_P (__ret, __err);				      \
-  })
-
-/* Priority Inheritance support.  */
-#define lll_futex_wait_requeue_pi(futexp, val, mutex, private) \
-  lll_futex_timed_wait_requeue_pi (futexp, val, NULL, 0, mutex, private)
-
-#define lll_futex_timed_wait_requeue_pi(futexp, val, timespec, clockbit,      \
-					mutex, private)			      \
-  ({									      \
-    INTERNAL_SYSCALL_DECL (__err);					      \
-    int __op = FUTEX_WAIT_REQUEUE_PI | clockbit;			      \
-									      \
-    INTERNAL_SYSCALL (futex, __err, 5, (futexp),			      \
-		      __lll_private_flag (__op, private),		      \
-		      (val), (timespec), mutex); 			      \
-  })
-
-#define lll_futex_cmp_requeue_pi(futexp, nr_wake, nr_move, mutex, val, priv)  \
-  ({									      \
-    INTERNAL_SYSCALL_DECL (__err);					      \
-    long int __ret;							      \
-									      \
-    __ret = INTERNAL_SYSCALL (futex, __err, 6, (futexp),		      \
-			      __lll_private_flag (FUTEX_CMP_REQUEUE_PI, priv),\
-			      (nr_wake), (nr_move), (mutex), (val));	      \
-    INTERNAL_SYSCALL_ERROR_P (__ret, __err);				      \
-  })
-
-
-#define lll_trylock(lock)	\
-  atomic_compare_and_exchange_val_acq(&(lock), 1, 0)
-
-#define lll_cond_trylock(lock)	\
-  atomic_compare_and_exchange_val_acq(&(lock), 2, 0)
-
-#define __lll_robust_trylock(futex, id) \
-  (atomic_compare_and_exchange_val_acq (futex, id, 0) != 0)
-#define lll_robust_trylock(lock, id) \
-  __lll_robust_trylock (&(lock), id)
-
-extern void __lll_lock_wait_private (int *futex) attribute_hidden;
-extern void __lll_lock_wait (int *futex, int private) attribute_hidden;
-extern int __lll_robust_lock_wait (int *futex, int private) attribute_hidden;
-
-#define __lll_lock(futex, private)					      \
-  ((void) ({								      \
-    int *__futex = (futex);						      \
-    if (__builtin_expect (atomic_compare_and_exchange_val_acq (__futex,       \
-								1, 0), 0))    \
-      {									      \
-	if (__builtin_constant_p (private) && (private) == LLL_PRIVATE)	      \
-	  __lll_lock_wait_private (__futex);				      \
-	else								      \
-	  __lll_lock_wait (__futex, private);				      \
-      }									      \
-  }))
-#define lll_lock(futex, private) __lll_lock (&(futex), private)
-
-
-#define __lll_robust_lock(futex, id, private)				      \
-  ({									      \
-    int *__futex = (futex);						      \
-    int __val = 0;							      \
-									      \
-    if (__builtin_expect (atomic_compare_and_exchange_bool_acq (__futex, id,  \
-								0), 0))	      \
-      __val = __lll_robust_lock_wait (__futex, private);		      \
-    __val;								      \
-  })
-#define lll_robust_lock(futex, id, private) \
-  __lll_robust_lock (&(futex), id, private)
-
-
-#define __lll_cond_lock(futex, private)					      \
-  ((void) ({								      \
-    int *__futex = (futex);						      \
-    if (__builtin_expect (atomic_exchange_acq (__futex, 2), 0))		      \
-      __lll_lock_wait (__futex, private);				      \
-  }))
-#define lll_cond_lock(futex, private) __lll_cond_lock (&(futex), private)
-
-
-#define lll_robust_cond_lock(futex, id, private) \
-  __lll_robust_lock (&(futex), (id) | FUTEX_WAITERS, private)
-
-
-extern int __lll_timedlock_wait (int *futex, const struct timespec *,
-				 int private) attribute_hidden;
-extern int __lll_robust_timedlock_wait (int *futex, const struct timespec *,
-					int private) attribute_hidden;
-
-#define __lll_timedlock(futex, abstime, private)			      \
-  ({									      \
-     int *__futex = (futex);						      \
-     int __val = 0;							      \
-									      \
-     if (__builtin_expect (atomic_exchange_acq (__futex, 1), 0))	      \
-       __val = __lll_timedlock_wait (__futex, abstime, private);	      \
-     __val;								      \
-  })
-#define lll_timedlock(futex, abstime, private) \
-  __lll_timedlock (&(futex), abstime, private)
-
-
-#define __lll_robust_timedlock(futex, abstime, id, private)		      \
-  ({									      \
-    int *__futex = (futex);						      \
-    int __val = 0;							      \
-									      \
-    if (__builtin_expect (atomic_compare_and_exchange_bool_acq (__futex, id,  \
-								0), 0))	      \
-      __val = __lll_robust_timedlock_wait (__futex, abstime, private);	      \
-    __val;								      \
-  })
-#define lll_robust_timedlock(futex, abstime, id, private) \
-  __lll_robust_timedlock (&(futex), abstime, id, private)
-
-
-#define __lll_unlock(futex, private) \
-  (void)							\
-    ({ int *__futex = (futex);					\
-       int __oldval = atomic_exchange_rel (__futex, 0);		\
-       if (__builtin_expect (__oldval > 1, 0))			\
-	 lll_futex_wake (__futex, 1, private);			\
-    })
-#define lll_unlock(futex, private) __lll_unlock(&(futex), private)
-
-
-#define __lll_robust_unlock(futex, private) \
-  (void)							\
-    ({ int *__futex = (futex);					\
-       int __oldval = atomic_exchange_rel (__futex, 0);		\
-       if (__builtin_expect (__oldval & FUTEX_WAITERS, 0))	\
-	 lll_futex_wake (__futex, 1, private);			\
-    })
-#define lll_robust_unlock(futex, private) \
-  __lll_robust_unlock(&(futex), private)
-
-
-#define lll_islocked(futex) \
-  (futex != 0)
-
-
-/* Our internal lock implementation is identical to the binary-compatible
-   mutex implementation. */
-
-/* Initializers for lock.  */
-#define LLL_LOCK_INITIALIZER		(0)
-#define LLL_LOCK_INITIALIZER_LOCKED	(1)
-
-/* The states of a lock are:
-    0  -  untaken
-    1  -  taken by one user
-   >1  -  taken by more users */
-
-/* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex
-   wakeup when the clone terminates.  The memory location contains the
-   thread ID while the clone is running and is reset to zero
-   afterwards.	*/
-#define lll_wait_tid(tid) \
-  do {					\
-    __typeof (tid) __tid;		\
-    while ((__tid = (tid)) != 0)	\
-      lll_futex_wait (&(tid), __tid, LLL_SHARED);\
-  } while (0)
-
-extern int __lll_timedwait_tid (int *, const struct timespec *)
-     attribute_hidden;
-
-#define lll_timedwait_tid(tid, abstime) \
-  ({							\
-    int __res = 0;					\
-    if ((tid) != 0)					\
-      __res = __lll_timedwait_tid (&(tid), (abstime));	\
-    __res;						\
-  })
-
-#endif	/* lowlevellock.h */
diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
new file mode 100644
index 0000000..343afd6
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
@@ -0,0 +1,137 @@ 
+/* Low-level locking access to futex facilities.  Linux version.
+   Copyright (C) 2005-2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOWLEVELLOCK_FUTEX_H
+#define _LOWLEVELLOCK_FUTEX_H	1
+
+#include <sysdep.h>
+#include <tls.h>
+#include <kernel-features.h>
+
+
+#define FUTEX_WAIT		0
+#define FUTEX_WAKE		1
+#define FUTEX_REQUEUE		3
+#define FUTEX_CMP_REQUEUE	4
+#define FUTEX_WAKE_OP		5
+#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE	((4 << 24) | 1)
+#define FUTEX_LOCK_PI		6
+#define FUTEX_UNLOCK_PI		7
+#define FUTEX_TRYLOCK_PI	8
+#define FUTEX_WAIT_BITSET	9
+#define FUTEX_WAKE_BITSET	10
+#define FUTEX_WAIT_REQUEUE_PI   11
+#define FUTEX_CMP_REQUEUE_PI    12
+#define FUTEX_PRIVATE_FLAG	128
+#define FUTEX_CLOCK_REALTIME	256
+
+#define FUTEX_BITSET_MATCH_ANY	0xffffffff
+
+/* Values for 'private' parameter of locking macros.  Yes, the
+   definition seems to be backwards.  But it is not.  The bit will be
+   reversed before passing to the system call.  */
+#define LLL_PRIVATE	0
+#define LLL_SHARED	FUTEX_PRIVATE_FLAG
+
+
+#if !defined NOT_IN_libc || defined IS_IN_rtld
+/* In libc.so or ld.so all futexes are private.  */
+# ifdef __ASSUME_PRIVATE_FUTEX
+#  define __lll_private_flag(fl, private) \
+  ((fl) | FUTEX_PRIVATE_FLAG)
+# else
+#  define __lll_private_flag(fl, private) \
+  ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))
+# endif
+#else
+# ifdef __ASSUME_PRIVATE_FUTEX
+#  define __lll_private_flag(fl, private) \
+  (((fl) | FUTEX_PRIVATE_FLAG) ^ (private))
+# else
+#  define __lll_private_flag(fl, private) \
+  (__builtin_constant_p (private)					      \
+   ? ((private) == 0							      \
+      ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))	      \
+      : (fl))								      \
+   : ((fl) | (((private) ^ FUTEX_PRIVATE_FLAG)				      \
+	      & THREAD_GETMEM (THREAD_SELF, header.private_futex))))
+# endif
+#endif
+
+#define lll_futex_syscall(nargs, futexp, op, ...)                       \
+  ({                                                                    \
+    INTERNAL_SYSCALL_DECL (__err);                                      \
+    long int __ret = INTERNAL_SYSCALL (futex, __err, nargs, futexp, op, \
+				       __VA_ARGS__);                    \
+    (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (__ret, __err))         \
+     ? -INTERNAL_SYSCALL_ERRNO (__ret, __err) : 0);                     \
+  })
+
+#define lll_futex_wait(futexp, val, private) \
+  lll_futex_timed_wait (futexp, val, NULL, private)
+
+#define lll_futex_timed_wait(futexp, val, timeout, private)     \
+  lll_futex_syscall (4, futexp,                                 \
+		     __lll_private_flag (FUTEX_WAIT, private),  \
+		     val, timeout)
+
+#define lll_futex_timed_wait_bitset(futexp, val, timeout, clockbit, private) \
+  lll_futex_syscall (6, futexp,                                         \
+		     __lll_private_flag (FUTEX_WAIT_BITSET | (clockbit), \
+					 private),                      \
+		     val, timeout, NULL /* Unused.  */,                 \
+		     FUTEX_BITSET_MATCH_ANY)
+
+#define lll_futex_wake(futexp, nr, private)                             \
+  lll_futex_syscall (4, futexp,                                         \
+		     __lll_private_flag (FUTEX_WAKE, private), nr, 0)
+
+/* Returns non-zero if error happened, zero if success.  */
+#define lll_futex_requeue(futexp, nr_wake, nr_move, mutex, val, private) \
+  lll_futex_syscall (6, futexp,                                         \
+		     __lll_private_flag (FUTEX_CMP_REQUEUE, private),   \
+		     nr_wake, nr_move, mutex, val)
+
+/* Returns non-zero if error happened, zero if success.  */
+#define lll_futex_wake_unlock(futexp, nr_wake, nr_wake2, futexp2, private) \
+  lll_futex_syscall (6, futexp,                                         \
+		     __lll_private_flag (FUTEX_WAKE_OP, private),       \
+		     nr_wake, nr_wake2, futexp2,                        \
+		     FUTEX_OP_CLEAR_WAKE_IF_GT_ONE)
+
+/* Priority Inheritance support.  */
+#define lll_futex_wait_requeue_pi(futexp, val, mutex, private) \
+  lll_futex_timed_wait_requeue_pi (futexp, val, NULL, 0, mutex, private)
+
+#define lll_futex_timed_wait_requeue_pi(futexp, val, timeout, clockbit, \
+					mutex, private)                 \
+  lll_futex_syscall (5, futexp,                                         \
+		     __lll_private_flag (FUTEX_WAIT_REQUEUE_PI          \
+					 | (clockbit), private),        \
+		     val, timeout, mutex)
+
+
+#define lll_futex_cmp_requeue_pi(futexp, nr_wake, nr_move, mutex,       \
+				 val, private)                          \
+  lll_futex_syscall (6, futexp,                                         \
+		     __lll_private_flag (FUTEX_CMP_REQUEUE_PI,          \
+					 private),                      \
+		     nr_wake, nr_move, mutex, val)
+
+
+#endif  /* lowlevellock-futex.h */