[1/2] Add futex wrappers with error checking

Message ID 1418774081.7165.71.camel@triegel.csb
State Dropped
Headers

Commit Message

Torvald Riegel Dec. 16, 2014, 11:54 p.m. UTC
  On Thu, 2014-12-11 at 17:18 -0800, Roland McGrath wrote:
> > The second is that I haven't looked through all the lowlevellock cases
> > yet, so didn't want to touch that; it seemed moving lll_futex* callers
> > over to futex* callers wouldn't be an issue later on.
> 
> Fair enough.  I'm happy as long as the end state has no duplication of
> code/logic (one internal futex interface to rule them all) and you commit
> to driving all the corners of the cleanup so we get to that end state in
> this cycle.

I want to push back on this a little for two reasons.  First, while I
agree that we don't want to have two interfaces for the same thing, this
isn't exactly the case: lll_futex_ has no error checking and is just a
wrapper for whatever the underlying kernel/... provides.  The futex
wrappers I introduced do have error checking.
Second, it's hard to easily commit to doing something that isn't clearly
defined, especially so close to the freeze.  I agree that we shouldn't
let introduce junk or duplicated functionality, but it may be
unrealistic to try to get all-or-nothing instead of incremental changes.

I've introduced the futex wrappers because I wanted to do proper error
checking for the semaphores, and doing so with a reusable header seemed
to be The Right Thing.

> > Third, some of the lll_futex* definitions are in headers that are also
> > used from asm files; I guess that would mean I'd need to use macros
> > instead of C functions.
> 
> #ifdef __ASSEMBLY__ if need be.

Right.  Thanks.

> But we can also just clean things up so
> that's no longer the case.  There's no reason I can see why assembly code
> should want lowlevellock-futex.h.

There is assembly that calls futex syscalls, which needs at least the
macros for the different futex ops, and the syscall number in certain
cases.  Those things are currently in lowlevellock.h.
lowlevellock-futex.h needs the same information, and it should not
include lowlevellock.h.

We have other #ifndef __ASSEMBLY__ in lowlelvellock.h for a similar
reason already, it seems.

> > Fourth, I need some way to get to the arch-specific futex syscalls.  I
> > didn't know whether sysdeps/unix/sysv/linux/lowlevellock-futex.h would
> > work on every arch, so I just used what works for the locks.
> 
> What's an arch-specific futex syscall?  AFAIK
> sysdeps/unix/sysv/linux/lowlevellock-futex.h should be fine for all
> machines.

It doesn't work currently on i386 due to six-argument syscalls not being
supported, AFAIU.  If someone can add that I'd appreciate it; it would
save me finding out how to do that properly.

> Indeed sysdeps/unix/sysv/linux/lowlevellock.h should be fine for
> all machines too, and the only reason we still have any machine-specific
> files is conservatism about making sure that removing each one doesn't
> degrade any performance or semantics (so someone just needs to look at the
> generated code and compare for each machine).

I agree that this is what we want to do in the long run.

> > > I don't
> > > think we want to have both layers as such in the long run,
> > 
> > Maybe not.  If we want to expose our own futex abstraction to users,
> > we'd need a separate version that does less of the error checking we do,
> > as there may be cases where certain errors would need to be handled
> > differently.  You point out something similar below; checking that the
> > kernel (or whatever below provides the futex functionality) didn't
> > return errors we haven't specified in our futex abstraction.
> 
> I think the best approach for now is not to think about any new user API.
> Just do all the cleanup of our internal futex use thoroughly so we think
> it's very good and very maintainable.  When/if we come up with a new user
> API later, we can refactor as needed to implement it.

OK.

> > I didn't think about clean-up as much.  What I wanted is something we
> > can use today to get the futex error handling correct in pthread_once
> > and the the semaphores I'm about to submit, for example.
> 
> I'm going to insist on cleanup so we aren't growing redundant internal
> APIs.

Understood, but see above (e.g., I disagree it's fully redundant; more
details below).

> > I think I have a pretty good understanding for what the futex semantics
> > of the abstraction that we use internally should be.  I don't have a
> > good feel for how to best clean up all the existing code we have related
> > to that.
> 
> If you start with a strawman proposal for the complete new internal API,
> then we can all work together to figure out how to clean up existing code.

I'd start with just futex (timed)wait and wake, so what's in my patch.
That covers most of the uses.  The other ops are mostly for the
low-level locks, and I need to make a pass over the error handling for
these (but this was already discussed in the futex error handling
thread).

> > > > +#include <lowlevellock.h>
> > > 
> > > Include only what you need: lowlevellock-futex.h here.  That changes
> > > which code you're getting today, because all the machine-specific
> > > lowlevellock.h files still need to be removed.  But we should be
> > > finishing that cleanup this cycle anyway (though everyone seems to
> > > have forgotten).
> > 
> > I tried that now, but that doesn't work because it redefines lll_futex*,
> > and it's hard to avoid including lowlevellock.h through some other
> > header.  Therefore, I left this unchanged for now.
> 
> OK.  Perhaps you'd like to take on eliminating at least the x86 versions of
> lowlevellock.h?  (I think we'll really need to eliminate all of them before
> all futex-related cleanup is done.)

I agree that the existance custom lowlevellock.h and the related
assembly files is an issue we want to fix.  But I think we can start
making the futex facility more generic independently of that.

> > > So, now I'm seeing a potential reason to have this layer exist
> > > distinct from the OS-encapsulation layer.  Perhaps we should have the
> > > checks for expected errno values be in an OS-independent layer rather
> > > than just saying in the specification of the OS-encapsulation layer
> > > that it must yield only the particular set.
> > 
> > I'm not sure I can quite follow you.  I could see why the
> > OS-encapsulation layer would want to check that the set of return values
> > is only those we support in higher layers, but that's not what you're
> > after, or is it?
> 
> If the generic (higher) layers require a certain protocol about errno
> values and we want code to enforce/sanity-check the underlying OS calls for
> that (which seems to be the consensus for Linux), then it is duplicative
> and error-prone for each OS-specific layer to repeat that checking logic.

Yes.  That what I implicitly had in my mind too when thinking about
exposing futex to users; we'd need the OS call to expose it as-is, and
we'd need an internal interface that does the common style of error
checking, which should be OS-independent (assuming the OS calls on
different OSes return compatible sets of errors).

> > Updated patch is attached.  Is this one okay, or do you want to see
> > further changes to it and/or more of the full problem being addressed?
> 
> I guess I'd like to be closer to a full plan for cleaning it all up--that
> is, at least a more full sense of what the complete end state will look
> like, if not all the details about how to reach it--before we start
> committing.

OK.  So, what I'd suggest is this:

1) Have an OS-call interface that just does that.  This is what
lowlevellock-futex.h is currently, AFAIU.  This is implemented
differently on each OS.
Make sure that there is a lowlevellock-futex.h on each arch.  The
attached patch does this for x86_64 and i386.  More details below.

2) Have an OS-independent internal futex interface, with error checking.
That's what's in my patch.  This uses the interface from 1).

3) Move generic, non-low-level-lock code over to using the interface
from 2).  The new semaphore does that.  I have a new condvar
implementation which is just missing the PI handling, which would do the
same.  I'm working on a new rwlock that would use it. Etc.

4) Use the interface from 2) in generic low-level lock.  This should be
fine and without significant performance implications because all futex
ops are on the slow path, with the exception of the PI mutexes.  But if
you're doing a syscall anyway, doing a few more instructions more or
less won't matter that much.

5) Remove custom low-level lock implementations after reviewing the
performance implications of such removals.

Does this sound reasonable?  Are you OK with doing steps 1 and 2 before
the freeze, and do as much of 3) as possible before the freeze?


Regarding the attached patch:

This uses generic lowlevellock-futex.h for x86_64.  It also adds a few
#ifdef __ASSEMBLER__ to the generic Linux one to make this work.

With i386, this doesn't work because of lack of support for six-argument
syscalls (see above); thus, this patch just splits out all the
lll_futex_* calls and related stuff into a i386-specific
lowlevellock-futex.h file.  This has fewer features than the generic
one, but the only users are the current interface from 2), which just
have futex (timed)wait and wake.  And it works currently, so this should
be fine and we can add stuff as necessary later on (or, better, move to
the generic lowlevellock-futex.h).

There are a few more archs with custom futex features (not handled in
the attached patch, but I could add them if this approach is fine for
you):
* microblaze, s390, hppa all use INTERNAL_SYSCALL;  I believe those
could just use generic Linux lowlevellock-futex.h.
* ia64 uses DO_INLINE_SYSCALL.  Not sure whether that could use generic
Linux futexes too.
* sh has custom asm.  Not sure what to do about that.
* sparc uses INTERNAL_SYSCALL, so could be moved to generic Linux, but
there is a change (whose purpose/motivation I currently don't
understand):
  /* Returns non-zero if error happened, zero if success.  */
  #ifdef __sparc32_atomic_do_lock
  /* Avoid FUTEX_WAKE_OP if supporting pre-v9 CPUs.  */
  # define lll_futex_wake_unlock(futexp, nr_wake, nr_wake2, futexp2,\
     private) 1
  #else
  # define lll_futex_wake_unlock(futexp, nr_wake, nr_wake2, futexp2,\
  ...

Thoughts?
  

Comments

Roland McGrath Dec. 17, 2014, 1:07 a.m. UTC | #1
> I want to push back on this a little for two reasons.  First, while I
> agree that we don't want to have two interfaces for the same thing, this
> isn't exactly the case: lll_futex_ has no error checking and is just a
> wrapper for whatever the underlying kernel/... provides.  The futex
> wrappers I introduced do have error checking.

The lack of error checking in lll_futex_* (and their callers) is a bug, as
agreed here previously in discussion with kernel folks.  Frankly, I thought
that introducing error checking uniformly to our futex uses was the
motivation for your changes.  

Do you actually think that we want to reach an end state in which we have
some uses of futex doing error checking and some uses not doing it?
I find it hard to imagine we want that.

> Second, it's hard to easily commit to doing something that isn't clearly
> defined, especially so close to the freeze.  I agree that we shouldn't
> let introduce junk or duplicated functionality, but it may be
> unrealistic to try to get all-or-nothing instead of incremental changes.

I didn't say the changes should be all-or-nothing.  
Almost certainly they should not be.  

I said I wanted you to commit to changing all futex uses as part of the
whole effort.  That has nothing to do with making it all one change.  It's
simply about your personal commitment to keep working on the issue until
all the changes are finished.

As to the proximity of the freeze, firstly we don't yet have any actual
date for the freeze.  Secondly, I think this kind of cleanup work is best
done without big delays in the middle and so probably should be all done in
the same cycle.  If it's too much for you to finish in this cycle, then
perhaps the whole thing should wait until the next cycle.  Personally, I'd
rather we just get on with it now and if that pushes the next release
freeze out a few weeks, that's fine with me.

> There is assembly that calls futex syscalls, which needs at least the
> macros for the different futex ops, and the syscall number in certain
> cases.  Those things are currently in lowlevellock.h.
> lowlevellock-futex.h needs the same information, and it should not
> include lowlevellock.h.

Do we have assembly code making futex syscalls that we actually want to
keep?  My impression was that we wanted to get rid of pretty much all of
the assembly files for this stuff, and it's simply us moving slowly for
each machine that's why we haven't done that yet.

> It doesn't work currently on i386 due to six-argument syscalls not being
> supported, AFAIU.  If someone can add that I'd appreciate it; it would
> save me finding out how to do that properly.

Ah.  I think there are several of us who could tackle that.

> > If you start with a strawman proposal for the complete new internal API,
> > then we can all work together to figure out how to clean up existing code.
> 
> I'd start with just futex (timed)wait and wake, so what's in my patch.
> That covers most of the uses.  The other ops are mostly for the
> low-level locks, and I need to make a pass over the error handling for
> these (but this was already discussed in the futex error handling
> thread).

As long as you have a rough idea how everything else will fit in so that we
can be confident that dealing with the other operations won't make us want
to reconsider the whole organization of the internal API, that seems OK.

> I agree that the existance custom lowlevellock.h and the related
> assembly files is an issue we want to fix.  But I think we can start
> making the futex facility more generic independently of that.

Clearly we can start.  But the existence of the old assembly code seems to
be a significant barrier to doing all the cleanup in the most
straightforward ways.  I think it is probably just going to be net easier
at the end of the day if we take some time first to clean out all the old
assembly code and just don't have those wrinkles to think about when
revamping the rest.

> 1) Have an OS-call interface that just does that.  This is what
> lowlevellock-futex.h is currently, AFAIU.  This is implemented
> differently on each OS.

That's roughly what it is currently.  But it's underspecified and it
doesn't have a very clean internal API.  This should be replaced by
something that isn't organized and named as an implementation detail of the
lll layer, cleans up the private flag stuff, and cleans up the API overall
wrt things I've mentioned before like the stupid negated errno protocol and
using type-checking inlines rather than loosy-goosy macros.

That's not to say that this cleanup is what must happen first.  But an end
state where we still have lowlevellock-futex.h at all in anything like its
current form is not desireable.

> Make sure that there is a lowlevellock-futex.h on each arch.  The
> attached patch does this for x86_64 and i386.  More details below.

I'm not clear on what that patch is accomplishing that's materially
different from just using the existing linux/lowlevellock-futex.h on x86_64
and i386.  But please post that patch in its own thread so it can get
proper review and discussion separate from the rest of this discussion.

> 2) Have an OS-independent internal futex interface, with error checking.
> That's what's in my patch.  This uses the interface from 1).

OK.

> 3) Move generic, non-low-level-lock code over to using the interface
> from 2).  The new semaphore does that.  I have a new condvar
> implementation which is just missing the PI handling, which would do the
> same.  I'm working on a new rwlock that would use it. Etc.

OK.

> 4) Use the interface from 2) in generic low-level lock.  This should be
> fine and without significant performance implications because all futex
> ops are on the slow path, with the exception of the PI mutexes.  But if
> you're doing a syscall anyway, doing a few more instructions more or
> less won't matter that much.

OK.  3 and 4 could happen in either order, or in parallel, right?

> 5) Remove custom low-level lock implementations after reviewing the
> performance implications of such removals.

This need not wait for any other step.  Aside from the i386 issue, this can
be done today and IMHO the sooner it's done the better.

> Does this sound reasonable?  Are you OK with doing steps 1 and 2 before
> the freeze, and do as much of 3) as possible before the freeze?

At the moment I'm feeling ambivalent about splitting any of thus work up
across a freeze.  But each step as described is sufficiently self-contained
that it's probably fine just to do each as it's ready (with very careful
and thorough review and testing) regardless of where that falls relative to
a freeze deadline.

> This uses generic lowlevellock-futex.h for x86_64.  It also adds a few
> #ifdef __ASSEMBLER__ to the generic Linux one to make this work.

Post that separately by itself.  Sounds trivial.

> With i386, this doesn't work because of lack of support for six-argument
> syscalls (see above); thus, this patch just splits out all the
> lll_futex_* calls and related stuff into a i386-specific
> lowlevellock-futex.h file.  This has fewer features than the generic
> one, but the only users are the current interface from 2), which just
> have futex (timed)wait and wake.  And it works currently, so this should
> be fine and we can add stuff as necessary later on (or, better, move to
> the generic lowlevellock-futex.h).

Post that separately by itself.  It sounds like it might be a fine
intermediate change we could do right away.  Making i386 INTERNAL_SYSCALL
handle the 6-argument case will probably be fiddly enough that it takes
a bit longer.

> * microblaze, s390, hppa all use INTERNAL_SYSCALL;  I believe those
> could just use generic Linux lowlevellock-futex.h.

Yes.  I think the right approach here is just to post a patch doing the
removal and let the machine maintainers agree or raise issues.  If you are
in a position to do a performance comparison, either empirically and/or by
eyeballing generated code, then I'm sure that's helpful to report.  But
ultimately the responsibility is with each machine maintainer.  I suspect
that at least e.g. microblaze and hppa maintainers will be happy to do the
removal just based on no-semantic-regressions testing and not bother to
fret about feared performance issues.

> * ia64 uses DO_INLINE_SYSCALL.  Not sure whether that could use generic
> Linux futexes too.

ia64's INTERNAL_SYSCALL is a pretty trivial wrapper around
DO_INLINE_SYSCALL, so I think this is almost certainly just fine.

> * sh has custom asm.  Not sure what to do about that.

Suggest removing it and see if the maintainer even cares to worry about
whether the performance is affected.

> * sparc uses INTERNAL_SYSCALL, so could be moved to generic Linux, but
> there is a change (whose purpose/motivation I currently don't
> understand):

Bug Dave to clarify the requirements for that.


Thanks,
Roland
  
Carlos O'Donell Dec. 17, 2014, 4:40 p.m. UTC | #2
On 12/16/2014 08:07 PM, Roland McGrath wrote:
> As to the proximity of the freeze, firstly we don't yet have any actual
> date for the freeze.  Secondly, I think this kind of cleanup work is best
> done without big delays in the middle and so probably should be all done in
> the same cycle.  If it's too much for you to finish in this cycle, then
> perhaps the whole thing should wait until the next cycle.  Personally, I'd
> rather we just get on with it now and if that pushes the next release
> freeze out a few weeks, that's fine with me.

I have tentatively set the freeze date is set as January 9th.

Cheers,
Carlos.
  
Torvald Riegel Dec. 17, 2014, 5 p.m. UTC | #3
On Tue, 2014-12-16 at 17:07 -0800, Roland McGrath wrote:
> > I want to push back on this a little for two reasons.  First, while I
> > agree that we don't want to have two interfaces for the same thing, this
> > isn't exactly the case: lll_futex_ has no error checking and is just a
> > wrapper for whatever the underlying kernel/... provides.  The futex
> > wrappers I introduced do have error checking.
> 
> The lack of error checking in lll_futex_* (and their callers) is a bug, as
> agreed here previously in discussion with kernel folks.  Frankly, I thought
> that introducing error checking uniformly to our futex uses was the
> motivation for your changes.  

It is.  I do want to have the low-level locks check the return values of
their futex calls.  To me, that doesn't conflict with having another
interface that is just what the underlying kernels/... provide.  We can
also get rid of this other interface and do error checking in each of
the per kernel/... implementations.  Does that clarify what I meant?

> Do you actually think that we want to reach an end state in which we have
> some uses of futex doing error checking and some uses not doing it?
> I find it hard to imagine we want that.

No, but having layered interfaces, where only the top-most one does
error checking and is used throughout glibc, doesn't seem obviously bad
to me.

> > Second, it's hard to easily commit to doing something that isn't clearly
> > defined, especially so close to the freeze.  I agree that we shouldn't
> > let introduce junk or duplicated functionality, but it may be
> > unrealistic to try to get all-or-nothing instead of incremental changes.
> 
> I didn't say the changes should be all-or-nothing.  
> Almost certainly they should not be.  
> 
> I said I wanted you to commit to changing all futex uses as part of the
> whole effort.  That has nothing to do with making it all one change.  It's
> simply about your personal commitment to keep working on the issue until
> all the changes are finished.
> 
> As to the proximity of the freeze, firstly we don't yet have any actual
> date for the freeze.

The message Carlos sent out states otherwise.  I believed we do want to
have fixed release dates, and I assumed that the release date would be
Dec 30.  The Jan 9 date Carlos mentioned gives a bit more time, but not
much.

> Secondly, I think this kind of cleanup work is best
> done without big delays in the middle and so probably should be all done in
> the same cycle.  If it's too much for you to finish in this cycle, then
> perhaps the whole thing should wait until the next cycle.

What would this mean for the new semaphore I posted?  Are you okay with
it using the current lll_futex_* operations and custom error checking?
Or do you want something like the futex-with-builtin-error-checking
interface that I posted?  In the latter case, this would block the
semaphore from going in.

> Personally, I'd
> rather we just get on with it now and if that pushes the next release
> freeze out a few weeks, that's fine with me.
> 
> > There is assembly that calls futex syscalls, which needs at least the
> > macros for the different futex ops, and the syscall number in certain
> > cases.  Those things are currently in lowlevellock.h.
> > lowlevellock-futex.h needs the same information, and it should not
> > include lowlevellock.h.
> 
> Do we have assembly code making futex syscalls that we actually want to
> keep?  My impression was that we wanted to get rid of pretty much all of
> the assembly files for this stuff, and it's simply us moving slowly for
> each machine that's why we haven't done that yet.

I would also hope that we can remove all of them.  But actually doing
that may be a bigger project.  The futex-using assembly files I see are
on x86, x86_64, and sh.  Its the low-level lock stuff (including
robust), pthread barrier, pthread condvar, pthread rwlock, pthread
semaphore, cancellation.

I've posted a new semaphore, and as I've said, am working on condvar and
rwlock.

Cancellation may be something that ends up doing futex calls from
assembly.  But given that futexes are on the slow path, the assembly
could probably just call C futex function instead.

> > It doesn't work currently on i386 due to six-argument syscalls not being
> > supported, AFAIU.  If someone can add that I'd appreciate it; it would
> > save me finding out how to do that properly.
> 
> Ah.  I think there are several of us who could tackle that.

Thanks.

> > > If you start with a strawman proposal for the complete new internal API,
> > > then we can all work together to figure out how to clean up existing code.
> > 
> > I'd start with just futex (timed)wait and wake, so what's in my patch.
> > That covers most of the uses.  The other ops are mostly for the
> > low-level locks, and I need to make a pass over the error handling for
> > these (but this was already discussed in the futex error handling
> > thread).
> 
> As long as you have a rough idea how everything else will fit in so that we
> can be confident that dealing with the other operations won't make us want
> to reconsider the whole organization of the internal API, that seems OK.
> 
> > I agree that the existance custom lowlevellock.h and the related
> > assembly files is an issue we want to fix.  But I think we can start
> > making the futex facility more generic independently of that.
> 
> Clearly we can start.  But the existence of the old assembly code seems to
> be a significant barrier to doing all the cleanup in the most
> straightforward ways.  I think it is probably just going to be net easier
> at the end of the day if we take some time first to clean out all the old
> assembly code and just don't have those wrinkles to think about when
> revamping the rest.
> 
> > 1) Have an OS-call interface that just does that.  This is what
> > lowlevellock-futex.h is currently, AFAIU.  This is implemented
> > differently on each OS.
> 
> That's roughly what it is currently.  But it's underspecified and it
> doesn't have a very clean internal API.  This should be replaced by
> something that isn't organized and named as an implementation detail of the
> lll layer,

Agreed that the name needs to be changed.

> cleans up the private flag stuff,

Yes, that would be good.

> and cleans up the API overall
> wrt things I've mentioned before like the stupid negated errno protocol and
> using type-checking inlines rather than loosy-goosy macros.
> 
> That's not to say that this cleanup is what must happen first.  But an end
> state where we still have lowlevellock-futex.h at all in anything like its
> current form is not desireable.

OK.

One thing we need to agree on is whether we need this interface in the
end at all, or whether we make the interface from 2) (see below) the one
that each OS implements.  What do you think?

Keeping 1) would make it easier for us to expose it to users in the
future (I do remember you not wanting to care about this now, though).
Keeping 1) isn't helping for internal use if the errors that it can
return are incompatible between the different OS implementations.

What is NaCl's error specification for the futex ops?  Given that much
of this refactoring is to make non-Linux futex possible, it would be
helpful if you can summarize what you need and want.

> > Make sure that there is a lowlevellock-futex.h on each arch.  The
> > attached patch does this for x86_64 and i386.  More details below.
> 
> I'm not clear on what that patch is accomplishing that's materially
> different from just using the existing linux/lowlevellock-futex.h on x86_64
> and i386.  But please post that patch in its own thread so it can get
> proper review and discussion separate from the rest of this discussion.

OK.

> > 2) Have an OS-independent internal futex interface, with error checking.
> > That's what's in my patch.  This uses the interface from 1).
> 
> OK.
> 
> > 3) Move generic, non-low-level-lock code over to using the interface
> > from 2).  The new semaphore does that.  I have a new condvar
> > implementation which is just missing the PI handling, which would do the
> > same.  I'm working on a new rwlock that would use it. Etc.
> 
> OK.
> 
> > 4) Use the interface from 2) in generic low-level lock.  This should be
> > fine and without significant performance implications because all futex
> > ops are on the slow path, with the exception of the PI mutexes.  But if
> > you're doing a syscall anyway, doing a few more instructions more or
> > less won't matter that much.
> 
> OK.  3 and 4 could happen in either order, or in parallel, right?

Yes.

> > 5) Remove custom low-level lock implementations after reviewing the
> > performance implications of such removals.
> 
> This need not wait for any other step.  Aside from the i386 issue, this can
> be done today and IMHO the sooner it's done the better.

That's true, but in contrast to the other steps, this may require more
time to assess the impact on performance.

> > This uses generic lowlevellock-futex.h for x86_64.  It also adds a few
> > #ifdef __ASSEMBLER__ to the generic Linux one to make this work.
> 
> Post that separately by itself.  Sounds trivial.

OK.

> > With i386, this doesn't work because of lack of support for six-argument
> > syscalls (see above); thus, this patch just splits out all the
> > lll_futex_* calls and related stuff into a i386-specific
> > lowlevellock-futex.h file.  This has fewer features than the generic
> > one, but the only users are the current interface from 2), which just
> > have futex (timed)wait and wake.  And it works currently, so this should
> > be fine and we can add stuff as necessary later on (or, better, move to
> > the generic lowlevellock-futex.h).
> 
> Post that separately by itself.  It sounds like it might be a fine
> intermediate change we could do right away.  Making i386 INTERNAL_SYSCALL
> handle the 6-argument case will probably be fiddly enough that it takes
> a bit longer.

OK.

> > * microblaze, s390, hppa all use INTERNAL_SYSCALL;  I believe those
> > could just use generic Linux lowlevellock-futex.h.
> 
> Yes.  I think the right approach here is just to post a patch doing the
> removal and let the machine maintainers agree or raise issues.  If you are
> in a position to do a performance comparison, either empirically and/or by
> eyeballing generated code, then I'm sure that's helpful to report.  But
> ultimately the responsibility is with each machine maintainer.  I suspect
> that at least e.g. microblaze and hppa maintainers will be happy to do the
> removal just based on no-semantic-regressions testing and not bother to
> fret about feared performance issues.

OK.  The futex ops are on the slow path, so performance differences
won't matter much (in contrast to lowlevellock.h, which contains the
fast paths too).

> > * ia64 uses DO_INLINE_SYSCALL.  Not sure whether that could use generic
> > Linux futexes too.
> 
> ia64's INTERNAL_SYSCALL is a pretty trivial wrapper around
> DO_INLINE_SYSCALL, so I think this is almost certainly just fine.

OK.

> > * sh has custom asm.  Not sure what to do about that.
> 
> Suggest removing it and see if the maintainer even cares to worry about
> whether the performance is affected.

OK.

> > * sparc uses INTERNAL_SYSCALL, so could be moved to generic Linux, but
> > there is a change (whose purpose/motivation I currently don't
> > understand):
> 
> Bug Dave to clarify the requirements for that.

OK.
  
Roland McGrath Dec. 17, 2014, 11:34 p.m. UTC | #4
> It is.  I do want to have the low-level locks check the return values of
> their futex calls.  To me, that doesn't conflict with having another
> interface that is just what the underlying kernels/... provide.  We can
> also get rid of this other interface and do error checking in each of
> the per kernel/... implementations.  Does that clarify what I meant?

Yes, that makes sense.  On further reflection, I think it does make sense
to have the error-checking layer be outside the OS-encapsulation layer so
that we have foolproof consistency of the expected error protocol of the OS
layer.

> The message Carlos sent out states otherwise.  I believed we do want to
> have fixed release dates, and I assumed that the release date would be
> Dec 30.  The Jan 9 date Carlos mentioned gives a bit more time, but not
> much.

Indeed.  I don't particularly advocate the fixed date approach, but I
haven't objected to it, and won't.

> What would this mean for the new semaphore I posted?  Are you okay with
> it using the current lll_futex_* operations and custom error checking?

If that is fixing some real, observable bug independent of the futex error
handling question, then I think it's fine as an intermediate step to do
now.  If it is nothing but internal reorganization or the only kind of bug
it's fixing is the lack of futex error handling that is a more pervasive
problem (and there's nothing really notable about the semaphore instance of
that problem)--or is a purely speculative theoretical bug--then I don't
think it's worth perturbing things before the release.

> I would also hope that we can remove all of them.  But actually doing
> that may be a bigger project.  The futex-using assembly files I see are
> on x86, x86_64, and sh.  Its the low-level lock stuff (including
> robust), pthread barrier, pthread condvar, pthread rwlock, pthread
> semaphore, cancellation.

The 6-argument syscall issue on i386 is the only blocker we know of, right?
But indeed we wanted to be conservative about perturbing performance issues
for i386/x86_64, and just establishing confidence about that will take some
time.  I had hoped someone like you would have done that during this cycle,
but since actual focus on it is only beginning about now, it might well
need to wait at least until right after the release.

> Cancellation may be something that ends up doing futex calls from
> assembly.

Can you point to particular code you're thinking of?

> One thing we need to agree on is whether we need this interface in the
> end at all, or whether we make the interface from 2) (see below) the one
> that each OS implements.  What do you think?

On the one hand I want to avoid extra layers when possible.  On the other
hand, I am adamant about having any sanity-check enforcement of error
protocols be done in code that is OS-independent so that we do not risk
skew between OS-specific implementations of the sanity checks.

Perhaps that right balance is to have some common code that does the error
protocol assertions for each piece of the internal interface, but as
utility code that the OS-specific layer calls rather than as an extra layer
around it.  Either of those two approaches should be fairly easy to
refactor into the other later.

> Keeping 1) would make it easier for us to expose it to users in the
> future (I do remember you not wanting to care about this now, though).
> Keeping 1) isn't helping for internal use if the errors that it can
> return are incompatible between the different OS implementations.

Right.  As I said before, I want to ignore speculation about a new user API
if considering it would complicate the ideal internal cleanup we can do in
its absence.  Off hand, 

> What is NaCl's error specification for the futex ops?  Given that much
> of this refactoring is to make non-Linux futex possible, it would be
> helpful if you can summarize what you need and want.

It's not formally specified.  Currently it's quite simple: only basic wake
and wait/timed-wait are supported, and the only errors possible are EFAULT
and EAGAIN (just for value mismatch).  It's likely that it will grow over
time to cover a larger subset of the Linux features and their nontrivial
failure modes.  It will always be the intent that it align pretty well with
the Linux semantics.

> > > 5) Remove custom low-level lock implementations after reviewing the
> > > performance implications of such removals.
> > 
> > This need not wait for any other step.  Aside from the i386 issue, this can
> > be done today and IMHO the sooner it's done the better.
> 
> That's true, but in contrast to the other steps, this may require more
> time to assess the impact on performance.

Agreed.


Thanks,
Roland
  
Torvald Riegel Jan. 5, 2015, 10:53 p.m. UTC | #5
On Wed, 2014-12-17 at 15:34 -0800, Roland McGrath wrote:
> > It is.  I do want to have the low-level locks check the return values of
> > their futex calls.  To me, that doesn't conflict with having another
> > interface that is just what the underlying kernels/... provide.  We can
> > also get rid of this other interface and do error checking in each of
> > the per kernel/... implementations.  Does that clarify what I meant?
> 
> Yes, that makes sense.  On further reflection, I think it does make sense
> to have the error-checking layer be outside the OS-encapsulation layer so
> that we have foolproof consistency of the expected error protocol of the OS
> layer.

OK.

> > What would this mean for the new semaphore I posted?  Are you okay with
> > it using the current lll_futex_* operations and custom error checking?
> 
> If that is fixing some real, observable bug independent of the futex error
> handling question, then I think it's fine as an intermediate step to do
> now.

It is fixing a correctness bug.

Carlos said he's currently reviewing that, so I'll try to focus on doing
more of the futex clean-up now, hoping that I get enough of it done
before the freeze (and with time for you to review) to make adding the
new internal futex interface fine.  Sounds good?

I'm traveling this week and thus don't have lots of time to spend on
this, but I'll see how far I can get.

> > I would also hope that we can remove all of them.  But actually doing
> > that may be a bigger project.  The futex-using assembly files I see are
> > on x86, x86_64, and sh.  Its the low-level lock stuff (including
> > robust), pthread barrier, pthread condvar, pthread rwlock, pthread
> > semaphore, cancellation.
> 
> The 6-argument syscall issue on i386 is the only blocker we know of, right?
> But indeed we wanted to be conservative about perturbing performance issues
> for i386/x86_64, and just establishing confidence about that will take some
> time.  I had hoped someone like you would have done that during this cycle,
> but since actual focus on it is only beginning about now, it might well
> need to wait at least until right after the release.

No, I haven't, sorry.

i386 6-argument syscall is a blocker in terms of using Linux-generic
lowlevellock-futex.h everywhere.  The other obstacle is that I haven't
heard back from the hppa, microblaze, and ia64 maintainers; it seems
okay to remove the custom lowlevellock.h on these archs, but I can't
test that easily.  Without that removal, I can't do certain follow-up
cleanups like adding a futex_wait that accepts absolute timeouts in just
Linux-generic lowlevellock-futex.h.

> > Cancellation may be something that ends up doing futex calls from
> > assembly.
> 
> Can you point to particular code you're thinking of?

AFAIR the plan we had for how to change cancellation, we wanted it to be
based on checking whether the instruction pointer falls into a certain
code region, not using the enable/disable calls we have now.  We'd put
the syscall into a tightly-controlled code region whose addresses we'd
know, and then would be able to check whether cancellation happened in
there or not.  I don't have pointers to the emails handy, but I'll look
for it.

OTOH, maybe that could just use a cancellable version of
INTERNAL_SYSCALL, and we'd be fine. 

> > One thing we need to agree on is whether we need this interface in the
> > end at all, or whether we make the interface from 2) (see below) the one
> > that each OS implements.  What do you think?
> 
> On the one hand I want to avoid extra layers when possible.  On the other
> hand, I am adamant about having any sanity-check enforcement of error
> protocols be done in code that is OS-independent so that we do not risk
> skew between OS-specific implementations of the sanity checks.

OK.

> Perhaps that right balance is to have some common code that does the error
> protocol assertions for each piece of the internal interface, but as
> utility code that the OS-specific layer calls rather than as an extra layer
> around it.  Either of those two approaches should be fairly easy to
> refactor into the other later.

OK, and I agree that such refactoring seems straightforward.  I'd start
with having the layering that I have now (with error checking in the
futex_* wrappers in my patches, that call the lll_futex_* functions for
now), if that's fine for you.  This just seems to be easier to implement
incrementally.  Also, we get to move uses off from the lll_futex_* names
into just futex_*, whcih seems right.


The next patches I plan to prepare are splitting the current
futex_timed_wait into two variants for relative and absolute timeouts,
respectively (absolute is used in pthread_*, relative is in aio_suspend,
for example).  That needs an additional lll_futex_timed_wait variant,
but the upside is that we can use this to consolidate much duplicated
code in pthread_*, and start using the CLOCK_REALTIME optimization with
futex_timed_wait_bitset everywhere.

Next, I plan to transition more uses of lll_futex_* to futex_*.

If any of that sounds wrong to you, please let me know soon.
  

Patch

commit a2848a4447dfb84635002e9dfa77031b8bbcac79
Author: Torvald Riegel <triegel@redhat.com>
Date:   Wed Dec 17 00:02:45 2014 +0100

    [WIP] Split out futex functions from lowlevellock.h.

diff --git a/sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h
new file mode 100644
index 0000000..2b5ae53
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h
@@ -0,0 +1,136 @@ 
+/* Copyright (C) 2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOWLEVELLOCK_FUTEX_H
+#define _LOWLEVELLOCK_FUTEX_H	1
+
+#define FUTEX_WAIT		0
+#define FUTEX_WAKE		1
+#define FUTEX_CMP_REQUEUE	4
+#define FUTEX_WAKE_OP		5
+#define FUTEX_LOCK_PI		6
+#define FUTEX_UNLOCK_PI		7
+#define FUTEX_TRYLOCK_PI	8
+#define FUTEX_WAIT_BITSET	9
+#define FUTEX_WAKE_BITSET	10
+#define FUTEX_WAIT_REQUEUE_PI	11
+#define FUTEX_CMP_REQUEUE_PI	12
+#define FUTEX_PRIVATE_FLAG	128
+#define FUTEX_CLOCK_REALTIME	256
+
+#define FUTEX_BITSET_MATCH_ANY	0xffffffff
+
+#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE	((4 << 24) | 1)
+
+/* Values for 'private' parameter of locking macros.  Yes, the
+   definition seems to be backwards.  But it is not.  The bit will be
+   reversed before passing to the system call.  */
+#define LLL_PRIVATE	0
+#define LLL_SHARED	FUTEX_PRIVATE_FLAG
+
+
+#if IS_IN (libc) || IS_IN (rtld)
+/* In libc.so or ld.so all futexes are private.  */
+# ifdef __ASSUME_PRIVATE_FUTEX
+#  define __lll_private_flag(fl, private) \
+  ((fl) | FUTEX_PRIVATE_FLAG)
+# else
+#  define __lll_private_flag(fl, private) \
+  ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))
+# endif
+#else
+# ifdef __ASSUME_PRIVATE_FUTEX
+#  define __lll_private_flag(fl, private) \
+  (((fl) | FUTEX_PRIVATE_FLAG) ^ (private))
+# else
+#  define __lll_private_flag(fl, private) \
+  (__builtin_constant_p (private)					      \
+   ? ((private) == 0							      \
+      ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))	      \
+      : (fl))								      \
+   : ({ unsigned int __fl = ((private) ^ FUTEX_PRIVATE_FLAG);		      \
+	asm ("andl %%gs:%P1, %0" : "+r" (__fl)				      \
+	     : "i" (offsetof (struct pthread, header.private_futex)));	      \
+	__fl | (fl); }))
+# endif
+#endif
+
+
+#ifndef __ASSEMBLER__
+
+/* To avoid naming conflicts with lowlevellock.h, use a different prefix
+   here.  */
+#ifdef PIC
+# define LLLF_EBX_LOAD	"xchgl %2, %%ebx\n"
+# define LLLF_EBX_REG	"D"
+#else
+# define LLLF_EBX_LOAD
+# define LLLF_EBX_REG	"b"
+#endif
+
+#ifdef I386_USE_SYSENTER
+# ifdef SHARED
+#  define LLLF_ENTER_KERNEL	"call *%%gs:%P6\n\t"
+# else
+#  define LLLF_ENTER_KERNEL	"call *_dl_sysinfo\n\t"
+# endif
+#else
+# define LLLF_ENTER_KERNEL	"int $0x80\n\t"
+#endif
+
+
+#define lll_futex_wait(futex, val, private) \
+  lll_futex_timed_wait (futex, val, NULL, private)
+
+
+#define lll_futex_timed_wait(futex, val, timeout, private) \
+  ({									      \
+    int __status;							      \
+    register __typeof (val) _val asm ("edx") = (val);			      \
+    __asm __volatile (LLLF_EBX_LOAD					      \
+		      LLLF_ENTER_KERNEL					      \
+		      LLLF_EBX_LOAD					      \
+		      : "=a" (__status)					      \
+		      : "0" (SYS_futex), LLLF_EBX_REG (futex), "S" (timeout),  \
+			"c" (__lll_private_flag (FUTEX_WAIT, private)),	      \
+			"d" (_val), "i" (offsetof (tcbhead_t, sysinfo))	      \
+		      : "memory");					      \
+    __status;								      \
+  })
+
+
+#define lll_futex_wake(futex, nr, private) \
+  ({									      \
+    int __status;							      \
+    register __typeof (nr) _nr asm ("edx") = (nr);			      \
+    LIBC_PROBE (lll_futex_wake, 3, futex, nr, private);                       \
+    __asm __volatile (LLLF_EBX_LOAD					      \
+		      LLLF_ENTER_KERNEL					      \
+		      LLLF_EBX_LOAD					      \
+		      : "=a" (__status)					      \
+		      : "0" (SYS_futex), LLLF_EBX_REG (futex),		      \
+			"c" (__lll_private_flag (FUTEX_WAKE, private)),	      \
+			"d" (_nr),					      \
+			"i" (0) /* phony, to align next arg's number */,      \
+			"i" (offsetof (tcbhead_t, sysinfo)));		      \
+    __status;								      \
+  })
+
+
+#endif  /* !__ASSEMBLER__ */
+
+#endif	/* lowlevellock-futex.h */
diff --git a/sysdeps/unix/sysv/linux/i386/lowlevellock.h b/sysdeps/unix/sysv/linux/i386/lowlevellock.h
index 1032f4b..c3528a8 100644
--- a/sysdeps/unix/sysv/linux/i386/lowlevellock.h
+++ b/sysdeps/unix/sysv/linux/i386/lowlevellock.h
@@ -45,57 +45,10 @@ 
 # endif
 #endif
 
+#include <lowlevellock-futex.h>
+
+/* XXX Remove when no assembler code uses futexes anymore.  */
 #define SYS_futex		240
-#define FUTEX_WAIT		0
-#define FUTEX_WAKE		1
-#define FUTEX_CMP_REQUEUE	4
-#define FUTEX_WAKE_OP		5
-#define FUTEX_LOCK_PI		6
-#define FUTEX_UNLOCK_PI		7
-#define FUTEX_TRYLOCK_PI	8
-#define FUTEX_WAIT_BITSET	9
-#define FUTEX_WAKE_BITSET	10
-#define FUTEX_WAIT_REQUEUE_PI	11
-#define FUTEX_CMP_REQUEUE_PI	12
-#define FUTEX_PRIVATE_FLAG	128
-#define FUTEX_CLOCK_REALTIME	256
-
-#define FUTEX_BITSET_MATCH_ANY	0xffffffff
-
-#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE	((4 << 24) | 1)
-
-/* Values for 'private' parameter of locking macros.  Yes, the
-   definition seems to be backwards.  But it is not.  The bit will be
-   reversed before passing to the system call.  */
-#define LLL_PRIVATE	0
-#define LLL_SHARED	FUTEX_PRIVATE_FLAG
-
-
-#if IS_IN (libc) || IS_IN (rtld)
-/* In libc.so or ld.so all futexes are private.  */
-# ifdef __ASSUME_PRIVATE_FUTEX
-#  define __lll_private_flag(fl, private) \
-  ((fl) | FUTEX_PRIVATE_FLAG)
-# else
-#  define __lll_private_flag(fl, private) \
-  ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))
-# endif
-#else
-# ifdef __ASSUME_PRIVATE_FUTEX
-#  define __lll_private_flag(fl, private) \
-  (((fl) | FUTEX_PRIVATE_FLAG) ^ (private))
-# else
-#  define __lll_private_flag(fl, private) \
-  (__builtin_constant_p (private)					      \
-   ? ((private) == 0							      \
-      ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))	      \
-      : (fl))								      \
-   : ({ unsigned int __fl = ((private) ^ FUTEX_PRIVATE_FLAG);		      \
-	asm ("andl %%gs:%P1, %0" : "+r" (__fl)				      \
-	     : "i" (offsetof (struct pthread, header.private_futex)));	      \
-	__fl | (fl); }))
-# endif
-#endif
 
 #ifndef __ASSEMBLER__
 
@@ -126,43 +79,6 @@ 
 /* Delay in spinlock loop.  */
 #define BUSY_WAIT_NOP	asm ("rep; nop")
 
-#define lll_futex_wait(futex, val, private) \
-  lll_futex_timed_wait (futex, val, NULL, private)
-
-
-#define lll_futex_timed_wait(futex, val, timeout, private) \
-  ({									      \
-    int __status;							      \
-    register __typeof (val) _val asm ("edx") = (val);			      \
-    __asm __volatile (LLL_EBX_LOAD					      \
-		      LLL_ENTER_KERNEL					      \
-		      LLL_EBX_LOAD					      \
-		      : "=a" (__status)					      \
-		      : "0" (SYS_futex), LLL_EBX_REG (futex), "S" (timeout),  \
-			"c" (__lll_private_flag (FUTEX_WAIT, private)),	      \
-			"d" (_val), "i" (offsetof (tcbhead_t, sysinfo))	      \
-		      : "memory");					      \
-    __status;								      \
-  })
-
-
-#define lll_futex_wake(futex, nr, private) \
-  ({									      \
-    int __status;							      \
-    register __typeof (nr) _nr asm ("edx") = (nr);			      \
-    LIBC_PROBE (lll_futex_wake, 3, futex, nr, private);                       \
-    __asm __volatile (LLL_EBX_LOAD					      \
-		      LLL_ENTER_KERNEL					      \
-		      LLL_EBX_LOAD					      \
-		      : "=a" (__status)					      \
-		      : "0" (SYS_futex), LLL_EBX_REG (futex),		      \
-			"c" (__lll_private_flag (FUTEX_WAKE, private)),	      \
-			"d" (_nr),					      \
-			"i" (0) /* phony, to align next arg's number */,      \
-			"i" (offsetof (tcbhead_t, sysinfo)));		      \
-    __status;								      \
-  })
-
 
 /* NB: in the lll_trylock macro we simply return the value in %eax
    after the cmpxchg instruction.  In case the operation succeded this
@@ -381,43 +297,37 @@  extern int __lll_timedlock_elision (int *futex, short *adapt_count,
   (futex != LLL_LOCK_INITIALIZER)
 
 /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex
-   wakeup when the clone terminates.  The memory location contains the
-   thread ID while the clone is running and is reset to zero
-   afterwards.
-
-   The macro parameter must not have any side effect.  */
+   wake-up when the clone terminates.  The memory location contains the
+   thread ID while the clone is running and is reset to zero by the kernel
+   afterwards.  The kernel up to version 3.16.3 does not use the private futex
+   operations for futex wake-up when the clone terminates.  */
 #define lll_wait_tid(tid) \
-  do {									      \
-    int __ignore;							      \
-    register __typeof (tid) _tid asm ("edx") = (tid);			      \
-    if (_tid != 0)							      \
-      __asm __volatile (LLL_EBX_LOAD					      \
-			"1:\tmovl %1, %%eax\n\t"			      \
-			LLL_ENTER_KERNEL				      \
-			"cmpl $0, (%%ebx)\n\t"				      \
-			"jne 1b\n\t"					      \
-			LLL_EBX_LOAD					      \
-			: "=&a" (__ignore)				      \
-			: "i" (SYS_futex), LLL_EBX_REG (&tid), "S" (0),	      \
-			  "c" (FUTEX_WAIT), "d" (_tid),			      \
-			  "i" (offsetof (tcbhead_t, sysinfo))		      \
-			: "memory");					      \
+  do {					\
+    __typeof (tid) __tid;		\
+    while ((__tid = (tid)) != 0)	\
+      lll_futex_wait (&(tid), __tid, LLL_SHARED);\
   } while (0)
 
 extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime)
      __attribute__ ((regparm (2))) attribute_hidden;
+
+/* As lll_wait_tid, but with a timeout.  If the timeout occurs then return
+   ETIMEDOUT.  If ABSTIME is invalid, return EINVAL.
+   XXX Note that this differs from the generic version in that we do the
+   error checking here and not in __lll_timedwait_tid.  */
 #define lll_timedwait_tid(tid, abstime) \
   ({									      \
     int __result = 0;							      \
-    if (tid != 0)							      \
+    if ((tid) != 0)							      \
       {									      \
-	if (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000)	      \
+	if ((abstime)->tv_nsec < 0 || (abstime)->tv_nsec >= 1000000000)	      \
 	  __result = EINVAL;						      \
 	else								      \
-	  __result = __lll_timedwait_tid (&tid, abstime);		      \
+	  __result = __lll_timedwait_tid (&(tid), (abstime));		      \
       }									      \
     __result; })
 
+
 extern int __lll_lock_elision (int *futex, short *adapt_count, int private)
   attribute_hidden;
 
diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
index 8927661..907b4c7 100644
--- a/sysdeps/unix/sysv/linux/lowlevellock-futex.h
+++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
@@ -19,10 +19,11 @@ 
 #ifndef _LOWLEVELLOCK_FUTEX_H
 #define _LOWLEVELLOCK_FUTEX_H	1
 
+#ifndef __ASSEMBLER__
 #include <sysdep.h>
 #include <tls.h>
 #include <kernel-features.h>
-
+#endif
 
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
@@ -48,6 +49,7 @@ 
 #define LLL_PRIVATE	0
 #define LLL_SHARED	FUTEX_PRIVATE_FLAG
 
+#ifndef __ASSEMBLER__
 
 #if IS_IN (libc) || IS_IN (rtld)
 /* In libc.so or ld.so all futexes are private.  */
@@ -133,5 +135,6 @@ 
 					 private),                      \
 		     nr_wake, nr_move, mutex, val)
 
+#endif  /* !__ASSEMBLER__  */
 
 #endif  /* lowlevellock-futex.h */
diff --git a/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h b/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
index 2f0cf5c..56570ee 100644
--- a/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
+++ b/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
@@ -45,59 +45,13 @@ 
 # endif
 #endif
 
+#include <lowlevellock-futex.h>
+
+/* XXX Remove when no assembler code uses futexes anymore.  */
 #define SYS_futex		__NR_futex
-#define FUTEX_WAIT		0
-#define FUTEX_WAKE		1
-#define FUTEX_CMP_REQUEUE	4
-#define FUTEX_WAKE_OP		5
-#define FUTEX_LOCK_PI		6
-#define FUTEX_UNLOCK_PI		7
-#define FUTEX_TRYLOCK_PI	8
-#define FUTEX_WAIT_BITSET	9
-#define FUTEX_WAKE_BITSET	10
-#define FUTEX_WAIT_REQUEUE_PI	11
-#define FUTEX_CMP_REQUEUE_PI	12
-#define FUTEX_PRIVATE_FLAG	128
-#define FUTEX_CLOCK_REALTIME	256
-
-#define FUTEX_BITSET_MATCH_ANY	0xffffffff
-
-#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE	((4 << 24) | 1)
-
-/* Values for 'private' parameter of locking macros.  Yes, the
-   definition seems to be backwards.  But it is not.  The bit will be
-   reversed before passing to the system call.  */
-#define LLL_PRIVATE	0
-#define LLL_SHARED	FUTEX_PRIVATE_FLAG
 
 #ifndef __ASSEMBLER__
 
-#if IS_IN (libc) || IS_IN (rtld)
-/* In libc.so or ld.so all futexes are private.  */
-# ifdef __ASSUME_PRIVATE_FUTEX
-#  define __lll_private_flag(fl, private) \
-  ((fl) | FUTEX_PRIVATE_FLAG)
-# else
-#  define __lll_private_flag(fl, private) \
-  ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))
-# endif
-#else
-# ifdef __ASSUME_PRIVATE_FUTEX
-#  define __lll_private_flag(fl, private) \
-  (((fl) | FUTEX_PRIVATE_FLAG) ^ (private))
-# else
-#  define __lll_private_flag(fl, private) \
-  (__builtin_constant_p (private)					      \
-   ? ((private) == 0							      \
-      ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex))	      \
-      : (fl))								      \
-   : ({ unsigned int __fl = ((private) ^ FUTEX_PRIVATE_FLAG);		      \
-	asm ("andl %%fs:%P1, %0" : "+r" (__fl)				      \
-	     : "i" (offsetof (struct pthread, header.private_futex)));	      \
-	__fl | (fl); }))
-# endif
-#endif
-
 /* Initializer for lock.  */
 #define LLL_LOCK_INITIALIZER		(0)
 #define LLL_LOCK_INITIALIZER_LOCKED	(1)
@@ -106,39 +60,6 @@ 
 /* Delay in spinlock loop.  */
 #define BUSY_WAIT_NOP	  asm ("rep; nop")
 
-#define lll_futex_wait(futex, val, private) \
-  lll_futex_timed_wait(futex, val, NULL, private)
-
-
-#define lll_futex_timed_wait(futex, val, timeout, private) \
-  ({									      \
-    register const struct timespec *__to __asm ("r10") = timeout;	      \
-    int __status;							      \
-    register __typeof (val) _val __asm ("edx") = (val);			      \
-    __asm __volatile ("syscall"						      \
-		      : "=a" (__status)					      \
-		      : "0" (SYS_futex), "D" (futex),			      \
-			"S" (__lll_private_flag (FUTEX_WAIT, private)),	      \
-			"d" (_val), "r" (__to)				      \
-		      : "memory", "cc", "r11", "cx");			      \
-    __status;								      \
-  })
-
-
-#define lll_futex_wake(futex, nr, private) \
-  ({									      \
-    int __status;							      \
-    register __typeof (nr) _nr __asm ("edx") = (nr);			      \
-    LIBC_PROBE (lll_futex_wake, 3, futex, nr, private);                       \
-    __asm __volatile ("syscall"						      \
-		      : "=a" (__status)					      \
-		      : "0" (SYS_futex), "D" (futex),			      \
-			"S" (__lll_private_flag (FUTEX_WAKE, private)),	      \
-			"d" (_nr)					      \
-		      : "memory", "cc", "r10", "r11", "cx");		      \
-    __status;								      \
-  })
-
 
 /* NB: in the lll_trylock macro we simply return the value in %eax
    after the cmpxchg instruction.  In case the operation succeded this
@@ -378,58 +299,38 @@  extern int __lll_timedlock_elision (int *futex, short *adapt_count,
     }									      \
   while (0)
 
-/* Returns non-zero if error happened, zero if success.  */
-#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex, val, private) \
-  ({ int __res;								      \
-     register int __nr_move __asm ("r10") = nr_move;			      \
-     register void *__mutex __asm ("r8") = mutex;			      \
-     register int __val __asm ("r9") = val;				      \
-     __asm __volatile ("syscall"					      \
-		       : "=a" (__res)					      \
-		       : "0" (__NR_futex), "D" ((void *) ftx),		      \
-			 "S" (__lll_private_flag (FUTEX_CMP_REQUEUE,	      \
-						  private)), "d" (nr_wake),   \
-			 "r" (__nr_move), "r" (__mutex), "r" (__val)	      \
-		       : "cx", "r11", "cc", "memory");			      \
-     __res < 0; })
-
 #define lll_islocked(futex) \
   (futex != LLL_LOCK_INITIALIZER)
 
 
 /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex
-   wakeup when the clone terminates.  The memory location contains the
-   thread ID while the clone is running and is reset to zero
-   afterwards.
-
-   The macro parameter must not have any side effect.  */
+   wake-up when the clone terminates.  The memory location contains the
+   thread ID while the clone is running and is reset to zero by the kernel
+   afterwards.  The kernel up to version 3.16.3 does not use the private futex
+   operations for futex wake-up when the clone terminates.  */
 #define lll_wait_tid(tid) \
-  do {									      \
-    int __ignore;							      \
-    register __typeof (tid) _tid asm ("edx") = (tid);			      \
-    if (_tid != 0)							      \
-      __asm __volatile ("xorq %%r10, %%r10\n\t"				      \
-			"1:\tmovq %2, %%rax\n\t"			      \
-			"syscall\n\t"					      \
-			"cmpl $0, (%%rdi)\n\t"				      \
-			"jne 1b"					      \
-			: "=&a" (__ignore)				      \
-			: "S" (FUTEX_WAIT), "i" (SYS_futex), "D" (&tid),      \
-			  "d" (_tid)					      \
-			: "memory", "cc", "r10", "r11", "cx");		      \
+  do {					\
+    __typeof (tid) __tid;		\
+    while ((__tid = (tid)) != 0)	\
+      lll_futex_wait (&(tid), __tid, LLL_SHARED);\
   } while (0)
 
-extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime)
+extern int __lll_timedwait_tid (int *, const struct timespec *)
      attribute_hidden;
+
+/* As lll_wait_tid, but with a timeout.  If the timeout occurs then return
+   ETIMEDOUT.  If ABSTIME is invalid, return EINVAL.
+   XXX Note that this differs from the generic version in that we do the
+   error checking here and not in __lll_timedwait_tid.  */
 #define lll_timedwait_tid(tid, abstime) \
   ({									      \
     int __result = 0;							      \
-    if (tid != 0)							      \
+    if ((tid) != 0)							      \
       {									      \
-	if (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000)	      \
+	if ((abstime)->tv_nsec < 0 || (abstime)->tv_nsec >= 1000000000)	      \
 	  __result = EINVAL;						      \
 	else								      \
-	  __result = __lll_timedwait_tid (&tid, abstime);		      \
+	  __result = __lll_timedwait_tid (&(tid), (abstime));		      \
       }									      \
     __result; })