diff mbox series

nptl: Fix pthread_cond_signal missing a sleeper (bug 25847)

Message ID	20201209031522.22759-1-malteskarupke@web.de
State	Superseded
Headers	DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4DB573858C27 To: libc-alpha@sourceware.org Subject: [PATCH] nptl: Fix pthread_cond_signal missing a sleeper (bug 25847) Date: Tue, 8 Dec 2020 22:15:22 -0500 Message-Id: <20201209031522.22759-1-malteskarupke@web.de> Content-Transfer-Encoding: quoted-printable Precedence: list From: Malte Skarupke via Libc-alpha <libc-alpha@sourceware.org> Reply-To: Malte Skarupke <malteskarupke@web.de> Cc: Malte Skarupke <malteskarupke@web.de>, triegel@redhat.com, malteskarupke@fastmail.fm Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org>
Series	nptl: Fix pthread_cond_signal missing a sleeper (bug 25847) \| nptl: Fix pthread_cond_signal missing a sleeper (bug 25847)

Commit Message

Malte Skarupke Dec. 9, 2020, 3:15 a.m. UTC

There was a subtle, rare bug in the code that tried to handle the
case when a thread took a long time to leave pthread_cond_wait,
and accidentally stole a signal from a later group. As a result
you could get a situation where pthread_cond_signal doesn't wake
a thread even though there is a sleeping thread on the condition
variable.

The fix is to make it impossible for a thread to ever get into
that state, allowing us to remove the code that handles potential
stealing of signals.

pthread_cond_signal would already wait for all sleeping threads
to wake up before closing a group, but pthread_cond_wait had a
shortcut code-path that didn't increment/decrement g_refs, which
meant that pthread_cond_signal wouldn't wait for the thread in
pthread_cond_wait to leave. If pthread_cond_signal then closed the
group too early, and the thread in pthread_cond_wait took a while
longer to leave the function, it could end up stealing a wake-up
signal from a later group, which led to a state where a sleeping
thread on the condition variable would not wake up. There was
already code in place to try to fix this, but that code had a
subtle bug.

In this change I ensure that each thread always increments and
decrements g_refs, so that pthread_cond_signal can't accidentally
close a group while there is still a thread in pthread_cond_wait.
As a result, stealing of signals can't happen, and the code with
the subtle bug could be removed.

This also meant that the increment/decrement of wrefs was no
longer necessary, since g_refs now covers the same scope. I
removed the increment and decrement of wrefs, which means that
this change should not affect performance.

Because wrefs now changed meaning, I also renamed the variable
from __wrefs to __flags.

Unfortunately I couldn't entirely get rid of all logic related to
wrefs: There is one edge case related to pthread_cancel where I
couldn't broaden the scope of g_refs as much as I would have liked
to. In that case I kept the scope of g_refs narrow, and needed
some other logic to synchronize with pthread_cond_destroy. I chose
to keep the wrefs-related logic in pthread_cond_destroy around for
that case. I still think the renaming of the variable is correct,
because it only serves its old purpose when coordinating between
pthread_cancel and pthread_cond_destroy, which is more of a corner
case instead of the core function of the variable.

I had reproduced the bug in the formal specification language TLA+
to better understand what is happening. I wrote a blog post about
the experience here:

https://probablydance.com/2020/10/31/using-tla-in-the-real-world-to-understand-a-glibc-bug/

This fix is also verified in TLA+. The TLA+ model is not included
in this patch because I don't know what the format for that should
be. Let me know if there is a good place for it.
---
nptl/nptl-printers.py | 12 ++-
nptl/nptl_lock_constants.pysym | 2 +-
nptl/pthread_cond_broadcast.c | 9 +-
nptl/pthread_cond_common.c | 4 +-
nptl/pthread_cond_destroy.c | 35 +++++--
nptl/pthread_cond_init.c | 4 +-
nptl/pthread_cond_signal.c | 9 +-
nptl/pthread_cond_wait.c | 124 ++++++------------------
nptl/test-cond-printers.c | 57 +++++++++--
nptl/test-cond-printers.py | 4 +
nptl/tst-cond22.c | 4 +-
sysdeps/nptl/bits/thread-shared-types.h | 2 +-
12 files changed, 134 insertions(+), 132 deletions(-)

--
2.17.1

Comments

Lucas A. M. Magalhaes Dec. 9, 2020, 1:30 p.m. UTC | #1

Hi Malte,
Thanks for working on this bug. I will make some suggestions.

Usually when a patch fixes a bug the commit subject have the
bugzilla number specified in the format "(#BZ XXXX)". So your patch
should be like:

nptl: Fix pthread_cond_signal missing a sleeper (#BZ 25847)

> There was a subtle, rare bug in the code that tried to handle the
> case when a thread took a long time to leave pthread_cond_wait,
> and accidentally stole a signal from a later group. As a result
> you could get a situation where pthread_cond_signal doesn't wake
> a thread even though there is a sleeping thread on the condition
> variable.
> 
> The fix is to make it impossible for a thread to ever get into
> that state, allowing us to remove the code that handles potential
> stealing of signals.
> 
> pthread_cond_signal would already wait for all sleeping threads
> to wake up before closing a group, but pthread_cond_wait had a
> shortcut code-path that didn't increment/decrement g_refs, which
> meant that pthread_cond_signal wouldn't wait for the thread in
> pthread_cond_wait to leave. If pthread_cond_signal then closed the
> group too early, and the thread in pthread_cond_wait took a while
> longer to leave the function, it could end up stealing a wake-up
> signal from a later group, which led to a state where a sleeping
> thread on the condition variable would not wake up. There was
> already code in place to try to fix this, but that code had a
> subtle bug.
> 
> In this change I ensure that each thread always increments and
> decrements g_refs, so that pthread_cond_signal can't accidentally
> close a group while there is still a thread in pthread_cond_wait.
> As a result, stealing of signals can't happen, and the code with
> the subtle bug could be removed.
> 
> This also meant that the increment/decrement of wrefs was no
> longer necessary, since g_refs now covers the same scope. I
> removed the increment and decrement of wrefs, which means that
> this change should not affect performance.
> 
> Because wrefs now changed meaning, I also renamed the variable
> from __wrefs to __flags.
> 

I found difficult to separate the fix for the renaming parts.
IMHO this patch could be splitted to make it easier to review.
It also improve git log readability and organization.

> Unfortunately I couldn't entirely get rid of all logic related to
> wrefs: There is one edge case related to pthread_cancel where I
> couldn't broaden the scope of g_refs as much as I would have liked
> to. In that case I kept the scope of g_refs narrow, and needed
> some other logic to synchronize with pthread_cond_destroy. I chose
> to keep the wrefs-related logic in pthread_cond_destroy around for
> that case. I still think the renaming of the variable is correct,
> because it only serves its old purpose when coordinating between
> pthread_cancel and pthread_cond_destroy, which is more of a corner
> case instead of the core function of the variable.
> 
> I had reproduced the bug in the formal specification language TLA+
> to better understand what is happening. I wrote a blog post about
> the experience here:
> 
> https://probablydance.com/2020/10/31/using-tla-in-the-real-world-to-understand-a-glibc-bug/
> 
Nice work. Looking forward to read it.

> This fix is also verified in TLA+. The TLA+ model is not included
> in this patch because I don't know what the format for that should
> be. Let me know if there is a good place for it.

IDK if its possible, but we could have a testcase for that.

---
Lucas A. M. Magalhães

Torvald Riegel Dec. 23, 2020, 6:09 p.m. UTC | #2

On Tue, 2020-12-08 at 22:15 -0500, Malte Skarupke wrote:
> I wrote a blog post about
> the experience here:
> 
> https://probablydance.com/2020/10/31/using-tla-in-the-real-world-to-understand-a-glibc-bug/

Thanks for you work on this bug.

I'm finally getting some uninterrupted time to look into this again
(after having switched to a different project a couple of years ago). 
First, I have a few comments regarding your blog.

(1) The simple condvar implementation that you show in the blog does
not satisfy the requirements of condvars as specified by ISO C++ and
also by POSIX.  These condvars require that waiters are woken up in an
order that is consistent with the happens-before order established
through the critical section that is associated with the condvar. 
Programs can test whether that is the case, which also means that they
can rely on it.  The condvar rewrite was necessary to fix this problem.

One issue with the simple algorithm is that Linux futexes don't
guarantee ordered wake-up.  That's not a bug because if there's no
ordering requirement, it's certainly cheaper for the implementation to
be able to wake up any of the available waiters (even if current
implementations don't exploit this, future ones may want to do so).

Even if futexes would provide such a guarantee, one would have to make
the order in which threads start waiting through the futexes consistent
with the order established through the critical section, which would
mean that one has to somehow merge mutex unlock and futex_wait into one
atomic step.  Adding futex operations to do so isn't a good option.

You don't seem to be checking for proper wake-up order in your TLA+
code, so it doesn't seem as if you have fully validated against the
condvar semantics we need.

condvar signals are also allowed within the critical section, which may
be something else to check.

A result of this is that when you say "This fix is also verified in
TLA+.", I'd argue that this actually means that it is only verified
against the simplified requirements you made, and it actually verifies
the TLA+ representation of the algorithm, not the actual fix in C. 
Don't get me wrong, kudos to you for making use of verification tools,
but it's important to be precise when talking about validation.

(2) For efficiency, we don't want to require kernel involvement to get
correctness; instead, we want to be able to synchronize in userspace
only (even if doing that to the full extent would require more
spinning-related optimizations and tuning in glibc).  Another
efficiency problem that exists in the simple algorithm is that any
newly added signal will prevent concurrent futex_waits from blocking,
regardless of whether this signal is concurrently consumed by one of
the threads; this forces other waiters back through the critical
section, thus increasing contention.

(3) ABA conditions on large numbers are a difficult trade-off. "Very
unlikely" does not equal "will never happen".  While I agree that a 64b
overflow will take more time than we expect systems to be alive and
running, making that argument gets tougher for 32b, especially if one
thinks about processors with lots of HW threads and efficient HW
support for atomics, like recent GPUs.  Thus, I wanted to avoid 32b ABA
issues.

(4) Yes, the current condvar code is complex.  But it's a difficult
problem to solve a required ordered wake-up with a fixed amount of
memory (think process-shared condvars) and without killing performance
(eg, always waking up lots of waiters, or just falling back to
spinning, etc.).  Having at least two separate groups of waiters is
required to solve the ordered wake-up.  And if one adds requirements
such as cancellation, this just adds complexity to the problem.

I'll continue investigating the bug and the proposed fixes.

Torvald Riegel Dec. 25, 2020, 5:18 p.m. UTC | #3

Thank you for investigating, the patch and the write-up in the blog. 
Comments below.

On Tue, 2020-12-08 at 22:15 -0500, Malte Skarupke wrote:
> There was a subtle, rare bug in the code that tried to handle the
> case when a thread took a long time to leave pthread_cond_wait,
> and accidentally stole a signal from a later group. As a result
> you could get a situation where pthread_cond_signal doesn't wake
> a thread even though there is a sleeping thread on the condition
> variable.

Yes, the code that tries to mitigate potential stealing didn't fix up
the size of the group as well, so both could diverge, leading to other
signalers sinking a signal into a group that didn't have any waiters
anymore.

> The fix is to make it impossible for a thread to ever get into
> that state, allowing us to remove the code that handles potential
> stealing of signals.
> 
> pthread_cond_signal would already wait for all sleeping threads
> to wake up before closing a group, but pthread_cond_wait had a
> shortcut code-path that didn't increment/decrement g_refs, which
> meant that pthread_cond_signal wouldn't wait for the thread in
> pthread_cond_wait to leave.

Limiting the scope of waiting to those threads that are trying to block
through futexes was intentional.  To be efficient in high-contention
scenarios where threads can synchronize using just shared memory, we
try to make blocking through futexes an "optional" last resort for
cases where wait times may actually be longer (assuming that programs
don't oversubscribe their CPU cores).  Thus, __g_refs was intended to
prevent ABAs on futex words.

> If pthread_cond_signal then closed the
> group too early, and the thread in pthread_cond_wait took a while
> longer to leave the function, it could end up stealing a wake-up
> signal from a later group, which led to a state where a sleeping
> thread on the condition variable would not wake up. There was
> already code in place to try to fix this, but that code had a
> subtle bug.
> 
> In this change I ensure that each thread always increments and
> decrements g_refs, so that pthread_cond_signal can't accidentally
> close a group while there is still a thread in pthread_cond_wait.
> As a result, stealing of signals can't happen, and the code with
> the subtle bug could be removed.

I agree that, if done right, we could have all waiters add a reference
to their group, so that signalers that switch groups wait for all
waiters to quiesce.

However, that may have performance implications because now this
applies to all waiters, even those that didn't futexes to block for a
longer time.

That said, the alternative fix of adding a signal the proper way (ie,
including updating group size) will also be less efficient that the
current (buggy) code.  IIRC, you said that you tried this but it
deadlocked; did you just put a call to pthread_cond_signal in, or did
you adapt it so that it hits the right group slot?  Any idea why it
deadlocked?
This alternative may be a simpler fix than changing the purpose of
__g_refs, unless I'm missing something.

> This also meant that the increment/decrement of wrefs was no
> longer necessary, since g_refs now covers the same scope.

This isn't exactly the same scope, as you also indicate below.  __wrefs
is larger in scope because if was used just for condvar destruction
safety, whereas group switching waits on __g_refs.  Releasing one's
ref-count in __g_refs can't be the last action on the condvar instance
in case of cancellation, which is why you have to add the work-around.

> I
> removed the increment and decrement of wrefs, which means that
> this change should not affect performance.

I think it would be better to start with a patch that just changes the
scope and use of __g_refs, and keeps __wrefs as it is.  This way, you
avoid having to re-introduce __wrefs but just for cancellation, and you
can still use the fast-path check on __wrefs (see below).

I'd be less concerned about performance overheads of an additional
atomic read-modify-write, given that it would be followed by a __g_refs
RMW right after it, and both may be on the same cache line.

> Because wrefs now changed meaning, I also renamed the variable
> from __wrefs to __flags.

I think the name change isn't great.  It still has a ref count and
flags in your patch, but now for just cancellation.

> diff --git a/nptl/pthread_cond_broadcast.c
> b/nptl/pthread_cond_broadcast.c
> index 8d887aab93..e1da07bfe0 100644
> --- a/nptl/pthread_cond_broadcast.c
> +++ b/nptl/pthread_cond_broadcast.c
> @@ -40,10 +40,13 @@ __pthread_cond_broadcast (pthread_cond_t *cond)
>  {
>    LIBC_PROBE (cond_broadcast, 1, cond);
> 
> -  unsigned int wrefs = atomic_load_relaxed (&cond->__data.__wrefs);
> -  if (wrefs >> 3 == 0)
> +  unsigned int grefs0 = atomic_load_relaxed(cond->__data.__g_refs);
> +  unsigned int grefs1 = atomic_load_relaxed(cond->__data.__g_refs +
> 1);
> +  if ((grefs0 >> 1) == 0 && (grefs1 >> 1) == 0)
>      return 0;

I don't think this works because you now have two separate loads.  The
prior code will pick one position in happens-before at which the
broadcast could have happend and load __wrefs there, making this
consistent.  The new code could could be unlucky and check an empty G2
in both cases, meaning that it couldn't be sure that there were no
waiters in any group at the time of any of the two loads.

If you'd keep __wrefs as it is, this problem would go away.

> diff --git a/nptl/pthread_cond_signal.c b/nptl/pthread_cond_signal.c
> index 4281ad4d3b..a59151ce72 100644
> --- a/nptl/pthread_cond_signal.c
> +++ b/nptl/pthread_cond_signal.c
> @@ -39,10 +39,13 @@ __pthread_cond_signal (pthread_cond_t *cond)
>    /* First check whether there are waiters.  Relaxed MO is fine for
> that for
>       the same reasons that relaxed MO is fine when observing __wseq
> (see
>       below).  */
> -  unsigned int wrefs = atomic_load_relaxed (&cond->__data.__wrefs);
> -  if (wrefs >> 3 == 0)
> +  unsigned int grefs0 = atomic_load_relaxed(cond->__data.__g_refs);
> +  unsigned int grefs1 = atomic_load_relaxed(cond->__data.__g_refs +
> 1);
> +  if ((grefs0 >> 1) == 0 && (grefs1 >> 1) == 0)
>      return 0;

See above.

> diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
> index 02d11c61db..64ce8ea8ee 100644
> --- a/nptl/pthread_cond_wait.c
> +++ b/nptl/pthread_cond_wait.c
> @@ -43,18 +43,6 @@ struct _condvar_cleanup_buffer
>  };
> 
> 
> -/* Decrease the waiter reference count.  */
> -static void
> -__condvar_confirm_wakeup (pthread_cond_t *cond, int private)
> -{
> -  /* If destruction is pending (i.e., the wake-request flag is
> nonzero) and we
> -     are the last waiter (prior value of __wrefs was 1 << 3), then
> wake any
> -     threads waiting in pthread_cond_destroy.  Release MO to
> synchronize with
> -     these threads.  Don't bother clearing the wake-up request
> flag.  */
> -  if ((atomic_fetch_add_release (&cond->__data.__wrefs, -8) >> 2) ==
> 3)
> -    futex_wake (&cond->__data.__wrefs, INT_MAX, private);
> -}
> -
> 
>  /* Cancel waiting after having registered as a waiter
> previously.  SEQ is our
>     position and G is our group index.
> @@ -81,7 +69,7 @@ __condvar_cancel_waiting (pthread_cond_t *cond,
> uint64_t seq, unsigned int g,
>  {
>    bool consumed_signal = false;
> 
> -  /* No deadlock with group switching is possible here because we
> have do
> +  /* No deadlock with group switching is possible here because we do
>       not hold a reference on the group.  */
>    __condvar_acquire_lock (cond, private);
> 
> @@ -146,7 +134,7 @@ __condvar_cancel_waiting (pthread_cond_t *cond,
> uint64_t seq, unsigned int g,
> 
>  /* Wake up any signalers that might be waiting.  */
>  static void
> -__condvar_dec_grefs (pthread_cond_t *cond, unsigned int g, int
> private)
> +__condvar_confirm_wakeup (pthread_cond_t *cond, unsigned int g, int
> private)
>  {
>    /* Release MO to synchronize-with the acquire load in
>       __condvar_quiesce_and_switch_g1.  */
> @@ -172,7 +160,16 @@ __condvar_cleanup_waiting (void *arg)
>    pthread_cond_t *cond = cbuffer->cond;
>    unsigned g = cbuffer->wseq & 1;
> 
> -  __condvar_dec_grefs (cond, g, cbuffer->private);
> +  /* Normally we are not allowed to touch cond any more after
> calling
> +     confirm_wakeup, because pthread_cond_destroy looks at __g_refs
> to
> +     determine when all waiters have woken. Since we will do more
> work in this
> +     function, we are using an extra channel to communicate to
> +     pthread_cond_destroy that it is not allowed to finish yet: We
> increment
> +     the fourth bit on __flags. Relaxed MO is enough. The
> synchronization
> +     happens because confirm_wakeup uses release MO. */
> +  atomic_fetch_add_relaxed (&cond->__data.__flags, 8);
> +
> +  __condvar_confirm_wakeup (cond, g, cbuffer->private);
> 
>    __condvar_cancel_waiting (cond, cbuffer->wseq >> 1, g, cbuffer-
> >private);
>    /* FIXME With the current cancellation implementation, it is
> possible that
> @@ -183,7 +180,12 @@ __condvar_cleanup_waiting (void *arg)
>       conservatively.  */
>    futex_wake (cond->__data.__g_signals + g, 1, cbuffer->private);
> 
> -  __condvar_confirm_wakeup (cond, cbuffer->private);
> +  /* If destruction is pending (i.e., the wake-request flag is
> nonzero) and we
> +     are the last waiter (prior value of __flags was 1 << 3), then
> wake any
> +     threads waiting in pthread_cond_destroy.  Release MO to
> synchronize with
> +     these threads.  Don't bother clearing the wake-up request
> flag.  */
> +  if ((atomic_fetch_add_release (&cond->__data.__flags, -8) >> 2) ==
> 3)
> +    futex_wake (&cond->__data.__flags, INT_MAX, cbuffer->private);
> 
>    /* XXX If locking the mutex fails, should we just stop
> execution?  This
>       might be better than silently ignoring the error.  */
> @@ -287,13 +289,14 @@ __condvar_cleanup_waiting (void *arg)
>     __g1_orig_size: Initial size of G1
>       * The two least-significant bits represent the condvar-internal 
> lock.
>       * Only accessed while having acquired the condvar-internal
> lock.
> -   __wrefs: Waiter reference counter.
> +   __flags: Flags and reference count to handle
> pthread_cleanup_push/pop.
> +     * Bit 3 and above is the count of threads that had
> pthread_cancel()
> +       called on them. pthread_cond_destroy waits for this to reach
> 0.

Waiters can cancel waiting be because of pthread_cancel or because of
normal timeouts.

>       * Bit 2 is true if waiters should run futex_wake when they
> remove the
>         last reference.  pthread_cond_destroy uses this as futex
> word.
>       * Bit 1 is the clock ID (0 == CLOCK_REALTIME, 1 ==
> CLOCK_MONOTONIC).
>       * Bit 0 is true iff this is a process-shared condvar.
> -     * Simple reference count used by both waiters and
> pthread_cond_destroy.
> -     (If the format of __wrefs is changed, update
> nptl_lock_constants.pysym
> +     (If the format of __flags is changed, update
> nptl_lock_constants.pysym
>        and the pretty printers.)
>     For each of the two groups, we have:
>     __g_refs: Futex waiter reference count.

This now isn't just the ref count for futex waiters, but for all
waiters.

I think it would improve the patch if the comments would be updated to
match the code, and ideally explain what the intent is behind the new
code in more detail.

> @@ -301,6 +304,8 @@ __condvar_cleanup_waiting (void *arg)
>         last reference.
>       * Reference count used by waiters concurrently with signalers
> that have
>         acquired the condvar-internal lock.
> +     (If the format of __g_refs is changed, update
> nptl_lock_constants.pysym
> +      and the pretty printers.)
>     __g_signals: The number of signals that can still be consumed.
>       * Used as a futex word by waiters.  Used concurrently by
> waiters and
>         signalers.
> @@ -407,7 +412,8 @@ __pthread_cond_wait_common (pthread_cond_t *cond,
> pthread_mutex_t *mutex,
> 
>    /* Increase the waiter reference count.  Relaxed MO is sufficient
> because
>       we only need to synchronize when decrementing the reference
> count.  */
> -  unsigned int flags = atomic_fetch_add_relaxed (&cond-
> >__data.__wrefs, 8);
> +  atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);

Please explain why a particular MO is used in the comments.

> +  unsigned int flags = atomic_load_relaxed (&cond->__data.__flags);
>    int private = __condvar_get_private (flags);
>
>    /* Now that we are registered as a waiter, we can release the
> mutex.
> @@ -421,7 +427,7 @@ __pthread_cond_wait_common (pthread_cond_t *cond,
> pthread_mutex_t *mutex,
>    if (__glibc_unlikely (err != 0))
>      {
>        __condvar_cancel_waiting (cond, seq, g, private);
> -      __condvar_confirm_wakeup (cond, private);
> +      __condvar_confirm_wakeup (cond, g, private);
>        return err;
>      }
> 
> @@ -482,13 +488,9 @@ __pthread_cond_wait_common (pthread_cond_t
> *cond, pthread_mutex_t *mutex,
>  	     release MO when decrementing the reference count because
> we use
>  	     an atomic read-modify-write operation and thus extend the
> release
>  	     sequence.  */
> -	  atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);

See above.  Please check that the memory orders are correct and still
work, and update the comments accordingly.  I haven't checked them
myself yet, so I don't yet have an opinion either way.

>  	  if (((atomic_load_acquire (cond->__data.__g_signals + g) & 1)
> != 0)
>  	      || (seq < (__condvar_load_g1_start_relaxed (cond) >> 1)))
>  	    {
> -	      /* Our group is closed.  Wake up any signalers that might
> be
> -		 waiting.  */

The group is still closed, so I'd retain that part of the comment.

> -	      __condvar_dec_grefs (cond, g, private);
>  	      goto done;
>  	    }
> 
> @@ -508,7 +510,6 @@ __pthread_cond_wait_common (pthread_cond_t *cond,
> pthread_mutex_t *mutex,
> 
>  	  if (__glibc_unlikely (err == ETIMEDOUT || err == EOVERFLOW))
>  	    {
> -	      __condvar_dec_grefs (cond, g, private);

This is cancelling waiting too, so you need to add similar code as you
have done for cancellation because of pthread_cancel.

IIRC, you don't mode cancellation in your TLA+ version yet, right.  If
so, this is an example of why I said in my earlier response that any
verification is only as good as the model that one is using.

I believe you also need to handle your change to __g_refs on line 423
of pthread_cond_wait.c.

diff mbox series

Patch

diff --git a/nptl/nptl-printers.py b/nptl/nptl-printers.py
index e794034d83..ef909c92db 100644
--- a/nptl/nptl-printers.py
+++ b/nptl/nptl-printers.py
@@ -312,7 +312,8 @@  class ConditionVariablePrinter(object):
         """

         data = cond['__data']
-        self.wrefs = data['__wrefs']
+        self.flags = data['__flags']
+        self.grefs = data['__g_refs']
         self.values = []

         self.read_values()
@@ -349,19 +350,20 @@  class ConditionVariablePrinter(object):
         This method reads whether the condvar is destroyed and how many threads
         are waiting for it.
         """
-
+        num_readers_g0 = self.grefs[0] >> PTHREAD_COND_GREFS_SHIFT
+        num_readers_g1 = self.grefs[1] >> PTHREAD_COND_GREFS_SHIFT
         self.values.append(('Threads known to still execute a wait function',
-                            self.wrefs >> PTHREAD_COND_WREFS_SHIFT))
+                            num_readers_g0 + num_readers_g1))

     def read_attributes(self):
         """Read the condvar's attributes."""

-        if (self.wrefs & PTHREAD_COND_CLOCK_MONOTONIC_MASK) != 0:
+        if (self.flags & PTHREAD_COND_CLOCK_MONOTONIC_MASK) != 0:
             self.values.append(('Clock ID', 'CLOCK_MONOTONIC'))
         else:
             self.values.append(('Clock ID', 'CLOCK_REALTIME'))

-        if (self.wrefs & PTHREAD_COND_SHARED_MASK) != 0:
+        if (self.flags & PTHREAD_COND_SHARED_MASK) != 0:
             self.values.append(('Shared', 'Yes'))
         else:
             self.values.append(('Shared', 'No'))
diff --git a/nptl/nptl_lock_constants.pysym b/nptl/nptl_lock_constants.pysym
index ade4398e0c..2141cfa1f0 100644
--- a/nptl/nptl_lock_constants.pysym
+++ b/nptl/nptl_lock_constants.pysym
@@ -50,7 +50,7 @@  PTHREAD_COND_SHARED_MASK          __PTHREAD_COND_SHARED_MASK
 PTHREAD_COND_CLOCK_MONOTONIC_MASK __PTHREAD_COND_CLOCK_MONOTONIC_MASK
 COND_CLOCK_BITS
 -- These values are hardcoded:
-PTHREAD_COND_WREFS_SHIFT          3
+PTHREAD_COND_GREFS_SHIFT          1

 -- Rwlock attributes
 PTHREAD_RWLOCK_PREFER_READER_NP
diff --git a/nptl/pthread_cond_broadcast.c b/nptl/pthread_cond_broadcast.c
index 8d887aab93..e1da07bfe0 100644
--- a/nptl/pthread_cond_broadcast.c
+++ b/nptl/pthread_cond_broadcast.c
@@ -40,10 +40,13 @@  __pthread_cond_broadcast (pthread_cond_t *cond)
 {
   LIBC_PROBE (cond_broadcast, 1, cond);

-  unsigned int wrefs = atomic_load_relaxed (&cond->__data.__wrefs);
-  if (wrefs >> 3 == 0)
+  unsigned int grefs0 = atomic_load_relaxed(cond->__data.__g_refs);
+  unsigned int grefs1 = atomic_load_relaxed(cond->__data.__g_refs + 1);
+  if ((grefs0 >> 1) == 0 && (grefs1 >> 1) == 0)
     return 0;
-  int private = __condvar_get_private (wrefs);
+
+  unsigned int flags = atomic_load_relaxed (&cond->__data.__flags);
+  int private = __condvar_get_private (flags);

   __condvar_acquire_lock (cond, private);

diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
index 3251c7f0ec..22270e8805 100644
--- a/nptl/pthread_cond_common.c
+++ b/nptl/pthread_cond_common.c
@@ -20,7 +20,7 @@ 
 #include <stdint.h>
 #include <pthread.h>

-/* We need 3 least-significant bits on __wrefs for something else.  */
+/* We need 3 least-significant bits on __flags for something else.  */
 #define __PTHREAD_COND_MAX_GROUP_SIZE ((unsigned) 1 << 29)

 #if __HAVE_64B_ATOMICS == 1
@@ -318,7 +318,7 @@  __condvar_set_orig_size (pthread_cond_t *cond, unsigned int size)
     atomic_store_relaxed (&cond->__data.__g1_orig_size, (size << 2) | 2);
 }

-/* Returns FUTEX_SHARED or FUTEX_PRIVATE based on the provided __wrefs
+/* Returns FUTEX_SHARED or FUTEX_PRIVATE based on the provided __flags
    value.  */
 static int __attribute__ ((unused))
 __condvar_get_private (int flags)
diff --git a/nptl/pthread_cond_destroy.c b/nptl/pthread_cond_destroy.c
index 31034905d1..63aafae156 100644
--- a/nptl/pthread_cond_destroy.c
+++ b/nptl/pthread_cond_destroy.c
@@ -37,24 +37,39 @@ 
    signal or broadcast calls.
    Thus, we can assume that all waiters that are still accessing the condvar
    have been woken.  We wait until they have confirmed to have woken up by
-   decrementing __wrefs.  */
+   decrementing __g_refs.  */
 int
 __pthread_cond_destroy (pthread_cond_t *cond)
 {
   LIBC_PROBE (cond_destroy, 1, cond);

-  /* Set the wake request flag.  We could also spin, but destruction that is
-     concurrent with still-active waiters is probably neither common nor
-     performance critical.  Acquire MO to synchronize with waiters confirming
-     that they finished.  */
-  unsigned int wrefs = atomic_fetch_or_acquire (&cond->__data.__wrefs, 4);
-  int private = __condvar_get_private (wrefs);
+  unsigned int flags = atomic_load_relaxed (&cond->__data.__flags);
+  int private = __condvar_get_private (flags);
+
+  for (unsigned g = 0; g < 2; ++g)
+    {
+      while (true)
+        {
+          /* Set the wake request flag.  We could also spin, but destruction
+             that is concurrent with still-active waiters is probably neither
+             common nor performance critical.  Acquire MO to synchronize with
+             waiters confirming that they finished.  */
+          unsigned r = atomic_fetch_or_acquire (cond->__data.__g_refs + g, 1) | 1;
+          if (r == 1)
+            break;
+          futex_wait_simple (cond->__data.__g_refs + g, r, private);
+        }
+    }
+
+   /* Same as above, except to synchronize with canceled threads.  This wake
+      flag never gets cleared, so it's enough to set it once.  */
+  unsigned int wrefs = atomic_fetch_or_acquire (&cond->__data.__flags, 4) | 4;
   while (wrefs >> 3 != 0)
     {
-      futex_wait_simple (&cond->__data.__wrefs, wrefs, private);
-      /* See above.  */
-      wrefs = atomic_load_acquire (&cond->__data.__wrefs);
+      futex_wait_simple (&cond->__data.__flags, wrefs, private);
+      wrefs = atomic_load_acquire (&cond->__data.__flags);
     }
+
   /* The memory the condvar occupies can now be reused.  */
   return 0;
 }
diff --git a/nptl/pthread_cond_init.c b/nptl/pthread_cond_init.c
index 595b1b3528..3031d52a42 100644
--- a/nptl/pthread_cond_init.c
+++ b/nptl/pthread_cond_init.c
@@ -37,13 +37,13 @@  __pthread_cond_init (pthread_cond_t *cond, const pthread_condattr_t *cond_attr)

   /* Iff not equal to ~0l, this is a PTHREAD_PROCESS_PRIVATE condvar.  */
   if (icond_attr != NULL && (icond_attr->value & 1) != 0)
-    cond->__data.__wrefs |= __PTHREAD_COND_SHARED_MASK;
+    cond->__data.__flags |= __PTHREAD_COND_SHARED_MASK;
   int clockid = (icond_attr != NULL
 		 ? ((icond_attr->value >> 1) & ((1 << COND_CLOCK_BITS) - 1))
 		 : CLOCK_REALTIME);
   /* If 0, CLOCK_REALTIME is used; CLOCK_MONOTONIC otherwise.  */
   if (clockid != CLOCK_REALTIME)
-    cond->__data.__wrefs |= __PTHREAD_COND_CLOCK_MONOTONIC_MASK;
+    cond->__data.__flags |= __PTHREAD_COND_CLOCK_MONOTONIC_MASK;

   LIBC_PROBE (cond_init, 2, cond, cond_attr);

diff --git a/nptl/pthread_cond_signal.c b/nptl/pthread_cond_signal.c
index 4281ad4d3b..a59151ce72 100644
--- a/nptl/pthread_cond_signal.c
+++ b/nptl/pthread_cond_signal.c
@@ -39,10 +39,13 @@  __pthread_cond_signal (pthread_cond_t *cond)
   /* First check whether there are waiters.  Relaxed MO is fine for that for
      the same reasons that relaxed MO is fine when observing __wseq (see
      below).  */
-  unsigned int wrefs = atomic_load_relaxed (&cond->__data.__wrefs);
-  if (wrefs >> 3 == 0)
+  unsigned int grefs0 = atomic_load_relaxed(cond->__data.__g_refs);
+  unsigned int grefs1 = atomic_load_relaxed(cond->__data.__g_refs + 1);
+  if ((grefs0 >> 1) == 0 && (grefs1 >> 1) == 0)
     return 0;
-  int private = __condvar_get_private (wrefs);
+
+  unsigned int flags = atomic_load_relaxed (&cond->__data.__flags);
+  int private = __condvar_get_private (flags);

   __condvar_acquire_lock (cond, private);

diff --git a/nptl/pthread_cond_wait.c b/nptl/pthread_cond_wait.c
index 02d11c61db..64ce8ea8ee 100644
--- a/nptl/pthread_cond_wait.c
+++ b/nptl/pthread_cond_wait.c
@@ -43,18 +43,6 @@  struct _condvar_cleanup_buffer
 };


-/* Decrease the waiter reference count.  */
-static void
-__condvar_confirm_wakeup (pthread_cond_t *cond, int private)
-{
-  /* If destruction is pending (i.e., the wake-request flag is nonzero) and we
-     are the last waiter (prior value of __wrefs was 1 << 3), then wake any
-     threads waiting in pthread_cond_destroy.  Release MO to synchronize with
-     these threads.  Don't bother clearing the wake-up request flag.  */
-  if ((atomic_fetch_add_release (&cond->__data.__wrefs, -8) >> 2) == 3)
-    futex_wake (&cond->__data.__wrefs, INT_MAX, private);
-}
-

 /* Cancel waiting after having registered as a waiter previously.  SEQ is our
    position and G is our group index.
@@ -81,7 +69,7 @@  __condvar_cancel_waiting (pthread_cond_t *cond, uint64_t seq, unsigned int g,
 {
   bool consumed_signal = false;

-  /* No deadlock with group switching is possible here because we have do
+  /* No deadlock with group switching is possible here because we do
      not hold a reference on the group.  */
   __condvar_acquire_lock (cond, private);

@@ -146,7 +134,7 @@  __condvar_cancel_waiting (pthread_cond_t *cond, uint64_t seq, unsigned int g,

 /* Wake up any signalers that might be waiting.  */
 static void
-__condvar_dec_grefs (pthread_cond_t *cond, unsigned int g, int private)
+__condvar_confirm_wakeup (pthread_cond_t *cond, unsigned int g, int private)
 {
   /* Release MO to synchronize-with the acquire load in
      __condvar_quiesce_and_switch_g1.  */
@@ -172,7 +160,16 @@  __condvar_cleanup_waiting (void *arg)
   pthread_cond_t *cond = cbuffer->cond;
   unsigned g = cbuffer->wseq & 1;

-  __condvar_dec_grefs (cond, g, cbuffer->private);
+  /* Normally we are not allowed to touch cond any more after calling
+     confirm_wakeup, because pthread_cond_destroy looks at __g_refs to
+     determine when all waiters have woken. Since we will do more work in this
+     function, we are using an extra channel to communicate to
+     pthread_cond_destroy that it is not allowed to finish yet: We increment
+     the fourth bit on __flags. Relaxed MO is enough. The synchronization
+     happens because confirm_wakeup uses release MO. */
+  atomic_fetch_add_relaxed (&cond->__data.__flags, 8);
+
+  __condvar_confirm_wakeup (cond, g, cbuffer->private);

   __condvar_cancel_waiting (cond, cbuffer->wseq >> 1, g, cbuffer->private);
   /* FIXME With the current cancellation implementation, it is possible that
@@ -183,7 +180,12 @@  __condvar_cleanup_waiting (void *arg)
      conservatively.  */
   futex_wake (cond->__data.__g_signals + g, 1, cbuffer->private);

-  __condvar_confirm_wakeup (cond, cbuffer->private);
+  /* If destruction is pending (i.e., the wake-request flag is nonzero) and we
+     are the last waiter (prior value of __flags was 1 << 3), then wake any
+     threads waiting in pthread_cond_destroy.  Release MO to synchronize with
+     these threads.  Don't bother clearing the wake-up request flag.  */
+  if ((atomic_fetch_add_release (&cond->__data.__flags, -8) >> 2) == 3)
+    futex_wake (&cond->__data.__flags, INT_MAX, cbuffer->private);

   /* XXX If locking the mutex fails, should we just stop execution?  This
      might be better than silently ignoring the error.  */
@@ -287,13 +289,14 @@  __condvar_cleanup_waiting (void *arg)
    __g1_orig_size: Initial size of G1
      * The two least-significant bits represent the condvar-internal lock.
      * Only accessed while having acquired the condvar-internal lock.
-   __wrefs: Waiter reference counter.
+   __flags: Flags and reference count to handle pthread_cleanup_push/pop.
+     * Bit 3 and above is the count of threads that had pthread_cancel()
+       called on them. pthread_cond_destroy waits for this to reach 0.
      * Bit 2 is true if waiters should run futex_wake when they remove the
        last reference.  pthread_cond_destroy uses this as futex word.
      * Bit 1 is the clock ID (0 == CLOCK_REALTIME, 1 == CLOCK_MONOTONIC).
      * Bit 0 is true iff this is a process-shared condvar.
-     * Simple reference count used by both waiters and pthread_cond_destroy.
-     (If the format of __wrefs is changed, update nptl_lock_constants.pysym
+     (If the format of __flags is changed, update nptl_lock_constants.pysym
       and the pretty printers.)
    For each of the two groups, we have:
    __g_refs: Futex waiter reference count.
@@ -301,6 +304,8 @@  __condvar_cleanup_waiting (void *arg)
        last reference.
      * Reference count used by waiters concurrently with signalers that have
        acquired the condvar-internal lock.
+     (If the format of __g_refs is changed, update nptl_lock_constants.pysym
+      and the pretty printers.)
    __g_signals: The number of signals that can still be consumed.
      * Used as a futex word by waiters.  Used concurrently by waiters and
        signalers.
@@ -407,7 +412,8 @@  __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,

   /* Increase the waiter reference count.  Relaxed MO is sufficient because
      we only need to synchronize when decrementing the reference count.  */
-  unsigned int flags = atomic_fetch_add_relaxed (&cond->__data.__wrefs, 8);
+  atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
+  unsigned int flags = atomic_load_relaxed (&cond->__data.__flags);
   int private = __condvar_get_private (flags);

   /* Now that we are registered as a waiter, we can release the mutex.
@@ -421,7 +427,7 @@  __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
   if (__glibc_unlikely (err != 0))
     {
       __condvar_cancel_waiting (cond, seq, g, private);
-      __condvar_confirm_wakeup (cond, private);
+      __condvar_confirm_wakeup (cond, g, private);
       return err;
     }

@@ -482,13 +488,9 @@  __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
 	     release MO when decrementing the reference count because we use
 	     an atomic read-modify-write operation and thus extend the release
 	     sequence.  */
-	  atomic_fetch_add_acquire (cond->__data.__g_refs + g, 2);
 	  if (((atomic_load_acquire (cond->__data.__g_signals + g) & 1) != 0)
 	      || (seq < (__condvar_load_g1_start_relaxed (cond) >> 1)))
 	    {
-	      /* Our group is closed.  Wake up any signalers that might be
-		 waiting.  */
-	      __condvar_dec_grefs (cond, g, private);
 	      goto done;
 	    }

@@ -508,7 +510,6 @@  __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,

 	  if (__glibc_unlikely (err == ETIMEDOUT || err == EOVERFLOW))
 	    {
-	      __condvar_dec_grefs (cond, g, private);
 	      /* If we timed out, we effectively cancel waiting.  Note that
 		 we have decremented __g_refs before cancellation, so that a
 		 deadlock between waiting for quiescence of our group in
@@ -518,8 +519,6 @@  __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
 	      result = err;
 	      goto done;
 	    }
-	  else
-	    __condvar_dec_grefs (cond, g, private);

 	  /* Reload signals.  See above for MO.  */
 	  signals = atomic_load_acquire (cond->__data.__g_signals + g);
@@ -533,75 +532,12 @@  __pthread_cond_wait_common (pthread_cond_t *cond, pthread_mutex_t *mutex,
   while (!atomic_compare_exchange_weak_acquire (cond->__data.__g_signals + g,
 						&signals, signals - 2));

-  /* We consumed a signal but we could have consumed from a more recent group
-     that aliased with ours due to being in the same group slot.  If this
-     might be the case our group must be closed as visible through
-     __g1_start.  */
-  uint64_t g1_start = __condvar_load_g1_start_relaxed (cond);
-  if (seq < (g1_start >> 1))
-    {
-      /* We potentially stole a signal from a more recent group but we do not
-	 know which group we really consumed from.
-	 We do not care about groups older than current G1 because they are
-	 closed; we could have stolen from these, but then we just add a
-	 spurious wake-up for the current groups.
-	 We will never steal a signal from current G2 that was really intended
-	 for G2 because G2 never receives signals (until it becomes G1).  We
-	 could have stolen a signal from G2 that was conservatively added by a
-	 previous waiter that also thought it stole a signal -- but given that
-	 that signal was added unnecessarily, it's not a problem if we steal
-	 it.
-	 Thus, the remaining case is that we could have stolen from the current
-	 G1, where "current" means the __g1_start value we observed.  However,
-	 if the current G1 does not have the same slot index as we do, we did
-	 not steal from it and do not need to undo that.  This is the reason
-	 for putting a bit with G2's index into__g1_start as well.  */
-      if (((g1_start & 1) ^ 1) == g)
-	{
-	  /* We have to conservatively undo our potential mistake of stealing
-	     a signal.  We can stop trying to do that when the current G1
-	     changes because other spinning waiters will notice this too and
-	     __condvar_quiesce_and_switch_g1 has checked that there are no
-	     futex waiters anymore before switching G1.
-	     Relaxed MO is fine for the __g1_start load because we need to
-	     merely be able to observe this fact and not have to observe
-	     something else as well.
-	     ??? Would it help to spin for a little while to see whether the
-	     current G1 gets closed?  This might be worthwhile if the group is
-	     small or close to being closed.  */
-	  unsigned int s = atomic_load_relaxed (cond->__data.__g_signals + g);
-	  while (__condvar_load_g1_start_relaxed (cond) == g1_start)
-	    {
-	      /* Try to add a signal.  We don't need to acquire the lock
-		 because at worst we can cause a spurious wake-up.  If the
-		 group is in the process of being closed (LSB is true), this
-		 has an effect similar to us adding a signal.  */
-	      if (((s & 1) != 0)
-		  || atomic_compare_exchange_weak_relaxed
-		       (cond->__data.__g_signals + g, &s, s + 2))
-		{
-		  /* If we added a signal, we also need to add a wake-up on
-		     the futex.  We also need to do that if we skipped adding
-		     a signal because the group is being closed because
-		     while __condvar_quiesce_and_switch_g1 could have closed
-		     the group, it might stil be waiting for futex waiters to
-		     leave (and one of those waiters might be the one we stole
-		     the signal from, which cause it to block using the
-		     futex).  */
-		  futex_wake (cond->__data.__g_signals + g, 1, private);
-		  break;
-		}
-	      /* TODO Back off.  */
-	    }
-	}
-    }
-
  done:

   /* Confirm that we have been woken.  We do that before acquiring the mutex
      to allow for execution of pthread_cond_destroy while having acquired the
      mutex.  */
-  __condvar_confirm_wakeup (cond, private);
+  __condvar_confirm_wakeup (cond, g, private);

   /* Woken up; now re-acquire the mutex.  If this doesn't fail, return RESULT,
      which is set to ETIMEDOUT if a timeout occured, or zero otherwise.  */
@@ -629,9 +565,7 @@  __pthread_cond_timedwait64 (pthread_cond_t *cond, pthread_mutex_t *mutex,
   if (! valid_nanoseconds (abstime->tv_nsec))
     return EINVAL;

-  /* Relaxed MO is suffice because clock ID bit is only modified
-     in condition creation.  */
-  unsigned int flags = atomic_load_relaxed (&cond->__data.__wrefs);
+  unsigned int flags = atomic_load_relaxed (&cond->__data.__flags);
   clockid_t clockid = (flags & __PTHREAD_COND_CLOCK_MONOTONIC_MASK)
                     ? CLOCK_MONOTONIC : CLOCK_REALTIME;
   return __pthread_cond_wait_common (cond, mutex, clockid, abstime);
diff --git a/nptl/test-cond-printers.c b/nptl/test-cond-printers.c
index 4b6db831f9..6117bc5e1a 100644
--- a/nptl/test-cond-printers.c
+++ b/nptl/test-cond-printers.c
@@ -26,7 +26,14 @@ 
 #define PASS 0
 #define FAIL 1

-static int test_status_destroyed (pthread_cond_t *condvar);
+static int test_status (pthread_cond_t *condvar);
+
+typedef struct
+{
+  pthread_mutex_t *mutex;
+  pthread_cond_t *condvar;
+  int *wait_thread_asleep;
+} test_state;

 int
 main (void)
@@ -36,22 +43,56 @@  main (void)
   int result = FAIL;

   if (pthread_condattr_init (&attr) == 0
-      && test_status_destroyed (&condvar) == PASS)
+      && test_status (&condvar) == PASS)
     result = PASS;
   /* Else, one of the pthread_cond* functions failed.  */

   return result;
 }

-/* Initializes CONDVAR, then destroys it.  */
+static void *
+wait (void *arg)
+{
+  test_state *state = (test_state *)arg;
+  void * result = PASS;
+  if (pthread_mutex_lock(state->mutex) != 0)
+    result = (void *)FAIL;
+  *state->wait_thread_asleep = 1;
+  if (pthread_cond_signal(state->condvar) != 0)
+    result = (void *)FAIL;
+  if (pthread_cond_wait(state->condvar, state->mutex) != 0)
+    result = (void *)FAIL;
+  if (pthread_mutex_unlock(state->mutex) != 0)
+    result = (void *)FAIL;
+  return result;
+}
+
 static int
-test_status_destroyed (pthread_cond_t *condvar)
+test_status (pthread_cond_t *condvar)
 {
-  int result = FAIL;
+  int result = PASS;

-  if (pthread_cond_init (condvar, NULL) == 0
-      && pthread_cond_destroy (condvar) == 0)
-    result = PASS; /* Test status (destroyed).  */
+  pthread_mutex_t mutex;
+  result |= pthread_mutex_init(&mutex, NULL);
+  result |= pthread_cond_init (condvar, NULL);
+  int wait_thread_asleep = 0;
+  test_state state = { &mutex, condvar, &wait_thread_asleep };
+  result |= pthread_mutex_lock(&mutex);
+  pthread_t thread;
+  result |= pthread_create(&thread, NULL, wait, &state);
+  while (!wait_thread_asleep)
+    {
+      result |= pthread_cond_wait(condvar, &mutex);
+    }
+  result |= pthread_cond_signal(condvar); /* Test about to signal */
+  result |= pthread_mutex_unlock(&mutex);
+  result |= pthread_cond_destroy (condvar);
+  void * retval = NULL;
+  result |= pthread_join(thread, &retval);  /* Test status (destroyed).  */
+  result |= pthread_mutex_destroy(&mutex);
+  result = result ? FAIL : PASS;
+  if (retval != NULL)
+    result = FAIL;

   return result;
 }
diff --git a/nptl/test-cond-printers.py b/nptl/test-cond-printers.py
index 38e2da4269..19d9e2cac8 100644
--- a/nptl/test-cond-printers.py
+++ b/nptl/test-cond-printers.py
@@ -33,6 +33,10 @@  try:
     var = 'condvar'
     to_string = 'pthread_cond_t'

+    break_at(test_source, 'Test about to signal')
+    continue_cmd() # Go to test_status_destroyed
+    test_printer(var, to_string, {'Threads known to still execute a wait function': '1'})
+
     break_at(test_source, 'Test status (destroyed)')
     continue_cmd() # Go to test_status_destroyed
     test_printer(var, to_string, {'Threads known to still execute a wait function': '0'})
diff --git a/nptl/tst-cond22.c b/nptl/tst-cond22.c
index 64f19ea0a5..e1338ebf94 100644
--- a/nptl/tst-cond22.c
+++ b/nptl/tst-cond22.c
@@ -110,7 +110,7 @@  do_test (void)
 	  c.__data.__wseq, c.__data.__g1_start,
 	  c.__data.__g_signals[0], c.__data.__g_refs[0], c.__data.__g_size[0],
 	  c.__data.__g_signals[1], c.__data.__g_refs[1], c.__data.__g_size[1],
-	  c.__data.__g1_orig_size, c.__data.__wrefs);
+	  c.__data.__g1_orig_size, c.__data.__flags);

   if (pthread_create (&th, NULL, tf, (void *) 1l) != 0)
     {
@@ -153,7 +153,7 @@  do_test (void)
 	  c.__data.__wseq, c.__data.__g1_start,
 	  c.__data.__g_signals[0], c.__data.__g_refs[0], c.__data.__g_size[0],
 	  c.__data.__g_signals[1], c.__data.__g_refs[1], c.__data.__g_size[1],
-	  c.__data.__g1_orig_size, c.__data.__wrefs);
+	  c.__data.__g1_orig_size, c.__data.__flags);

   return status;
 }
diff --git a/sysdeps/nptl/bits/thread-shared-types.h b/sysdeps/nptl/bits/thread-shared-types.h
index fbbdd0bb36..84cedfcaa0 100644
--- a/sysdeps/nptl/bits/thread-shared-types.h
+++ b/sysdeps/nptl/bits/thread-shared-types.h
@@ -112,7 +112,7 @@  struct __pthread_cond_s
   unsigned int __g_refs[2] __LOCK_ALIGNMENT;
   unsigned int __g_size[2];
   unsigned int __g1_orig_size;
-  unsigned int __wrefs;
+  unsigned int __flags;
   unsigned int __g_signals[2];
 };