[v4,3/9] nptl: Remove unnecessary catch-all-wake in condvar group switch

Message ID 20230509175558.10014-4-malteskarupke@fastmail.fm
State Superseded
Headers
Series Patch to fix glibc condition variable bug (BZ 25847) |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

develop--- via Libc-alpha May 9, 2023, 5:55 p.m. UTC
  From: Malte Skarupke <malteskarupke@fastmail.fm>

This wake is unnecessary. We only switch groups after every sleeper in a group
has been woken. Sure, they may take a while to actually wake up and may still
hold a reference, but waking them a second time doesn't speed that up. Instead
this just makes the code more complicated and may hide problems.

In particular this safety wake wouldn't even have helped with the bug that was
fixed by Barrus' patch: The bug there was that pthread_cond_signal would not
switch g1 when it should, so we wouldn't even have entered this code path.

Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
---
 nptl/pthread_cond_common.c | 31 +------------------------------
 1 file changed, 1 insertion(+), 30 deletions(-)
  

Patch

diff --git a/nptl/pthread_cond_common.c b/nptl/pthread_cond_common.c
index 350a16fab2..f976a533a1 100644
--- a/nptl/pthread_cond_common.c
+++ b/nptl/pthread_cond_common.c
@@ -221,13 +221,7 @@  __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
      * New waiters arriving concurrently with the group switching will all go
        into G2 until we atomically make the switch.  Waiters existing in G2
        are not affected.
-     * Waiters in G1 have already received a signal and been woken. If they
-       haven't woken yet, they will be closed out immediately by the advancing
-       of __g_signals to the next "lowseq" (low 31 bits of the new g1_start),
-       which will prevent waiters from blocking using a futex on
-       __g_signals since it provides enough signals for all possible
-       remaining waiters.  As a result, they can each consume a signal
-       and they will eventually remove their group reference.  */
+     * Waiters in G1 have already received a signal and been woken.  */
 
   /* Update __g1_start, which finishes closing this group.  The value we add
      will never be negative because old_orig_size can only be zero when we
@@ -240,29 +234,6 @@  __condvar_quiesce_and_switch_g1 (pthread_cond_t *cond, uint64_t wseq,
 
   unsigned int lowseq = ((old_g1_start + old_orig_size) << 1) & ~1U;
 
-  /* If any waiters still hold group references (and thus could be blocked),
-     then wake them all up now and prevent any running ones from blocking.
-     This is effectively a catch-all for any possible current or future
-     bugs that can allow the group size to reach 0 before all G1 waiters
-     have been awakened or at least given signals to consume, or any
-     other case that can leave blocked (or about to block) older waiters..  */
-  if ((atomic_fetch_or_release (cond->__data.__g_refs + g1, 0) >> 1) > 0)
-   {
-    /* First advance signals to the end of the group (i.e. enough signals
-       for the entire G1 group) to ensure that waiters which have not
-       yet blocked in the futex will not block.
-       Note that in the vast majority of cases, this should never
-       actually be necessary, since __g_signals will have enough
-       signals for the remaining g_refs waiters.  As an optimization,
-       we could check this first before proceeding, although that
-       could still leave the potential for futex lost wakeup bugs
-       if the signal count was non-zero but the futex wakeup
-       was somehow lost.  */
-    atomic_store_release (cond->__data.__g_signals + g1, lowseq);
-
-    futex_wake (cond->__data.__g_signals + g1, INT_MAX, private);
-   }
-
   /* At this point, the old G1 is now a valid new G2 (but not in use yet).
      No old waiter can neither grab a signal nor acquire a reference without
      noticing that __g1_start is larger.