[1/2] nptl: Simplify condition variables to fix pthread_cond_signal (BZ 25847)

From: Malte Skarupke <malteskarupke@fastmail.fm>

  From: Malte Skarupke <malteskarupke@fastmail.fm>

Bug BZ 25847 proved hard to fix mainly because the condition variables
implementation are hard to reason about. An attempt of a description of
the problem:

Problem 1:
There used to be a problem where pthread_cond_signal could wake up future
waiters. E.g. if we are certain that these happened in order:
1. Thread A calls pthread_cond_wait
2. Thread B calls pthread_cond_signal
3. Thread C calls pthread_cond_wait
then it used to be possible that thread C gets woken up and thread A remains
asleep even if we are certain that thread C went to sleep after thread B
signaled.

Solution 1:
As a fix there are now two groups, g1 and g2. Threads always go to sleep in
g2, and signals always signal to g1. When no threads remain in g1, the two
groups switch. This prevents the above problem: Thread C can't be in the same
group that thread B signaled in. (whichever group it signals in is not going
to be g2 afterwards, so thread C goes to sleep in a different group)

Problem 2:
The problem with this is that if a sleeper takes a while to wake up, we can
run into an ABA problem if the group has switched twice. If we were also
currently trying to wake up a sleeper in the newer version of g1, the old
sleeper can cut in and steal the signal. The thread that was supposed to wake
up then goes back to sleep.

Solution 2:
The code tried to detect this and fix the problem by adding an extra signal.

Problem 3:
That extra signal can lead to more problems in complex interleavings, which
causes the current bug, BZ 25847.

New solution:
All this context is here to explain that there are several different ways of
attacking this. We could pop the stack down to any layer of this explanation:
If we could make Problem 1 go away, we wouldn't need to worry
about Problem 2 or 3.

In this feature I choose to pop the stack down to Problem 2: Make the ABA
problem harmless. You may have noticed that I said "the thread that was
supposed to wake up goes back to sleep". This is rather silly. It received a
FUTEX_WAKE then notices that there are 0 available signals and decides to
call FUTEX_WAIT again. Why? Spurious wakes are allowed. If we just allow that
thread to leave pthread_cond_wait, the bug goes away. (in fact if there was
signal stealing, it wasn't even a spurious wake. Both threads were supposed
to wake up)

So in this patch I just allow the thread to stay awake. If a thread has been
woken with FUTEX_WAKE, it will leave pthread_cond_wait. This means if we
spuriously woke up from the futex sleep, the user will notice, but that's
allowed.

After that change our group counts will be wrong if we ran into the ABA
problem from Problem 2 above. So my solution is to not keep group counts.
But how do we know when to switch groups? FUTEX_WAKE can tell us if it woke
someone. If it didn't, that means we're on an empty group and can switch.

This means that on a group switch there will be two calls to FUTEX_WAKE: One
call that does nothing, and one call that actually wakes someone. We still
have the __wrefs count to allow us to early-out if nobody is sleeping in g2,
or in either group.

The result is a much simpler implementation of condition variables that still
solves Problem 1, above.

Benefits:
1. Much easier to reason about. All the pthread_cond_* functions have very
straightforward control flow now. Just try to reason through all the edge
cases in __condvar_cancel_waiting (now deleted). The old code just looked
like code that has more bugs hiding in it.
2. Tested for correctness in TLA+ (the usual caveats apply. Bugs may have
snuck in during translation back to C, this still needs review)
3. Less memory usage. For now I have reserved the same number of bytes
though, to allow future growth without breaking binary compatibility.

Downsides:
1. More spurious wakes. This code makes no attempt to put threads back to
sleep that were woken spuriously.
2. Smaller tolerance for ABA problems: If there were (1 << 30) calls to
pthread_cond_wait in the time that it takes one thread to go to sleep, that
thread will go to sleep even though it shouldn't have. I think this is fine
because it seems very unlikely to get a billion calls.
Note that calls to pthread_cond_signal are not enough. You need waiters to
cause the problem. Meaning if you don't have a billion threads, you need a
lot of calls to pthread_cond_wait and matching wakeups in the time that it
takes one thread to go to sleep.

Neutral:
1. Same speed. As a benchmark I ran pthread_cond_repro.c from BZ 25847 and
this new implementation seems to be 0.5% faster, which might just be noisy
measurements.

Overall I think the benefits outweigh the downsides. Now that the code is
easier to reason about, we could add some complexity back in to make the
downsides less bad. But I don't think they are a big concern.
---
 nptl/pthread_cond_broadcast.c           |  54 +--
 nptl/pthread_cond_common.c              | 259 +-------------
 nptl/pthread_cond_signal.c              |  32 +-
 nptl/pthread_cond_wait.c                | 450 +++---------------------
 nptl/tst-cond22.c                       |  30 +-
 sysdeps/nptl/bits/thread-shared-types.h |  15 +-
 sysdeps/nptl/futex-internal.h           |   8 +-
 sysdeps/nptl/lowlevellock-futex.h       |  19 +-
 sysdeps/nptl/pthread.h                  |   2 +-
 9 files changed, 138 insertions(+), 731 deletions(-)

Message ID	20221015195305.1322087-1-malteskarupke@fastmail.fm
State	Superseded
Headers	Return-Path: <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6219D385840B for <patchwork@sourceware.org>; Sat, 15 Oct 2022 19:54:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6219D385840B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1665863648; bh=W+scw2ArPDAGy/cOm9z6WkBhBtT4FEXydSVlx1E19C8=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=I8S4a4M7tqib2iZC1beVhdlgBgtzTjMBLxb2o4fvZRGALYrutc2z57cTEnbeURDIX rolw0Y+GFEIUeQrwF9Aijky/qAmyBfsxo9iiaYzHK/tRTkdzFNPiWc8Ql75cRdma/C 4V9ntOl0SVVoXniV+xCpn8+UX+NeNrNG/C4TApaU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by sourceware.org (Postfix) with ESMTPS id 655C93858D1E for <libc-alpha@sourceware.org>; Sat, 15 Oct 2022 19:53:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 655C93858D1E Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 6E8CD5C00B0; Sat, 15 Oct 2022 15:53:34 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Sat, 15 Oct 2022 15:53:34 -0400 X-ME-Sender: <xms:vg9LY1eb3mA5MjcoY8qLS6tuhZyAy_Pwrw8m83V_fURZlsH7GV6dAg> <xme:vg9LYzNqgixS_F5P--yEsSMgDqFWSISI_bar-d1DPBQSLUaUFzfeadSupQYbhLlN_ oa5_y9L46mJ-izzqw> X-ME-Received: <xmr:vg9LY-hhVTd6cDVK6zpsqHw2mA1ziJ5wqYY37iwwhH46VRJ_p5naE_T33pathmcLNHL1t_WEUiaTgH8WVzZO9ZxSFXAX5mGSba_F_x18Tg> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfeekgedgudegfecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvfevufffkffoggfgsedtkeertdertddtnecuhfhrohhmpehmrghlthgv shhkrghruhhpkhgvsehfrghsthhmrghilhdrfhhmnecuggftrfgrthhtvghrnhepvedtle dugfdtieduheehuedtfeethfejgfejvdekvdeiuddvheegudetlefhheehnecuvehluhhs thgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrlhhtvghskhgrrh huphhkvgesfhgrshhtmhgrihhlrdhfmh X-ME-Proxy: <xmx:vg9LY--LFPwBX3Fmh17eruIUOuHyr162FvZfTGuE77kV6lMqiCFj1w> <xmx:vg9LYxsTEElkE_aC_bBg4Nkqp6pVEkK5d7MPbmggsMpyhxkJ-mzcBg> <xmx:vg9LY9EAzRQNSMY8_UTK_5fa61KKw5KxyS0xGQP4d4dl3MVj16s5Gw> <xmx:vg9LY_3TpZCRxZDHpJTOh7LZvwsUoEzTn_MW9WfTHIwfaEEPEnhXSg> Feedback-ID: ifa6c408f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 15 Oct 2022 15:53:33 -0400 (EDT) To: libc-alpha@sourceware.org Subject: [PATCH 1/2] nptl: Simplify condition variables to fix pthread_cond_signal (BZ 25847) Date: Sat, 15 Oct 2022 15:53:04 -0400 Message-Id: <20221015195305.1322087-1-malteskarupke@fastmail.fm> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: malteskarupke--- via Libc-alpha <libc-alpha@sourceware.org> Reply-To: malteskarupke@fastmail.fm Cc: Malte Skarupke <malteskarupke@fastmail.fm> Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org>
Series	[1/2] nptl: Simplify condition variables to fix pthread_cond_signal (BZ 25847) \| [1/2] nptl: Simplify condition variables to fix pthread_cond_signal (BZ 25847) [2/2] nptl: Simplifying condvar-internal mutex

[1/2] nptl: Simplify condition variables to fix pthread_cond_signal (BZ 25847)

Checks

Commit Message

Comments

Patch