Patchwork [PATCHv2] nptl: Fix pthread_rwlock_try*lock stalls [BZ #23844]

login
register
mail settings
Submitter Carlos O'Donell
Date Jan. 22, 2019, 6:12 p.m.
Message ID <390f7db7-8e72-e718-f0eb-673eb91199c0@redhat.com>
Download mbox | patch
Permalink /patch/31162/
State New
Headers show

Comments

Carlos O'Donell - Jan. 22, 2019, 6:12 p.m.
On 1/22/19 7:10 AM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> +/* For a full analysis see comment:
>> +   https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
> 
> In the past, we had issues with bug trackers going away and such
> references becoming stale.

Fixed.

Added text to first test.

>> +
>> +   The stall is caused by pthread_rwlock_tryrdlock failing to check
>> +   that PTHREAD_RWLOCK_FUTEX_USED is set in the __wrphase_futex futex
>> +   and then waking the futex.
>> +
>> +   The fix for bug 23844 ensures that waiters on __wrphase_futex are
>> +   correctly woken.  Before the fix the test stalls as readers can
>> +   wait forever on __wrphase_futex.  */
>> +
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <unistd.h>
>> +#include <pthread.h>
>> +#include <support/xthread.h>
>> +#include <errno.h>
>> +
>> +/* We need only one lock to reproduce the issue. We will need multiple
>> +   threads to get the exact case where we have a read, try, and unlock
>> +   all interleaving to produce the case where the readers are waiting
>> +   and the try fails to wake them.  */
>> +pthread_rwlock_t onelock;
>> +
>> +/* The number of threads is arbitrary but empirically chosen to have
>> +   enough threads that we see the condition where waiting readers are
>> +   not woken by a successful tryrdlock.  */
>> +#define NTHREADS 32
>> +
>> +_Atomic int do_exit;
>> +
>> +void *
>> +run_loop (void *arg)
>> +{
>> +  int i = 0, ret, rw;
>> +  while (!do_exit)
>> +    {
>> +      /* Arbitrarily choose if we are the writer or reader.  Choose a
>> +	 high enough ratio of readers to writers to make it likely
>> +	 that readers block (and eventually are susceptable to
>> +	 stalling).  */
>> +      rw = i % 8;
>> +      /* If we are a writer, take the write lock, and then unlock.
>> +	 If we are a reader, try the lock, then lock, then unlock.  */
>> +      if (rw)
> 
> rw != 0 (or (i % 8) != 0).  Note sure if the variable actually increases
> clarity here.

Fixed.

I'll remove rw and avoid boolean coercion.

>> +  /* Run for some amount of time.  Empirically speaking exercising
>> +     the stall via pthread_rwlock_tryrdlock is much harder, and on
>> +     a 3.5GHz 4 core x86_64 VM system it takes somewhere around
>> +     20-200s to stall, approaching 100% stall past 200s.  We can't
>> +     wait that long for a regression test so we just test for 20s,
>> +     and expect the stall to happen with a 5-10% chance (enough for
>> +     developers to see).  */
>> +  sleep (20);
> 
> Use nanosleep or usleep to avoid signals.

Not fixed.

I'm going to leave this as sleep for now.

I'm going to change this to xnanosleep after release.

I'll add a support/xnanosleep.c with a nanosleep that checks for errors.

I will also fixup the new tst-rwloc-pwn.c test too which uses sleep.

>> +  /* Then exit.  */
>> +  printf ("INFO: Exiting...");
>> +  fflush (stdout);
> 
> Why no trailing \n?  It will just result in garbled output if there is a
> problem.

Fixed.

Because I wanted a nice "Exiting...done" message. It matches the "done."
later in the output which has a \n.

I'll fix this up though, you raise a good design point, that all messages
should immediately get output and be seen, rather than delay them. It's
simpler to just use:

printf ("INFO: Exiting...\n");
...
printf ("INFO: Done.\n");

>> +  do_exit = 1;
>> +  /* If any readers stalled then we will timeout waiting for them.  */
>> +  for (i = 0; i < NTHREADS; i++)
>> +    xpthread_join (tids[i]);
>> +  printf ("done.\n");
>> +  xpthread_rwlock_destroy (&onelock);
>> +  printf ("PASS: No pthread_rwlock_tryrdlock stalls detected.\n");
>> +  return 0;
>> +}
>> +
>> +#define TIMEOUT (30)
> 
> Parentheses are unnecessary.

Fixed.

> Most of these comments also apply to ther other test.

Fixed in both places.

> I have not reviewed the code.
> 
> Thanks,
> Florian
> 

v2
- Added analysis comment to tryrdlock test.
- Removed rw var and avoided boolean coercion.
- Avoided incomplete test output to stdout.

From 0768c51e2930f922071b4b77f96bb15d1ddae004 Mon Sep 17 00:00:00 2001
From: Carlos O'Donell <carlos@redhat.com>
Date: Mon, 21 Jan 2019 22:50:12 -0500
Subject: [PATCH] nptl: Fix pthread_rwlock_try*lock stalls (Bug 23844)

For a full analysis of both the pthread_rwlock_tryrdlock() stall
and the pthread_rwlock_trywrlock() stall see:
https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14

In the pthread_rwlock_trydlock() function we fail to inspect for
PTHREAD_RWLOCK_FUTEX_USED in __wrphase_futex and wake the waiting
readers.

In the pthread_rwlock_trywrlock() function we write 1 to
__wrphase_futex and loose the setting of the PTHREAD_RWLOCK_FUTEX_USED
bit, again failing to wake waiting readers during unlock.

The fix in the case of pthread_rwlock_trydlock() is to check for
PTHREAD_RWLOCK_FUTEX_USED and wake the readers.

The fix in the case of pthread_rwlock_trywrlock() is to only write
1 to __wrphase_futex if we installed the write phase, since all other
readers would be spinning waiting for this step.

We add two new tests, one exercises the stall for
pthread_rwlock_trywrlock() which is easy to exercise, and one exercises
the stall for pthread_rwlock_trydlock() which is harder to exercise.

The pthread_rwlock_trywrlock() test fails consistently without the fix,
and passes after. The pthread_rwlock_tryrdlock() test fails roughly
5-10% of the time without the fix, and passes all the time after.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Torvald Riegel <triegel@redhat.com>
Signed-off-by: Rik Prohaska <prohaska7@gmail.com>
---
 ChangeLog                         |  17 ++
 nptl/Makefile                     |   3 +-
 nptl/pthread_rwlock_tryrdlock.c   |  25 ++-
 nptl/pthread_rwlock_trywrlock.c   |   9 +-
 nptl/tst-rwlock-tryrdlock-stall.c | 355 ++++++++++++++++++++++++++++++
 nptl/tst-rwlock-trywrlock-stall.c | 108 +++++++++
 support/Makefile                  |   1 +
 support/xpthread_rwlock_destroy.c |  26 +++
 support/xthread.h                 |   1 +
 9 files changed, 534 insertions(+), 11 deletions(-)
 create mode 100644 nptl/tst-rwlock-tryrdlock-stall.c
 create mode 100644 nptl/tst-rwlock-trywrlock-stall.c
 create mode 100644 support/xpthread_rwlock_destroy.c
Carlos O'Donell - Jan. 25, 2019, 1:44 a.m.
On 1/22/19 1:12 PM, Carlos O'Donell wrote:
> v2
> - Added analysis comment to tryrdlock test.
> - Removed rw var and avoided boolean coercion.
> - Avoided incomplete test output to stdout.

Siddhesh,

Both Torvald and Rik have OK'd my use of Signed-off-by.

Florian has done a review of tests.

Torvald, Rik and myself have reviewed and tested the fix.

Rik restated that the current fix I have here fixes his test case.

I plan to do a "quick" b-m-g run to look for issues and then push
this fix as my last fix for 2.29.

> From 0768c51e2930f922071b4b77f96bb15d1ddae004 Mon Sep 17 00:00:00 2001
> From: Carlos O'Donell <carlos@redhat.com>
> Date: Mon, 21 Jan 2019 22:50:12 -0500
> Subject: [PATCH] nptl: Fix pthread_rwlock_try*lock stalls (Bug 23844)
> 
> For a full analysis of both the pthread_rwlock_tryrdlock() stall
> and the pthread_rwlock_trywrlock() stall see:
> https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
> 
> In the pthread_rwlock_trydlock() function we fail to inspect for
> PTHREAD_RWLOCK_FUTEX_USED in __wrphase_futex and wake the waiting
> readers.
> 
> In the pthread_rwlock_trywrlock() function we write 1 to
> __wrphase_futex and loose the setting of the PTHREAD_RWLOCK_FUTEX_USED
> bit, again failing to wake waiting readers during unlock.
> 
> The fix in the case of pthread_rwlock_trydlock() is to check for
> PTHREAD_RWLOCK_FUTEX_USED and wake the readers.
> 
> The fix in the case of pthread_rwlock_trywrlock() is to only write
> 1 to __wrphase_futex if we installed the write phase, since all other
> readers would be spinning waiting for this step.
> 
> We add two new tests, one exercises the stall for
> pthread_rwlock_trywrlock() which is easy to exercise, and one exercises
> the stall for pthread_rwlock_trydlock() which is harder to exercise.
> 
> The pthread_rwlock_trywrlock() test fails consistently without the fix,
> and passes after. The pthread_rwlock_tryrdlock() test fails roughly
> 5-10% of the time without the fix, and passes all the time after.
> 
> Signed-off-by: Carlos O'Donell <carlos@redhat.com>
> Signed-off-by: Torvald Riegel <triegel@redhat.com>
> Signed-off-by: Rik Prohaska <prohaska7@gmail.com>
Siddhesh Poyarekar - Jan. 25, 2019, 1:52 a.m.
On 25/01/19 7:14 AM, Carlos O'Donell wrote:
> On 1/22/19 1:12 PM, Carlos O'Donell wrote:
>> v2
>> - Added analysis comment to tryrdlock test.
>> - Removed rw var and avoided boolean coercion.
>> - Avoided incomplete test output to stdout.
> 
> Siddhesh,
> 
> Both Torvald and Rik have OK'd my use of Signed-off-by.
> 
> Florian has done a review of tests.
> 
> Torvald, Rik and myself have reviewed and tested the fix.
> 
> Rik restated that the current fix I have here fixes his test case.
> 
> I plan to do a "quick" b-m-g run to look for issues and then push
> this fix as my last fix for 2.29.

Sure, please do this before the end of the week and if anything breaks,
revert and retry in 2.30.

Siddhesh
Carlos O'Donell - Jan. 25, 2019, 1:53 a.m.
On 1/24/19 8:52 PM, Siddhesh Poyarekar wrote:
> On 25/01/19 7:14 AM, Carlos O'Donell wrote:
>> On 1/22/19 1:12 PM, Carlos O'Donell wrote:
>>> v2
>>> - Added analysis comment to tryrdlock test.
>>> - Removed rw var and avoided boolean coercion.
>>> - Avoided incomplete test output to stdout.
>>
>> Siddhesh,
>>
>> Both Torvald and Rik have OK'd my use of Signed-off-by.
>>
>> Florian has done a review of tests.
>>
>> Torvald, Rik and myself have reviewed and tested the fix.
>>
>> Rik restated that the current fix I have here fixes his test case.
>>
>> I plan to do a "quick" b-m-g run to look for issues and then push
>> this fix as my last fix for 2.29.
> 
> Sure, please do this before the end of the week and if anything breaks,
> revert and retry in 2.30.

Will do. In the next 24h I should have a commit or a reversion waiting
for 2.30. Thanks for the guidance.

Patch

diff --git a/ChangeLog b/ChangeLog
index c4fc9ab22a..c8eb7a397d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@ 
+2019-01-22  Carlos O'Donell  <carlos@redhat.com>
+	    Torvald Riegel  <triegel@redhat.com>
+	    Rik Prohaska  <prohaska7@gmail.com>
+
+	[BZ# 23844]
+	* nptl/Makefile (tests): Add tst-rwlock-tryrdlock-stall, and
+	tst-rwlock-trywrlock-stall.
+	* nptl/pthread_rwlock_tryrdlock.c (__pthread_rwlock_tryrdlock):
+	Wake waiters if PTHREAD_RWLOCK_FUTEX_USED is set.
+	* nptl/pthread_rwlock_trywrlock.c (__pthread_rwlock_trywrlock):
+	Set __wrphase_fute to 1 only if we started the write phase.
+	* nptl/tst-rwlock-tryrdlock-stall.c: New file.
+	* nptl/tst-rwlock-trywrlock-stall.c: New file.
+	* support/Makefile (libsupport-routines): Add xpthread_rwlock_destroy.
+	* support/xpthread_rwlock_destroy.c: New file.
+	* support/xthread.h: Declare xpthread_rwlock_destroy.
+
 2019-01-21  Joseph Myers  <joseph@codesourcery.com>
 
 	* scripts/build-many-glibcs.py (Context.checkout): Default
diff --git a/nptl/Makefile b/nptl/Makefile
index 340282c6cb..0e316edfac 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -319,7 +319,8 @@  tests = tst-attr1 tst-attr2 tst-attr3 tst-default-attr \
 	tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
 	tst-cnd-timedwait tst-thrd-detach tst-mtx-basic tst-thrd-sleep \
 	tst-mtx-recursive tst-tss-basic tst-call-once tst-mtx-timedlock \
-	tst-rwlock-pwn
+	tst-rwlock-pwn \
+	tst-rwlock-tryrdlock-stall tst-rwlock-trywrlock-stall
 
 tests-internal := tst-rwlock19 tst-rwlock20 \
 		  tst-sem11 tst-sem12 tst-sem13 \
diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c
index 368862ff07..2f94f17f36 100644
--- a/nptl/pthread_rwlock_tryrdlock.c
+++ b/nptl/pthread_rwlock_tryrdlock.c
@@ -94,15 +94,22 @@  __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
       /* Same as in __pthread_rwlock_rdlock_full:
 	 We started the read phase, so we are also responsible for
 	 updating the write-phase futex.  Relaxed MO is sufficient.
-	 Note that there can be no other reader that we have to wake
-	 because all other readers will see the read phase started by us
-	 (or they will try to start it themselves); if a writer started
-	 the read phase, we cannot have started it.  Furthermore, we
-	 cannot discard a PTHREAD_RWLOCK_FUTEX_USED flag because we will
-	 overwrite the value set by the most recent writer (or the readers
-	 before it in case of explicit hand-over) and we know that there
-	 are no waiting readers.  */
-      atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 0);
+	 We have to do the same steps as a writer would when handing over the
+	 read phase to use because other readers cannot distinguish between
+	 us and the writer.
+	 Note that __pthread_rwlock_tryrdlock callers will not have to be
+	 woken up because they will either see the read phase started by us
+	 or they will try to start it themselves; however, callers of
+	 __pthread_rwlock_rdlock_full just increase the reader count and then
+	 check what state the lock is in, so they cannot distinguish between
+	 us and a writer that acquired and released the lock in the
+	 meantime.  */
+      if ((atomic_exchange_relaxed (&rwlock->__data.__wrphase_futex, 0)
+	  & PTHREAD_RWLOCK_FUTEX_USED) != 0)
+	{
+	  int private = __pthread_rwlock_get_private (rwlock);
+	  futex_wake (&rwlock->__data.__wrphase_futex, INT_MAX, private);
+	}
     }
 
   return 0;
diff --git a/nptl/pthread_rwlock_trywrlock.c b/nptl/pthread_rwlock_trywrlock.c
index fd37a71ce4..fae475cc70 100644
--- a/nptl/pthread_rwlock_trywrlock.c
+++ b/nptl/pthread_rwlock_trywrlock.c
@@ -46,8 +46,15 @@  __pthread_rwlock_trywrlock (pthread_rwlock_t *rwlock)
 	  &rwlock->__data.__readers, &r,
 	  r | PTHREAD_RWLOCK_WRPHASE | PTHREAD_RWLOCK_WRLOCKED))
 	{
+	  /* We have become the primary writer and we cannot have shared
+	     the PTHREAD_RWLOCK_FUTEX_USED flag with someone else, so we
+	     can simply enable blocking (see full wrlock code).  */
 	  atomic_store_relaxed (&rwlock->__data.__writers_futex, 1);
-	  atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
+	  /* If we started a write phase, we need to enable readers to
+	     wait.  If we did not, we must not change it because other threads
+	     may have set the PTHREAD_RWLOCK_FUTEX_USED in the meantime.  */
+	  if ((r & PTHREAD_RWLOCK_WRPHASE) == 0)
+	    atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
 	  atomic_store_relaxed (&rwlock->__data.__cur_writer,
 	      THREAD_GETMEM (THREAD_SELF, tid));
 	  return 0;
diff --git a/nptl/tst-rwlock-tryrdlock-stall.c b/nptl/tst-rwlock-tryrdlock-stall.c
new file mode 100644
index 0000000000..9351ad3338
--- /dev/null
+++ b/nptl/tst-rwlock-tryrdlock-stall.c
@@ -0,0 +1,355 @@ 
+/* Bug 23844: Test for pthread_rwlock_tryrdlock stalls.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* For a full analysis see comment:
+   https://sourceware.org/bugzilla/show_bug.cgi?id=23844#c14
+
+   Provided here for reference:
+
+   --- Analysis of pthread_rwlock_tryrdlock() stall ---
+   A read lock begins to execute.
+
+   In __pthread_rwlock_rdlock_full:
+
+   We can attempt a read lock, but find that the lock is
+   in a write phase (PTHREAD_RWLOCK_WRPHASE, or WP-bit
+   is set), and the lock is held by a primary writer
+   (PTHREAD_RWLOCK_WRLOCKED is set). In this case we must
+   wait for explicit hand over from the writer to us or
+   one of the other waiters. The read lock threads are
+   about to execute:
+
+   341   r = (atomic_fetch_add_acquire (&rwlock->__data.__readers,
+   342                                  (1 << PTHREAD_RWLOCK_READER_SHIFT))
+   343        + (1 << PTHREAD_RWLOCK_READER_SHIFT));
+
+   An unlock beings to execute.
+
+   Then in __pthread_rwlock_wrunlock:
+
+   547   unsigned int r = atomic_load_relaxed (&rwlock->__data.__readers);
+   ...
+   549   while (!atomic_compare_exchange_weak_release
+   550          (&rwlock->__data.__readers, &r,
+   551           ((r ^ PTHREAD_RWLOCK_WRLOCKED)
+   552            ^ ((r >> PTHREAD_RWLOCK_READER_SHIFT) == 0 ? 0
+   553               : PTHREAD_RWLOCK_WRPHASE))))
+   554     {
+   ...
+   556     }
+
+   We clear PTHREAD_RWLOCK_WRLOCKED, and if there are
+   no readers so we leave the lock in PTHRAD_RWLOCK_WRPHASE.
+
+   Back in the read lock.
+
+   The read lock adjusts __readres as above.
+
+   383   while ((r & PTHREAD_RWLOCK_WRPHASE) != 0
+   384          && (r & PTHREAD_RWLOCK_WRLOCKED) == 0)
+   385     {
+   ...
+   390       if (atomic_compare_exchange_weak_acquire (&rwlock->__data.__readers, &r,
+   391                                                 r ^ PTHREAD_RWLOCK_WRPHASE))
+   392         {
+
+   And then attemps to start the read phase.
+
+   Assume there happens to be a tryrdlock at this point, noting
+   that PTHREAD_RWLOCK_WRLOCKED is clear, and PTHREAD_RWLOCK_WRPHASE
+   is 1. So the try lock attemps to start the read phase.
+
+   In __pthread_rwlock_tryrdlock:
+
+    44       if ((r & PTHREAD_RWLOCK_WRPHASE) == 0)
+    45         {
+   ...
+    49           if (((r & PTHREAD_RWLOCK_WRLOCKED) != 0)
+    50               && (rwlock->__data.__flags
+    51                   == PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP))
+    52             return EBUSY;
+    53           rnew = r + (1 << PTHREAD_RWLOCK_READER_SHIFT);
+    54         }
+   ...
+    89   while (!atomic_compare_exchange_weak_acquire (&rwlock->__data.__readers,
+    90       &r, rnew));
+
+   And succeeds.
+
+   Back in the write unlock:
+
+   557   if ((r >> PTHREAD_RWLOCK_READER_SHIFT) != 0)
+   558     {
+   ...
+   563       if ((atomic_exchange_relaxed (&rwlock->__data.__wrphase_futex, 0)
+   564            & PTHREAD_RWLOCK_FUTEX_USED) != 0)
+   565         futex_wake (&rwlock->__data.__wrphase_futex, INT_MAX, private);
+   566     }
+
+   We note that PTHREAD_RWLOCK_FUTEX_USED is non-zero
+   and don't wake anyone. This is OK because we handed
+   over to the trylock. It will be the trylock's responsibility
+   to wake any waiters.
+
+   Back in the read lock:
+
+   The read lock fails to install PTHRAD_REWLOCK_WRPHASE as 0 because
+   the __readers value was adjusted by the trylock, and so it falls through
+   to waiting on the lock for explicit handover from either a new writer
+   or a new reader.
+
+   448           int err = futex_abstimed_wait (&rwlock->__data.__wrphase_futex,
+   449                                          1 | PTHREAD_RWLOCK_FUTEX_USED,
+   450                                          abstime, private);
+
+   We use PTHREAD_RWLOCK_FUTEX_USED to indicate the futex
+   is in use.
+
+   At this point we have readers waiting on the read lock
+   to unlock. The wrlock is done. The trylock is finishing
+   the installation of the read phase.
+
+    92   if ((r & PTHREAD_RWLOCK_WRPHASE) != 0)
+    93     {
+   ...
+   105       atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 0);
+   106     }
+
+   The trylock does note that we were the one that
+   installed the read phase, but the comments are not
+   correct, the execution ordering above shows that
+   readers might indeed be waiting, and they are.
+
+   The atomic_store_relaxed throws away PTHREAD_RWLOCK_FUTEX_USED,
+   and the waiting reader is never worken becuase as noted
+   above it is conditional on the futex being used.
+
+   The solution is for the trylock thread to inspect
+   PTHREAD_RWLOCK_FUTEX_USED and wake the waiting readers.
+
+   --- Analysis of pthread_rwlock_trywrlock() stall ---
+
+   A write lock begins to execute, takes the write lock,
+   and then releases the lock...
+
+   In pthread_rwlock_wrunlock():
+
+   547   unsigned int r = atomic_load_relaxed (&rwlock->__data.__readers);
+   ...
+   549   while (!atomic_compare_exchange_weak_release
+   550          (&rwlock->__data.__readers, &r,
+   551           ((r ^ PTHREAD_RWLOCK_WRLOCKED)
+   552            ^ ((r >> PTHREAD_RWLOCK_READER_SHIFT) == 0 ? 0
+   553               : PTHREAD_RWLOCK_WRPHASE))))
+   554     {
+   ...
+   556     }
+
+   ... leaving it in the write phase with zero readers
+   (the case where we leave the write phase in place
+   during a write unlock).
+
+   A write trylock begins to execute.
+
+   In __pthread_rwlock_trywrlock:
+
+    40   while (((r & PTHREAD_RWLOCK_WRLOCKED) == 0)
+    41       && (((r >> PTHREAD_RWLOCK_READER_SHIFT) == 0)
+    42           || (prefer_writer && ((r & PTHREAD_RWLOCK_WRPHASE) != 0))))
+    43     {
+
+   The lock is not locked.
+
+   There are no readers.
+
+    45       if (atomic_compare_exchange_weak_acquire (
+    46           &rwlock->__data.__readers, &r,
+    47           r | PTHREAD_RWLOCK_WRPHASE | PTHREAD_RWLOCK_WRLOCKED))
+
+   We atomically install the write phase and we take the
+   exclusive write lock.
+
+    48         {
+    49           atomic_store_relaxed (&rwlock->__data.__writers_futex, 1);
+
+   We get this far.
+
+   A reader lock begins to execute.
+
+   In pthread_rwlock_rdlock:
+
+   437   for (;;)
+   438     {
+   439       while (((wpf = atomic_load_relaxed (&rwlock->__data.__wrphase_futex))
+   440               | PTHREAD_RWLOCK_FUTEX_USED) == (1 | PTHREAD_RWLOCK_FUTEX_USED))
+   441         {
+   442           int private = __pthread_rwlock_get_private (rwlock);
+   443           if (((wpf & PTHREAD_RWLOCK_FUTEX_USED) == 0)
+   444               && (!atomic_compare_exchange_weak_relaxed
+   445                   (&rwlock->__data.__wrphase_futex,
+   446                    &wpf, wpf | PTHREAD_RWLOCK_FUTEX_USED)))
+   447             continue;
+   448           int err = futex_abstimed_wait (&rwlock->__data.__wrphase_futex,
+   449                                          1 | PTHREAD_RWLOCK_FUTEX_USED,
+   450                                          abstime, private);
+
+   We are in a write phase, so the while() on line 439 is true.
+
+   The value of wpf does not have PTHREAD_RWLOCK_FUTEX_USED set
+   since this is the first reader to lock.
+
+   The atomic operation sets wpf with PTHREAD_RELOCK_FUTEX_USED
+   on the expectation that this reader will be woken during
+   the handoff.
+
+   Back in pthread_rwlock_trywrlock:
+
+    50           atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);
+    51           atomic_store_relaxed (&rwlock->__data.__cur_writer,
+    52               THREAD_GETMEM (THREAD_SELF, tid));
+    53           return 0;
+    54         }
+   ...
+    57     }
+
+   We write 1 to __wrphase_futex discarding PTHREAD_RWLOCK_FUTEX_USED,
+   and so in the unlock we will not awaken the waiting reader.
+
+   The solution to this is to realize that if we did not start the write
+   phase we need not write 1 or any other value to __wrphase_futex.
+   This ensures that any readers (which saw __wrphase_futex != 0) can
+   set PTHREAD_RWLOCK_FUTEX_USED and this can be used at unlock to
+   wake them.
+
+   If we installed the write phase then all other readers are looping
+   here:
+
+   In __pthread_rwlock_rdlock_full:
+
+   437   for (;;)
+   438     {
+   439       while (((wpf = atomic_load_relaxed (&rwlock->__data.__wrphase_futex))
+   440               | PTHREAD_RWLOCK_FUTEX_USED) == (1 | PTHREAD_RWLOCK_FUTEX_USED))
+   441         {
+   ...
+   508     }
+
+   waiting for the write phase to be installed or removed before they
+   can begin waiting on __wrphase_futex (part of the algorithm), or
+   taking a concurrent read lock, and thus we can safely write 1 to
+   __wrphase_futex.
+
+   If we did not install the write phase then the readers may already
+   be waiting on the futex, the original writer wrote 1 to __wrphase_futex
+   as part of starting the write phase, and we cannot also write 1
+   without loosing the PTHREAD_RWLOCK_FUTEX_USED bit.
+
+   ---
+
+   Summary for the pthread_rwlock_tryrdlock() stall:
+
+   The stall is caused by pthread_rwlock_tryrdlock failing to check
+   that PTHREAD_RWLOCK_FUTEX_USED is set in the __wrphase_futex futex
+   and then waking the futex.
+
+   The fix for bug 23844 ensures that waiters on __wrphase_futex are
+   correctly woken.  Before the fix the test stalls as readers can
+   wait forever on __wrphase_futex.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <support/xthread.h>
+#include <errno.h>
+
+/* We need only one lock to reproduce the issue. We will need multiple
+   threads to get the exact case where we have a read, try, and unlock
+   all interleaving to produce the case where the readers are waiting
+   and the try fails to wake them.  */
+pthread_rwlock_t onelock;
+
+/* The number of threads is arbitrary but empirically chosen to have
+   enough threads that we see the condition where waiting readers are
+   not woken by a successful tryrdlock.  */
+#define NTHREADS 32
+
+_Atomic int do_exit;
+
+void *
+run_loop (void *arg)
+{
+  int i = 0, ret;
+  while (!do_exit)
+    {
+      /* Arbitrarily choose if we are the writer or reader.  Choose a
+	 high enough ratio of readers to writers to make it likely
+	 that readers block (and eventually are susceptable to
+	 stalling).
+
+         If we are a writer, take the write lock, and then unlock.
+	 If we are a reader, try the lock, then lock, then unlock.  */
+      if ((i % 8) != 0)
+	xpthread_rwlock_wrlock (&onelock);
+      else
+	{
+	  if ((ret = pthread_rwlock_tryrdlock (&onelock)) != 0)
+	    {
+	      if (ret == EBUSY)
+		xpthread_rwlock_rdlock (&onelock);
+	      else
+		exit (EXIT_FAILURE);
+	    }
+	}
+      /* Thread does some work and then unlocks.  */
+      xpthread_rwlock_unlock (&onelock);
+      i++;
+    }
+  return NULL;
+}
+
+int
+do_test (void)
+{
+  int i;
+  pthread_t tids[NTHREADS];
+  xpthread_rwlock_init (&onelock, NULL);
+  for (i = 0; i < NTHREADS; i++)
+    tids[i] = xpthread_create (NULL, run_loop, NULL);
+  /* Run for some amount of time.  Empirically speaking exercising
+     the stall via pthread_rwlock_tryrdlock is much harder, and on
+     a 3.5GHz 4 core x86_64 VM system it takes somewhere around
+     20-200s to stall, approaching 100% stall past 200s.  We can't
+     wait that long for a regression test so we just test for 20s,
+     and expect the stall to happen with a 5-10% chance (enough for
+     developers to see).  */
+  sleep (20);
+  /* Then exit.  */
+  printf ("INFO: Exiting...\n");
+  do_exit = 1;
+  /* If any readers stalled then we will timeout waiting for them.  */
+  for (i = 0; i < NTHREADS; i++)
+    xpthread_join (tids[i]);
+  printf ("INFO: Done.\n");
+  xpthread_rwlock_destroy (&onelock);
+  printf ("PASS: No pthread_rwlock_tryrdlock stalls detected.\n");
+  return 0;
+}
+
+#define TIMEOUT 30
+#include <support/test-driver.c>
diff --git a/nptl/tst-rwlock-trywrlock-stall.c b/nptl/tst-rwlock-trywrlock-stall.c
new file mode 100644
index 0000000000..14d27cbcbc
--- /dev/null
+++ b/nptl/tst-rwlock-trywrlock-stall.c
@@ -0,0 +1,108 @@ 
+/* Bug 23844: Test for pthread_rwlock_trywrlock stalls.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* For a full analysis see comments in tst-rwlock-tryrdlock-stall.c.
+
+   Summary for the pthread_rwlock_trywrlock() stall:
+
+   The stall is caused by pthread_rwlock_trywrlock setting
+   __wrphase_futex futex to 1 and loosing the
+   PTHREAD_RWLOCK_FUTEX_USED bit.
+
+   The fix for bug 23844 ensures that waiters on __wrphase_futex are
+   correctly woken.  Before the fix the test stalls as readers can
+   wait forever on  __wrphase_futex.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <support/xthread.h>
+#include <errno.h>
+
+/* We need only one lock to reproduce the issue. We will need multiple
+   threads to get the exact case where we have a read, try, and unlock
+   all interleaving to produce the case where the readers are waiting
+   and the try clears the PTHREAD_RWLOCK_FUTEX_USED bit and a
+   subsequent unlock fails to wake them.  */
+pthread_rwlock_t onelock;
+
+/* The number of threads is arbitrary but empirically chosen to have
+   enough threads that we see the condition where waiting readers are
+   not woken by a successful unlock.  */
+#define NTHREADS 32
+
+_Atomic int do_exit;
+
+void *
+run_loop (void *arg)
+{
+  int i = 0, ret;
+  while (!do_exit)
+    {
+      /* Arbitrarily choose if we are the writer or reader.  Choose a
+	 high enough ratio of readers to writers to make it likely
+	 that readers block (and eventually are susceptable to
+	 stalling).
+
+         If we are a writer, take the write lock, and then unlock.
+	 If we are a reader, try the lock, then lock, then unlock.  */
+      if ((i % 8) != 0)
+	{
+	  if ((ret = pthread_rwlock_trywrlock (&onelock)) != 0)
+	    {
+	      if (ret == EBUSY)
+		xpthread_rwlock_wrlock (&onelock);
+	      else
+		exit (EXIT_FAILURE);
+	    }
+	}
+      else
+	xpthread_rwlock_rdlock (&onelock);
+      /* Thread does some work and then unlocks.  */
+      xpthread_rwlock_unlock (&onelock);
+      i++;
+    }
+  return NULL;
+}
+
+int
+do_test (void)
+{
+  int i;
+  pthread_t tids[NTHREADS];
+  xpthread_rwlock_init (&onelock, NULL);
+  for (i = 0; i < NTHREADS; i++)
+    tids[i] = xpthread_create (NULL, run_loop, NULL);
+  /* Run for some amount of time.  The pthread_rwlock_tryrwlock stall
+     is very easy to trigger and happens in seconds under the test
+     conditions.  */
+  sleep (10);
+  /* Then exit.  */
+  printf ("INFO: Exiting...\n");
+  do_exit = 1;
+  /* If any readers stalled then we will timeout waiting for them.  */
+  for (i = 0; i < NTHREADS; i++)
+    xpthread_join (tids[i]);
+  printf ("INFO: Done.\n");
+  xpthread_rwlock_destroy (&onelock);
+  printf ("PASS: No pthread_rwlock_tryrwlock stalls detected.\n");
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/support/Makefile b/support/Makefile
index 432cf2fe6c..c15b93647c 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -129,6 +129,7 @@  libsupport-routines = \
   xpthread_mutexattr_settype \
   xpthread_once \
   xpthread_rwlock_init \
+  xpthread_rwlock_destroy \
   xpthread_rwlock_rdlock \
   xpthread_rwlock_unlock \
   xpthread_rwlock_wrlock \
diff --git a/support/xpthread_rwlock_destroy.c b/support/xpthread_rwlock_destroy.c
new file mode 100644
index 0000000000..6d6e953569
--- /dev/null
+++ b/support/xpthread_rwlock_destroy.c
@@ -0,0 +1,26 @@ 
+/* pthread_rwlock_destroy with error checking.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <support/xthread.h>
+
+void
+xpthread_rwlock_destroy (pthread_rwlock_t *rwlock)
+{
+  xpthread_check_return ("pthread_rwlock_destroy",
+                         pthread_rwlock_destroy (rwlock));
+}
diff --git a/support/xthread.h b/support/xthread.h
index 47c23235f3..9fe1f68b3b 100644
--- a/support/xthread.h
+++ b/support/xthread.h
@@ -84,6 +84,7 @@  void xpthread_rwlockattr_setkind_np (pthread_rwlockattr_t *attr, int pref);
 void xpthread_rwlock_wrlock (pthread_rwlock_t *rwlock);
 void xpthread_rwlock_rdlock (pthread_rwlock_t *rwlock);
 void xpthread_rwlock_unlock (pthread_rwlock_t *rwlock);
+void xpthread_rwlock_destroy (pthread_rwlock_t *rwlock);
 
 __END_DECLS