Fix rwlock stall with PREFER_WRITER_NONRECURSIVE_NP (bug 23861)

Message ID mvmr2eo87jr.fsf@suse.de
State New, archived
Headers

Commit Message

Andreas Schwab Dec. 11, 2018, 2:25 p.m. UTC
  [BZ #23861]
	* nptl/pthread_rwlock_common.c (__pthread_rwlock_rdlock_full):
	Update expected value for __readers while waiting on
	PTHREAD_RWLOCK_RWAITING.
	* nptl/tst-rwlock-pwn.c: New file.
	* nptl/Makefile (tests): Add tst-rwlock-pwn.
---
 nptl/Makefile                |  3 +-
 nptl/pthread_rwlock_common.c |  2 +-
 nptl/tst-rwlock-pwn.c        | 82 ++++++++++++++++++++++++++++++++++++
 3 files changed, 85 insertions(+), 2 deletions(-)
 create mode 100644 nptl/tst-rwlock-pwn.c
  

Comments

Florian Weimer Dec. 11, 2018, 3:33 p.m. UTC | #1
* Andreas Schwab:

> +volatile int do_exit;

Please no volatile, and use the GCC __atomic builtins instead (not the
glibc-internal ones, which are inaccessible).  I think the minimum GCC
version does not yet support proper C11 atomics.

Thanks,
Florian
  
Andreas Schwab Dec. 11, 2018, 3:55 p.m. UTC | #2
On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:

> * Andreas Schwab:
>
>> +volatile int do_exit;
>
> Please no volatile, and use the GCC __atomic builtins instead (not the
> glibc-internal ones, which are inaccessible).

It's only about the compiler not optimizing the access, no atomicity
required.

Andreas.
  
Florian Weimer Dec. 11, 2018, 3:58 p.m. UTC | #3
* Andreas Schwab:

> On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:
>
>> * Andreas Schwab:
>>
>>> +volatile int do_exit;
>>
>> Please no volatile, and use the GCC __atomic builtins instead (not the
>> glibc-internal ones, which are inaccessible).
>
> It's only about the compiler not optimizing the access, no atomicity
> required.

No, without atomics, the code has a data race and undefined behavior.

Thanks,
Florian
  
Andreas Schwab Dec. 11, 2018, 4:03 p.m. UTC | #4
On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:

> * Andreas Schwab:
>
>> On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:
>>
>>> * Andreas Schwab:
>>>
>>>> +volatile int do_exit;
>>>
>>> Please no volatile, and use the GCC __atomic builtins instead (not the
>>> glibc-internal ones, which are inaccessible).
>>
>> It's only about the compiler not optimizing the access, no atomicity
>> required.
>
> No, without atomics, the code has a data race and undefined behavior.

In which way?  There is a single setter, and it doesn't matter how long
the thread takes to see the transition from zero to non-zero.

Andreas.
  
Florian Weimer Dec. 11, 2018, 4:08 p.m. UTC | #5
* Andreas Schwab:

>> No, without atomics, the code has a data race and undefined behavior.
>
> In which way?  There is a single setter, and it doesn't matter how long
> the thread takes to see the transition from zero to non-zero.

| The execution of a program contains a data race if it contains two
| conflicting actions in different threads, at least one of which is not
| atomic, and neither happens before the other. Any such data race
| results in undefined behavior.

Florian
  
Andreas Schwab Dec. 11, 2018, 4:11 p.m. UTC | #6
On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:

> * Andreas Schwab:
>
>>> No, without atomics, the code has a data race and undefined behavior.
>>
>> In which way?  There is a single setter, and it doesn't matter how long
>> the thread takes to see the transition from zero to non-zero.
>
> | The execution of a program contains a data race if it contains two
> | conflicting actions in different threads,

In which way do the actions conflict?

Andreas.
  
Florian Weimer Dec. 11, 2018, 4:13 p.m. UTC | #7
* Andreas Schwab:

> On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:
>
>> * Andreas Schwab:
>>
>>>> No, without atomics, the code has a data race and undefined behavior.
>>>
>>> In which way?  There is a single setter, and it doesn't matter how long
>>> the thread takes to see the transition from zero to non-zero.
>>
>> | The execution of a program contains a data race if it contains two
>> | conflicting actions in different threads,
>
> In which way do the actions conflict?

| Two expression evaluations conflict if one of them modifies a memory
| location and the other one reads or modifies the same memory location.

Thanks,
Florian
  
Andreas Schwab Dec. 11, 2018, 4:16 p.m. UTC | #8
On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:

> * Andreas Schwab:
>
>> On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:
>>
>>> * Andreas Schwab:
>>>
>>>>> No, without atomics, the code has a data race and undefined behavior.
>>>>
>>>> In which way?  There is a single setter, and it doesn't matter how long
>>>> the thread takes to see the transition from zero to non-zero.
>>>
>>> | The execution of a program contains a data race if it contains two
>>> | conflicting actions in different threads,
>>
>> In which way do the actions conflict?
>
> | Two expression evaluations conflict if one of them modifies a memory
> | location and the other one reads or modifies the same memory location.

And how is that a problem in this pariticular case?

Andreas.
  
Carlos O'Donell Dec. 11, 2018, 4:20 p.m. UTC | #9
On 12/11/18 11:16 AM, Andreas Schwab wrote:
> On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:
> 
>> * Andreas Schwab:
>>
>>> On Dez 11 2018, Florian Weimer <fweimer@redhat.com> wrote:
>>>
>>>> * Andreas Schwab:
>>>>
>>>>>> No, without atomics, the code has a data race and undefined behavior.
>>>>>
>>>>> In which way?  There is a single setter, and it doesn't matter how long
>>>>> the thread takes to see the transition from zero to non-zero.
>>>>
>>>> | The execution of a program contains a data race if it contains two
>>>> | conflicting actions in different threads,
>>>
>>> In which way do the actions conflict?
>>
>> | Two expression evaluations conflict if one of them modifies a memory
>> | location and the other one reads or modifies the same memory location.
> 
> And how is that a problem in this pariticular case?

Some data races can be benign, but they are still data races and they
still render the program invalid (undefined behvaiour).

Use relaxed MO loads/stores if you need them.
  

Patch

diff --git a/nptl/Makefile b/nptl/Makefile
index 34ae830276..b01f2b0626 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -318,7 +318,8 @@  tests = tst-attr1 tst-attr2 tst-attr3 tst-default-attr \
 	tst-minstack-throw \
 	tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
 	tst-cnd-timedwait tst-thrd-detach tst-mtx-basic tst-thrd-sleep \
-	tst-mtx-recursive tst-tss-basic tst-call-once tst-mtx-timedlock
+	tst-mtx-recursive tst-tss-basic tst-call-once tst-mtx-timedlock \
+	tst-rwlock-pwn
 
 tests-internal := tst-rwlock19 tst-rwlock20 \
 		  tst-sem11 tst-sem12 tst-sem13 \
diff --git a/nptl/pthread_rwlock_common.c b/nptl/pthread_rwlock_common.c
index 6662536812..e55bf5431a 100644
--- a/nptl/pthread_rwlock_common.c
+++ b/nptl/pthread_rwlock_common.c
@@ -314,7 +314,7 @@  __pthread_rwlock_rdlock_full (pthread_rwlock_t *rwlock,
 		 harmless because the flag is just about the state of
 		 __readers, and all threads set the flag under the same
 		 conditions.  */
-	      while ((atomic_load_relaxed (&rwlock->__data.__readers)
+	      while (((r = atomic_load_relaxed (&rwlock->__data.__readers))
 		      & PTHREAD_RWLOCK_RWAITING) != 0)
 		{
 		  int private = __pthread_rwlock_get_private (rwlock);
diff --git a/nptl/tst-rwlock-pwn.c b/nptl/tst-rwlock-pwn.c
new file mode 100644
index 0000000000..57a5034515
--- /dev/null
+++ b/nptl/tst-rwlock-pwn.c
@@ -0,0 +1,82 @@ 
+/* Test rwlock with PREFER_WRITER_NONRECURSIVE_NP.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <support/xthread.h>
+
+/* We choose 10 iterations and 3 threads because this happens to be able
+   to trigger the stall today on modern hardware.  */
+#define LOOPS 10
+#define NTHREADS 3
+
+volatile int do_exit;
+pthread_rwlockattr_t mylock_attr;
+pthread_rwlock_t mylock;
+
+void *
+run_loop (void *a)
+{
+  while (!do_exit)
+    {
+      if (random () & 1)
+	{
+	  xpthread_rwlock_wrlock (&mylock);
+	  xpthread_rwlock_unlock (&mylock);
+	}
+      else
+	{
+	  xpthread_rwlock_rdlock (&mylock);
+	  xpthread_rwlock_unlock (&mylock);
+	}
+    }
+  return NULL;
+}
+
+int
+do_test (void)
+{
+  xpthread_rwlockattr_init (&mylock_attr);
+  xpthread_rwlockattr_setkind_np (&mylock_attr,
+				  PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP);
+  xpthread_rwlock_init (&mylock, &mylock_attr);
+
+  for (int n = 0; n < LOOPS; n++)
+    {
+      pthread_t tids[NTHREADS];
+      do_exit = 0;
+      for (int i = 0; i < NTHREADS; i++)
+	tids[i] = xpthread_create (NULL, run_loop, NULL);
+      /* Let the threads run for some time.  */
+      sleep (1);
+      printf ("Exiting...");
+      fflush (stdout);
+      do_exit = 1;
+      for (int i = 0; i < NTHREADS; i++)
+	xpthread_join (tids[i]);
+      printf ("done.\n");
+    }
+  pthread_rwlock_destroy (&mylock);
+  pthread_rwlockattr_destroy (&mylock_attr);
+  return 0;
+}
+
+#define TIMEOUT (DEFAULT_TIMEOUT + 3 * LOOPS)
+#include <support/test-driver.c>