diff mbox

[PATCHv4] powerpc: Fix write-after-destroy in lock elision

Message ID	1483446882-2593-1-git-send-email-tuliom@linux.vnet.ibm.com
State	Committed
Delegated to:	Torvald Riegel
Headers	Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org From: "Tulio Magno Quites Machado Filho" <tuliom@linux.vnet.ibm.com> To: triegel@redhat.com, libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, raji@linux.vnet.ibm.com, munroesj@linux.vnet.ibm.com Subject: [PATCHv4] powerpc: Fix write-after-destroy in lock elision Date: Tue, 3 Jan 2017 10:34:42 -0200 In-Reply-To: <1483385613.13143.127.camel@redhat.com> References: <1483385613.13143.127.camel@redhat.com> Message-Id: <1483446882-2593-1-git-send-email-tuliom@linux.vnet.ibm.com>

Commit Message

Tulio Magno Quites Machado Filho Jan. 3, 2017, 12:34 p.m. UTC

  Changes since v3:
 - Use atomics to update adapt_count in __lll_unlock_elision.
 - Improved the source code comments in __lll_unlock_elision to mention
   the mutex destroy requirements.

Changes since v2:
 - Fixed coding style.

Changes since v1:
 - Removed references to data race by the actual error: write-after-destroy.
 - Enclosed adapt_count accesses in C11 atomics.

--- 8< ---

The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.

2017-01-03  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
	    Steven Munroe  <sjmunroe@us.ibm.com>
	    Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>

	[BZ #20822]
	* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
	(__lll_lock_elision): Access adapt_count via C11 atomics.
	* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
	(__lll_trylock_elision): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
	(__lll_unlock_elision):  Update adapt_count variable inside the
	critical section using C11 atomics.
---
 sysdeps/unix/sysv/linux/powerpc/elision-lock.c    |  8 +++++---
 sysdeps/unix/sysv/linux/powerpc/elision-trylock.c |  7 ++++---
 sysdeps/unix/sysv/linux/powerpc/elision-unlock.c  | 15 +++++++++------
 3 files changed, 18 insertions(+), 12 deletions(-)

Comments

Torvald Riegel Jan. 3, 2017, 1:02 p.m. UTC | #1

On Tue, 2017-01-03 at 10:34 -0200, Tulio Magno Quites Machado Filho
wrote:
> Changes since v3:
>  - Use atomics to update adapt_count in __lll_unlock_elision.
>  - Improved the source code comments in __lll_unlock_elision to mention
>    the mutex destroy requirements.
> 
> Changes since v2:
>  - Fixed coding style.
> 
> Changes since v1:
>  - Removed references to data race by the actual error: write-after-destroy.
>  - Enclosed adapt_count accesses in C11 atomics.
> 
> --- 8< ---

This is okay, thanks.

For the next time, or when you change non-arch-specific code, please
remember to document memory orders; a brief comment is often sufficient
but reveals the intent behind a specific bit of implementation.  For
example:

> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> index 4589491..044803a 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> @@ -45,7 +45,7 @@
>  int
>  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>  {
> -  if (*adapt_count > 0)
> +  if (atomic_load_relaxed (adapt_count) > 0)

Here, you could have just addded:
    /* adapt_count is accessed concurrently but is just a hint.  Thus,
       use atomic accesses but relaxed MO is sufficient.  */

This helps others to more quickly understand what was intended.  This
isn't critical in this case because we have well-documented similar code
for other architectures, but generally it is better to document why a
choice was made if it is not immediately obvious, or cannot be checked
for consistency based on just local information (eg, the source code
lines around the statement in question); concurrent code obviously can't
be cross-checked with just local information...

Tulio Magno Quites Machado Filho Jan. 3, 2017, 7:25 p.m. UTC | #2

Torvald Riegel <triegel@redhat.com> writes:

> For the next time, or when you change non-arch-specific code, please
> remember to document memory orders; a brief comment is often sufficient
> but reveals the intent behind a specific bit of implementation.  For
> example:
>
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
>> index 4589491..044803a 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
>> @@ -45,7 +45,7 @@
>>  int
>>  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>>  {
>> -  if (*adapt_count > 0)
>> +  if (atomic_load_relaxed (adapt_count) > 0)
>
> Here, you could have just addded:
>     /* adapt_count is accessed concurrently but is just a hint.  Thus,
>        use atomic accesses but relaxed MO is sufficient.  */
>
> This helps others to more quickly understand what was intended.  This
> isn't critical in this case because we have well-documented similar code
> for other architectures, but generally it is better to document why a
> choice was made if it is not immediately obvious, or cannot be checked
> for consistency based on just local information (eg, the source code
> lines around the statement in question); concurrent code obviously can't
> be cross-checked with just local information...

I agree with you.
I added this comment in this patch and pushed it as e9a96ea.

Thanks!

Andreas Schwab Jan. 11, 2017, 4:41 p.m. UTC | #3

On Jan 11 2017, "Ulrich Weigand" <uweigand@de.ibm.com> wrote:

> Tulio Magno Quites Machado Filho wrote:
>
>> +      /* Update adapt_count in the critical section to prevent a
>> +	 write-after-destroy error as mentioned in BZ 20822.  The
>> +	 following update of adapt_count has to be contained within
>> +	 the critical region of the fall-back lock in order to not violate
>> +	 the mutex destruction requirements.  */
>> +      short __tmp = atomic_load_relaxed (adapt_count);
>> +      if (__tmp > 0)
>> +        atomic_store_relaxed (adapt_count, __tmp--);
>
> Doesn't this have to be --__tmp to have any effect?

It should be just __tmp - 1.

Andreas.

diff mbox

Patch

diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
index 4589491..044803a 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
@@ -45,7 +45,7 @@ 
 int
 __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 {
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed (adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -67,7 +67,8 @@  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 	  if (_TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ()))
 	    {
 	      if (aconf.skip_lock_internal_abort > 0)
-		*adapt_count = aconf.skip_lock_internal_abort;
+		atomic_store_relaxed (adapt_count,
+				      aconf.skip_lock_internal_abort);
 	      goto use_lock;
 	    }
 	}
@@ -75,7 +76,8 @@  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 
   /* Fall back to locks for a bit if retries have been exhausted */
   if (aconf.try_tbegin > 0 && aconf.skip_lock_out_of_tbegin_retries > 0)
-    *adapt_count = aconf.skip_lock_out_of_tbegin_retries;
+    atomic_store_relaxed (adapt_count,
+			  aconf.skip_lock_out_of_tbegin_retries);
 
 use_lock:
   return LLL_LOCK ((*lock), pshared);
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
index 1e5cbe8..ed244d3 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
@@ -34,7 +34,7 @@  __lll_trylock_elision (int *futex, short *adapt_count)
   __libc_tabort (_ABORT_NESTED_TRYLOCK);
 
   /* Only try a transaction if it's worth it.  */
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed (adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -49,7 +49,7 @@  __lll_trylock_elision (int *futex, short *adapt_count)
       __libc_tend (0);
 
       if (aconf.skip_lock_busy > 0)
-	*adapt_count = aconf.skip_lock_busy;
+	atomic_store_relaxed (adapt_count, aconf.skip_lock_busy);
     }
   else
     {
@@ -59,7 +59,8 @@  __lll_trylock_elision (int *futex, short *adapt_count)
 	     result in another failure.  Use normal locking now and
 	     for the next couple of calls.  */
 	  if (aconf.skip_trylock_internal_abort > 0)
-	    *adapt_count = aconf.skip_trylock_internal_abort;
+	    atomic_store_relaxed (adapt_count,
+				aconf.skip_trylock_internal_abort);
 	}
     }
 
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
index 6f45a9c..759c146 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
@@ -28,13 +28,16 @@  __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
     __libc_tend (0);
   else
     {
-      lll_unlock ((*lock), pshared);
+      /* Update adapt_count in the critical section to prevent a
+	 write-after-destroy error as mentioned in BZ 20822.  The
+	 following update of adapt_count has to be contained within
+	 the critical region of the fall-back lock in order to not violate
+	 the mutex destruction requirements.  */
+      short __tmp = atomic_load_relaxed (adapt_count);
+      if (__tmp > 0)
+	atomic_store_relaxed (adapt_count, __tmp--);
 
-      /* Update the adapt count AFTER completing the critical section.
-         Doing this here prevents unneeded stalling when entering
-         a critical section.  Saving about 8% runtime on P8.  */
-      if (*adapt_count > 0)
-	(*adapt_count)--;
+      lll_unlock ((*lock), pshared);
     }
   return 0;
 }