[PATCHv3] powerpc: Spinlock optimization and cleanup
Commit Message
Changes from V2:
* Cleanup whitespace
* Apply similar changes to pthread_mutex_trylock
* Remove initial test as it doesn't impact P8 performance,
and degrades P7 performance.
* Remove out of date comments, someone cleaned up the
declarations in the meantime.
I'm still working on getting the benchmark utility in shape
for initial submission.
---8<---
This patch optimizes powerpc spinlock implementation by:
* Use the correct EH hint bit on the larx for supported ISA. For lock
acquisition, the thread that acquired the lock with a successful stcx
does not want to give away the write ownership on the cacheline. The
idea is to make the load reservation "sticky" about retaining write
authority to the line. That way, the store that must inevitably come
to release the lock can succeed quickly and not contend with other
threads issuing lwarx. If another thread does a store to the line
(false sharing), the winning thread must give up write authority to
the proper value of EH for the larx for a lock acquisition is 1.
* Increase contented lock performance by up to 40%, and no measurable
impact on uncontended locks on P8.
Thanks to Adhemerval Zanella who did most of the work. I've run some
tests, and addressed some minor feedback.
2015-10-28 Adhemeval Zanella <azanella@linux.vnet.ibm.com>
Paul E. Murphy <murphyp@linux.vnet.ibm.com>
* sysdeps/powerpc/nptl/pthread_spin_lock.c (pthread_spin_lock):
Add lwarx hint, and use macro for acquire instruction.
* sysdeps/powerpc/nptl/pthread_spin_trylock.c (pthread_spin_trylock):
Likewise.
* sysdep/unix/sysv/linux/powerpc/pthread_spin_unlock.c: Move to ...
* sysdeps/powerpc/nptl/pthread_spin_unlock.c: ... here, and
update to use new atomic macros.
---
sysdeps/powerpc/nptl/pthread_spin_lock.c | 4 ++--
sysdeps/powerpc/nptl/pthread_spin_trylock.c | 4 ++--
sysdeps/powerpc/nptl/pthread_spin_unlock.c | 27 +++++++++++++++++++++
.../unix/sysv/linux/powerpc/pthread_spin_unlock.c | 28 ----------------------
4 files changed, 31 insertions(+), 32 deletions(-)
create mode 100644 sysdeps/powerpc/nptl/pthread_spin_unlock.c
delete mode 100644 sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
Comments
"Paul E. Murphy" <murphyp@linux.vnet.ibm.com> writes:
> Changes from V2:
>
> * Cleanup whitespace
> * Apply similar changes to pthread_mutex_trylock
> * Remove initial test as it doesn't impact P8 performance,
> and degrades P7 performance.
> * Remove out of date comments, someone cleaned up the
> declarations in the meantime.
>
> I'm still working on getting the benchmark utility in shape
> for initial submission.
>
> ---8<---
> This patch optimizes powerpc spinlock implementation by:
>
> * Use the correct EH hint bit on the larx for supported ISA. For lock
> acquisition, the thread that acquired the lock with a successful stcx
> does not want to give away the write ownership on the cacheline. The
> idea is to make the load reservation "sticky" about retaining write
> authority to the line. That way, the store that must inevitably come
> to release the lock can succeed quickly and not contend with other
> threads issuing lwarx. If another thread does a store to the line
> (false sharing), the winning thread must give up write authority to
> the proper value of EH for the larx for a lock acquisition is 1.
>
> * Increase contented lock performance by up to 40%, and no measurable
> impact on uncontended locks on P8.
>
> Thanks to Adhemerval Zanella who did most of the work. I've run some
> tests, and addressed some minor feedback.
>
> 2015-10-28 Adhemeval Zanella <azanella@linux.vnet.ibm.com>
> Paul E. Murphy <murphyp@linux.vnet.ibm.com>
>
> * sysdeps/powerpc/nptl/pthread_spin_lock.c (pthread_spin_lock):
> Add lwarx hint, and use macro for acquire instruction.
> * sysdeps/powerpc/nptl/pthread_spin_trylock.c (pthread_spin_trylock):
> Likewise.
> * sysdep/unix/sysv/linux/powerpc/pthread_spin_unlock.c: Move to ...
> * sysdeps/powerpc/nptl/pthread_spin_unlock.c: ... here, and
> update to use new atomic macros.
LGTM.
I'm pushing it.
Thanks!
@@ -24,12 +24,12 @@ pthread_spin_lock (pthread_spinlock_t *lock)
unsigned int __tmp;
asm volatile (
- "1: lwarx %0,0,%1\n"
+ "1: lwarx %0,0,%1" MUTEX_HINT_ACQ "\n"
" cmpwi 0,%0,0\n"
" bne- 2f\n"
" stwcx. %2,0,%1\n"
" bne- 2f\n"
- " isync\n"
+ __ARCH_ACQ_INSTR "\n"
" .subsection 1\n"
"2: lwzx %0,0,%1\n"
" cmpwi 0,%0,0\n"
@@ -25,13 +25,13 @@ pthread_spin_trylock (pthread_spinlock_t *lock)
unsigned int old;
int err = EBUSY;
- asm ("1: lwarx %0,0,%2\n"
+ asm ("1: lwarx %0,0,%2" MUTEX_HINT_ACQ "\n"
" cmpwi 0,%0,0\n"
" bne 2f\n"
" stwcx. %3,0,%2\n"
" bne- 1b\n"
" li %1,0\n"
- " isync\n"
+ __ARCH_ACQ_INSTR "\n"
"2: "
: "=&r" (old), "=&r" (err)
: "r" (lock), "r" (1), "1" (err)
new file mode 100644
@@ -0,0 +1,27 @@
+/* pthread_spin_unlock -- unlock a spin lock. PowerPC version.
+ Copyright (C) 2007-2015 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include "pthreadP.h"
+#include <lowlevellock.h>
+
+int
+pthread_spin_unlock (pthread_spinlock_t *lock)
+{
+ atomic_store_release (lock, 0);
+ return 0;
+}
deleted file mode 100644
@@ -1,28 +0,0 @@
-/* pthread_spin_unlock -- unlock a spin lock. PowerPC version.
- Copyright (C) 2007-2015 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#include "pthreadP.h"
-#include <lowlevellock.h>
-
-int
-pthread_spin_unlock (pthread_spinlock_t *lock)
-{
- __asm __volatile (__ARCH_REL_INSTR ::: "memory");
- *lock = 0;
- return 0;
-}