From patchwork Tue Sep 1 18:59:47 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. Murphy" X-Patchwork-Id: 8548 X-Patchwork-Delegate: azanella@linux.vnet.ibm.com Received: (qmail 79682 invoked by alias); 1 Sep 2015 18:59:54 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 79671 invoked by uid 89); 1 Sep 2015 18:59:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL, BAYES_00, KAM_LAZY_DOMAIN_SECURITY, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: e17.ny.us.ibm.com X-MailFrom: murphyp@linux.vnet.ibm.com X-RcptTo: libc-alpha@sourceware.org Message-ID: <55E5F5A3.5020105@linux.vnet.ibm.com> Date: Tue, 01 Sep 2015 13:59:47 -0500 From: "Paul E. Murphy" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: "libc-alpha@sourceware.org" , Adhemerval Zanella , Tulio Magno Quites Machado Filho Subject: [PATCH] powerpc: Optimize lock elision for pthread_mutex_t X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15090118-0041-0000-0000-00000151DF60 With TLE enabled, the adapt count variable update incurs an 8% overhead before entering the critical section of an elided mutex. Instead, if it is done right after leaving the critical section, this serialization can be avoided. This alters the existing behavior of __lll_trylock_elision as it will only decrement the adapt_count if it successfully acquires the lock. 2015-09-01 Paul E. Murphy * sysdeps/unix/sysv/linux/powerpc/elision-lock.c (__lll_lock_elision): Remove adapt_count decrement... * sysdeps/unix/sysv/linux/powerpc/elision-trylock.c (__lll_trylock_elision): Likewise. * sysdeps/unix/sysv/linux/powerpc/elision-unlock.c (__lll_unlock_elision): ... to here. --- sysdeps/unix/sysv/linux/powerpc/elision-lock.c | 1 - sysdeps/unix/sysv/linux/powerpc/elision-trylock.c | 1 - sysdeps/unix/sysv/linux/powerpc/elision-unlock.c | 10 ++++++++++ 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c index 26d272e..3762732 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c @@ -47,7 +47,6 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared) { if (*adapt_count > 0) { - (*adapt_count)--; goto use_lock; } diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c index 7b6d1b9..440939c 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c @@ -36,7 +36,6 @@ __lll_trylock_elision (int *futex, short *adapt_count) /* Only try a transaction if it's worth it. */ if (*adapt_count > 0) { - (*adapt_count)--; goto use_lock; } diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c index f04c339..88b6f5b 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c @@ -27,6 +27,16 @@ __lll_unlock_elision(int *lock, int pshared) if (*lock == 0) __builtin_tend (0); else + { + /* Note assumption that the lock _is_ the first element of the structure */ + short *adapt_count = &((pthread_mutex_t*)(lock))->__data.__elision; lll_unlock ((*lock), pshared); + + /* Update the adapt count AFTER completing the critical section. + Doing this here prevents unneeded stalling when entering + a critical section. Saving about 8% runtime on P8. */ + if (*adapt_count > 0) + (*adapt_count)--; + } return 0; }