From patchwork Thu Sep 22 14:14:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John David Anglin X-Patchwork-Id: 15922 Received: (qmail 92718 invoked by alias); 22 Sep 2016 14:14:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 92212 invoked by uid 89); 22 Sep 2016 14:14:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS, UNPARSEABLE_RELAY autolearn=ham version=3.3.2 spammy=bn, H*MI:bell, stalls, enosys X-HELO: mtlfep02.bell.net From: John David Anglin Mime-Version: 1.0 (Apple Message framework v1085) Date: Thu, 22 Sep 2016 10:14:11 -0400 Subject: [PATCH] hppa: Optimize atomic_compare_and_exchange_val_acq Cc: deller@kernel.org, Carlos O'Donell , Mike Frysinger , Aurelien Jarno To: GNU C Library Message-Id: <58B70052-B987-4C41-B603-F3AAB2FDE34B@bell.net> X-Opwv-CommTouchExtSvcRefID: str=0001.0A020205.57E3E733.0343, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 The attached patch replaces the conditional branch tests in atomic_compare_and_exchange_val_acq with conditional instruction nullification. This avoids the stalls associated with conditional branches and the resulting code is shorter. There are no branches in the fast path when the operation is successful. The change was intended as an optimization but tst-stack4 now passes. Please install. Thanks, Dave --- John David Anglin dave.anglin@bell.net 2016-09-22 John David Anglin * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Don't include abort-instr.h. (EFAULT): Remove conditional define. (ENOSYS): Likewise. (atomic_compare_and_exchange_val_acq): Use instruction nullification instead of conditional branch instructions. Index: glibc-2.23/sysdeps/unix/sysv/linux/hppa/atomic-machine.h =================================================================== --- glibc-2.23.orig/sysdeps/unix/sysv/linux/hppa/atomic-machine.h +++ glibc-2.23/sysdeps/unix/sysv/linux/hppa/atomic-machine.h @@ -17,13 +17,6 @@ . */ #include /* Required for type definitions e.g. uint8_t. */ -#include /* Required for ABORT_INSTRUCTIUON. */ - -/* We need EFAULT, ENONSYS */ -#if !defined EFAULT && !defined ENOSYS -#define EFAULT 14 -#define ENOSYS 251 -#endif #ifndef _ATOMIC_MACHINE_H #define _ATOMIC_MACHINE_H 1 @@ -62,7 +55,7 @@ typedef uintmax_t uatomic_max_t; #define _ASM_EDEADLOCK "-45" /* The only basic operation needed is compare and exchange. The mem - pointer must be word aligned. */ + pointer must be word aligned. We no longer loop on deadlock. */ #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \ ({ \ register long lws_errno asm("r21"); \ @@ -74,20 +67,15 @@ typedef uintmax_t uatomic_max_t; "0: \n\t" \ "ble " _LWS "(%%sr2, %%r0) \n\t" \ "ldi " _LWS_CAS ", %%r20 \n\t" \ - "ldi " _ASM_EAGAIN ", %%r20 \n\t" \ - "cmpb,=,n %%r20, %%r21, 0b \n\t" \ - "nop \n\t" \ - "ldi " _ASM_EDEADLOCK ", %%r20 \n\t" \ - "cmpb,=,n %%r20, %%r21, 0b \n\t" \ - "nop \n\t" \ + "cmpiclr,<> " _ASM_EAGAIN ", %%r21, %%r0\n\t" \ + "b,n 0b \n\t" \ + "cmpclr,= %%r0, %%r21, %%r0 \n\t" \ + "iitlbp %%r0,(%%sr0, %%r0) \n\t" \ : "=r" (lws_ret), "=r" (lws_errno) \ : "r" (lws_mem), "r" (lws_old), "r" (lws_new) \ : _LWS_CLOBBER \ ); \ \ - if (lws_errno == -EFAULT || lws_errno == -ENOSYS) \ - ABORT_INSTRUCTION; \ - \ (__typeof (oldval)) lws_ret; \ })