From patchwork Wed Jun 11 14:59:41 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 1435 Received: (qmail 2679 invoked by alias); 11 Jun 2014 14:59:56 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 2668 invoked by uid 89); 11 Jun 2014 14:59:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com From: "Wilco" To: "GNU C Library" Subject: [PATCH] AArch64: Add libc_feholdsetround_noex_aarch64_ctx Date: Wed, 11 Jun 2014 15:59:41 +0100 Message-ID: <004301cf8585$c6248360$526d8a20$@com> MIME-Version: 1.0 X-MC-Unique: 114061115595004601 Hi, This patch adds new function libc_feholdsetround_noex_aarch64_ctx, enabling further optimization. libc_feholdsetround_aarch64_ctx now only needs to read the FPCR in the typical case, avoiding a redundant FPSR read. Performance results show a good improvement (5-10% on sin()) on cores with expensive FPCR/FPSR instructions. OK for commit? Wilco ChangeLog: 2014-06-11 Wilco * sysdeps/aarch64/fpu/math_private.h (libc_feholdsetround_noex_aarch64_ctx): New function. --- sysdeps/aarch64/fpu/math_private.h | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/sysdeps/aarch64/fpu/math_private.h b/sysdeps/aarch64/fpu/math_private.h index 023c9d0..b13c030 100644 --- a/sysdeps/aarch64/fpu/math_private.h +++ b/sysdeps/aarch64/fpu/math_private.h @@ -228,12 +228,9 @@ static __always_inline void libc_feholdsetround_aarch64_ctx (struct rm_ctx *ctx, int r) { fpu_control_t fpcr; - fpu_fpsr_t fpsr; int round; _FPU_GETCW (fpcr); - _FPU_GETFPSR (fpsr); - ctx->env.__fpsr = fpsr; /* Check whether rounding modes are different. */ round = (fpcr ^ r) & _FPU_FPCR_RM_MASK; @@ -264,6 +261,33 @@ libc_feresetround_aarch64_ctx (struct rm_ctx *ctx) #define libc_feresetroundl_ctx libc_feresetround_aarch64_ctx static __always_inline void +libc_feholdsetround_noex_aarch64_ctx (struct rm_ctx *ctx, int r) +{ + fpu_control_t fpcr; + fpu_fpsr_t fpsr; + int round; + + _FPU_GETCW (fpcr); + _FPU_GETFPSR (fpsr); + ctx->env.__fpsr = fpsr; + + /* Check whether rounding modes are different. */ + round = (fpcr ^ r) & _FPU_FPCR_RM_MASK; + ctx->updated_status = round != 0; + + /* Set the rounding mode if changed. */ + if (__glibc_unlikely (round != 0)) + { + ctx->env.__fpcr = fpcr; + _FPU_SETCW (fpcr ^ round); + } +} + +#define libc_feholdsetround_noex_ctx libc_feholdsetround_noex_aarch64_ctx +#define libc_feholdsetround_noexf_ctx libc_feholdsetround_noex_aarch64_ctx +#define libc_feholdsetround_noexl_ctx libc_feholdsetround_noex_aarch64_ctx + +static __always_inline void libc_feresetround_noex_aarch64_ctx (struct rm_ctx *ctx) { /* Restore the rounding mode if updated. */