From patchwork Wed Apr 6 16:37:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lu, Hongjiu" X-Patchwork-Id: 11653 Received: (qmail 52934 invoked by alias); 6 Apr 2016 16:37:44 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 52921 invoked by uid 89); 6 Apr 2016 16:37:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.7 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=p2align, VZEROUPPER, jad32, UD:ja.d32 X-HELO: mga01.intel.com X-ExtLoop1: 1 Date: Wed, 6 Apr 2016 09:37:07 -0700 From: "H.J. Lu" To: GNU C Library Subject: [committed, PATCH] X86-64: Prepare memset-vec-unaligned-erms.S Message-ID: <20160406163707.GA17472@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) Prepare memset-vec-unaligned-erms.S to make the SSE2 version as the default memset. Tested on x86-64. Checked in. H.J. --- * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S (MEMSET_CHK_SYMBOL): New. Define if not defined. (__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH. Disabled fro now. Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk symbols. Properly check USE_MULTIARCH on __memset symbols. --- .../x86_64/multiarch/memset-vec-unaligned-erms.S | 32 +++++++++++++--------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S index fe0f745..578a5ae 100644 --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S @@ -28,6 +28,10 @@ #include +#ifndef MEMSET_CHK_SYMBOL +# define MEMSET_CHK_SYMBOL(p,s) MEMSET_SYMBOL(p, s) +#endif + #ifndef VZEROUPPER # if VEC_SIZE > 16 # define VZEROUPPER vzeroupper @@ -66,8 +70,8 @@ # error SECTION is not defined! #endif -#if !defined USE_MULTIARCH && IS_IN (libc) .section SECTION(.text),"ax",@progbits +#if VEC_SIZE == 16 && IS_IN (libc) && 0 ENTRY (__bzero) movq %rdi, %rax /* Set return value. */ movq %rsi, %rdx /* Set n. */ @@ -78,10 +82,10 @@ weak_alias (__bzero, bzero) #endif #if defined SHARED && IS_IN (libc) -ENTRY_CHK (MEMSET_SYMBOL (__memset_chk, unaligned)) +ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned)) cmpq %rdx, %rcx jb HIDDEN_JUMPTARGET (__chk_fail) -END_CHK (MEMSET_SYMBOL (__memset_chk, unaligned)) +END_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned)) #endif ENTRY (MEMSET_SYMBOL (__memset, unaligned)) @@ -97,15 +101,16 @@ L(entry_from_bzero): VMOVU %VEC(0), (%rdi) VZEROUPPER ret +#if defined USE_MULTIARCH && IS_IN (libc) END (MEMSET_SYMBOL (__memset, unaligned)) -#if VEC_SIZE == 16 +# if VEC_SIZE == 16 /* Only used to measure performance of REP STOSB. */ ENTRY (__memset_erms) -#else +# else /* Provide a symbol to debugger. */ ENTRY (MEMSET_SYMBOL (__memset, erms)) -#endif +# endif L(stosb): movq %rdx, %rcx movzbl %sil, %eax @@ -113,18 +118,18 @@ L(stosb): rep stosb movq %rdx, %rax ret -#if VEC_SIZE == 16 +# if VEC_SIZE == 16 END (__memset_erms) -#else +# else END (MEMSET_SYMBOL (__memset, erms)) -#endif +# endif -#if defined SHARED && IS_IN (libc) -ENTRY_CHK (MEMSET_SYMBOL (__memset_chk, unaligned_erms)) +# if defined SHARED && IS_IN (libc) +ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms)) cmpq %rdx, %rcx jb HIDDEN_JUMPTARGET (__chk_fail) -END_CHK (MEMSET_SYMBOL (__memset_chk, unaligned_erms)) -#endif +END_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms)) +# endif ENTRY (MEMSET_SYMBOL (__memset, unaligned_erms)) VDUP_TO_VEC0_AND_SET_RETURN (%esi, %rdi) @@ -144,6 +149,7 @@ L(stosb_more_2x_vec): /* Force 32-bit displacement to avoid long nop between instructions. */ ja.d32 L(stosb) +#endif .p2align 4 L(more_2x_vec): cmpq $(VEC_SIZE * 4), %rdx