From patchwork Wed Jun 7 20:39:50 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Erich Elsen X-Patchwork-Id: 20831 Received: (qmail 27149 invoked by alias); 7 Jun 2017 20:39:53 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 27124 invoked by uid 89); 7 Jun 2017 20:39:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=organize X-HELO: mail-oi0-f47.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=yiP09srWposH9jcqooId0v0kmMyumOt56+gQa67erUk=; b=aW1Zlav6CU5x3hkmsuZ4Y9uRGRC6ZKNJgNvX0vS28Rmt0zSe118szTV2C03DFQXKct fVuZy738fADbetl3ULnFtkjxQ/PHkvZidlhH/uc7miFIHMh7A+PS78J4U5EnK2P4LYTc JCqlxbi+OrenZMniUQM05dhbv8YVfC2Sf1Hk8/9Qrm9CtdMrs7RXxWVHYdASoXeNZ5EL Rse4+OrrsMYBkqCDcKR1rew6j/T88BTx6ptvu1gZJuMkDl1GA+8xQMM7e4Uu+FrCArpV tNnJxUSdk/2OfQouuCeSXTELREHkrrRIVBL1XJNFcYMWmlrroupR18IzdLfpnjkluzKp g2OQ== X-Gm-Message-State: AODbwcB7NHZ9AgvTTFW0RduOKwkNOHR5YtdNkJPHbkSP/2wXDTVaJgAS 3A24h0ajovFelFlljVNbCa2kJ6M0i3od X-Received: by 10.202.204.147 with SMTP id c141mr9395828oig.211.1496867991631; Wed, 07 Jun 2017 13:39:51 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <4a16e1e8-9baf-7b75-41b0-e25a127c649a@linaro.org> <264ee0ca-ee55-297b-ac16-2761c77e0bfc@linaro.org> <10c41e3f-d9ea-184d-4580-1beac97fb2dd@linaro.org> <05ff0033-370c-1970-5600-63638f17a496@linaro.org> From: Erich Elsen Date: Wed, 7 Jun 2017 13:39:50 -0700 Message-ID: Subject: Re: RFC: Rewrite x86-64 IFUNC selector in C To: Adhemerval Zanella Cc: "H.J. Lu" , Siddhesh Poyarekar , "Carlos O'Donell" , GNU C Library Here's a patch that also moves memset.S to memset.c. On Tue, May 30, 2017 at 1:32 PM, Adhemerval Zanella wrote: > > > On 30/05/2017 17:13, H.J. Lu wrote: >> On Tue, May 30, 2017 at 12:50 PM, Adhemerval Zanella >> wrote: >>> >>> >>> On 30/05/2017 13:24, H.J. Lu wrote: >>>> On Mon, May 29, 2017 at 1:34 PM, Adhemerval Zanella >>>> wrote: >>>>> >>>>> I think we can simplify it further and use the already existent ifunc macros on >>>>> libc-symbols.h. Also, for memmove I think we can organize the code better (at >>>>> least for ifunc) and build a extra object with a more meaningful name. I used >>>>> your logic for the ifunc selection and extended for memmove as well. >>>>> >>>> Here is the combined patch. >>> >>> >>>> diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile >>>> index 3736f54..8a6ff00 100644 >>>> --- a/sysdeps/x86_64/multiarch/Makefile >>>> +++ b/sysdeps/x86_64/multiarch/Makefile >>>> @@ -19,6 +19,7 @@ sysdep_routines += strncat-c stpncpy-c strncpy-c strcmp-ssse3 \ >>>> strchr-sse2-no-bsf memcmp-ssse3 strstr-sse2-unaligned \ >>>> strcspn-c strpbrk-c strspn-c varshift \ >>>> memset-avx512-no-vzeroupper \ >>>> + memmove-sse2-unaligned-erms \ >>> >>> I still think this is a misleading name file since it build not only memmove but >>> also memcpy and mempcpy implementations. I would prefer to rename to a more >>> general name as in my suggestion. >> >> I'd like to keep it consistent with memmove-avx-unaligned-erms.S and >> memmove-avx512-unaligned-erms.S. As for making memcpy an alias >> of memmove, and embedding mempcpy, __memcpy_chk, >> __mempcpy_chk, __memmove_chk is an implementation detail. >> > > Alright, at least I think it should add some comments that the object > is building multiple symbols (it wasn't clear when I started to work > on the patch). From 6655fc03ff8306b6c752113ee826de818639c953 Mon Sep 17 00:00:00 2001 From: Erich Elsen Date: Wed, 7 Jun 2017 13:37:11 -0700 Subject: [PATCH 1/1] move memset.S -> memset.c --- sysdeps/x86_64/multiarch/Makefile | 1 + sysdeps/x86_64/multiarch/ifunc-memmove.h | 8 ++--- sysdeps/x86_64/multiarch/memcpy.c | 2 +- sysdeps/x86_64/multiarch/memmove.c | 2 +- sysdeps/x86_64/multiarch/mempcpy.c | 2 +- .../{memset.S => memset-sse2-unaligned-erms.S} | 37 +--------------------- sysdeps/x86_64/multiarch/memset.c | 34 ++++++++++++++++++++ 7 files changed, 43 insertions(+), 43 deletions(-) rename sysdeps/x86_64/multiarch/{memset.S => memset-sse2-unaligned-erms.S} (64%) create mode 100644 sysdeps/x86_64/multiarch/memset.c diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index 1fcde18c72..9d2583ad6f 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -23,6 +23,7 @@ sysdep_routines += strncat-c stpncpy-c strncpy-c strcmp-ssse3 \ memmove-sse2-unaligned-erms \ memmove-avx-unaligned-erms \ memmove-avx512-unaligned-erms \ + memset-sse2-unaligned-erms \ memset-avx2-unaligned-erms \ memset-avx512-unaligned-erms CFLAGS-varshift.c += -msse4 diff --git a/sysdeps/x86_64/multiarch/ifunc-memmove.h b/sysdeps/x86_64/multiarch/ifunc-memmove.h index 3034e33920..22c370a3c5 100644 --- a/sysdeps/x86_64/multiarch/ifunc-memmove.h +++ b/sysdeps/x86_64/multiarch/ifunc-memmove.h @@ -37,7 +37,7 @@ extern __typeof (REDIRECT_NAME) OPTIMIZE (avx512_no_vzeroupper) attribute_hidden; static inline void * -IFUNC_SELECTOR (void) +IFUNC_SELECTOR (bool has_avx_impl, bool has_ssse3_impl) { const struct cpu_features* cpu_features = __get_cpu_features (); @@ -56,7 +56,7 @@ IFUNC_SELECTOR (void) return OPTIMIZE (avx512_unaligned); } - if (CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) + if (has_avx_impl && CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) { if (CPU_FEATURES_CPU_P (cpu_features, ERMS)) return OPTIMIZE (avx_unaligned_erms); @@ -64,8 +64,8 @@ IFUNC_SELECTOR (void) return OPTIMIZE (avx_unaligned); } - if (!CPU_FEATURES_CPU_P (cpu_features, SSSE3) - || CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Copy)) + if (!has_ssse3_impl || (!CPU_FEATURES_CPU_P (cpu_features, SSSE3) + || CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Copy))) { if (CPU_FEATURES_CPU_P (cpu_features, ERMS)) return OPTIMIZE (sse2_unaligned_erms); diff --git a/sysdeps/x86_64/multiarch/memcpy.c b/sysdeps/x86_64/multiarch/memcpy.c index 730168ac96..b610adcaaa 100644 --- a/sysdeps/x86_64/multiarch/memcpy.c +++ b/sysdeps/x86_64/multiarch/memcpy.c @@ -30,7 +30,7 @@ extern __typeof (__redirect_memcpy) __new_memcpy; -libc_ifunc (__new_memcpy, IFUNC_SELECTOR ()); +libc_ifunc (__new_memcpy, IFUNC_SELECTOR (true, true)); # include versioned_symbol (libc, __new_memcpy, memcpy, GLIBC_2_14); diff --git a/sysdeps/x86_64/multiarch/memmove.c b/sysdeps/x86_64/multiarch/memmove.c index e2fa151d3c..74a71be185 100644 --- a/sysdeps/x86_64/multiarch/memmove.c +++ b/sysdeps/x86_64/multiarch/memmove.c @@ -29,6 +29,6 @@ extern __typeof (__redirect_memmove) __libc_memmove; -libc_ifunc (__libc_memmove, IFUNC_SELECTOR ()); +libc_ifunc (__libc_memmove, IFUNC_SELECTOR (true, true)); strong_alias (__libc_memmove, memmove); #endif diff --git a/sysdeps/x86_64/multiarch/mempcpy.c b/sysdeps/x86_64/multiarch/mempcpy.c index ac3fec5613..cbc722ac20 100644 --- a/sysdeps/x86_64/multiarch/mempcpy.c +++ b/sysdeps/x86_64/multiarch/mempcpy.c @@ -30,6 +30,6 @@ # define SYMBOL_NAME mempcpy # include "ifunc-memmove.h" -libc_ifunc_redirected (__redirect_mempcpy, __mempcpy, IFUNC_SELECTOR ()); +libc_ifunc_redirected (__redirect_mempcpy, __mempcpy, IFUNC_SELECTOR (true, true)); weak_alias (__mempcpy, mempcpy) #endif diff --git a/sysdeps/x86_64/multiarch/memset.S b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S similarity index 64% rename from sysdeps/x86_64/multiarch/memset.S rename to sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S index 11f27378b0..c69cd74895 100644 --- a/sysdeps/x86_64/multiarch/memset.S +++ b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S @@ -1,4 +1,4 @@ -/* Multiple versions of memset +/* memset with SSE2. All versions must be listed in ifunc-impl-list.c. Copyright (C) 2014-2017 Free Software Foundation, Inc. This file is part of the GNU C Library. @@ -21,41 +21,6 @@ #include #include -/* Define multiple versions only for the definition in lib. */ -#if IS_IN (libc) -ENTRY(memset) - .type memset, @gnu_indirect_function - LOAD_RTLD_GLOBAL_RO_RDX - lea __memset_erms(%rip), %RAX_LP - HAS_ARCH_FEATURE (Prefer_ERMS) - jnz 2f - lea __memset_sse2_unaligned_erms(%rip), %RAX_LP - HAS_CPU_FEATURE (ERMS) - jnz 1f - lea __memset_sse2_unaligned(%rip), %RAX_LP -1: - HAS_ARCH_FEATURE (AVX2_Usable) - jz 2f - lea __memset_avx2_unaligned_erms(%rip), %RAX_LP - HAS_CPU_FEATURE (ERMS) - jnz L(AVX512F) - lea __memset_avx2_unaligned(%rip), %RAX_LP -L(AVX512F): - HAS_ARCH_FEATURE (Prefer_No_AVX512) - jnz 2f - HAS_ARCH_FEATURE (AVX512F_Usable) - jz 2f - lea __memset_avx512_no_vzeroupper(%rip), %RAX_LP - HAS_ARCH_FEATURE (Prefer_No_VZEROUPPER) - jnz 2f - lea __memset_avx512_unaligned_erms(%rip), %RAX_LP - HAS_CPU_FEATURE (ERMS) - jnz 2f - lea __memset_avx512_unaligned(%rip), %RAX_LP -2: ret -END(memset) -#endif - #if IS_IN (libc) # define MEMSET_SYMBOL(p,s) p##_sse2_##s # define WMEMSET_SYMBOL(p,s) p##_sse2_##s diff --git a/sysdeps/x86_64/multiarch/memset.c b/sysdeps/x86_64/multiarch/memset.c new file mode 100644 index 0000000000..5a04192837 --- /dev/null +++ b/sysdeps/x86_64/multiarch/memset.c @@ -0,0 +1,34 @@ +/* Multiple versions of memset. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in lib and for + DSO. In static binaries we need memset before the initialization + happened. */ +#if IS_IN (libc) +# define memset __redirect_memset +# include +# undef memset + +# define SYMBOL_NAME memset +# include "ifunc-memmove.h" + +extern __typeof (__redirect_memset) memset; + +libc_ifunc (memset, IFUNC_SELECTOR (false, false)); +#endif -- 2.13.0.506.g27d5fe0cd-goog