From patchwork Wed Jul 13 23:09:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 56052 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E604F382C14A for ; Wed, 13 Jul 2022 23:10:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E604F382C14A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1657753820; bh=uOZN3QcKJoyQOCUOPL/5feyi2/hOfPJzJh5j5NTivLk=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=vZ3ky6HBWl8aPtdhtBnEfBkhIVIsQyaNnBLZgjODUBejPKLZLzWmsnaxxv4/U7byX 6jwNoPGPMZmRKoeec3xesUrDoBeFjy7ojTQDwhyM1+UArKzf6HJHA4ni1j7xbc17BY f2c4kt+MvgTIGWFL44SBPdscXe7DvTzROnCWCRps= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id D62693858C20 for ; Wed, 13 Jul 2022 23:09:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D62693858C20 Received: by mail-pg1-x52c.google.com with SMTP id q82so11775481pgq.6 for ; Wed, 13 Jul 2022 16:09:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uOZN3QcKJoyQOCUOPL/5feyi2/hOfPJzJh5j5NTivLk=; b=N5+ftgjw0GkQtaMWIqBLVnoYU99lHvWzq3mUKv//32PJPfqhEAoTywgo4De0LNGgWe +94rt6NpfZ35jqqNN46Xc8DRTf/Ym36TB0MkQF6/SsfTFK8iPIw8ieSVc6sd7bFyzrlC 9pv9zCqkr4rh4QwvS5MDvV5R8xeyRfEMKJbvU6qXe8FZ0tM+XZT8Pl5IZ9qGGD9P6MtZ Oypg3xVA9xkoFRmaZEC7Nf8h6Zu6q1uiqyZoLTpC7O/gr9j5OxW+ITOecLS8E3MH/62B hDYRjbR2rCnwR08RIBf4iigsbRhUc/zjXFP/ImG+a0RnGjk2eZJs8N/jkGpOI69aYub/ V+VQ== X-Gm-Message-State: AJIora9Wl9enCUkD6/o6s9xBf6VKiuYpkMKd5ywWyq0anz5hwJUTXzje 6rAyqGOSoUZixqJA3TjtPIPgaGeTytU= X-Google-Smtp-Source: AGRyM1sDEJDYoMAuN9Z3Li5hqoMJQaLJLHXS/lBZzmWDFuUUqBw15gMyGstqyC83Fq1in20SyvcLMw== X-Received: by 2002:a05:6a00:188e:b0:52a:b545:559f with SMTP id x14-20020a056a00188e00b0052ab545559fmr5496008pfh.18.1657753794127; Wed, 13 Jul 2022 16:09:54 -0700 (PDT) Received: from noah-tgl.. ([192.55.60.37]) by smtp.gmail.com with ESMTPSA id y8-20020aa793c8000000b0052ab602a7d0sm77343pff.100.2022.07.13.16.09.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 16:09:53 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1 3/3] x86: Add support to build st{p|r}{n}{cpy|cat} with explicit ISA level Date: Wed, 13 Jul 2022 16:09:45 -0700 Message-Id: <20220713230945.1866193-3-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220713230945.1866193-1-goldstein.w.n@gmail.com> References: <20220713230945.1866193-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This commit also follows the same pattern of earlier commits: 1. Refactor files so that all implementations are in the multiarch directory - i.e move strcpy's SSE2 implementation from strcpy.S to multiarch/strcpy-sse2.S - The non-multiarch file now only includes one of the implementations in the multiarch directory based on the compiled ISA level (only used for non-multiarch builds. Otherwise we go through the ifunc selector). 2. Add ISA level build guards to different implementations. - I.e strcpy-avx2.S which is ISA level 3 will only build if compiled ISA level <= 3. Otherwise there is no reason to include it as we will always use one of the ISA level 4 implementations (strcpy-evex.S). 3. Add new multiarch/rtld-.S that just include the non-multiarch .S which will in turn select the best implementation based on the compiled ISA level. - i.e multiarch/rtld-stpcpy.S 4. Refactor the ifunc selector and ifunc implementation list to use the ISA level aware wrapper macros that allow functions below the compiled ISA level (with a guranteed replacement) to be skipped. Tested with and without multiarch on x86_64 for ISA levels: {generic, x86-64-v2, x86-64-v3, x86-64-v4} And m32 with and without multiarch. --- sysdeps/x86_64/multiarch/ifunc-impl-list.c | 183 +++++++++++------- sysdeps/x86_64/multiarch/ifunc-strcpy.h | 27 +-- sysdeps/x86_64/multiarch/ifunc-strncpy.h | 22 ++- sysdeps/x86_64/multiarch/stpcpy-avx2.S | 6 +- sysdeps/x86_64/multiarch/stpcpy-evex.S | 6 +- .../x86_64/multiarch/stpcpy-sse2-unaligned.S | 6 +- sysdeps/x86_64/multiarch/stpcpy-sse2.S | 25 +-- sysdeps/x86_64/multiarch/stpncpy-avx2.S | 6 +- sysdeps/x86_64/multiarch/stpncpy-evex.S | 6 +- .../x86_64/multiarch/stpncpy-sse2-unaligned.S | 6 +- sysdeps/x86_64/multiarch/strcat-avx2.S | 5 +- sysdeps/x86_64/multiarch/strcat-evex.S | 5 +- .../x86_64/multiarch/strcat-sse2-unaligned.S | 7 +- sysdeps/x86_64/multiarch/strcat-sse2.S | 18 +- sysdeps/x86_64/multiarch/strcpy-avx2.S | 5 +- sysdeps/x86_64/multiarch/strcpy-evex.S | 5 +- .../x86_64/multiarch/strcpy-sse2-unaligned.S | 7 +- sysdeps/x86_64/multiarch/strcpy-sse2.S | 23 ++- sysdeps/x86_64/multiarch/strncat-avx2.S | 6 +- sysdeps/x86_64/multiarch/strncat-evex.S | 6 +- .../x86_64/multiarch/strncat-sse2-unaligned.S | 6 +- sysdeps/x86_64/multiarch/strncpy-avx2.S | 6 +- sysdeps/x86_64/multiarch/strncpy-evex.S | 6 +- .../x86_64/multiarch/strncpy-sse2-unaligned.S | 6 +- sysdeps/x86_64/stpcpy.S | 26 ++- sysdeps/x86_64/stpncpy.S | 28 +++ sysdeps/x86_64/strcat.S | 15 +- sysdeps/x86_64/strcpy.S | 12 +- sysdeps/x86_64/strncat.S | 28 +++ sysdeps/x86_64/strncpy.S | 27 +++ 30 files changed, 382 insertions(+), 158 deletions(-) create mode 100644 sysdeps/x86_64/stpncpy.S create mode 100644 sysdeps/x86_64/strncat.S create mode 100644 sysdeps/x86_64/strncpy.S diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c index 9318e98cc8..a71444eccb 100644 --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c @@ -403,33 +403,46 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/stpncpy.c. */ IFUNC_IMPL (i, name, stpncpy, - IFUNC_IMPL_ADD (array, i, stpncpy, CPU_FEATURE_USABLE (AVX2), - __stpncpy_avx2) - IFUNC_IMPL_ADD (array, i, stpncpy, - (CPU_FEATURE_USABLE (AVX2) - && CPU_FEATURE_USABLE (RTM)), - __stpncpy_avx2_rtm) - IFUNC_IMPL_ADD (array, i, stpncpy, - (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), - __stpncpy_evex) - IFUNC_IMPL_ADD (array, i, stpncpy, 1, - __stpncpy_sse2_unaligned)) + X86_IFUNC_IMPL_ADD_V4 (array, i, stpncpy, + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW)), + __stpncpy_evex) + X86_IFUNC_IMPL_ADD_V3 (array, i, stpncpy, + CPU_FEATURE_USABLE (AVX2), + __stpncpy_avx2) + X86_IFUNC_IMPL_ADD_V3 (array, i, stpncpy, + (CPU_FEATURE_USABLE (AVX2) + && CPU_FEATURE_USABLE (RTM)), + __stpncpy_avx2_rtm) + /* ISA V2 wrapper for sse2_unaligned implementation because + the sse2_unaligned implementation is also used at ISA + level 2. */ + X86_IFUNC_IMPL_ADD_V2 (array, i, stpncpy, + 1, + __stpncpy_sse2_unaligned)) /* Support sysdeps/x86_64/multiarch/stpcpy.c. */ IFUNC_IMPL (i, name, stpcpy, - IFUNC_IMPL_ADD (array, i, stpcpy, CPU_FEATURE_USABLE (AVX2), - __stpcpy_avx2) - IFUNC_IMPL_ADD (array, i, stpcpy, - (CPU_FEATURE_USABLE (AVX2) - && CPU_FEATURE_USABLE (RTM)), - __stpcpy_avx2_rtm) - IFUNC_IMPL_ADD (array, i, stpcpy, - (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), - __stpcpy_evex) - IFUNC_IMPL_ADD (array, i, stpcpy, 1, __stpcpy_sse2_unaligned) - IFUNC_IMPL_ADD (array, i, stpcpy, 1, __stpcpy_sse2)) + X86_IFUNC_IMPL_ADD_V4 (array, i, stpcpy, + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW)), + __stpcpy_evex) + X86_IFUNC_IMPL_ADD_V3 (array, i, stpcpy, + CPU_FEATURE_USABLE (AVX2), + __stpcpy_avx2) + X86_IFUNC_IMPL_ADD_V3 (array, i, stpcpy, + (CPU_FEATURE_USABLE (AVX2) + && CPU_FEATURE_USABLE (RTM)), + __stpcpy_avx2_rtm) + /* ISA V2 wrapper for sse2_unaligned implementation because + the sse2_unaligned implementation is also used at ISA + level 2. */ + X86_IFUNC_IMPL_ADD_V2 (array, i, stpcpy, + 1, + __stpcpy_sse2_unaligned) + X86_IFUNC_IMPL_ADD_V1 (array, i, stpcpy, + 1, + __stpcpy_sse2)) /* Support sysdeps/x86_64/multiarch/strcasecmp_l.c. */ IFUNC_IMPL (i, name, strcasecmp, @@ -477,18 +490,26 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/strcat.c. */ IFUNC_IMPL (i, name, strcat, - IFUNC_IMPL_ADD (array, i, strcat, CPU_FEATURE_USABLE (AVX2), - __strcat_avx2) - IFUNC_IMPL_ADD (array, i, strcat, - (CPU_FEATURE_USABLE (AVX2) - && CPU_FEATURE_USABLE (RTM)), - __strcat_avx2_rtm) - IFUNC_IMPL_ADD (array, i, strcat, - (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), - __strcat_evex) - IFUNC_IMPL_ADD (array, i, strcat, 1, __strcat_sse2_unaligned) - IFUNC_IMPL_ADD (array, i, strcat, 1, __strcat_sse2)) + X86_IFUNC_IMPL_ADD_V4 (array, i, strcat, + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW)), + __strcat_evex) + X86_IFUNC_IMPL_ADD_V3 (array, i, strcat, + CPU_FEATURE_USABLE (AVX2), + __strcat_avx2) + X86_IFUNC_IMPL_ADD_V3 (array, i, strcat, + (CPU_FEATURE_USABLE (AVX2) + && CPU_FEATURE_USABLE (RTM)), + __strcat_avx2_rtm) + /* ISA V2 wrapper for sse2_unaligned implementation because + the sse2_unaligned implementation is also used at ISA + level 2. */ + X86_IFUNC_IMPL_ADD_V2 (array, i, strcat, + 1, + __strcat_sse2_unaligned) + X86_IFUNC_IMPL_ADD_V1 (array, i, strcat, + 1, + __strcat_sse2)) /* Support sysdeps/x86_64/multiarch/strchr.c. */ IFUNC_IMPL (i, name, strchr, @@ -584,18 +605,26 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/strcpy.c. */ IFUNC_IMPL (i, name, strcpy, - IFUNC_IMPL_ADD (array, i, strcpy, CPU_FEATURE_USABLE (AVX2), - __strcpy_avx2) - IFUNC_IMPL_ADD (array, i, strcpy, - (CPU_FEATURE_USABLE (AVX2) - && CPU_FEATURE_USABLE (RTM)), - __strcpy_avx2_rtm) - IFUNC_IMPL_ADD (array, i, strcpy, - (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), - __strcpy_evex) - IFUNC_IMPL_ADD (array, i, strcpy, 1, __strcpy_sse2_unaligned) - IFUNC_IMPL_ADD (array, i, strcpy, 1, __strcpy_sse2)) + X86_IFUNC_IMPL_ADD_V4 (array, i, strcpy, + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW)), + __strcpy_evex) + X86_IFUNC_IMPL_ADD_V3 (array, i, strcpy, + CPU_FEATURE_USABLE (AVX2), + __strcpy_avx2) + X86_IFUNC_IMPL_ADD_V3 (array, i, strcpy, + (CPU_FEATURE_USABLE (AVX2) + && CPU_FEATURE_USABLE (RTM)), + __strcpy_avx2_rtm) + /* ISA V2 wrapper for sse2_unaligned implementation because + the sse2_unaligned implementation is also used at ISA + level 2. */ + X86_IFUNC_IMPL_ADD_V2 (array, i, strcpy, + 1, + __strcpy_sse2_unaligned) + X86_IFUNC_IMPL_ADD_V1 (array, i, strcpy, + 1, + __strcpy_sse2)) /* Support sysdeps/x86_64/multiarch/strcspn.c. */ IFUNC_IMPL (i, name, strcspn, @@ -651,33 +680,43 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/strncat.c. */ IFUNC_IMPL (i, name, strncat, - IFUNC_IMPL_ADD (array, i, strncat, CPU_FEATURE_USABLE (AVX2), - __strncat_avx2) - IFUNC_IMPL_ADD (array, i, strncat, - (CPU_FEATURE_USABLE (AVX2) - && CPU_FEATURE_USABLE (RTM)), - __strncat_avx2_rtm) - IFUNC_IMPL_ADD (array, i, strncat, - (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), - __strncat_evex) - IFUNC_IMPL_ADD (array, i, strncat, 1, - __strncat_sse2_unaligned)) + X86_IFUNC_IMPL_ADD_V4 (array, i, strncat, + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW)), + __strncat_evex) + X86_IFUNC_IMPL_ADD_V3 (array, i, strncat, + CPU_FEATURE_USABLE (AVX2), + __strncat_avx2) + X86_IFUNC_IMPL_ADD_V3 (array, i, strncat, + (CPU_FEATURE_USABLE (AVX2) + && CPU_FEATURE_USABLE (RTM)), + __strncat_avx2_rtm) + /* ISA V2 wrapper for sse2_unaligned implementation because + the sse2_unaligned implementation is also used at ISA + level 2. */ + X86_IFUNC_IMPL_ADD_V2 (array, i, strncat, + 1, + __strncat_sse2_unaligned)) /* Support sysdeps/x86_64/multiarch/strncpy.c. */ IFUNC_IMPL (i, name, strncpy, - IFUNC_IMPL_ADD (array, i, strncpy, CPU_FEATURE_USABLE (AVX2), - __strncpy_avx2) - IFUNC_IMPL_ADD (array, i, strncpy, - (CPU_FEATURE_USABLE (AVX2) - && CPU_FEATURE_USABLE (RTM)), - __strncpy_avx2_rtm) - IFUNC_IMPL_ADD (array, i, strncpy, - (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), - __strncpy_evex) - IFUNC_IMPL_ADD (array, i, strncpy, 1, - __strncpy_sse2_unaligned)) + X86_IFUNC_IMPL_ADD_V4 (array, i, strncpy, + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW)), + __strncpy_evex) + X86_IFUNC_IMPL_ADD_V3 (array, i, strncpy, + CPU_FEATURE_USABLE (AVX2), + __strncpy_avx2) + X86_IFUNC_IMPL_ADD_V3 (array, i, strncpy, + (CPU_FEATURE_USABLE (AVX2) + && CPU_FEATURE_USABLE (RTM)), + __strncpy_avx2_rtm) + /* ISA V2 wrapper for sse2_unaligned implementation because + the sse2_unaligned implementation is also used at ISA + level 2. */ + X86_IFUNC_IMPL_ADD_V2 (array, i, strncpy, + 1, + __strncpy_sse2_unaligned)) /* Support sysdeps/x86_64/multiarch/strpbrk.c. */ IFUNC_IMPL (i, name, strpbrk, diff --git a/sysdeps/x86_64/multiarch/ifunc-strcpy.h b/sysdeps/x86_64/multiarch/ifunc-strcpy.h index a15afa44e9..e083f71df3 100644 --- a/sysdeps/x86_64/multiarch/ifunc-strcpy.h +++ b/sysdeps/x86_64/multiarch/ifunc-strcpy.h @@ -20,33 +20,38 @@ #include -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) - attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; + extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; -extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; + +extern __typeof (REDIRECT_NAME) + OPTIMIZE (sse2_unaligned) attribute_hidden; + +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; static inline void * IFUNC_SELECTOR (void) { - const struct cpu_features* cpu_features = __get_cpu_features (); + const struct cpu_features *cpu_features = __get_cpu_features (); - if (CPU_FEATURE_USABLE_P (cpu_features, AVX2) - && CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) + if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) + && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, + AVX_Fast_Unaligned_Load, )) { - if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) - && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) + if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) return OPTIMIZE (evex); if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) return OPTIMIZE (avx2_rtm); - if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER)) + if (X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, + Prefer_No_VZEROUPPER, !)) return OPTIMIZE (avx2); } - if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) + if (X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load, )) return OPTIMIZE (sse2_unaligned); return OPTIMIZE (sse2); diff --git a/sysdeps/x86_64/multiarch/ifunc-strncpy.h b/sysdeps/x86_64/multiarch/ifunc-strncpy.h index 323225af4d..5c1c46b885 100644 --- a/sysdeps/x86_64/multiarch/ifunc-strncpy.h +++ b/sysdeps/x86_64/multiarch/ifunc-strncpy.h @@ -19,28 +19,32 @@ #include -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) - attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; + extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; -extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; + +extern __typeof (REDIRECT_NAME) + OPTIMIZE (sse2_unaligned) attribute_hidden; static inline void * IFUNC_SELECTOR (void) { - const struct cpu_features* cpu_features = __get_cpu_features (); + const struct cpu_features *cpu_features = __get_cpu_features (); - if (CPU_FEATURE_USABLE_P (cpu_features, AVX2) - && CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) + if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX2) + && X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, + AVX_Fast_Unaligned_Load, )) { - if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) - && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) + if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) + && X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) return OPTIMIZE (evex); if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) return OPTIMIZE (avx2_rtm); - if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER)) + if (X86_ISA_CPU_FEATURES_ARCH_P (cpu_features, + Prefer_No_VZEROUPPER, !)) return OPTIMIZE (avx2); } diff --git a/sysdeps/x86_64/multiarch/stpcpy-avx2.S b/sysdeps/x86_64/multiarch/stpcpy-avx2.S index f0bd3029fe..277f9e73d2 100644 --- a/sysdeps/x86_64/multiarch/stpcpy-avx2.S +++ b/sysdeps/x86_64/multiarch/stpcpy-avx2.S @@ -1,3 +1,7 @@ +#ifndef STPCPY +# define STPCPY __stpcpy_avx2 +#endif + #define USE_AS_STPCPY -#define STRCPY __stpcpy_avx2 +#define STRCPY STPCPY #include "strcpy-avx2.S" diff --git a/sysdeps/x86_64/multiarch/stpcpy-evex.S b/sysdeps/x86_64/multiarch/stpcpy-evex.S index 7c6f26cd98..4f1c015424 100644 --- a/sysdeps/x86_64/multiarch/stpcpy-evex.S +++ b/sysdeps/x86_64/multiarch/stpcpy-evex.S @@ -1,3 +1,7 @@ +#ifndef STPCPY +# define STPCPY __stpcpy_evex +#endif + #define USE_AS_STPCPY -#define STRCPY __stpcpy_evex +#define STRCPY STPCPY #include "strcpy-evex.S" diff --git a/sysdeps/x86_64/multiarch/stpcpy-sse2-unaligned.S b/sysdeps/x86_64/multiarch/stpcpy-sse2-unaligned.S index 34231f8b46..4c77e5b51c 100644 --- a/sysdeps/x86_64/multiarch/stpcpy-sse2-unaligned.S +++ b/sysdeps/x86_64/multiarch/stpcpy-sse2-unaligned.S @@ -1,3 +1,7 @@ +#ifndef STPCPY +# define STPCPY __stpcpy_sse2_unaligned +#endif + #define USE_AS_STPCPY -#define STRCPY __stpcpy_sse2_unaligned +#define STRCPY STPCPY #include "strcpy-sse2-unaligned.S" diff --git a/sysdeps/x86_64/multiarch/stpcpy-sse2.S b/sysdeps/x86_64/multiarch/stpcpy-sse2.S index ea9f973af3..fcd67cada2 100644 --- a/sysdeps/x86_64/multiarch/stpcpy-sse2.S +++ b/sysdeps/x86_64/multiarch/stpcpy-sse2.S @@ -1,26 +1,7 @@ -/* stpcpy optimized with SSE2. - Copyright (C) 2017-2022 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#if IS_IN (libc) -# ifndef STRCPY -# define STRCPY __stpcpy_sse2 -# endif +#ifndef STPCPY +# define STPCPY __stpcpy_sse2 #endif #define USE_AS_STPCPY +#define STRCPY STPCPY #include "strcpy-sse2.S" diff --git a/sysdeps/x86_64/multiarch/stpncpy-avx2.S b/sysdeps/x86_64/multiarch/stpncpy-avx2.S index 032b0407d0..b2f8c19143 100644 --- a/sysdeps/x86_64/multiarch/stpncpy-avx2.S +++ b/sysdeps/x86_64/multiarch/stpncpy-avx2.S @@ -1,4 +1,8 @@ +#ifndef STPNCPY +# define STPNCPY __stpncpy_avx2 +#endif + #define USE_AS_STPCPY #define USE_AS_STRNCPY -#define STRCPY __stpncpy_avx2 +#define STRCPY STPNCPY #include "strcpy-avx2.S" diff --git a/sysdeps/x86_64/multiarch/stpncpy-evex.S b/sysdeps/x86_64/multiarch/stpncpy-evex.S index 1570014d1c..99ea76a372 100644 --- a/sysdeps/x86_64/multiarch/stpncpy-evex.S +++ b/sysdeps/x86_64/multiarch/stpncpy-evex.S @@ -1,4 +1,8 @@ +#ifndef STPNCPY +# define STPNCPY __stpncpy_evex +#endif + #define USE_AS_STPCPY #define USE_AS_STRNCPY -#define STRCPY __stpncpy_evex +#define STRCPY STPNCPY #include "strcpy-evex.S" diff --git a/sysdeps/x86_64/multiarch/stpncpy-sse2-unaligned.S b/sysdeps/x86_64/multiarch/stpncpy-sse2-unaligned.S index 658520f78f..c83706016f 100644 --- a/sysdeps/x86_64/multiarch/stpncpy-sse2-unaligned.S +++ b/sysdeps/x86_64/multiarch/stpncpy-sse2-unaligned.S @@ -1,4 +1,8 @@ +#ifndef STPNCPY +# define STPNCPY __stpncpy_sse2_unaligned +#endif + #define USE_AS_STPCPY #define USE_AS_STRNCPY -#define STRCPY __stpncpy_sse2_unaligned +#define STRCPY STPNCPY #include "strcpy-sse2-unaligned.S" diff --git a/sysdeps/x86_64/multiarch/strcat-avx2.S b/sysdeps/x86_64/multiarch/strcat-avx2.S index 51d42f89e8..d9b7fb2a43 100644 --- a/sysdeps/x86_64/multiarch/strcat-avx2.S +++ b/sysdeps/x86_64/multiarch/strcat-avx2.S @@ -16,7 +16,10 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +#if ISA_SHOULD_BUILD (3) + # include diff --git a/sysdeps/x86_64/multiarch/strcat-evex.S b/sysdeps/x86_64/multiarch/strcat-evex.S index 110505cb13..0e2df947e9 100644 --- a/sysdeps/x86_64/multiarch/strcat-evex.S +++ b/sysdeps/x86_64/multiarch/strcat-evex.S @@ -16,7 +16,10 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +#if ISA_SHOULD_BUILD (4) + # include diff --git a/sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S b/sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S index a0e051d08a..9d2ca1d504 100644 --- a/sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S +++ b/sysdeps/x86_64/multiarch/strcat-sse2-unaligned.S @@ -16,7 +16,12 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +/* MINIMUM_X86_ISA_LEVEL <= 2 because there is no V2 implementation + so we need this to build for ISA V2 builds. */ +#if ISA_SHOULD_BUILD (2) + # include diff --git a/sysdeps/x86_64/multiarch/strcat-sse2.S b/sysdeps/x86_64/multiarch/strcat-sse2.S index 244c4a6d74..d1d0a3366a 100644 --- a/sysdeps/x86_64/multiarch/strcat-sse2.S +++ b/sysdeps/x86_64/multiarch/strcat-sse2.S @@ -1,5 +1,6 @@ -/* strcat optimized with SSE2. - Copyright (C) 2017-2022 Free Software Foundation, Inc. +/* strcat(dest, src) -- Append SRC on the end of DEST. + Optimized for x86-64. + Copyright (C) 2002-2022 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -16,13 +17,17 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +#if ISA_SHOULD_BUILD (1) + +# include + # ifndef STRCAT -# define STRCAT __strcat_sse2 +# define STRCAT __strcat_sse2 # endif -#endif -#include +/* Will be removed when new strcpy implementation gets merged. */ .text ENTRY (STRCAT) @@ -256,3 +261,4 @@ ENTRY (STRCAT) movq %rdi, %rax /* Source is return value. */ retq END (STRCAT) +#endif diff --git a/sysdeps/x86_64/multiarch/strcpy-avx2.S b/sysdeps/x86_64/multiarch/strcpy-avx2.S index 064eafcbee..c725834929 100644 --- a/sysdeps/x86_64/multiarch/strcpy-avx2.S +++ b/sysdeps/x86_64/multiarch/strcpy-avx2.S @@ -16,7 +16,10 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +#if ISA_SHOULD_BUILD (3) + # ifndef USE_AS_STRCAT # include diff --git a/sysdeps/x86_64/multiarch/strcpy-evex.S b/sysdeps/x86_64/multiarch/strcpy-evex.S index 32229e05d8..82e45ac675 100644 --- a/sysdeps/x86_64/multiarch/strcpy-evex.S +++ b/sysdeps/x86_64/multiarch/strcpy-evex.S @@ -16,7 +16,10 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +#if ISA_SHOULD_BUILD (4) + # ifndef USE_AS_STRCAT # include diff --git a/sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S b/sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S index d5fbd9570a..a889cd96be 100644 --- a/sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S +++ b/sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S @@ -16,7 +16,12 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) +#include + +/* MINIMUM_X86_ISA_LEVEL <= 2 because there is no V2 implementation + so we need this to build for ISA V2 builds. */ +#if ISA_SHOULD_BUILD (2) + # ifndef USE_AS_STRCAT # include diff --git a/sysdeps/x86_64/multiarch/strcpy-sse2.S b/sysdeps/x86_64/multiarch/strcpy-sse2.S index 8b5db8b13d..e29b411314 100644 --- a/sysdeps/x86_64/multiarch/strcpy-sse2.S +++ b/sysdeps/x86_64/multiarch/strcpy-sse2.S @@ -1,5 +1,5 @@ -/* strcpy optimized with SSE2. - Copyright (C) 2017-2022 Free Software Foundation, Inc. +/* strcpy/stpcpy implementation for x86-64. + Copyright (C) 2002-2022 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -16,13 +16,15 @@ License along with the GNU C Library; if not, see . */ -#if IS_IN (libc) -# ifndef STRCPY +#include + +#if ISA_SHOULD_BUILD (1) + +# include + +# ifndef STPCPY # define STRCPY __strcpy_sse2 # endif -#endif - -#include .text ENTRY (STRCPY) @@ -144,10 +146,11 @@ ENTRY (STRCPY) jmp 3b /* and look at next two bytes in %rax. */ 4: -#ifdef USE_AS_STPCPY +# ifdef USE_AS_STPCPY movq %rdx, %rax /* Destination is return value. */ -#else +# else movq %rdi, %rax /* Source is return value. */ -#endif +# endif retq END (STRCPY) +#endif diff --git a/sysdeps/x86_64/multiarch/strncat-avx2.S b/sysdeps/x86_64/multiarch/strncat-avx2.S index bfefa659bb..52ecbca943 100644 --- a/sysdeps/x86_64/multiarch/strncat-avx2.S +++ b/sysdeps/x86_64/multiarch/strncat-avx2.S @@ -1,3 +1,7 @@ +#ifndef STRNCAT +# define STRNCAT __strncat_avx2 +#endif + #define USE_AS_STRNCAT -#define STRCAT __strncat_avx2 +#define STRCAT STRNCAT #include "strcat-avx2.S" diff --git a/sysdeps/x86_64/multiarch/strncat-evex.S b/sysdeps/x86_64/multiarch/strncat-evex.S index 8884f02371..203a19bf21 100644 --- a/sysdeps/x86_64/multiarch/strncat-evex.S +++ b/sysdeps/x86_64/multiarch/strncat-evex.S @@ -1,3 +1,7 @@ +#ifndef STRNCAT +# define STRNCAT __strncat_evex +#endif + #define USE_AS_STRNCAT -#define STRCAT __strncat_evex +#define STRCAT STRNCAT #include "strcat-evex.S" diff --git a/sysdeps/x86_64/multiarch/strncat-sse2-unaligned.S b/sysdeps/x86_64/multiarch/strncat-sse2-unaligned.S index 133e1d20b0..5982e22677 100644 --- a/sysdeps/x86_64/multiarch/strncat-sse2-unaligned.S +++ b/sysdeps/x86_64/multiarch/strncat-sse2-unaligned.S @@ -1,3 +1,7 @@ +#ifndef STRNCAT +# define STRNCAT __strncat_sse2_unaligned +#endif + #define USE_AS_STRNCAT -#define STRCAT __strncat_sse2_unaligned +#define STRCAT STRNCAT #include "strcat-sse2-unaligned.S" diff --git a/sysdeps/x86_64/multiarch/strncpy-avx2.S b/sysdeps/x86_64/multiarch/strncpy-avx2.S index 9ef8c87627..ce634e94fa 100644 --- a/sysdeps/x86_64/multiarch/strncpy-avx2.S +++ b/sysdeps/x86_64/multiarch/strncpy-avx2.S @@ -1,3 +1,7 @@ +#ifndef STRNCPY +# define STRNCPY __strncpy_avx2 +#endif + #define USE_AS_STRNCPY -#define STRCPY __strncpy_avx2 +#define STRCPY STRNCPY #include "strcpy-avx2.S" diff --git a/sysdeps/x86_64/multiarch/strncpy-evex.S b/sysdeps/x86_64/multiarch/strncpy-evex.S index 40e391f0da..1b3426d511 100644 --- a/sysdeps/x86_64/multiarch/strncpy-evex.S +++ b/sysdeps/x86_64/multiarch/strncpy-evex.S @@ -1,3 +1,7 @@ +#ifndef STRNCPY +# define STRNCPY __strncpy_evex +#endif + #define USE_AS_STRNCPY -#define STRCPY __strncpy_evex +#define STRCPY STRNCPY #include "strcpy-evex.S" diff --git a/sysdeps/x86_64/multiarch/strncpy-sse2-unaligned.S b/sysdeps/x86_64/multiarch/strncpy-sse2-unaligned.S index fcc23a754a..e3eb15d93a 100644 --- a/sysdeps/x86_64/multiarch/strncpy-sse2-unaligned.S +++ b/sysdeps/x86_64/multiarch/strncpy-sse2-unaligned.S @@ -1,3 +1,7 @@ +#ifndef STRNCPY +# define STRNCPY __strncpy_sse2_unaligned +#endif + #define USE_AS_STRNCPY -#define STRCPY __strncpy_sse2_unaligned +#define STRCPY STRNCPY #include "strcpy-sse2-unaligned.S" diff --git a/sysdeps/x86_64/stpcpy.S b/sysdeps/x86_64/stpcpy.S index b097c203dd..c7d8d959a7 100644 --- a/sysdeps/x86_64/stpcpy.S +++ b/sysdeps/x86_64/stpcpy.S @@ -1,6 +1,28 @@ -#define STRCPY __stpcpy +/* stpcpy dispatch for RTLD and non-multiarch build + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. -#include "multiarch/stpcpy-sse2.S" + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STPCPY __stpcpy + +#define DEFAULT_IMPL_V1 "multiarch/stpcpy-sse2-unaligned.S" +#define DEFAULT_IMPL_V3 "multiarch/stpcpy-avx2.S" +#define DEFAULT_IMPL_V4 "multiarch/stpcpy-evex.S" + +#include "isa-default-impl.h" weak_alias (__stpcpy, stpcpy) libc_hidden_def (__stpcpy) diff --git a/sysdeps/x86_64/stpncpy.S b/sysdeps/x86_64/stpncpy.S new file mode 100644 index 0000000000..5c2d6efa3d --- /dev/null +++ b/sysdeps/x86_64/stpncpy.S @@ -0,0 +1,28 @@ +/* stpncpy dispatch for RTLD and non-multiarch build + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STPNCPY __stpncpy + +#define DEFAULT_IMPL_V1 "multiarch/stpncpy-sse2-unaligned.S" +#define DEFAULT_IMPL_V3 "multiarch/stpncpy-avx2.S" +#define DEFAULT_IMPL_V4 "multiarch/stpncpy-evex.S" + +#include "isa-default-impl.h" + +weak_alias (__stpncpy, stpncpy) +libc_hidden_def (__stpncpy) diff --git a/sysdeps/x86_64/strcat.S b/sysdeps/x86_64/strcat.S index fc3e8a9bcf..9bca7ecedc 100644 --- a/sysdeps/x86_64/strcat.S +++ b/sysdeps/x86_64/strcat.S @@ -1,6 +1,5 @@ -/* strcat(dest, src) -- Append SRC on the end of DEST. - Optimized for x86-64. - Copyright (C) 2002-2022 Free Software Foundation, Inc. +/* strcat dispatch for RTLD and non-multiarch build + Copyright (C) 2022 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -17,6 +16,12 @@ License along with the GNU C Library; if not, see . */ -#define STRCAT strcat -#include "multiarch/strcat-sse2.S" +#define STRCAT strcat + +#define DEFAULT_IMPL_V1 "multiarch/strcat-sse2-unaligned.S" +#define DEFAULT_IMPL_V3 "multiarch/strcat-avx2.S" +#define DEFAULT_IMPL_V4 "multiarch/strcat-evex.S" + +#include "isa-default-impl.h" + libc_hidden_builtin_def (strcat) diff --git a/sysdeps/x86_64/strcpy.S b/sysdeps/x86_64/strcpy.S index 05f19e6e94..7c04cc2102 100644 --- a/sysdeps/x86_64/strcpy.S +++ b/sysdeps/x86_64/strcpy.S @@ -1,5 +1,5 @@ -/* strcpy/stpcpy implementation for x86-64. - Copyright (C) 2002-2022 Free Software Foundation, Inc. +/* strcpy dispatch for RTLD and non-multiarch build + Copyright (C) 2022 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -17,5 +17,11 @@ . */ #define STRCPY strcpy -#include "multiarch/strcpy-sse2.S" + +#define DEFAULT_IMPL_V1 "multiarch/strcpy-sse2-unaligned.S" +#define DEFAULT_IMPL_V3 "multiarch/strcpy-avx2.S" +#define DEFAULT_IMPL_V4 "multiarch/strcpy-evex.S" + +#include "isa-default-impl.h" + libc_hidden_builtin_def (strcpy) diff --git a/sysdeps/x86_64/strncat.S b/sysdeps/x86_64/strncat.S new file mode 100644 index 0000000000..3ba2603a28 --- /dev/null +++ b/sysdeps/x86_64/strncat.S @@ -0,0 +1,28 @@ +/* strncat dispatch for RTLD and non-multiarch build + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STRNCAT strncat + +#define DEFAULT_IMPL_V1 "multiarch/strncat-sse2-unaligned.S" +#define DEFAULT_IMPL_V3 "multiarch/strncat-avx2.S" +#define DEFAULT_IMPL_V4 "multiarch/strncat-evex.S" + +#include "isa-default-impl.h" + +strong_alias (strncat, __strncat) +libc_hidden_def (__strncat) diff --git a/sysdeps/x86_64/strncpy.S b/sysdeps/x86_64/strncpy.S new file mode 100644 index 0000000000..04c904e60d --- /dev/null +++ b/sysdeps/x86_64/strncpy.S @@ -0,0 +1,27 @@ +/* strncpy dispatch for RTLD and non-multiarch build + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STRNCPY strncpy + +#define DEFAULT_IMPL_V1 "multiarch/strncpy-sse2-unaligned.S" +#define DEFAULT_IMPL_V3 "multiarch/strncpy-avx2.S" +#define DEFAULT_IMPL_V4 "multiarch/strncpy-evex.S" + +#include "isa-default-impl.h" + +libc_hidden_builtin_def (strncpy)