From patchwork Tue Aug 8 06:15:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dengjianbo X-Patchwork-Id: 73792 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7A1D33856DD4 for ; Tue, 8 Aug 2023 06:16:13 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 0350B3858D33 for ; Tue, 8 Aug 2023 06:15:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0350B3858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.2.5.5]) by gateway (Coremail) with SMTP id _____8Dxl+iR3dFkL5wSAA--.5391S3; Tue, 08 Aug 2023 14:15:46 +0800 (CST) Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxfSOR3dFkIoBOAA--.19620S2; Tue, 08 Aug 2023 14:15:45 +0800 (CST) From: dengjianbo To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, xry111@xry111.site, caiyinyu@loongson.cn, xuchenghua@loongson.cn, huangpei@loongson.cn, dengjianbo Subject: [PATCH v4 1/3] LoongArch: Redefine macro LEAF/ENTRY. Date: Tue, 8 Aug 2023 14:15:42 +0800 Message-Id: <20230808061544.1268981-1-dengjianbo@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxfSOR3dFkIoBOAA--.19620S2 X-CM-SenderInfo: pghqwyxldqu0o6or00hjvr0hdfq/ X-Coremail-Antispam: 1Uk129KBj93XoWxJryrXFW7GrW7WF4kWw47WrX_yoW8Gw45pr WFyrZ8Ar43G39xGw13Kw1YgF4fJ3yvgr4xGFWav3yDAF4xCw18XrykCw45JayxGryxG3W5 ZF12va4UW39IywcCm3ZEXasCq-sJn29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07UNvtZUUUUU= X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" The following usage of macro LEAF/ENTRY are all feasible: 1. LEAF(fcn) -- the align value of fcn is .align 3(default value) 2. LEAF(fcn, 6) -- the align value of fcn is .align 6 --- sysdeps/loongarch/sys/asm.h | 36 ++++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/sysdeps/loongarch/sys/asm.h b/sysdeps/loongarch/sys/asm.h index d1a279b8fb..c5eb8afa09 100644 --- a/sysdeps/loongarch/sys/asm.h +++ b/sysdeps/loongarch/sys/asm.h @@ -39,16 +39,32 @@ #define FREG_L fld.d #define FREG_S fst.d -/* Declare leaf routine. */ -#define LEAF(symbol) \ - .text; \ - .globl symbol; \ - .align 3; \ - cfi_startproc; \ - .type symbol, @function; \ - symbol: - -#define ENTRY(symbol) LEAF (symbol) +/* Declare leaf routine. + The usage of macro LEAF/ENTRY is as follows: + 1. LEAF(fcn) -- the align value of fcn is .align 3 (default value) + 2. LEAF(fcn, 6) -- the align value of fcn is .align 6 +*/ +#define LEAF_IMPL(symbol, aln, ...) \ + .text; \ + .globl symbol; \ + .align aln; \ + .type symbol, @function; \ +symbol: \ + cfi_startproc; + + +#define LEAF(...) LEAF_IMPL(__VA_ARGS__, 3) +#define ENTRY(...) LEAF(__VA_ARGS__) + +#define LEAF_NO_ALIGN(symbol) \ + .text; \ + .globl symbol; \ + .type symbol, @function; \ +symbol: \ + cfi_startproc; + +#define ENTRY_NO_ALIGN(symbol) LEAF_NO_ALIGN(symbol) + /* Mark end of function. */ #undef END From patchwork Tue Aug 8 06:15:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dengjianbo X-Patchwork-Id: 73791 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D2FCA3856DC8 for ; Tue, 8 Aug 2023 06:16:12 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 0385C3858C41 for ; Tue, 8 Aug 2023 06:15:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0385C3858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.2.5.5]) by gateway (Coremail) with SMTP id _____8BxbOqU3dFkNpwSAA--.14071S3; Tue, 08 Aug 2023 14:15:48 +0800 (CST) Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxfSOR3dFkIoBOAA--.19620S3; Tue, 08 Aug 2023 14:15:46 +0800 (CST) From: dengjianbo To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, xry111@xry111.site, caiyinyu@loongson.cn, xuchenghua@loongson.cn, huangpei@loongson.cn, dengjianbo Subject: [PATCH v4 2/3] LoongArch: Add minuimum binutils required version Date: Tue, 8 Aug 2023 14:15:43 +0800 Message-Id: <20230808061544.1268981-2-dengjianbo@loongson.cn> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808061544.1268981-1-dengjianbo@loongson.cn> References: <20230808061544.1268981-1-dengjianbo@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxfSOR3dFkIoBOAA--.19620S3 X-CM-SenderInfo: pghqwyxldqu0o6or00hjvr0hdfq/ X-Coremail-Antispam: 1Uk129KBj93XoWxCFyxGryxAF1fAFy3XF48Xwc_yoWrAw17pF y7ZrnxGFs7CFZ3GFZrA3yYgFs3JF4xuFy7ZF1Fy3y8Cr1xCw1kZr40y3savF4UX3y8A34a vryvg3W2vF45JwbCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r106r15M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07j1LvtUUUUU= X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" LoongArch glibc can add some LASX/LSX vector instructions codes, change the required minimum binutils version to 2.41 which could support vector instructions. HAVE_LOONGARCH_VEC_ASM is removed accordingly. --- NEWS | 3 ++- config.h.in | 5 ----- sysdeps/loongarch/configure | 5 ++--- sysdeps/loongarch/configure.ac | 4 ++-- sysdeps/loongarch/dl-machine.h | 4 ++-- sysdeps/loongarch/dl-trampoline.S | 2 +- 6 files changed, 9 insertions(+), 14 deletions(-) diff --git a/NEWS b/NEWS index 22875d5fa4..d8deaff4ce 100644 --- a/NEWS +++ b/NEWS @@ -17,7 +17,8 @@ Deprecated and removed features, and other changes affecting compatibility: Changes to build and runtime requirements: - [Add changes to build and runtime requirements here] +* Building on LoongArch requires at a minimum binutils 2.41 for vector + instructions. Security related changes: diff --git a/config.h.in b/config.h.in index 0dedc124f7..44a34072a4 100644 --- a/config.h.in +++ b/config.h.in @@ -141,11 +141,6 @@ /* LOONGARCH floating-point ABI for ld.so. */ #undef LOONGARCH_ABI_FRLEN -/* Assembler support LoongArch LASX/LSX vector instructions. - This macro becomes obsolete when glibc increased the minimum - required version of GNU 'binutils' to 2.41 or later. */ -#define HAVE_LOONGARCH_VEC_ASM 0 - /* Linux specific: minimum supported kernel version. */ #undef __LINUX_KERNEL_VERSION diff --git a/sysdeps/loongarch/configure b/sysdeps/loongarch/configure index 5843c7cf90..395ddc92ca 100644 --- a/sysdeps/loongarch/configure +++ b/sysdeps/loongarch/configure @@ -128,8 +128,7 @@ rm -f conftest* fi { printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_cv_loongarch_vec_asm" >&5 printf "%s\n" "$libc_cv_loongarch_vec_asm" >&6; } -if test $libc_cv_loongarch_vec_asm = yes; then - printf "%s\n" "#define HAVE_LOONGARCH_VEC_ASM 1" >>confdefs.h - +if test $libc_cv_loongarch_vec_asm = no; then + as_fn_error $? "binutils version is too old, use 2.41 or newer version" "$LINENO" 5 fi diff --git a/sysdeps/loongarch/configure.ac b/sysdeps/loongarch/configure.ac index ba89d8346d..989287c6d2 100644 --- a/sysdeps/loongarch/configure.ac +++ b/sysdeps/loongarch/configure.ac @@ -74,6 +74,6 @@ else libc_cv_loongarch_vec_asm=no fi rm -f conftest*]) -if test $libc_cv_loongarch_vec_asm = yes; then - AC_DEFINE(HAVE_LOONGARCH_VEC_ASM) +if test $libc_cv_loongarch_vec_asm = no; then + AC_MSG_ERROR([binutils version is too old, use 2.41 or newer version]) fi diff --git a/sysdeps/loongarch/dl-machine.h b/sysdeps/loongarch/dl-machine.h index 51ce9af84b..066bb233ac 100644 --- a/sysdeps/loongarch/dl-machine.h +++ b/sysdeps/loongarch/dl-machine.h @@ -270,7 +270,7 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[], /* If using PLTs, fill in the first two entries of .got.plt. */ if (l->l_info[DT_JMPREL]) { -#if HAVE_LOONGARCH_VEC_ASM && !defined __loongarch_soft_float +#if !defined __loongarch_soft_float extern void _dl_runtime_resolve_lasx (void) attribute_hidden; extern void _dl_runtime_resolve_lsx (void) attribute_hidden; #endif @@ -300,7 +300,7 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[], /* This function will get called to fix up the GOT entry indicated by the offset on the stack, and then jump to the resolved address. */ -#if HAVE_LOONGARCH_VEC_ASM && !defined __loongarch_soft_float +#if !defined __loongarch_soft_float if (SUPPORT_LASX) gotplt[0] = (ElfW(Addr)) &_dl_runtime_resolve_lasx; else if (SUPPORT_LSX) diff --git a/sysdeps/loongarch/dl-trampoline.S b/sysdeps/loongarch/dl-trampoline.S index f6ba5e443c..8fd9146978 100644 --- a/sysdeps/loongarch/dl-trampoline.S +++ b/sysdeps/loongarch/dl-trampoline.S @@ -19,7 +19,7 @@ #include #include -#if HAVE_LOONGARCH_VEC_ASM && !defined __loongarch_soft_float +#if !defined __loongarch_soft_float #define USE_LASX #define _dl_runtime_resolve _dl_runtime_resolve_lasx #include "dl-trampoline.h" From patchwork Tue Aug 8 06:15:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dengjianbo X-Patchwork-Id: 73793 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 484B83857350 for ; Tue, 8 Aug 2023 06:16:36 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id D7856385840B for ; Tue, 8 Aug 2023 06:15:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D7856385840B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.2.5.5]) by gateway (Coremail) with SMTP id _____8Cxh+iY3dFkPJwSAA--.5513S3; Tue, 08 Aug 2023 14:15:52 +0800 (CST) Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxfSOR3dFkIoBOAA--.19620S4; Tue, 08 Aug 2023 14:15:51 +0800 (CST) From: dengjianbo To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, xry111@xry111.site, caiyinyu@loongson.cn, xuchenghua@loongson.cn, huangpei@loongson.cn, dengjianbo Subject: [PATCH v4 3/3] Loongarch: Add ifunc support and add different versions of strlen Date: Tue, 8 Aug 2023 14:15:44 +0800 Message-Id: <20230808061544.1268981-3-dengjianbo@loongson.cn> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808061544.1268981-1-dengjianbo@loongson.cn> References: <20230808061544.1268981-1-dengjianbo@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxfSOR3dFkIoBOAA--.19620S4 X-CM-SenderInfo: pghqwyxldqu0o6or00hjvr0hdfq/ X-Coremail-Antispam: 1Uk129KBj9fXoW3uw1rGw4DZF4kCF15WFyfuFX_yoW8XrWfXo WftFsrJrs2kr4IywsrCrsrZ3srWr1fGr4jv3yUZayrJry8try5Cry8Cay0grZxtr95WF4r Xa42vwnxJF9IkFn5l-sFpf9Il3svdjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8wcxFpf 9Il3svdxBIdaVrn0xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3 UjIYCTnIWjp_UUUY17kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI 8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26r1I6r4UM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26r4j6F4UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI 0UMc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280 aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28Icx kI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2Iq xVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42 IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY 6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aV CY1x0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjxUwMKuUUUUU X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, KAM_STOCKGEN, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" strlen-lasx is implemeted by LASX simd instructions(256bit) strlen-lsx is implemeted by LSX simd instructions(128bit) strlen-align is implemented by LA basic instructions and never use unaligned memory acess --- sysdeps/loongarch/lp64/multiarch/Makefile | 7 ++ .../lp64/multiarch/ifunc-impl-list.c | 41 +++++++ .../loongarch/lp64/multiarch/ifunc-strlen.h | 40 +++++++ .../loongarch/lp64/multiarch/strlen-aligned.S | 100 ++++++++++++++++++ .../loongarch/lp64/multiarch/strlen-lasx.S | 63 +++++++++++ sysdeps/loongarch/lp64/multiarch/strlen-lsx.S | 71 +++++++++++++ sysdeps/loongarch/lp64/multiarch/strlen.c | 37 +++++++ sysdeps/loongarch/sys/regdef.h | 57 ++++++++++ .../unix/sysv/linux/loongarch/cpu-features.h | 2 + 9 files changed, 418 insertions(+) create mode 100644 sysdeps/loongarch/lp64/multiarch/Makefile create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-impl-list.c create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-strlen.h create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen-aligned.S create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen-lasx.S create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen-lsx.S create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen.c diff --git a/sysdeps/loongarch/lp64/multiarch/Makefile b/sysdeps/loongarch/lp64/multiarch/Makefile new file mode 100644 index 0000000000..76c506c966 --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/Makefile @@ -0,0 +1,7 @@ +ifeq ($(subdir),string) +sysdep_routines += \ + strlen-aligned \ + strlen-lsx \ + strlen-lasx \ +# sysdep_routines +endif diff --git a/sysdeps/loongarch/lp64/multiarch/ifunc-impl-list.c b/sysdeps/loongarch/lp64/multiarch/ifunc-impl-list.c new file mode 100644 index 0000000000..1a2a576fcd --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/ifunc-impl-list.c @@ -0,0 +1,41 @@ +/* Enumerate available IFUNC implementations of a function LoongArch64 version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include + +size_t +__libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, + size_t max) +{ + + size_t i = max; + + IFUNC_IMPL (i, name, strlen, +#if !defined __loongarch_soft_float + IFUNC_IMPL_ADD (array, i, strlen, SUPPORT_LASX, __strlen_lasx) + IFUNC_IMPL_ADD (array, i, strlen, SUPPORT_LSX, __strlen_lsx) +#endif + IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_aligned) + ) + return i; +} diff --git a/sysdeps/loongarch/lp64/multiarch/ifunc-strlen.h b/sysdeps/loongarch/lp64/multiarch/ifunc-strlen.h new file mode 100644 index 0000000000..6258bb76c3 --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/ifunc-strlen.h @@ -0,0 +1,40 @@ +/* Common definition for strlen ifunc selections. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#if !defined __loongarch_soft_float +extern __typeof (REDIRECT_NAME) OPTIMIZE (lasx) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (lsx) attribute_hidden; +#endif +extern __typeof (REDIRECT_NAME) OPTIMIZE (aligned) attribute_hidden; + +static inline void * +IFUNC_SELECTOR (void) +{ +#if !defined __loongarch_soft_float + if (SUPPORT_LASX) + return OPTIMIZE (lasx); + else if (SUPPORT_LSX) + return OPTIMIZE (lsx); + else +#endif + return OPTIMIZE (aligned); +} diff --git a/sysdeps/loongarch/lp64/multiarch/strlen-aligned.S b/sysdeps/loongarch/lp64/multiarch/strlen-aligned.S new file mode 100644 index 0000000000..e9e1d2fc04 --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/strlen-aligned.S @@ -0,0 +1,100 @@ +/* Optimized strlen implementation using basic Loongarch instructions. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include +#include + +#if IS_IN (libc) +# define STRLEN __strlen_aligned +#else +# define STRLEN strlen +#endif + +LEAF(STRLEN, 6) + move a1, a0 + bstrins.d a0, zero, 2, 0 + lu12i.w a2, 0x01010 + li.w t0, -1 + + ld.d t2, a0, 0 + andi t1, a1, 0x7 + ori a2, a2, 0x101 + slli.d t1, t1, 3 + + bstrins.d a2, a2, 63, 32 + sll.d t1, t0, t1 + slli.d t3, a2, 7 + nor a3, zero, t3 + + orn t2, t2, t1 + sub.d t0, t2, a2 + nor t1, t2, a3 + and t0, t0, t1 + + + bnez t0, L(count_pos) + addi.d a0, a0, 8 +L(loop_16_7bit): + ld.d t2, a0, 0 + sub.d t1, t2, a2 + + and t0, t1, t3 + bnez t0, L(more_check) + ld.d t2, a0, 8 + sub.d t1, t2, a2 + + and t0, t1, t3 + addi.d a0, a0, 16 + beqz t0, L(loop_16_7bit) + addi.d a0, a0, -8 + +L(more_check): + nor t0, t2, a3 + and t0, t1, t0 + bnez t0, L(count_pos) + addi.d a0, a0, 8 + + +L(loop_16_8bit): + ld.d t2, a0, 0 + sub.d t1, t2, a2 + nor t0, t2, a3 + and t0, t0, t1 + + bnez t0, L(count_pos) + ld.d t2, a0, 8 + addi.d a0, a0, 16 + sub.d t1, t2, a2 + + nor t0, t2, a3 + and t0, t0, t1 + beqz t0, L(loop_16_8bit) + addi.d a0, a0, -8 + +L(count_pos): + ctz.d t1, t0 + sub.d a0, a0, a1 + srli.d t1, t1, 3 + add.d a0, a0, t1 + + jr ra +END(STRLEN) + +libc_hidden_builtin_def (STRLEN) diff --git a/sysdeps/loongarch/lp64/multiarch/strlen-lasx.S b/sysdeps/loongarch/lp64/multiarch/strlen-lasx.S new file mode 100644 index 0000000000..258c47cea0 --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/strlen-lasx.S @@ -0,0 +1,63 @@ +/* Optimized strlen implementation using loongarch LASX SIMD instructions. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include +#include + +#if IS_IN (libc) && !defined __loongarch_soft_float + +# define STRLEN __strlen_lasx + +LEAF(STRLEN, 6) + move a1, a0 + bstrins.d a0, zero, 4, 0 + li.d t1, -1 + xvld xr0, a0, 0 + + xvmsknz.b xr0, xr0 + xvpickve.w xr1, xr0, 4 + vilvl.h vr0, vr1, vr0 + movfr2gr.s t0, fa0 # sign extend + + sra.w t0, t0, a1 + beq t0, t1, L(loop) + cto.w a0, t0 + jr ra + +L(loop): + xvld xr0, a0, 32 + addi.d a0, a0, 32 + xvsetanyeqz.b fcc0, xr0 + bceqz fcc0, L(loop) + + + xvmsknz.b xr0, xr0 + sub.d a0, a0, a1 + xvpickve.w xr1, xr0, 4 + vilvl.h vr0, vr1, vr0 + + movfr2gr.s t0, fa0 + cto.w t0, t0 + add.d a0, a0, t0 + jr ra +END(STRLEN) + +libc_hidden_builtin_def (STRLEN) +#endif diff --git a/sysdeps/loongarch/lp64/multiarch/strlen-lsx.S b/sysdeps/loongarch/lp64/multiarch/strlen-lsx.S new file mode 100644 index 0000000000..b194355e7b --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/strlen-lsx.S @@ -0,0 +1,71 @@ +/* Optimized strlen implementation using Loongarch LSX SIMD instructions. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include +#include + +#if IS_IN (libc) && !defined __loongarch_soft_float + +# define STRLEN __strlen_lsx + +LEAF(STRLEN, 6) + move a1, a0 + bstrins.d a0, zero, 4, 0 + vld vr0, a0, 0 + vld vr1, a0, 16 + + li.d t1, -1 + vmsknz.b vr0, vr0 + vmsknz.b vr1, vr1 + vilvl.h vr0, vr1, vr0 + + movfr2gr.s t0, fa0 + sra.w t0, t0, a1 + beq t0, t1, L(loop) + cto.w a0, t0 + + jr ra + nop + nop + nop + + +L(loop): + vld vr0, a0, 32 + vld vr1, a0, 48 + addi.d a0, a0, 32 + vmin.bu vr2, vr0, vr1 + + vsetanyeqz.b fcc0, vr2 + bceqz fcc0, L(loop) + vmsknz.b vr0, vr0 + vmsknz.b vr1, vr1 + + vilvl.h vr0, vr1, vr0 + sub.d a0, a0, a1 + movfr2gr.s t0, fa0 + cto.w t0, t0 + + add.d a0, a0, t0 + jr ra +END(STRLEN) + +libc_hidden_builtin_def (STRLEN) +#endif diff --git a/sysdeps/loongarch/lp64/multiarch/strlen.c b/sysdeps/loongarch/lp64/multiarch/strlen.c new file mode 100644 index 0000000000..381c2daa86 --- /dev/null +++ b/sysdeps/loongarch/lp64/multiarch/strlen.c @@ -0,0 +1,37 @@ +/* Multiple versions of strlen. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +# define strlen __redirect_strlen +# include +# undef strlen + +# define SYMBOL_NAME strlen +# include "ifunc-strlen.h" + +libc_ifunc_redirected (__redirect_strlen, strlen, IFUNC_SELECTOR ()); + +# ifdef SHARED +__hidden_ver1 (strlen, __GI_strlen, __redirect_strlen) + __attribute__ ((visibility ("hidden"))) __attribute_copy__ (strlen); +# endif + +#endif diff --git a/sysdeps/loongarch/sys/regdef.h b/sysdeps/loongarch/sys/regdef.h index 5100f36d24..524d2e3277 100644 --- a/sysdeps/loongarch/sys/regdef.h +++ b/sysdeps/loongarch/sys/regdef.h @@ -89,6 +89,14 @@ #define fs5 $f29 #define fs6 $f30 #define fs7 $f31 +#define fcc0 $fcc0 +#define fcc1 $fcc1 +#define fcc2 $fcc2 +#define fcc3 $fcc3 +#define fcc4 $fcc4 +#define fcc5 $fcc5 +#define fcc6 $fcc6 +#define fcc7 $fcc7 #define vr0 $vr0 #define vr1 $vr1 @@ -98,6 +106,30 @@ #define vr5 $vr5 #define vr6 $vr6 #define vr7 $vr7 +#define vr8 $vr8 +#define vr9 $vr9 +#define vr10 $vr10 +#define vr11 $vr11 +#define vr12 $vr12 +#define vr13 $vr13 +#define vr14 $vr14 +#define vr15 $vr15 +#define vr16 $vr16 +#define vr17 $vr17 +#define vr18 $vr18 +#define vr19 $vr19 +#define vr20 $vr20 +#define vr21 $vr21 +#define vr22 $vr22 +#define vr23 $vr23 +#define vr24 $vr24 +#define vr25 $vr25 +#define vr26 $vr26 +#define vr27 $vr27 +#define vr28 $vr28 +#define vr29 $vr29 +#define vr30 $vr30 +#define vr31 $vr31 #define xr0 $xr0 #define xr1 $xr1 @@ -107,5 +139,30 @@ #define xr5 $xr5 #define xr6 $xr6 #define xr7 $xr7 +#define xr7 $xr7 +#define xr8 $xr8 +#define xr9 $xr9 +#define xr10 $xr10 +#define xr11 $xr11 +#define xr12 $xr12 +#define xr13 $xr13 +#define xr14 $xr14 +#define xr15 $xr15 +#define xr16 $xr16 +#define xr17 $xr17 +#define xr18 $xr18 +#define xr19 $xr19 +#define xr20 $xr20 +#define xr21 $xr21 +#define xr22 $xr22 +#define xr23 $xr23 +#define xr24 $xr24 +#define xr25 $xr25 +#define xr26 $xr26 +#define xr27 $xr27 +#define xr28 $xr28 +#define xr29 $xr29 +#define xr30 $xr30 +#define xr31 $xr31 #endif /* _SYS_REGDEF_H */ diff --git a/sysdeps/unix/sysv/linux/loongarch/cpu-features.h b/sysdeps/unix/sysv/linux/loongarch/cpu-features.h index e371e13b15..d1a280a5ee 100644 --- a/sysdeps/unix/sysv/linux/loongarch/cpu-features.h +++ b/sysdeps/unix/sysv/linux/loongarch/cpu-features.h @@ -25,5 +25,7 @@ #define SUPPORT_LSX (GLRO (dl_hwcap) & HWCAP_LOONGARCH_LSX) #define SUPPORT_LASX (GLRO (dl_hwcap) & HWCAP_LOONGARCH_LASX) +#define INIT_ARCH() + #endif /* _CPU_FEATURES_LOONGARCH64_H */