From patchwork Tue Aug 1 07:09:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dengjianbo X-Patchwork-Id: 55957 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DCF273858413 for ; Tue, 1 Aug 2023 07:09:24 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 9EEC03858D28 for ; Tue, 1 Aug 2023 07:09:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9EEC03858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.2.5.5]) by gateway (Coremail) with SMTP id _____8AxTeuQr8hkHOkNAA--.27513S3; Tue, 01 Aug 2023 15:09:04 +0800 (CST) Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxzM6Qr8hk7KFDAA--.45263S2; Tue, 01 Aug 2023 15:09:04 +0800 (CST) From: dengjianbo To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, xry111@xry111.site, caiyinyu@loongson.cn, xuchenghua@loongson.cn, huangpei@loongson.cn, dengjianbo Subject: [PATCH 0/2] Add ifunc support and different versions of strlen Date: Tue, 1 Aug 2023 15:09:00 +0800 Message-Id: <20230801070902.1385953-1-dengjianbo@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxzM6Qr8hk7KFDAA--.45263S2 X-CM-SenderInfo: pghqwyxldqu0o6or00hjvr0hdfq/ X-Coremail-Antispam: 1Uk129KBj93XoW7Ww4DCryxtFW5Gw48XF1DArc_yoW8Zw18pr ZFkwn8JFs3G3sFgr1fKa45Xan5J3y8Gr129F1a9348JrWxZryfXryxCw1kXF1UXw18JrW8 Zrnak3WUW3W5C3cCm3ZEXasCq-sJn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r106r15M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07j8yCJUUUUU= X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Tested on LoongArch 3A5000 machine: gcc 13.0.1, Linux kernel 6.1.0-rc7+, binutils 2.40.50, No new FAIL items introduced. make check: 4646 PASS 20 UNSUPPORTED 12 XFAIL 6 XPASS make bench result can be found from: https://github.com/jiadengx/glibc_test/blob/main/strlen/bench-strlen.out Comparing with the current generic version, strlen_lasx(256bit SIMD) has a 20%-600% performance improvement, strlen_lsx(128bit SIMD) has a 20%- 280% performace improvement, strlen_algin has a 5%-60% performance improvement when length exceeds 32 bytes. Detailed info can be found from histogram via following link: https://github.com/jiadengx/glibc_test/tree/main/strlen dengjianbo (2): LoongArch: Redefine macro LEAF/ENTRY. Loongarch: Add ifunc support and add different versions of strlen sysdeps/loongarch/lp64/multiarch/Makefile | 3 + .../lp64/multiarch/ifunc-impl-list.c | 39 +++++++ .../loongarch/lp64/multiarch/ifunc-strlen.h | 36 +++++++ .../loongarch/lp64/multiarch/strlen-aligned.S | 101 ++++++++++++++++++ .../loongarch/lp64/multiarch/strlen-lasx.S | 65 +++++++++++ sysdeps/loongarch/lp64/multiarch/strlen-lsx.S | 73 +++++++++++++ sysdeps/loongarch/lp64/multiarch/strlen.c | 37 +++++++ sysdeps/loongarch/sys/asm.h | 36 +++++-- sysdeps/loongarch/sys/regdef.h | 57 ++++++++++ .../unix/sysv/linux/loongarch/cpu-features.h | 2 + 10 files changed, 439 insertions(+), 10 deletions(-) create mode 100644 sysdeps/loongarch/lp64/multiarch/Makefile create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-impl-list.c create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-strlen.h create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen-aligned.S create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen-lasx.S create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen-lsx.S create mode 100644 sysdeps/loongarch/lp64/multiarch/strlen.c