From patchwork Fri Sep 2 08:39:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dengjianbo X-Patchwork-Id: 55111 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E0C593857C49 for ; Fri, 2 Sep 2022 08:39:50 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 58E5F3858D1E for ; Fri, 2 Sep 2022 08:39:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 58E5F3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8AxTWtBwRFjWa8PAA--.5836S2; Fri, 02 Sep 2022 16:39:29 +0800 (CST) From: dengjianbo To: adhemerval.zanella@linaro.org, libc-alpha@sourceware.org, i.swmail@xen0n.name Subject: [PATCH 0/1] LoongArch: Add optimized functions. Date: Fri, 2 Sep 2022 16:39:07 +0800 Message-Id: <20220902083908.2560918-1-dengjianbo@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8AxTWtBwRFjWa8PAA--.5836S2 X-Coremail-Antispam: 1UD129KBjvJXoWxZw4rCFy7Gr4fWF45Jr15CFg_yoW5GF18pa yS9r1rJrna9ry7trWfKay7Ww4SqFs5GF4jqF4ayr18XFWDZr93XrWfurn5WF1qvw4xtrs8 Zrn5WF18WFnxWaUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvq14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVWxJr0_GcWl84ACjcxK6I8E87Iv6xkF7I0E14v26r xl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj 6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVW8JVWxJwAm72CE4IkC6x0Yz7v_Jr 0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkIecxEwVCm-wCF 04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwCFI7km07C267AKxVWUXVWUAw C20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAF wI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjx v20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2 jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0x ZFpf9x0JUHnQUUUUUU= X-CM-SenderInfo: pghqwyxldqu0o6or00hjvr0hdfq/ X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dengjianbo , xuchenghua@loongson.cn, joseph_myers@mentor.com, caiyinyu@loongson.cn Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Tested on LoongArch machine 3A5000: gcc 12.1.0, Linux kernel 5.19.0 rc5, binutils 2.38.50, All cases are passed besides ifunc related tests, no new failed entry was introduced. configure: ../configure --prefix=/usr CFLAGS="-O2" make bench, comparing with the improved generic version, test results can be found in following link: https://github.com/jiadengx/glibc/tree/main/make_bench strchr About 30% faster than improved generic version, detailed info can be found from strchr_xls.png and graph-strchr. https://github.com/jiadengx/glibc/blob/main/make_bench/graph-strchr https://github.com/jiadengx/glibc/blob/main/make_bench/strchr_xls.png strchrnul About 30% faster than improved generic version, detailed info can be found from strchrnul_xls.png and graph-strchrnul. https://github.com/jiadengx/glibc/blob/main/make_bench/graph-strchrnul https://github.com/jiadengx/glibc/blob/main/make_bench/strchrnul_xls.png strcmp About 10% - 60% faster than improved generic version, detailed info can be found from strcmp_xls.png and graph-strcmp. https://github.com/jiadengx/glibc/blob/main/make_bench/graph-strcmp https://github.com/jiadengx/glibc/blob/main/make_bench/strcmp_xls.png strncmp About 0% - 80% faster than generic version, detailed info can be found from strncmp_xls.png and graph-strncmp. https://github.com/jiadengx/glibc/blob/main/make_bench/graph-strncmp https://github.com/jiadengx/glibc/blob/main/make_bench/strncmp_xls.png memmove About 5% - 60% faster than improved generic version, detailed info can be found from memmove_xls.png and graph-memmove. https://github.com/jiadengx/glibc/blob/main/make_bench/graph-memmove https://github.com/jiadengx/glibc/blob/main/make_bench/memmove_xls.png dengjianbo (1): LoongArch: Add optimized functions. sysdeps/loongarch/lp64/memmove.S | 491 +++++++++++++++++++++++++++++ sysdeps/loongarch/lp64/strchr.S | 145 +++++++++ sysdeps/loongarch/lp64/strchrnul.S | 160 ++++++++++ sysdeps/loongarch/lp64/strcmp.S | 210 ++++++++++++ sysdeps/loongarch/lp64/strncmp.S | 281 +++++++++++++++++ 5 files changed, 1287 insertions(+) create mode 100644 sysdeps/loongarch/lp64/memmove.S create mode 100644 sysdeps/loongarch/lp64/strchr.S create mode 100644 sysdeps/loongarch/lp64/strchrnul.S create mode 100644 sysdeps/loongarch/lp64/strcmp.S create mode 100644 sysdeps/loongarch/lp64/strncmp.S