From patchwork Fri Sep 16 07:16:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dengjianbo X-Patchwork-Id: 55123 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C0F0E3AA8C09 for ; Fri, 16 Sep 2022 07:17:02 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 336653839DDC for ; Fri, 16 Sep 2022 07:16:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 336653839DDC Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from 5.5.5 (unknown [10.2.5.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxbWvcIiRjXvAaAA--.56556S2; Fri, 16 Sep 2022 15:16:44 +0800 (CST) From: dengjianbo To: adhemerval.zanella@linaro.org, libc-alpha@sourceware.org, i.swmail@xen0n.name Subject: [PATCH 0/1] LoongArch: Add optimized functions. Date: Fri, 16 Sep 2022 15:16:41 +0800 Message-Id: <20220916071642.2822131-1-dengjianbo@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8CxbWvcIiRjXvAaAA--.56556S2 X-Coremail-Antispam: 1UD129KBjvJXoWxZw4rCFy7Gryrtw4xGr43trb_yoW5XF4Upa yI9r1rJrnakryxtrZ3Kay7Ww4FqFs5GF1jvFWayr18XrWDAr93XrZ7Zrn5WF1jvw48trs8 urn5WFy8WF13WaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkI14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26ryj6F1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26F4j 6r4UJwA2z4x0Y4vEx4A2jsIE14v26F4UJVW0owA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E 2Ix0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJV W8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lc2xSY4AK6svPMxAI w28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr 4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxG rwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJw CI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2 z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VU1a9aPUUUUU== X-CM-SenderInfo: pghqwyxldqu0o6or00hjvr0hdfq/ X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: caiyinyu@loongson.cn, xuchenghua@loongson.cn, huangpei@loongson.cn, joseph_myers@mentor.com, dengjianbo Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Tested on LoongArch machine 3A5000: gcc 12.1.0, Linux kernel 5.19.0 rc5, binutils 2.38.50, All cases are passed besides ifunc related tests, no new failed entry was introduced. configure: ../configure --prefix=/usr CFLAGS="-O2" make bench: comparing with improved generic string version resent in https://sourceware.org/pipermail/libc-alpha/2022-September/141833.html , test results can be found in following link: https://github.com/jiadengx/glibc/tree/main/make_bench_v2 strchr About 30% faster than improved generic version, detailed info can be found from strchr_xls.png and graph-strchr. https://github.com/jiadengx/glibc/blob/main/make_bench_v2/graph-strchr https://github.com/jiadengx/glibc/blob/main/make_bench_v2/strchr_xls.png strchrnul About 30% faster than improved generic version, detailed info can be found from strchrnul_xls.png and graph-strchrnul. https://github.com/jiadengx/glibc/blob/main/make_bench_v2/graph-strchrnul https://github.com/jiadengx/glibc/blob/main/make_bench_v2/strchrnul_xls.png strcmp About 10% - 60% faster than improved generic version, detailed info can be found from strcmp_xls.png and graph-strcmp. https://github.com/jiadengx/glibc/blob/main/make_bench_v2/graph-strcmp https://github.com/jiadengx/glibc/blob/main/make_bench_v2/strcmp_xls.png strncmp About 0% - 80% faster than generic version, detailed info can be found from strncmp_xls.png and graph-strncmp. https://github.com/jiadengx/glibc/blob/main/make_bench_v2/graph-strncmp https://github.com/jiadengx/glibc/blob/main/make_bench_v2/strncmp_xls.png memmove About 5% - 60% faster than improved generic version, detailed info can be found from memmove_xls.png and graph-memmove https://github.com/jiadengx/glibc/blob/main/make_bench_v2/graph-memmove https://github.com/jiadengx/glibc/blob/main/make_bench_v2/memmove_xls.png dengjianbo (1): LoongArch: Add optimized functions sysdeps/loongarch/lp64/memmove.S | 485 +++++++++++++++++++++++++++++ sysdeps/loongarch/lp64/strchr.S | 124 ++++++++ sysdeps/loongarch/lp64/strchrnul.S | 133 ++++++++ sysdeps/loongarch/lp64/strcmp.S | 207 ++++++++++++ sysdeps/loongarch/lp64/strncmp.S | 278 +++++++++++++++++ 5 files changed, 1227 insertions(+) create mode 100644 sysdeps/loongarch/lp64/memmove.S create mode 100644 sysdeps/loongarch/lp64/strchr.S create mode 100644 sysdeps/loongarch/lp64/strchrnul.S create mode 100644 sysdeps/loongarch/lp64/strcmp.S create mode 100644 sysdeps/loongarch/lp64/strncmp.S