From patchwork Fri Apr 21 07:54:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68107 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7AFC2385773F for ; Fri, 21 Apr 2023 07:54:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7AFC2385773F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1682063687; bh=M+8RsA44kDpp09yFbLkF/XoJuNTBva6C3apOpqubi+I=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ROjf+RizFMPpKPLu07EquzgkOvx5bma0gMXxZKw2JSYKoL/7IDFNtFc2Wf2OmOAUj TpNChfHXGovq2HKjUVEsHlkGflNRqtV5eo+GwC3wd/ifaRA5xpppnQDx/3JUdC015T TwJ7TF42hVXsCPSVOBkhbUIlAFmBtCMeldm8pjXU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by sourceware.org (Postfix) with ESMTPS id AE4763858D37 for ; Fri, 21 Apr 2023 07:54:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AE4763858D37 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-63b5c4c769aso2647503b3a.3 for ; Fri, 21 Apr 2023 00:54:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682063663; x=1684655663; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M+8RsA44kDpp09yFbLkF/XoJuNTBva6C3apOpqubi+I=; b=UfdjkqohOELkqGnDgOO2PZ3vVSAE0UR0eeHsKSRCpZF4kGK9V3AtTllAggaBtZOvX5 UA+YpN7VZy+MmjECaMbXFmCGLnJn1OJF+GpWHut0Sz5rswhCgsOym0zs9lrWQh3QL7G7 ZWKtvfjL/zuDvmvZsdIjJLXrmvoxjeUNzoNscmuhNminDnJxwFReduCUreGS0DiBwD4F FZADjK5GBPJoOtcIzsHRVdDyehEUUMI9mzZGdCiPWcJCP7UAiMmWM8E1tuCjHI+wmxgH 5wrLHDG8fSHV3rJSd6nnBOf/JGopgxKbXakgSEuZOxOSFP5PCj1kISsKLBwN6p5/V4o9 xQNQ== X-Gm-Message-State: AAQBX9frv34sC1xQDcg+QWR8bzRmaIJ9TxdPscuiLhPgaijZmGLni3V6 eCvErQaBubYC5umJWpkjp9ekEZZDJvGTlc23rPE8t2UKjMhGtfBsqtcQvGrNTaGNvGAN40bcC2S xbjBTHd87HipV55tjxLpKmTomZRuCJEHnFtlamDmODouh7yjFOm5A25KroxLjhF6bF8qWPjiqC6 CECX07 X-Google-Smtp-Source: AKy350YnLCdWsHenZ1z6uWOR9eaynwch5igaa5Iw8kiN53IS21UK2ecp3G5z7M8xIJyJzgYymQaQ7w== X-Received: by 2002:a05:6a00:1393:b0:63f:15cc:9c1a with SMTP id t19-20020a056a00139300b0063f15cc9c1amr2883144pfg.1.1682063663210; Fri, 21 Apr 2023 00:54:23 -0700 (PDT) Received: from localhost.localdomain (1-169-217-217.dynamic-ip.hinet.net. [1.169.217.217]) by smtp.gmail.com with ESMTPSA id fa23-20020a056a002d1700b006259e883ee9sm650992pfb.189.2023.04.21.00.54.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Apr 2023 00:54:22 -0700 (PDT) To: libc-alpha@sourceware.org, hongrong.hsu@sifive.com, jerry.shih@sifive.com, nick.knight@sifive.com, kito.cheng@sifive.com Cc: greentime.hu@sifive.com, alice.chan@sifive.com, andrew@sifive.com, vincent.chen@sifive.com, hau.hsu@sifive.com Subject: [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time Date: Fri, 21 Apr 2023 15:54:01 +0800 Message-Id: <20230421075405.14892-2-hau.hsu@sifive.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230421075405.14892-1-hau.hsu@sifive.com> References: <20230421075405.14892-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Vincent Chen Let the build selects the vectorized mem*/str* functions when it detects the compiler supports RISC-V V extension and enables it in this build. We agree that the these vectorized mem*/str* functions should be selected by IFUNC. Therefore, this patch is intended as a **temporary solution** to enable reviewers to evaluate the effectiveness of these vectorized mem*/str* functions. --- scripts/build-many-glibcs.py | 10 ++++++++++ sysdeps/riscv/preconfigure | 19 +++++++++++++++++++ sysdeps/riscv/preconfigure.ac | 18 ++++++++++++++++++ sysdeps/riscv/rv32/rvv/Implies | 2 ++ sysdeps/riscv/rv64/rvv/Implies | 2 ++ 5 files changed, 51 insertions(+) create mode 100644 sysdeps/riscv/rv32/rvv/Implies create mode 100644 sysdeps/riscv/rv64/rvv/Implies diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py index 82f8d97281..2fbb91a028 100755 --- a/scripts/build-many-glibcs.py +++ b/scripts/build-many-glibcs.py @@ -381,6 +381,11 @@ class Context(object): variant='rv32imafdc-ilp32d', gcc_cfg=['--with-arch=rv32imafdc', '--with-abi=ilp32d', '--disable-multilib']) + self.add_config(arch='riscv32', + os_name='linux-gnu', + variant='rv32imafdcv-ilp32d', + gcc_cfg=['--with-arch=rv32imafdcv', '--with-abi=ilp32d', + '--disable-multilib']) self.add_config(arch='riscv64', os_name='linux-gnu', variant='rv64imac-lp64', @@ -396,6 +401,11 @@ class Context(object): variant='rv64imafdc-lp64d', gcc_cfg=['--with-arch=rv64imafdc', '--with-abi=lp64d', '--disable-multilib']) + self.add_config(arch='riscv64', + os_name='linux-gnu', + variant='rv64imafdcv-lp64d', + gcc_cfg=['--with-arch=rv64imafdcv', '--with-abi=lp64d', + '--disable-multilib']) self.add_config(arch='s390x', os_name='linux-gnu', glibcs=[{}, diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure index 4dedf4b0bb..5ddc195b46 100644 --- a/sysdeps/riscv/preconfigure +++ b/sysdeps/riscv/preconfigure @@ -7,6 +7,7 @@ riscv*) flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'` float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'` atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2` + vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2` case "$xlen" in 64 | 32) @@ -32,6 +33,24 @@ riscv*) ;; esac + case "$vector" in + __riscv_vector) + case "$flen" in + 64) + float_machine=rvv + ;; + *) + # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only. + ;; + esac + ;; + *) + ;; + esac + + { $as_echo "$as_me:${as_lineno-$LINENO}: vector $vector flen $flen float_machine $float_machine" >&5 +$as_echo "$as_me: vector $vector flen $flen float_machine $float_machine" >&6;} + case "$float_abi" in soft) abi_flen=0 diff --git a/sysdeps/riscv/preconfigure.ac b/sysdeps/riscv/preconfigure.ac index a5c30e0dbf..b6b8bb46e4 100644 --- a/sysdeps/riscv/preconfigure.ac +++ b/sysdeps/riscv/preconfigure.ac @@ -7,6 +7,7 @@ riscv*) flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'` float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'` atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2` + vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2` case "$xlen" in 64 | 32) @@ -32,6 +33,23 @@ riscv*) ;; esac + case "$vector" in + __riscv_vector) + case "$flen" in + 64) + float_machine=rvv + ;; + *) + # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only. + ;; + esac + ;; + *) + ;; + esac + + AC_MSG_NOTICE([vector $vector flen $flen float_machine $float_machine]) + case "$float_abi" in soft) abi_flen=0 diff --git a/sysdeps/riscv/rv32/rvv/Implies b/sysdeps/riscv/rv32/rvv/Implies new file mode 100644 index 0000000000..25ce1df222 --- /dev/null +++ b/sysdeps/riscv/rv32/rvv/Implies @@ -0,0 +1,2 @@ +riscv/rv32/rvd +riscv/rvv diff --git a/sysdeps/riscv/rv64/rvv/Implies b/sysdeps/riscv/rv64/rvv/Implies new file mode 100644 index 0000000000..9993bb30e3 --- /dev/null +++ b/sysdeps/riscv/rv64/rvv/Implies @@ -0,0 +1,2 @@ +riscv/rv64/rvd +riscv/rvv From patchwork Fri Apr 21 07:54:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68108 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B77173857010 for ; Fri, 21 Apr 2023 07:54:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B77173857010 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1682063694; bh=lhy2BiSZGNA2EQECGdCZMLWXTZvOGUaVN5KTofGgPfA=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=l4dJSn7pQb89rYD7d4gqVqeUJThyFe5Trw6zJL3ZQuD7seivxPQH7zSkQXN+O3uf5 FKwrzrlrGx6DN8lagVZ3CVojKw7pRvp3O9dhnqyIwSlA1AoSM084wm+ev/6VFAg3Sq Wxlv2BcnSkIsc9TglWV7hwADjPR9fP4/027rDM6Y= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by sourceware.org (Postfix) with ESMTPS id DAA63385840A for ; Fri, 21 Apr 2023 07:54:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DAA63385840A Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-63d4595d60fso12662532b3a.0 for ; Fri, 21 Apr 2023 00:54:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682063665; x=1684655665; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lhy2BiSZGNA2EQECGdCZMLWXTZvOGUaVN5KTofGgPfA=; b=YLyvA607VPEdQxc0ZGyST+DhGnu+zd5rcIH8w3bYtsRB8bKzD3VFdL+ujxjYqfBsFy nqPrh6ryAEDef+qCJuruCucF8dBs/21jmMJmFqmKokbMmEObIGBI+Dzrp12bBvVJpwJg 3KE7+TNnsmM0D+tMAlirm+E6iZJTcr8XXowCb3xsJjHHCCyuBnhm3antAWylyq2f0aNB +AWpZkjHvdkPNIs+AMXxmtI8rvBdkwl3v8oW+UlKSLDapqEVU63/vdzIcM2XBBrnsAxZ jsNx9KpRrjbHd/qcVTjvOivd9LlCkxjzPVNkH4N5YF0qtg9EF7M7Yc4GCefULXhfqBR7 Ay8g== X-Gm-Message-State: AAQBX9ciRaNSKSLg8SiFaj6Ggd5F31yUxXSFoDir3v8jzWN7x0W8O42q MW4RsfJZXLrkcoWZcS4OM6OQG7Kd4/Cw+IP0AQkpx9J0Q7Yb9NrdyA1HRFJ55A04Kop4VOgXaDE 8CL5UcWGm9ki4q6u85mn/ym9d9PusKt7FhprhjXxXX1MXOg2Dq2eFcrLK7LU0RDPzH4RgnddUFF hjjnfM X-Google-Smtp-Source: AKy350ZZXOXczNQ3/Li3arpxmwGMst8mUn1meyRVNjKsN30pfAEDyI195yJ7YsecyFcDQ4qV9KMzWw== X-Received: by 2002:aa7:8d55:0:b0:63e:b018:d7cb with SMTP id s21-20020aa78d55000000b0063eb018d7cbmr4884587pfe.6.1682063665361; Fri, 21 Apr 2023 00:54:25 -0700 (PDT) Received: from localhost.localdomain (1-169-217-217.dynamic-ip.hinet.net. [1.169.217.217]) by smtp.gmail.com with ESMTPSA id fa23-20020a056a002d1700b006259e883ee9sm650992pfb.189.2023.04.21.00.54.23 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Apr 2023 00:54:25 -0700 (PDT) To: libc-alpha@sourceware.org, hongrong.hsu@sifive.com, jerry.shih@sifive.com, nick.knight@sifive.com, kito.cheng@sifive.com Cc: greentime.hu@sifive.com, alice.chan@sifive.com, andrew@sifive.com, vincent.chen@sifive.com, hau.hsu@sifive.com Subject: [PATCH v2 2/5] riscv: vectorized mem* functions Date: Fri, 21 Apr 2023 15:54:02 +0800 Message-Id: <20230421075405.14892-3-hau.hsu@sifive.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230421075405.14892-1-hau.hsu@sifive.com> References: <20230421075405.14892-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of memchr, memcmp, memcpy, memmove, and memset that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/memchr.S | 63 +++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcmp.S | 75 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcpy.S | 51 +++++++++++++++++++++++++ sysdeps/riscv/rvv/memmove.S | 72 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memset.S | 51 +++++++++++++++++++++++++ 5 files changed, 312 insertions(+) create mode 100644 sysdeps/riscv/rvv/memchr.S create mode 100644 sysdeps/riscv/rvv/memcmp.S create mode 100644 sysdeps/riscv/rvv/memcpy.S create mode 100644 sysdeps/riscv/rvv/memmove.S create mode 100644 sysdeps/riscv/rvv/memset.S diff --git a/sysdeps/riscv/rvv/memchr.S b/sysdeps/riscv/rvv/memchr.S new file mode 100644 index 0000000000..6981a9f8b0 --- /dev/null +++ b/sysdeps/riscv/rvv/memchr.S @@ -0,0 +1,63 @@ +/* RVV versions memchr. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pSrc a0 +#define iValue a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 +#define vMask v8 + +ENTRY(memchr) + +L(loop): + vsetvli zero, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vData, (pSrc) + /* Find the iValue inside the loaded data. */ + vmseq.vx vMask, vData, iValue + vfirst.m iTemp, vMask + + /* Skip the loop if we find the matched value. */ + bgez iTemp, L(found) + + csrr iVL, vl + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + + bnez iNum, L(loop) + + li iResult, 0 + ret + +L(found): + add iResult, pSrc, iTemp + ret + +END(memchr) +libc_hidden_builtin_def (memchr) diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S new file mode 100644 index 0000000000..b156ec524c --- /dev/null +++ b/sysdeps/riscv/rvv/memcmp.S @@ -0,0 +1,75 @@ +/* RVV versions memcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pSrc1 a0 +#define pSrc2 a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 +#define iTemp1 a5 +#define iTemp2 a6 + +#define ELEM_LMUL_SETTING m8 +#define vData1 v0 +#define vData2 v8 +#define vMask v16 + +ENTRY(memcmp) + +L(loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData1, (pSrc1) + vle8.v vData2, (pSrc2) + + vmsne.vv vMask, vData1, vData2 + sub iNum, iNum, iVL + vfirst.m iTemp, vMask + + /* Skip the loop if we find the different value between pSrc1 and pSrc2. */ + bgez iTemp, L(found) + + add pSrc1, pSrc1, iVL + add pSrc2, pSrc2, iVL + + bnez iNum, L(loop) + + li iResult, 0 + ret + +L(found): + add pSrc1, pSrc1, iTemp + add pSrc2, pSrc2, iTemp + lbu iTemp1, 0(pSrc1) + lbu iTemp2, 0(pSrc2) + sub iResult, iTemp1, iTemp2 + ret + +END(memcmp) +libc_hidden_builtin_def (memcmp) +weak_alias (memcmp,bcmp) +strong_alias (memcmp, __memcmpeq) +libc_hidden_def (__memcmpeq) + diff --git a/sysdeps/riscv/rvv/memcpy.S b/sysdeps/riscv/rvv/memcpy.S new file mode 100644 index 0000000000..de790fbe51 --- /dev/null +++ b/sysdeps/riscv/rvv/memcpy.S @@ -0,0 +1,51 @@ +/* RVV versions memcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +ENTRY(memcpy) + + mv pDstPtr, pDst + +L(loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + + bnez iNum, L(loop) + + ret + +END(memcpy) +libc_hidden_builtin_def (memcpy) diff --git a/sysdeps/riscv/rvv/memmove.S b/sysdeps/riscv/rvv/memmove.S new file mode 100644 index 0000000000..ed12744064 --- /dev/null +++ b/sysdeps/riscv/rvv/memmove.S @@ -0,0 +1,72 @@ +/* RVV versions memmove. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 +#define pSrcBackwardPtr a5 +#define pDstBackwardPtr a6 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +ENTRY(memmove) + + mv pDstPtr, pDst + + /* If pSrc is equal or after pDst, all data in pSrc will be loaded before + overwrited for the overlapping case. We could use faster `forward-copy`. */ + bgeu pSrc, pDst, L(forward_copy_loop) + add pSrcBackwardPtr, pSrc, iNum + add pDstBackwardPtr, pDst, iNum + /* If pDst inside source data range, we need to use `backward_copy_loop` to + handle the overlapping issue. */ + bltu pDst, pSrcBackwardPtr, L(backward_copy_loop) + +L(forward_copy_loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + + bnez iNum, L(forward_copy_loop) + ret + +L(backward_copy_loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + sub pSrcBackwardPtr, pSrcBackwardPtr, iVL + vle8.v vData, (pSrcBackwardPtr) + sub iNum, iNum, iVL + sub pDstBackwardPtr, pDstBackwardPtr, iVL + vse8.v vData, (pDstBackwardPtr) + bnez iNum, L(backward_copy_loop) + ret + +END(memmove) +libc_hidden_builtin_def (memmove) diff --git a/sysdeps/riscv/rvv/memset.S b/sysdeps/riscv/rvv/memset.S new file mode 100644 index 0000000000..3a6c3d0afd --- /dev/null +++ b/sysdeps/riscv/rvv/memset.S @@ -0,0 +1,51 @@ +/* RVV versions memset. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define iValue a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 +#define pDstPtr a5 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +ENTRY(memset) + + mv pDstPtr, pDst + + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + vmv.v.x vData, iValue + +L(loop): + vse8.v vData, (pDstPtr) + sub iNum, iNum, iVL + add pDstPtr, pDstPtr, iVL + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + bnez iNum, L(loop) + + ret + +END(memset) +libc_hidden_builtin_def (memset) From patchwork Fri Apr 21 07:54:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68109 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 598223853821 for ; Fri, 21 Apr 2023 07:55:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 598223853821 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1682063729; bh=Hm/7cUa1/bLGlUij6cYL0nWvl7GDXs3H6f8oOwjui9Q=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=aT2o1jMALRhYH+s8Kfk6OHSfHfil/Q3j6J2rbcVAhpRQ414QOQ4tNnaU/XUzwtK4e EC9vnjewWvee6DNdMqTf54QU03dADLVQh+BLnF0UngNDh7qhuAiOrWez6eUTHXo225 dw2j+NEsu/tmJcnN1c6vVJDsyysmYViZD4jgJEXg= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by sourceware.org (Postfix) with ESMTPS id 26F8F385700A for ; Fri, 21 Apr 2023 07:54:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 26F8F385700A Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-63b7096e2e4so1746255b3a.2 for ; Fri, 21 Apr 2023 00:54:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682063668; x=1684655668; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Hm/7cUa1/bLGlUij6cYL0nWvl7GDXs3H6f8oOwjui9Q=; b=ly12LTa80TZ0lgKM9gKeRv0uvX5f/k2tq3GfEbbRn3DsJvT8PMH8UWQIBwwRBP7BIx 2iO/eHWPRlFrGZpI2Hj8cd2PrckzeACY1P3PAhLK7Dp4naoEkJm5jPP7W/rxd+1afU1j lHoVLHcYiSxc3Gd8buL7l5XP9XRx+uo+78f1zTtHQ9vHdO/7LWyjR8mJRWlBDb5uU7J+ CHuH/TSjrwNfxG0Y5/OQlpA6yBDGjFCF8sCkX2pqEucCLCPt4ygKwpBqPgbM9GZd38JU zHNgcNILe85sju7S29MmLAco7cPav2l0oSa1I4Gzy1xyTLwVV4QxjbEIhVPW4Cs70UPG pJQQ== X-Gm-Message-State: AAQBX9cJcD65+M1EMPhRHNxXtghNl7pQLftMExiBm13NP/CME/5oosxI 7LXK/SJ8qN/nV9tSVZOIvkWRWcwxbEdROz9zIbd2nd7mzSgrUrSZzURbUVOGNDL5jBM5njy6/jx m37DeP8Bd2Mi0A8A6XjxGiOnMBtjEQk2AS6OEpQSz0OzMfdnwaHrUr3DIO4m5eHfTMBdvZgZTh9 FzpSO9 X-Google-Smtp-Source: AKy350ZnNaIsZg11bAMXu2ZgnJU4flHMOomBv3DQcviWLaMNIcnK/ayX9xBVcfpqhHClNw/cmc3i8g== X-Received: by 2002:a05:6a00:15c2:b0:63f:158a:6e7b with SMTP id o2-20020a056a0015c200b0063f158a6e7bmr3060075pfu.6.1682063667526; Fri, 21 Apr 2023 00:54:27 -0700 (PDT) Received: from localhost.localdomain (1-169-217-217.dynamic-ip.hinet.net. [1.169.217.217]) by smtp.gmail.com with ESMTPSA id fa23-20020a056a002d1700b006259e883ee9sm650992pfb.189.2023.04.21.00.54.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Apr 2023 00:54:27 -0700 (PDT) To: libc-alpha@sourceware.org, hongrong.hsu@sifive.com, jerry.shih@sifive.com, nick.knight@sifive.com, kito.cheng@sifive.com Cc: greentime.hu@sifive.com, alice.chan@sifive.com, andrew@sifive.com, vincent.chen@sifive.com, hau.hsu@sifive.com Subject: [PATCH v2 3/5] riscv: vectorized str* functions Date: Fri, 21 Apr 2023 15:54:03 +0800 Message-Id: <20230421075405.14892-4-hau.hsu@sifive.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230421075405.14892-1-hau.hsu@sifive.com> References: <20230421075405.14892-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of strcat, strcmp, strcpy, strlen, strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/strcat.S | 72 ++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strcmp.S | 93 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strcpy.S | 56 ++++++++++++++++++++++ sysdeps/riscv/rvv/strlen.S | 54 +++++++++++++++++++++ sysdeps/riscv/rvv/strncat.S | 83 +++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strncmp.S | 85 +++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strncpy.S | 86 ++++++++++++++++++++++++++++++++++ 7 files changed, 529 insertions(+) create mode 100644 sysdeps/riscv/rvv/strcat.S create mode 100644 sysdeps/riscv/rvv/strcmp.S create mode 100644 sysdeps/riscv/rvv/strcpy.S create mode 100644 sysdeps/riscv/rvv/strlen.S create mode 100644 sysdeps/riscv/rvv/strncat.S create mode 100644 sysdeps/riscv/rvv/strncmp.S create mode 100644 sysdeps/riscv/rvv/strncpy.S diff --git a/sysdeps/riscv/rvv/strcat.S b/sysdeps/riscv/rvv/strcat.S new file mode 100644 index 0000000000..8a7779fd3c --- /dev/null +++ b/sysdeps/riscv/rvv/strcat.S @@ -0,0 +1,72 @@ +/* RVV versions strcat. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define pDstPtr a2 + +#define iVL a3 +#define iCurrentVL a4 +#define iActiveElemPos a5 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strcat) + + mv pDstPtr, pDst + + /* Perform `strlen(dst)`. */ +L(strlen_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pDstPtr) + vmseq.vx vMask1, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask1 + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strlen_loop) + + sub pDstPtr, pDstPtr, iCurrentVL + add pDstPtr, pDstPtr, iActiveElemPos + + /* Perform `strcpy(dst, src)`. */ +L(strcpy_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strcpy_loop) + + ret + +END(strcat) +libc_hidden_builtin_def (strcat) diff --git a/sysdeps/riscv/rvv/strcmp.S b/sysdeps/riscv/rvv/strcmp.S new file mode 100644 index 0000000000..c5f525bbe9 --- /dev/null +++ b/sysdeps/riscv/rvv/strcmp.S @@ -0,0 +1,93 @@ +// Copyright (c) 2023 SiFive, Inc. -- Proprietary and Confidential All Rights +// Reserved. +// +// NOTICE: All information contained herein is, and remains the property of +// SiFive, Inc. The intellectual and technical concepts contained herein are +// proprietary to SiFive, Inc. and may be covered by U.S. and Foreign Patents, +// patents in process, and are protected by trade secret or copyright law. +// +// This work may not be copied, modified, re-published, uploaded, executed, or +// distributed in any way, in any medium, whether in whole or in part, without +// prior written permission from SiFive, Inc. +// +// The copyright notice above does not evidence any actual or intended +// publication or disclosure of this source code, which includes information +// that is confidential and/or proprietary, and is a trade secret, of SiFive, +// Inc. +//===----------------------------------------------------------------------===// + +// Contributed by: Jerry Shih + +// Prototype: +// int strcmp(const char *lhs, const char *rhs) + +#include +#include + +#define iResult a0 + +#define pStr1 a0 +#define pStr2 a1 + +#define iVL a2 +#define iTemp1 a3 +#define iTemp2 a4 + +#define vStr1 v0 +#define vStr2 v8 +#define vMask1 v16 +#define vMask2 v17 + +ENTRY(strcmp) + // lmul=1 + +L(Loop): + vsetvli iVL, zero, e8, m1, ta, ma + vle8ff.v vStr1, (pStr1) + // check if vStr1[i] == 0 + vmseq.vx vMask1, vStr1, zero + + vle8ff.v vStr2, (pStr2) + // check if vStr1[i] != vStr2[i] + vmsne.vv vMask2, vStr1, vStr2 + + // find the index x for vStr1[x]==0 + vfirst.m iTemp1, vMask1 + // find the index x for vStr1[x]!=vStr2[x] + vfirst.m iTemp2, vMask2 + + bgez iTemp1, L(check1) + bgez iTemp2, L(check2) + + // get the current vl updated by vle8ff. + csrr iVL, vl + add pStr1, pStr1, iVL + add pStr2, pStr2, iVL + j L(Loop) + + // iTemp1>=0 +L(check1): + bltz iTemp2, 1f + blt iTemp2, iTemp1, L(check2) +1: + // iTemp2<0 + // iTemp2>=0 && iTemp1=0 +L(check2): + add pStr1, pStr1, iTemp2 + add pStr2, pStr2, iTemp2 + lbu iTemp1, 0(pStr1) + lbu iTemp2, 0(pStr2) + sub iResult, iTemp1, iTemp2 + ret + +END(strcmp) +libc_hidden_builtin_def (strcmp) diff --git a/sysdeps/riscv/rvv/strcpy.S b/sysdeps/riscv/rvv/strcpy.S new file mode 100644 index 0000000000..8fb754ee23 --- /dev/null +++ b/sysdeps/riscv/rvv/strcpy.S @@ -0,0 +1,56 @@ +/* RVV versions strcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define pDstPtr a2 + +#define iVL a3 +#define iCurrentVL a4 +#define iActiveElemPos a5 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strcpy) + + mv pDstPtr, pDst + +L(strcpy_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strcpy_loop) + + ret + +END(strcpy) +libc_hidden_builtin_def (strcpy) diff --git a/sysdeps/riscv/rvv/strlen.S b/sysdeps/riscv/rvv/strlen.S new file mode 100644 index 0000000000..eb456b094b --- /dev/null +++ b/sysdeps/riscv/rvv/strlen.S @@ -0,0 +1,54 @@ +/* RVV versions strlen. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 +#define pStr a0 +#define pCopyStr a1 +#define iVL a2 +#define iCurrentVL a2 +#define iEndOffset a3 + +#define ELEM_LMUL_SETTING m2 +#define vStr v0 +#define vMaskEnd v2 + +ENTRY(strlen) + + mv pCopyStr, pStr +L(loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr, (pCopyStr) + csrr iCurrentVL, vl + vmseq.vi vMaskEnd, vStr, 0 + vfirst.m iEndOffset, vMaskEnd + add pCopyStr, pCopyStr, iCurrentVL + bltz iEndOffset, L(loop) + + add pStr, pStr, iCurrentVL + add pCopyStr, pCopyStr, iEndOffset + sub iResult, pCopyStr, iResult + + ret + +END(strlen) + +libc_hidden_builtin_def (strlen) diff --git a/sysdeps/riscv/rvv/strncat.S b/sysdeps/riscv/rvv/strncat.S new file mode 100644 index 0000000000..7847c4f008 --- /dev/null +++ b/sysdeps/riscv/rvv/strncat.S @@ -0,0 +1,83 @@ +/* RVV versions strncat. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iLength a2 +#define pDstPtr a3 + +#define iVL a4 +#define iCurrentVL a5 +#define iActiveElemPos a6 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strncat) + + mv pDstPtr, pDst + + /* the strlen of dst. */ +L(strlen_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pDstPtr) + /* find the '\0'. */ + vmseq.vx vMask1, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask1 + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strlen_loop) + + sub pDstPtr, pDstPtr, iCurrentVL + add pDstPtr, pDstPtr, iActiveElemPos + + /* copy pSrc to pDstPtr. */ +L(strcpy_loop): + vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + sub iLength, iLength, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + beqz iLength, L(fill_zero) + bltz iActiveElemPos, L(strcpy_loop) + + ret + +L(fill_zero): + bgez iActiveElemPos, L(fill_zero_end) + sb zero, (pDstPtr) + +L(fill_zero_end): + ret + +END(strncat) +libc_hidden_builtin_def (strncat) diff --git a/sysdeps/riscv/rvv/strncmp.S b/sysdeps/riscv/rvv/strncmp.S new file mode 100644 index 0000000000..168dbb07ce --- /dev/null +++ b/sysdeps/riscv/rvv/strncmp.S @@ -0,0 +1,85 @@ +/* RVV versions strncmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pStr1 a0 +#define pStr2 a1 +#define iLength a2 + +#define iVL a3 +#define iTemp1 a4 +#define iTemp2 a5 + +#define ELEM_LMUL_SETTING m1 +#define vStr1 v0 +#define vStr2 v4 +#define vMask1 v8 +#define vMask2 v9 + +ENTRY(strncmp) + + beqz iLength, L(zero_length) + +L(loop): + vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pStr1) + /* vStr1[i] == 0. */ + vmseq.vx vMask1, vStr1, zero + + vle8ff.v vStr2, (pStr2) + /* vStr1[i] != vStr2[i]. */ + vmsne.vv vMask2, vStr1, vStr2 + + csrr iVL, vl + + /* r = mask1 | mask2 + We could use vfirst.m to get the first zero char or the + first different char between str1 and str2. */ + vmor.mm vMask1, vMask1, vMask2 + + sub iLength, iLength, iVL + + vfirst.m iTemp1, vMask1 + + bgez iTemp1, L(end_loop) + + add pStr1, pStr1, iVL + add pStr2, pStr2, iVL + bnez iLength, L(loop) +L(end_loop): + + add pStr1, pStr1, iTemp1 + add pStr2, pStr2, iTemp1 + lbu iTemp1, 0(pStr1) + lbu iTemp2, 0(pStr2) + + sub iResult, iTemp1, iTemp2 + ret + +L(zero_length): + li iResult, 0 + ret + +END(strncmp) +libc_hidden_builtin_def (strncmp) diff --git a/sysdeps/riscv/rvv/strncpy.S b/sysdeps/riscv/rvv/strncpy.S new file mode 100644 index 0000000000..e8d9450448 --- /dev/null +++ b/sysdeps/riscv/rvv/strncpy.S @@ -0,0 +1,86 @@ +/* RVV versions strncpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iLength a2 +#define pDstPtr a3 + +#define iVL a4 +#define iCurrentVL a5 +#define iActiveElemPos a6 +#define iTemp a7 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define ZERO_FILL_ELEM_LMUL_SETTING m8 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strncpy) + + mv pDstPtr, pDst + + /* Copy pSrc to pDstPtr. */ +L(strcpy_loop): + vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + sub iLength, iLength, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + bgez iActiveElemPos, L(fill_zero) + bnez iLength, L(strcpy_loop) + ret + + /* Fill the tail zero. */ +L(fill_zero): + /* We already copy the `\0` to dst. But we use `vfirst.m` to + get the `index` of `\0` position. We need to adjust `-1` + to get the correct remaining iLength for zero filling. */ + sub iTemp, iCurrentVL, iActiveElemPos + addi iTemp, iTemp, -1 + add iLength, iLength, iTemp + /* Have an earily return for `strlen(src) + 1 == count` case. */ + bnez iLength, 1f + ret +1: + sub pDstPtr, pDstPtr, iTemp + vsetvli zero, iLength, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma + vmv.v.x vStr2, zero + +L(fill_zero_loop): + vsetvli iVL, iLength, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma + vse8.v vStr2, (pDstPtr) + sub iLength, iLength, iVL + add pDstPtr, pDstPtr, iVL + bnez iLength, L(fill_zero_loop) + + ret + +END(strncpy) +libc_hidden_builtin_def (strncpy) From patchwork Fri Apr 21 07:54:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68111 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 698C4385383B for ; Fri, 21 Apr 2023 07:55:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 698C4385383B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1682063737; bh=qKkRybMvAZHb34vEduXFG/+Jq5aZhVo6NLnTe937+Xc=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=u0a0OMnJAW3TaCn7+/y6OOADyow9zdrioCgR9Wq5lhbrOf+7mTicSvvdxqeOFTe89 aNRG/uaKz91Ek1MFkcgBgmD85OOZnSbEU6GLVQZwY71Jg8wTSaZ2Dk4hRrvgCKS2A2 QOf5faKzZJyZp6r/AZYe+VTX3X0bZo65MhMcSvmQ= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id 148D03857027 for ; Fri, 21 Apr 2023 07:54:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 148D03857027 Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-63b51fd2972so1644996b3a.3 for ; Fri, 21 Apr 2023 00:54:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682063670; x=1684655670; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qKkRybMvAZHb34vEduXFG/+Jq5aZhVo6NLnTe937+Xc=; b=XoTQOzqHC0tPx9iEo1/WZDgwXtT98tz9WqkENX2AT42qMiFJrJiR8ulOapAU3yA8zi wi2vvxOcTZVjmFYB6BueMwYRBMzNCCdX6M8OZgfeWLgTlbhSpjLXWFaVuPKSlVOPEdxK Q78P/SMF9XkhEO5G5eBYDd/gurGU9/y3ibSk81Y5jp62mxEotPCI8r6wwgS/4Iun174S x+qDwNHzCZm8eNfoxPB2T2ojz/bYcIrgkoCTn30/iwM9sbOH1F1e0YxeHr6HxfNI89mt 982a3YHTCU4z4+ozI4xejanD4k5enTfIDwKgYsFQX2eLz6G7dw3RiBbuSXV2TGeVmr0J 4LYg== X-Gm-Message-State: AAQBX9eNsAdbiKQCNFeu3DQWchkFWGr/FxfPpZTwYdGyco40Iv8SpD4X 4EmO9vnLP9u0AKWejhdip8PUl+8Z67dIQoRZbU1raxOWxjNd0M5szYTWN6eea3zi4UuHD9YrYcW aGgm5Od45iUnOHo0rLVPs3DZc7cNJ99h8TlkWmm0dsOcz25ZqziMhKe5jtAktVjguLjLKn660Qh 5QO0kr X-Google-Smtp-Source: AKy350YLKRbwZQJG4wf2TedVp35sokdAHwQP0rGQdgMY62xbXQFKXGZwvzrWEItTOz0B6Ek1uRXBQg== X-Received: by 2002:a05:6a20:6a28:b0:ef:cd5b:a5c7 with SMTP id p40-20020a056a206a2800b000efcd5ba5c7mr6054019pzk.56.1682063669676; Fri, 21 Apr 2023 00:54:29 -0700 (PDT) Received: from localhost.localdomain (1-169-217-217.dynamic-ip.hinet.net. [1.169.217.217]) by smtp.gmail.com with ESMTPSA id fa23-20020a056a002d1700b006259e883ee9sm650992pfb.189.2023.04.21.00.54.27 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Apr 2023 00:54:29 -0700 (PDT) To: libc-alpha@sourceware.org, hongrong.hsu@sifive.com, jerry.shih@sifive.com, nick.knight@sifive.com, kito.cheng@sifive.com Cc: greentime.hu@sifive.com, alice.chan@sifive.com, andrew@sifive.com, vincent.chen@sifive.com, hau.hsu@sifive.com Subject: [PATCH v2 4/5] riscv: vectorized strchr and strnlen functions Date: Fri, 21 Apr 2023 15:54:04 +0800 Message-Id: <20230421075405.14892-5-hau.hsu@sifive.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230421075405.14892-1-hau.hsu@sifive.com> References: <20230421075405.14892-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Nick Knight This patch proposes implementations of strcat, strcmp, strcpy, strlen, strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/strchr.S | 53 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strnlen.S | 56 +++++++++++++++++++++++++++++++++++++ 2 files changed, 109 insertions(+) create mode 100644 sysdeps/riscv/rvv/strchr.S create mode 100644 sysdeps/riscv/rvv/strnlen.S diff --git a/sysdeps/riscv/rvv/strchr.S b/sysdeps/riscv/rvv/strchr.S new file mode 100644 index 0000000000..4a660200c3 --- /dev/null +++ b/sysdeps/riscv/rvv/strchr.S @@ -0,0 +1,53 @@ +/* RISC-V multiarch strchr, V-extension version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Nick Knight . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + + + +ENTRY(strchr) +0: + vsetvli t0, zero, e8, m8, ta, ma + vle8ff.v v0, (a0) + vmseq.vi v8, v0, 0 + vmseq.vx v9, v0, a1 + vfirst.m a2, v8 /* first occurrence of \0 */ + vfirst.m a3, v9 /* first occurrence of ch */ + addi a4, a3, 1 + seqz a4, a4 + sltu a5, a2, a3 + or a4, a4, a5 + beqz a4, 1f /* Found ch, not preceded by \0? */ + li a6, -1 + csrr a5, vl + add a0, a0, a5 + beq a2, a6, 0b /* Didn't find \0? */ + li a0, 0 + ret +1: + add a0, a0, a3 + ret + +END(strchr) +weak_alias (strchr, index) +libc_hidden_builtin_def (strchr) + + diff --git a/sysdeps/riscv/rvv/strnlen.S b/sysdeps/riscv/rvv/strnlen.S new file mode 100644 index 0000000000..c1ce12baa5 --- /dev/null +++ b/sysdeps/riscv/rvv/strnlen.S @@ -0,0 +1,56 @@ +/* RVV versions strnlen. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by: Nick Knight + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pStr a0 +#define pCopyStr a2 +#define iRetValue a0 +#define iMaxlen a1 +#define iCurrentVL a3 +#define iEndOffset a4 + +#define ELEM_LMUL_SETTING m1 +#define vStr v0 +#define vMaskEnd v8 + +ENTRY(__strnlen) + + mv pCopyStr, pStr + mv iRetValue, iMaxlen +L(strnlen_loop): + beqz iMaxlen, L(end_strnlen_loop) + vsetvli zero, iMaxlen, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr, (pCopyStr) + vmseq.vi vMaskEnd, vStr, 0 + vfirst.m iEndOffset, vMaskEnd /* first occurence of \0 */ + csrr iCurrentVL, vl + add pCopyStr, pCopyStr, iCurrentVL + sub iMaxlen, iMaxlen, iCurrentVL + bltz iEndOffset, L(strnlen_loop) + add iMaxlen, iMaxlen, iCurrentVL + sub iRetValue, iRetValue, iMaxlen + add iRetValue, iRetValue, iEndOffset +L(end_strnlen_loop): + ret +END(__strnlen) +weak_alias (__strnlen, strnlen) +libc_hidden_builtin_def (strnlen) +libc_hidden_builtin_def (__strnlen) From patchwork Fri Apr 21 07:54:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68110 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A2DF385381F for ; Fri, 21 Apr 2023 07:55:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1A2DF385381F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1682063730; bh=iXYYco1kbXOODD2dFTtsWq6yKtRa6g72VD/nuRu/hoo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=lPdskiA3u9gZ4OTBl8yy7niDL6T4KpR+CFYQUDpWNYEqq49PTOtnUpMUKtcFOIsYx szfMF6wQ97tMkfbsoDIR9nR5X1+mf5m7NnhQGj0Hbln4iOSHl3GKjRJDQM8beaQc1V sG6pqxlOA1plU7dhJLTqHBnNBypZ7DUbzONX8nFY= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by sourceware.org (Postfix) with ESMTPS id 3EB893858434 for ; Fri, 21 Apr 2023 07:54:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3EB893858434 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-63b35789313so1489430b3a.3 for ; Fri, 21 Apr 2023 00:54:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682063672; x=1684655672; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iXYYco1kbXOODD2dFTtsWq6yKtRa6g72VD/nuRu/hoo=; b=Y/wAJSw6rSSkWxikeYKtxiDPx2XbZIuJMkR1OjVGlpQMMKlXTrJoIIh1xWHLM1G4f6 Rq3p/zoM/pSBJLxIshTllX5QSuJz23RLHzIWV2W+tltqmFVs4C1iuycdRL0neGKIvDQo fD8JHItbXsEaZOyeBaygFdX6mNcthNSMOwo0lxMtD5C/27PNGGYFjIpqPIpLgzFTxTdw 0nHmISDytxq6DNOxzqC1wRXNR1YCH4NPLDRfuviktXmzpRJaRuq71TxI1OLkcNvSEdbq O2We0TVXe55ug8Qs3y8JF3TP1tzIy+KpMC2qfarqLOHcbCrgUjMSCoh1v/iFFautH+Jj akFA== X-Gm-Message-State: AAQBX9e2mnjV092BvR235q+y7pOj+5RLStVWYUXsh1PcpRaxXX6L+3Gx ClnrcvIRZ7JXNA4XHWvLHQTrnitin9j0+IWuBC6gKCFvj2QsQw/7t6pcjB0sny174ZYwzFqT9zs N7C5rljyOR5cq3331K2vaOXgFh6GMjjwNCI3IX+Zp0K5MVg5a3a/ErargPP4tzyCrgkv9KW6g/2 mGAGvW X-Google-Smtp-Source: AKy350Z2ZuJyZrCiHc16kn7tYxku6EaucCv7nYwdpZghs7qn9P6taOFQJojfRVkekUC7Ce8iRXSazg== X-Received: by 2002:a05:6a20:c1a6:b0:e5:58e6:be37 with SMTP id bg38-20020a056a20c1a600b000e558e6be37mr5344940pzb.61.1682063671874; Fri, 21 Apr 2023 00:54:31 -0700 (PDT) Received: from localhost.localdomain (1-169-217-217.dynamic-ip.hinet.net. [1.169.217.217]) by smtp.gmail.com with ESMTPSA id fa23-20020a056a002d1700b006259e883ee9sm650992pfb.189.2023.04.21.00.54.29 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Apr 2023 00:54:31 -0700 (PDT) To: libc-alpha@sourceware.org, hongrong.hsu@sifive.com, jerry.shih@sifive.com, nick.knight@sifive.com, kito.cheng@sifive.com Cc: greentime.hu@sifive.com, alice.chan@sifive.com, andrew@sifive.com, vincent.chen@sifive.com, hau.hsu@sifive.com, Yun Hsiang Subject: [PATCH v2 5/5] riscv: add vectorized __memcmpeq Date: Fri, 21 Apr 2023 15:54:05 +0800 Message-Id: <20230421075405.14892-6-hau.hsu@sifive.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230421075405.14892-1-hau.hsu@sifive.com> References: <20230421075405.14892-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Yun Hsiang This patch proposes implementations of __memcmpeq that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/memcmp.S | 4 --- sysdeps/riscv/rvv/memcmpeq.S | 69 ++++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+), 4 deletions(-) create mode 100644 sysdeps/riscv/rvv/memcmpeq.S diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S index b156ec524c..74d8361293 100644 --- a/sysdeps/riscv/rvv/memcmp.S +++ b/sysdeps/riscv/rvv/memcmp.S @@ -69,7 +69,3 @@ L(found): END(memcmp) libc_hidden_builtin_def (memcmp) -weak_alias (memcmp,bcmp) -strong_alias (memcmp, __memcmpeq) -libc_hidden_def (__memcmpeq) - diff --git a/sysdeps/riscv/rvv/memcmpeq.S b/sysdeps/riscv/rvv/memcmpeq.S new file mode 100644 index 0000000000..302bca6992 --- /dev/null +++ b/sysdeps/riscv/rvv/memcmpeq.S @@ -0,0 +1,69 @@ +/* RVV versions memcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih , + Yun Hsiang . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + +#define iResult a0 + +#define pSrc1 a0 +#define pSrc2 a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 + +#define ELEM_LMUL_SETTING m1 +#define vData1 v0 +#define vData2 v8 +#define vMask v16 + +ENTRY(__memcmpeq) + +L(loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData1, (pSrc1) + vle8.v vData2, (pSrc2) + + vmsne.vv vMask, vData1, vData2 + sub iNum, iNum, iVL + vfirst.m iTemp, vMask + + // Skip the loop if we find the different value between pSrc1 and pSrc2. + bgez iTemp, L(found) + + add pSrc1, pSrc1, iVL + add pSrc2, pSrc2, iVL + + bnez iNum, L(loop) + + li iResult, 0 + ret + +L(found): + mv iResult, iVL + ret + +END(__memcmpeq) + +weak_alias (__memcmpeq, bcmp) +libc_hidden_def (__memcmpeq)