From patchwork Wed Mar 1 15:32:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Chen X-Patchwork-Id: 65845 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 16A2C3850841 for ; Wed, 1 Mar 2023 15:34:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 16A2C3850841 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1677684862; bh=PZYj+ZrC9/22JDzdeScW5amNIsJBo+2bc/ramKvra2M=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=f/c3JS+MMQS7t5g+ZvJgE9dBOR/nFmQkv/NeqRyfhcrDLkmjJBYdmgqJKP3W4jfQ8 WgQiYzTbjZul6q7tJqbklBuoWg4mC9MIWGt/3XpC3zaECJPmJNFTtLzIBz9E/HW44S j3tC6wemjscOSg5EAWD4tvJtSRCnbbbXi7Z7Y1jM= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by sourceware.org (Postfix) with ESMTPS id 6AC673858408 for ; Wed, 1 Mar 2023 15:33:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6AC673858408 Received: by mail-pl1-x62f.google.com with SMTP id i10so14364284plr.9 for ; Wed, 01 Mar 2023 07:33:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677684786; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PZYj+ZrC9/22JDzdeScW5amNIsJBo+2bc/ramKvra2M=; b=zMJGF/C4jcJ1rEPEfn6Id7RFXrprRck+mvhsvOxuOyatRjLZWQtzU3y7oAaeiRq8N4 c+xRJCsg0Kn9pnYUQ1lnKuqxvXi/nyFSxtXeUlvnp0qWbw56j9AtUpdptS0UId0rFYMS REC/AlivlcJ4wpZaKRhDdIeKAvB8IZaH+6j+NYRhvFToaj707k5d+7W7yPlOiqQvHF/7 b7tQYtf1NERKsnRRsibNnRf1qNQEs5KN9UJ2Qfqm600uj3NCK7nC8XFN+DryMChIuavp 3BuSumzUhfrMHZNHtMFKgCD648ULx6k44aZQapbBAjcpbrSQF0u0UEt2yb9I9g2GjFgr sHzQ== X-Gm-Message-State: AO0yUKVmb5KkyNv1DcTsbD3oqbttVuyoV+E4c6ibzOoXCQZckw3UUKyv kX9VQNfwO1U0BtM/pv0ftt0P6b8CLIrPGrDF3XhXLbVU4jxb+SNbmrVrR+Wixl+8LunKcnikG9s 5IjDQuAo9n5Dz43b42zcSPoMp3/mekscg77JjcObPbTVCPpophtggLocTcvmT5/Fqvy5g2lkkNV 2YkUxNWg== X-Google-Smtp-Source: AK7set84v3rVJb8DFIdDGji1yZDSdt7cTLY0SAm+ND5o0YOxO9LGj9uzwWS3iBSxEcByJR7G3uFZ0Q== X-Received: by 2002:a17:90b:4a88:b0:22b:f0d4:9e1e with SMTP id lp8-20020a17090b4a8800b0022bf0d49e1emr8176184pjb.8.1677684785976; Wed, 01 Mar 2023 07:33:05 -0800 (PST) Received: from localhost.localdomain (111-251-213-204.dynamic-ip.hinet.net. [111.251.213.204]) by smtp.gmail.com with ESMTPSA id a6-20020a17090a740600b002345ef591dasm8058025pjg.31.2023.03.01.07.33.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Mar 2023 07:33:05 -0800 (PST) To: libc-alpha@sourceware.org, palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, dj@redhat.com Cc: jerry.shih@sifive.com, nick.knight@sifive.com, hongrong.hsu@sifive.com, hau.hsu@sifive.com, kito.cheng@sifive.com, vincent.chen@sifive.com, greentime.hu@sifive.com Subject: [PATCH 1/4] riscv: Enabling vectorized mem*/str* functions in build time Date: Wed, 1 Mar 2023 23:32:44 +0800 Message-Id: <20230301153247.1499566-2-vincent.chen@sifive.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230301153247.1499566-1-vincent.chen@sifive.com> References: <20230301153247.1499566-1-vincent.chen@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Vincent Chen via Libc-alpha From: Vincent Chen Reply-To: Vincent Chen Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Let the build selects the vectorized mem*/str* functions when it detects the compiler supports RISC-V V extension and enables it in this build. We agree that the these vectorized mem*/str* functions should be selected by IFUNC. Therefore, this patch is intended as a **temporary solution** to enable reviewers to evaluate the effectiveness of these vectorized mem*/str* functions. --- scripts/build-many-glibcs.py | 10 ++++++++++ sysdeps/riscv/preconfigure | 19 +++++++++++++++++++ sysdeps/riscv/preconfigure.ac | 18 ++++++++++++++++++ sysdeps/riscv/rv32/rvv/Implies | 2 ++ sysdeps/riscv/rv64/rvv/Implies | 2 ++ 5 files changed, 51 insertions(+) create mode 100644 sysdeps/riscv/rv32/rvv/Implies create mode 100644 sysdeps/riscv/rv64/rvv/Implies diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py index bd212dbc82..8ac86c50f6 100755 --- a/scripts/build-many-glibcs.py +++ b/scripts/build-many-glibcs.py @@ -381,6 +381,11 @@ class Context(object): variant='rv32imafdc-ilp32d', gcc_cfg=['--with-arch=rv32imafdc', '--with-abi=ilp32d', '--disable-multilib']) + self.add_config(arch='riscv32', + os_name='linux-gnu', + variant='rv32imafdcv-ilp32d', + gcc_cfg=['--with-arch=rv32imafdcv', '--with-abi=ilp32d', + '--disable-multilib']) self.add_config(arch='riscv64', os_name='linux-gnu', variant='rv64imac-lp64', @@ -396,6 +401,11 @@ class Context(object): variant='rv64imafdc-lp64d', gcc_cfg=['--with-arch=rv64imafdc', '--with-abi=lp64d', '--disable-multilib']) + self.add_config(arch='riscv64', + os_name='linux-gnu', + variant='rv64imafdcv-lp64d', + gcc_cfg=['--with-arch=rv64imafdcv', '--with-abi=lp64d', + '--disable-multilib']) self.add_config(arch='s390x', os_name='linux-gnu', glibcs=[{}, diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure index 4dedf4b0bb..5ddc195b46 100644 --- a/sysdeps/riscv/preconfigure +++ b/sysdeps/riscv/preconfigure @@ -7,6 +7,7 @@ riscv*) flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'` float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'` atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2` + vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2` case "$xlen" in 64 | 32) @@ -32,6 +33,24 @@ riscv*) ;; esac + case "$vector" in + __riscv_vector) + case "$flen" in + 64) + float_machine=rvv + ;; + *) + # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only. + ;; + esac + ;; + *) + ;; + esac + + { $as_echo "$as_me:${as_lineno-$LINENO}: vector $vector flen $flen float_machine $float_machine" >&5 +$as_echo "$as_me: vector $vector flen $flen float_machine $float_machine" >&6;} + case "$float_abi" in soft) abi_flen=0 diff --git a/sysdeps/riscv/preconfigure.ac b/sysdeps/riscv/preconfigure.ac index a5c30e0dbf..b6b8bb46e4 100644 --- a/sysdeps/riscv/preconfigure.ac +++ b/sysdeps/riscv/preconfigure.ac @@ -7,6 +7,7 @@ riscv*) flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'` float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'` atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2` + vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2` case "$xlen" in 64 | 32) @@ -32,6 +33,23 @@ riscv*) ;; esac + case "$vector" in + __riscv_vector) + case "$flen" in + 64) + float_machine=rvv + ;; + *) + # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only. + ;; + esac + ;; + *) + ;; + esac + + AC_MSG_NOTICE([vector $vector flen $flen float_machine $float_machine]) + case "$float_abi" in soft) abi_flen=0 diff --git a/sysdeps/riscv/rv32/rvv/Implies b/sysdeps/riscv/rv32/rvv/Implies new file mode 100644 index 0000000000..25ce1df222 --- /dev/null +++ b/sysdeps/riscv/rv32/rvv/Implies @@ -0,0 +1,2 @@ +riscv/rv32/rvd +riscv/rvv diff --git a/sysdeps/riscv/rv64/rvv/Implies b/sysdeps/riscv/rv64/rvv/Implies new file mode 100644 index 0000000000..9993bb30e3 --- /dev/null +++ b/sysdeps/riscv/rv64/rvv/Implies @@ -0,0 +1,2 @@ +riscv/rv64/rvd +riscv/rvv From patchwork Wed Mar 1 15:32:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Chen X-Patchwork-Id: 65844 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DAD743857B93 for ; Wed, 1 Mar 2023 15:33:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DAD743857B93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1677684824; bh=M0lbd3CkKMT0Y1CGq1o++YHvUAQxVoN5G+ynoNqtd9s=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=HgyXSCNprsN7GmRxFBQFf5SQ905jiWFOLCx8igcKl5l80vPevmxdW86fdQZY0tkg4 mtkpazf4U4mCZJk0kdbViBKHe44CnbudiF67uJAsNJsCqNNnVUy6J2YZ3gLXl80UgH +16R4AmzC6fzq++vanp2a8RD9UZFK4R5LDayeMpY= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id C78DC385841C for ; Wed, 1 Mar 2023 15:33:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C78DC385841C Received: by mail-pj1-x1034.google.com with SMTP id me6-20020a17090b17c600b0023816b0c7ceso9755366pjb.2 for ; Wed, 01 Mar 2023 07:33:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677684789; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M0lbd3CkKMT0Y1CGq1o++YHvUAQxVoN5G+ynoNqtd9s=; b=D4HDTZawLVBfYVhuHPILa+kfj/Xw+kYbTBy50f50wvqtS9ZTEsq8mXULzoD+4WvO3V IjDrDD93v+0xyfbGO+fOG/vVRjLJ7/ZvluHS8dAnsVtcSyPsIsrf47S/T1+UgCdc0nfv /4nsMVTy3rj0ke0FSzrSYjKuIcO0nlpTl09waYnKYNxOE1lfbCBUV8KUqzFcs82TrdE/ gP3d69vpRNdHO+vdRnQ4AhXRaUs/oRQ6uT9MZkNKdNtHjXh6X+Pc3I2hyhz9bzTYtiI2 kuimryzQ2jgyPB39WruQ0K+lFK25/x+5aJM44LpWGQ8CobaezZMCSHGdDK2JkTQgFRso R8jA== X-Gm-Message-State: AO0yUKVLsyVcwxY0wwWiBSsTRztnrBzcjG36MrPNngq1OEmSqeFb194D uPtXS5/osgHU+h+HCINpAydiXu9WUGZcCRlpmANwKWteLd8/XHOyUQsZBVt4UzGFEmCzltHEtEC ih2w2iN/sn5pxqrG9WZE/ivUK9kTSY9adyS6Bhipm5DCMkdATcvUzhyNSkzDt6FOJN/AeVKpHCI UWkD0/gA== X-Google-Smtp-Source: AK7set+x7LVr6oi5FBILtlIaN2+pm5hYTZ0ubJcLD9cD9xABZqJyZAaFuX3+Hjf/Z0xqxh3wo4iKKQ== X-Received: by 2002:a17:90b:4d8b:b0:234:f4a:8985 with SMTP id oj11-20020a17090b4d8b00b002340f4a8985mr8099391pjb.15.1677684789154; Wed, 01 Mar 2023 07:33:09 -0800 (PST) Received: from localhost.localdomain (111-251-213-204.dynamic-ip.hinet.net. [111.251.213.204]) by smtp.gmail.com with ESMTPSA id a6-20020a17090a740600b002345ef591dasm8058025pjg.31.2023.03.01.07.33.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Mar 2023 07:33:08 -0800 (PST) To: libc-alpha@sourceware.org, palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, dj@redhat.com Cc: jerry.shih@sifive.com, nick.knight@sifive.com, hongrong.hsu@sifive.com, hau.hsu@sifive.com, kito.cheng@sifive.com, vincent.chen@sifive.com, greentime.hu@sifive.com Subject: [PATCH 2/4] riscv: vectorized mem* functions Date: Wed, 1 Mar 2023 23:32:45 +0800 Message-Id: <20230301153247.1499566-3-vincent.chen@sifive.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230301153247.1499566-1-vincent.chen@sifive.com> References: <20230301153247.1499566-1-vincent.chen@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Vincent Chen via Libc-alpha From: Vincent Chen Reply-To: Vincent Chen Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of memchr, memcmp, memcpy, memmove, and memset that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/memchr.S | 63 +++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcmp.S | 75 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcpy.S | 51 +++++++++++++++++++++++++ sysdeps/riscv/rvv/memmove.S | 72 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memset.S | 51 +++++++++++++++++++++++++ 5 files changed, 312 insertions(+) create mode 100644 sysdeps/riscv/rvv/memchr.S create mode 100644 sysdeps/riscv/rvv/memcmp.S create mode 100644 sysdeps/riscv/rvv/memcpy.S create mode 100644 sysdeps/riscv/rvv/memmove.S create mode 100644 sysdeps/riscv/rvv/memset.S diff --git a/sysdeps/riscv/rvv/memchr.S b/sysdeps/riscv/rvv/memchr.S new file mode 100644 index 0000000000..6981a9f8b0 --- /dev/null +++ b/sysdeps/riscv/rvv/memchr.S @@ -0,0 +1,63 @@ +/* RVV versions memchr. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pSrc a0 +#define iValue a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 +#define vMask v8 + +ENTRY(memchr) + +L(loop): + vsetvli zero, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vData, (pSrc) + /* Find the iValue inside the loaded data. */ + vmseq.vx vMask, vData, iValue + vfirst.m iTemp, vMask + + /* Skip the loop if we find the matched value. */ + bgez iTemp, L(found) + + csrr iVL, vl + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + + bnez iNum, L(loop) + + li iResult, 0 + ret + +L(found): + add iResult, pSrc, iTemp + ret + +END(memchr) +libc_hidden_builtin_def (memchr) diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S new file mode 100644 index 0000000000..b156ec524c --- /dev/null +++ b/sysdeps/riscv/rvv/memcmp.S @@ -0,0 +1,75 @@ +/* RVV versions memcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pSrc1 a0 +#define pSrc2 a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 +#define iTemp1 a5 +#define iTemp2 a6 + +#define ELEM_LMUL_SETTING m8 +#define vData1 v0 +#define vData2 v8 +#define vMask v16 + +ENTRY(memcmp) + +L(loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData1, (pSrc1) + vle8.v vData2, (pSrc2) + + vmsne.vv vMask, vData1, vData2 + sub iNum, iNum, iVL + vfirst.m iTemp, vMask + + /* Skip the loop if we find the different value between pSrc1 and pSrc2. */ + bgez iTemp, L(found) + + add pSrc1, pSrc1, iVL + add pSrc2, pSrc2, iVL + + bnez iNum, L(loop) + + li iResult, 0 + ret + +L(found): + add pSrc1, pSrc1, iTemp + add pSrc2, pSrc2, iTemp + lbu iTemp1, 0(pSrc1) + lbu iTemp2, 0(pSrc2) + sub iResult, iTemp1, iTemp2 + ret + +END(memcmp) +libc_hidden_builtin_def (memcmp) +weak_alias (memcmp,bcmp) +strong_alias (memcmp, __memcmpeq) +libc_hidden_def (__memcmpeq) + diff --git a/sysdeps/riscv/rvv/memcpy.S b/sysdeps/riscv/rvv/memcpy.S new file mode 100644 index 0000000000..de790fbe51 --- /dev/null +++ b/sysdeps/riscv/rvv/memcpy.S @@ -0,0 +1,51 @@ +/* RVV versions memcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +ENTRY(memcpy) + + mv pDstPtr, pDst + +L(loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + + bnez iNum, L(loop) + + ret + +END(memcpy) +libc_hidden_builtin_def (memcpy) diff --git a/sysdeps/riscv/rvv/memmove.S b/sysdeps/riscv/rvv/memmove.S new file mode 100644 index 0000000000..ed12744064 --- /dev/null +++ b/sysdeps/riscv/rvv/memmove.S @@ -0,0 +1,72 @@ +/* RVV versions memmove. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 +#define pSrcBackwardPtr a5 +#define pDstBackwardPtr a6 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +ENTRY(memmove) + + mv pDstPtr, pDst + + /* If pSrc is equal or after pDst, all data in pSrc will be loaded before + overwrited for the overlapping case. We could use faster `forward-copy`. */ + bgeu pSrc, pDst, L(forward_copy_loop) + add pSrcBackwardPtr, pSrc, iNum + add pDstBackwardPtr, pDst, iNum + /* If pDst inside source data range, we need to use `backward_copy_loop` to + handle the overlapping issue. */ + bltu pDst, pSrcBackwardPtr, L(backward_copy_loop) + +L(forward_copy_loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + + bnez iNum, L(forward_copy_loop) + ret + +L(backward_copy_loop): + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + sub pSrcBackwardPtr, pSrcBackwardPtr, iVL + vle8.v vData, (pSrcBackwardPtr) + sub iNum, iNum, iVL + sub pDstBackwardPtr, pDstBackwardPtr, iVL + vse8.v vData, (pDstBackwardPtr) + bnez iNum, L(backward_copy_loop) + ret + +END(memmove) +libc_hidden_builtin_def (memmove) diff --git a/sysdeps/riscv/rvv/memset.S b/sysdeps/riscv/rvv/memset.S new file mode 100644 index 0000000000..3a6c3d0afd --- /dev/null +++ b/sysdeps/riscv/rvv/memset.S @@ -0,0 +1,51 @@ +/* RVV versions memset. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define iValue a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 +#define pDstPtr a5 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +ENTRY(memset) + + mv pDstPtr, pDst + + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + vmv.v.x vData, iValue + +L(loop): + vse8.v vData, (pDstPtr) + sub iNum, iNum, iVL + add pDstPtr, pDstPtr, iVL + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + bnez iNum, L(loop) + + ret + +END(memset) +libc_hidden_builtin_def (memset) From patchwork Wed Mar 1 15:32:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Chen X-Patchwork-Id: 65846 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C1E90385B525 for ; Wed, 1 Mar 2023 15:34:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C1E90385B525 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1677684863; bh=sij9GeLc7QY4/1s8qdaw/cbUOOs/XMrrYqWAcFRekIU=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=BKPOLwbLE8pBzvnoGUu/Pbxsuhtpy3+XIA28qLlmtmf44UuDxl5i3TQ/ZsRvkl5bF jxyy3rwLtfEl7Yj0ayNFej5MNkEohcsBLQ2AkJelwzyDl7nHlsKs/byrp+YhY7tt4E BnhJ5yHFVlPOsrsy06G7Oy8esFr09Rimg9jxI17I= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id 61F8B3858430 for ; Wed, 1 Mar 2023 15:33:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 61F8B3858430 Received: by mail-pj1-x1034.google.com with SMTP id l1so13741875pjt.2 for ; Wed, 01 Mar 2023 07:33:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677684795; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sij9GeLc7QY4/1s8qdaw/cbUOOs/XMrrYqWAcFRekIU=; b=O2OgIw7rHQePW6nHZ69WmRyHq1DhWLpls14WTnfX8AlOIQxN+/iMZbqebxfddX/mEI fGzpIYz4Gz+lH+eELzgFO0GW8tiMA4t8Pm3cvZIdyv2MlIt3W0UIDWnkKsma7dYtnvXo jK72j9GZKIAmQ34vGdjM++0MKIol/38ufzMtQIHpvCxgXqccoW2qIF9llze+3eKS9Mv0 jmjDxAnDbo7Zz7kQNZeo8cJHLa+uDWjAhFG7KTHlV4tBSEXqNOs2BFmzsiKzPqfyt4Kx 5s5hZYynh7x57R0er8ljsFk/ykua0WC818utzODLqfRo69XkmFwaQwGQCNnyKY/Cs6F/ l2MQ== X-Gm-Message-State: AO0yUKU8/99SEb9LoZ4r+Oa3CkYQLdvBL2Yok55j3qjtPol8D6WaeQBC c2mvXVdvtyM1zDOIRg4YJnWfWlOtxxq/0QRL9wWieGIaAxBwQ7Pima1SWa1NriqFXXSu1MjKIaD 6y6/bB/84udOY9Qzyw3RAeEu/tX+aEuLZ7vsuOrJ22pcpRHA7J6TeZhwux6wINv+ZPFSbM6KSbe aKnauE4Q== X-Google-Smtp-Source: AK7set/1OA+nafsQR6qq+7y1MoO6LV4VYWS21Ed6hpjfXtolhTn35xz2Wu61/O6rIeKSr1PIFw5DIw== X-Received: by 2002:a17:90b:1e4f:b0:237:161d:f5ac with SMTP id pi15-20020a17090b1e4f00b00237161df5acmr8169712pjb.36.1677684792355; Wed, 01 Mar 2023 07:33:12 -0800 (PST) Received: from localhost.localdomain (111-251-213-204.dynamic-ip.hinet.net. [111.251.213.204]) by smtp.gmail.com with ESMTPSA id a6-20020a17090a740600b002345ef591dasm8058025pjg.31.2023.03.01.07.33.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Mar 2023 07:33:12 -0800 (PST) To: libc-alpha@sourceware.org, palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, dj@redhat.com Cc: jerry.shih@sifive.com, nick.knight@sifive.com, hongrong.hsu@sifive.com, hau.hsu@sifive.com, kito.cheng@sifive.com, vincent.chen@sifive.com, greentime.hu@sifive.com Subject: [PATCH 3/4] riscv: vectorized str* functions Date: Wed, 1 Mar 2023 23:32:46 +0800 Message-Id: <20230301153247.1499566-4-vincent.chen@sifive.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230301153247.1499566-1-vincent.chen@sifive.com> References: <20230301153247.1499566-1-vincent.chen@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Vincent Chen via Libc-alpha From: Vincent Chen Reply-To: Vincent Chen Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of strcat, strcmp, strcpy, strlen, strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/strcat.S | 72 +++++++++++++++++++ sysdeps/riscv/rvv/strcmp.S | 135 ++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strcpy.S | 56 +++++++++++++++ sysdeps/riscv/rvv/strlen.S | 54 +++++++++++++++ sysdeps/riscv/rvv/strncat.S | 83 ++++++++++++++++++++++ sysdeps/riscv/rvv/strncmp.S | 85 +++++++++++++++++++++++ sysdeps/riscv/rvv/strncpy.S | 86 +++++++++++++++++++++++ 7 files changed, 571 insertions(+) create mode 100644 sysdeps/riscv/rvv/strcat.S create mode 100644 sysdeps/riscv/rvv/strcmp.S create mode 100644 sysdeps/riscv/rvv/strcpy.S create mode 100644 sysdeps/riscv/rvv/strlen.S create mode 100644 sysdeps/riscv/rvv/strncat.S create mode 100644 sysdeps/riscv/rvv/strncmp.S create mode 100644 sysdeps/riscv/rvv/strncpy.S diff --git a/sysdeps/riscv/rvv/strcat.S b/sysdeps/riscv/rvv/strcat.S new file mode 100644 index 0000000000..8a7779fd3c --- /dev/null +++ b/sysdeps/riscv/rvv/strcat.S @@ -0,0 +1,72 @@ +/* RVV versions strcat. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define pDstPtr a2 + +#define iVL a3 +#define iCurrentVL a4 +#define iActiveElemPos a5 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strcat) + + mv pDstPtr, pDst + + /* Perform `strlen(dst)`. */ +L(strlen_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pDstPtr) + vmseq.vx vMask1, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask1 + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strlen_loop) + + sub pDstPtr, pDstPtr, iCurrentVL + add pDstPtr, pDstPtr, iActiveElemPos + + /* Perform `strcpy(dst, src)`. */ +L(strcpy_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strcpy_loop) + + ret + +END(strcat) +libc_hidden_builtin_def (strcat) diff --git a/sysdeps/riscv/rvv/strcmp.S b/sysdeps/riscv/rvv/strcmp.S new file mode 100644 index 0000000000..ff5bb252f3 --- /dev/null +++ b/sysdeps/riscv/rvv/strcmp.S @@ -0,0 +1,135 @@ +/* RVV versions strcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pStr1 a0 +#define pStr2 a1 + +#define iVL a2 +#define iTemp1 a3 +#define iTemp2 a4 +#define iLMUL1 a5 +#define iLMUL2 a6 +#define iLMUL4 a7 + +#define iLMUL t0 + +#define vStr1 v0 +#define vStr2 v8 +#define vMask1 v16 +#define vMask2 v17 + +ENTRY(strcmp) + + /* Increase the lmul using the following sequences: + 1/2, 1/2, 1, 2, 4, 4, 4, ... . */ + + /* lmul=1/2. */ + vsetvli iVL, zero, e8, mf2, ta, ma + + vle8ff.v vStr1, (pStr1) + /* check if vStr1[i] == 0. */ + vmseq.vx vMask1, vStr1, zero + + vle8ff.v vStr2, (pStr2) + /* check if vStr1[i] != vStr2[i]. */ + vmsne.vv vMask2, vStr1, vStr2 + + /* find the index x for vStr1[x]==0. */ + vfirst.m iTemp1, vMask1 + /* find the index x for vStr1[x]!=vStr2[x]. */ + vfirst.m iTemp2, vMask2 + + bgez iTemp1, L(check1) + bgez iTemp2, L(check2) + + /* get the current vl updated by vle8ff. */ + csrr iVL, vl + add pStr1, pStr1, iVL + add pStr2, pStr2, iVL + + vsetvli iVL, zero, e8, mf2, ta, ma + addi iLMUL1, zero, 1 + addi iLMUL, zero, 1 + j L(loop) +L(m1): + vsetvli iVL, zero, e8, m1, ta, ma + addi iLMUL2, zero, 2 + addi iLMUL, zero, 2 + j L(loop) +L(m2): + vsetvli iVL, zero, e8, m2, ta, ma + addi iLMUL4, zero, 4 + addi iLMUL, zero, 4 + j L(loop) +L(m4): + vsetvli iVL, zero, e8, m4, ta, ma + +L(loop): + vle8ff.v vStr1, (pStr1) + vmseq.vx vMask1, vStr1, zero + + vle8ff.v vStr2, (pStr2) + vmsne.vv vMask2, vStr1, vStr2 + + vfirst.m iTemp1, vMask1 + vfirst.m iTemp2, vMask2 + + bgez iTemp1, L(check1) + bgez iTemp2, L(check2) + + csrr iVL, vl + add pStr1, pStr1, iVL + add pStr2, pStr2, iVL + + beq iLMUL, iLMUL1, L(m1) + beq iLMUL, iLMUL2, L(m2) + beq iLMUL, iLMUL4, L(m4) + j L(loop) + + /* iTemp1>=0. */ +L(check1): + bltz iTemp2, 1f + blt iTemp2, iTemp1, L(check2) +1: + /* iTemp2<0 + iTemp2>=0 && iTemp1=0. */ +L(check2): + add pStr1, pStr1, iTemp2 + add pStr2, pStr2, iTemp2 + lbu iTemp1, 0(pStr1) + lbu iTemp2, 0(pStr2) + sub iResult, iTemp1, iTemp2 + ret + +END(strcmp) +libc_hidden_builtin_def (strcmp) diff --git a/sysdeps/riscv/rvv/strcpy.S b/sysdeps/riscv/rvv/strcpy.S new file mode 100644 index 0000000000..8fb754ee23 --- /dev/null +++ b/sysdeps/riscv/rvv/strcpy.S @@ -0,0 +1,56 @@ +/* RVV versions strcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define pDstPtr a2 + +#define iVL a3 +#define iCurrentVL a4 +#define iActiveElemPos a5 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strcpy) + + mv pDstPtr, pDst + +L(strcpy_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strcpy_loop) + + ret + +END(strcpy) +libc_hidden_builtin_def (strcpy) diff --git a/sysdeps/riscv/rvv/strlen.S b/sysdeps/riscv/rvv/strlen.S new file mode 100644 index 0000000000..eb456b094b --- /dev/null +++ b/sysdeps/riscv/rvv/strlen.S @@ -0,0 +1,54 @@ +/* RVV versions strlen. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 +#define pStr a0 +#define pCopyStr a1 +#define iVL a2 +#define iCurrentVL a2 +#define iEndOffset a3 + +#define ELEM_LMUL_SETTING m2 +#define vStr v0 +#define vMaskEnd v2 + +ENTRY(strlen) + + mv pCopyStr, pStr +L(loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr, (pCopyStr) + csrr iCurrentVL, vl + vmseq.vi vMaskEnd, vStr, 0 + vfirst.m iEndOffset, vMaskEnd + add pCopyStr, pCopyStr, iCurrentVL + bltz iEndOffset, L(loop) + + add pStr, pStr, iCurrentVL + add pCopyStr, pCopyStr, iEndOffset + sub iResult, pCopyStr, iResult + + ret + +END(strlen) + +libc_hidden_builtin_def (strlen) diff --git a/sysdeps/riscv/rvv/strncat.S b/sysdeps/riscv/rvv/strncat.S new file mode 100644 index 0000000000..7847c4f008 --- /dev/null +++ b/sysdeps/riscv/rvv/strncat.S @@ -0,0 +1,83 @@ +/* RVV versions strncat. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iLength a2 +#define pDstPtr a3 + +#define iVL a4 +#define iCurrentVL a5 +#define iActiveElemPos a6 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strncat) + + mv pDstPtr, pDst + + /* the strlen of dst. */ +L(strlen_loop): + vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pDstPtr) + /* find the '\0'. */ + vmseq.vx vMask1, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask1 + add pDstPtr, pDstPtr, iCurrentVL + bltz iActiveElemPos, L(strlen_loop) + + sub pDstPtr, pDstPtr, iCurrentVL + add pDstPtr, pDstPtr, iActiveElemPos + + /* copy pSrc to pDstPtr. */ +L(strcpy_loop): + vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + sub iLength, iLength, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + beqz iLength, L(fill_zero) + bltz iActiveElemPos, L(strcpy_loop) + + ret + +L(fill_zero): + bgez iActiveElemPos, L(fill_zero_end) + sb zero, (pDstPtr) + +L(fill_zero_end): + ret + +END(strncat) +libc_hidden_builtin_def (strncat) diff --git a/sysdeps/riscv/rvv/strncmp.S b/sysdeps/riscv/rvv/strncmp.S new file mode 100644 index 0000000000..168dbb07ce --- /dev/null +++ b/sysdeps/riscv/rvv/strncmp.S @@ -0,0 +1,85 @@ +/* RVV versions strncmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define iResult a0 + +#define pStr1 a0 +#define pStr2 a1 +#define iLength a2 + +#define iVL a3 +#define iTemp1 a4 +#define iTemp2 a5 + +#define ELEM_LMUL_SETTING m1 +#define vStr1 v0 +#define vStr2 v4 +#define vMask1 v8 +#define vMask2 v9 + +ENTRY(strncmp) + + beqz iLength, L(zero_length) + +L(loop): + vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vStr1, (pStr1) + /* vStr1[i] == 0. */ + vmseq.vx vMask1, vStr1, zero + + vle8ff.v vStr2, (pStr2) + /* vStr1[i] != vStr2[i]. */ + vmsne.vv vMask2, vStr1, vStr2 + + csrr iVL, vl + + /* r = mask1 | mask2 + We could use vfirst.m to get the first zero char or the + first different char between str1 and str2. */ + vmor.mm vMask1, vMask1, vMask2 + + sub iLength, iLength, iVL + + vfirst.m iTemp1, vMask1 + + bgez iTemp1, L(end_loop) + + add pStr1, pStr1, iVL + add pStr2, pStr2, iVL + bnez iLength, L(loop) +L(end_loop): + + add pStr1, pStr1, iTemp1 + add pStr2, pStr2, iTemp1 + lbu iTemp1, 0(pStr1) + lbu iTemp2, 0(pStr2) + + sub iResult, iTemp1, iTemp2 + ret + +L(zero_length): + li iResult, 0 + ret + +END(strncmp) +libc_hidden_builtin_def (strncmp) diff --git a/sysdeps/riscv/rvv/strncpy.S b/sysdeps/riscv/rvv/strncpy.S new file mode 100644 index 0000000000..e8d9450448 --- /dev/null +++ b/sysdeps/riscv/rvv/strncpy.S @@ -0,0 +1,86 @@ +/* RVV versions strncpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jerry Shih . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iLength a2 +#define pDstPtr a3 + +#define iVL a4 +#define iCurrentVL a5 +#define iActiveElemPos a6 +#define iTemp a7 + +#define ELEM_LMUL_SETTING m1 +#define vMask1 v0 +#define vMask2 v1 +#define ZERO_FILL_ELEM_LMUL_SETTING m8 +#define vStr1 v8 +#define vStr2 v16 + +ENTRY(strncpy) + + mv pDstPtr, pDst + + /* Copy pSrc to pDstPtr. */ +L(strcpy_loop): + vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr1, (pSrc) + vmseq.vx vMask2, vStr1, zero + csrr iCurrentVL, vl + vfirst.m iActiveElemPos, vMask2 + vmsif.m vMask1, vMask2 + add pSrc, pSrc, iCurrentVL + sub iLength, iLength, iCurrentVL + vse8.v vStr1, (pDstPtr), vMask1.t + add pDstPtr, pDstPtr, iCurrentVL + bgez iActiveElemPos, L(fill_zero) + bnez iLength, L(strcpy_loop) + ret + + /* Fill the tail zero. */ +L(fill_zero): + /* We already copy the `\0` to dst. But we use `vfirst.m` to + get the `index` of `\0` position. We need to adjust `-1` + to get the correct remaining iLength for zero filling. */ + sub iTemp, iCurrentVL, iActiveElemPos + addi iTemp, iTemp, -1 + add iLength, iLength, iTemp + /* Have an earily return for `strlen(src) + 1 == count` case. */ + bnez iLength, 1f + ret +1: + sub pDstPtr, pDstPtr, iTemp + vsetvli zero, iLength, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma + vmv.v.x vStr2, zero + +L(fill_zero_loop): + vsetvli iVL, iLength, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma + vse8.v vStr2, (pDstPtr) + sub iLength, iLength, iVL + add pDstPtr, pDstPtr, iVL + bnez iLength, L(fill_zero_loop) + + ret + +END(strncpy) +libc_hidden_builtin_def (strncpy) From patchwork Wed Mar 1 15:32:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Chen X-Patchwork-Id: 65847 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B70E7385B514 for ; Wed, 1 Mar 2023 15:35:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B70E7385B514 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1677684907; bh=tAuRdMkWjWLLMsdukHidxDyLHM6ALUTbHiqDGqJUvEU=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=NDs8LKs8baJKRfSwefWfxAB4EdldEvX6RV9EH9YDc4Pz8GZNMwQ1nCqzVZ/65vLge OdUbF4vKKUECvhGrJ3XvgF5N5ILJAtPkl8BjfJ/M+S2rbxcwyni1gABmLzECs18/xF 7AfJ7YfGI3hz+nH3utpjtxOI/IqReP/mPt5Rn6t0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 73DD33858002 for ; Wed, 1 Mar 2023 15:33:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73DD33858002 Received: by mail-pl1-x62a.google.com with SMTP id ky4so14407569plb.3 for ; Wed, 01 Mar 2023 07:33:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677684798; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tAuRdMkWjWLLMsdukHidxDyLHM6ALUTbHiqDGqJUvEU=; b=aoR5Z9QIiJHJqEkD4259FpOBuGleK91ITTLbAA/VXyTJFBwSoBRtdumEqvenqBTwWR F1hrM3G7cK2hSfjCT/tNmbQVADMYWli1LAM1N3mu7zsY3/oa/kczhwM6Ne3xtyUhEEfW he/hsf4PEqU1gGpqbP/eTI2rPee16cAh113SSaNCp8ihjtlzFvpHIfZJktHgSJnZ5M7E s0bFd5JN6nG1RZDrGScW9+cf6GhcJfhAbh8UU6zZeS/UmhKzVSclRrsApJ6v7sHjJsWJ xS+icFlppxm7V4VBsnRhUcZUZx0YLpScki8Riy9TpWb4OA/hKnpBdqXLgkQYTzkbI5CM UESg== X-Gm-Message-State: AO0yUKU9ZebfGNHN1JqIw5ildKxnkhRfJwyQjPUTJcGWHBhdsXEudLL1 38bx1Rx0cbJutEfexqcHN6K4T3s0AiTCy8VfYuBtc+30T0hcSRWQw4Uz+jovfWx6OKh/AY3fVmU iKrv8uwsbqnRsG0+IZymwOQ44P4A9e1+kj+3RbzcaG5zlJQ2dCmZkHMwmYJzYqtNpf6qa66DowN 2PWWFJ9Q== X-Google-Smtp-Source: AK7set9VW6QouE3Qkz4yuFG931nOEhRLSBqy+YEKq2+9LGMGcU6F9Tq77EFvGsKfaWhhiABndSY7UQ== X-Received: by 2002:a17:90a:6acf:b0:237:1b6a:dbce with SMTP id b15-20020a17090a6acf00b002371b6adbcemr7965411pjm.2.1677684798056; Wed, 01 Mar 2023 07:33:18 -0800 (PST) Received: from localhost.localdomain (111-251-213-204.dynamic-ip.hinet.net. [111.251.213.204]) by smtp.gmail.com with ESMTPSA id a6-20020a17090a740600b002345ef591dasm8058025pjg.31.2023.03.01.07.33.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Mar 2023 07:33:17 -0800 (PST) To: libc-alpha@sourceware.org, palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, dj@redhat.com Cc: jerry.shih@sifive.com, nick.knight@sifive.com, hongrong.hsu@sifive.com, hau.hsu@sifive.com, kito.cheng@sifive.com, vincent.chen@sifive.com, greentime.hu@sifive.com Subject: [PATCH 4/4] riscv: vectorized strchr and strnlen functions Date: Wed, 1 Mar 2023 23:32:47 +0800 Message-Id: <20230301153247.1499566-5-vincent.chen@sifive.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230301153247.1499566-1-vincent.chen@sifive.com> References: <20230301153247.1499566-1-vincent.chen@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Vincent Chen via Libc-alpha From: Vincent Chen Reply-To: Vincent Chen Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Nick Knight This patch proposes implementations of strcat, strcmp, strcpy, strlen, strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/strchr.S | 53 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strnlen.S | 56 +++++++++++++++++++++++++++++++++++++ 2 files changed, 109 insertions(+) create mode 100644 sysdeps/riscv/rvv/strchr.S create mode 100644 sysdeps/riscv/rvv/strnlen.S diff --git a/sysdeps/riscv/rvv/strchr.S b/sysdeps/riscv/rvv/strchr.S new file mode 100644 index 0000000000..4a660200c3 --- /dev/null +++ b/sysdeps/riscv/rvv/strchr.S @@ -0,0 +1,53 @@ +/* RISC-V multiarch strchr, V-extension version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Nick Knight . + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + + + +ENTRY(strchr) +0: + vsetvli t0, zero, e8, m8, ta, ma + vle8ff.v v0, (a0) + vmseq.vi v8, v0, 0 + vmseq.vx v9, v0, a1 + vfirst.m a2, v8 /* first occurrence of \0 */ + vfirst.m a3, v9 /* first occurrence of ch */ + addi a4, a3, 1 + seqz a4, a4 + sltu a5, a2, a3 + or a4, a4, a5 + beqz a4, 1f /* Found ch, not preceded by \0? */ + li a6, -1 + csrr a5, vl + add a0, a0, a5 + beq a2, a6, 0b /* Didn't find \0? */ + li a0, 0 + ret +1: + add a0, a0, a3 + ret + +END(strchr) +weak_alias (strchr, index) +libc_hidden_builtin_def (strchr) + + diff --git a/sysdeps/riscv/rvv/strnlen.S b/sysdeps/riscv/rvv/strnlen.S new file mode 100644 index 0000000000..c1ce12baa5 --- /dev/null +++ b/sysdeps/riscv/rvv/strnlen.S @@ -0,0 +1,56 @@ +/* RVV versions strnlen. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by: Nick Knight + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define pStr a0 +#define pCopyStr a2 +#define iRetValue a0 +#define iMaxlen a1 +#define iCurrentVL a3 +#define iEndOffset a4 + +#define ELEM_LMUL_SETTING m1 +#define vStr v0 +#define vMaskEnd v8 + +ENTRY(__strnlen) + + mv pCopyStr, pStr + mv iRetValue, iMaxlen +L(strnlen_loop): + beqz iMaxlen, L(end_strnlen_loop) + vsetvli zero, iMaxlen, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vStr, (pCopyStr) + vmseq.vi vMaskEnd, vStr, 0 + vfirst.m iEndOffset, vMaskEnd /* first occurence of \0 */ + csrr iCurrentVL, vl + add pCopyStr, pCopyStr, iCurrentVL + sub iMaxlen, iMaxlen, iCurrentVL + bltz iEndOffset, L(strnlen_loop) + add iMaxlen, iMaxlen, iCurrentVL + sub iRetValue, iRetValue, iMaxlen + add iRetValue, iRetValue, iEndOffset +L(end_strnlen_loop): + ret +END(__strnlen) +weak_alias (__strnlen, strnlen) +libc_hidden_builtin_def (strnlen) +libc_hidden_builtin_def (__strnlen)