From patchwork Thu May 4 07:48:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68733 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 41C5C385B501 for ; Thu, 4 May 2023 07:49:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 41C5C385B501 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683186584; bh=ul/uiOcXkLDavZkvtIfnUwBRi3/02lXQ4agKc6C1hzA=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=LEfHZfVZwLRzI4l+UzNM6MlpVwOYPoh1FBQOTLvNhHChjck0Tu+hQwtonvYe/XSPw vEsD6vTrSiXjppp3lSzPrRI3dwUqcLbRKq3zIMqtw6xAaL3H/YClqRKbk0nWIUdnuR uAjHoEJO+ABPgLquhWTmb1S7QfyXG3ERTQPiPdPs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by sourceware.org (Postfix) with ESMTPS id DFC3D3858C33 for ; Thu, 4 May 2023 07:49:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DFC3D3858C33 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-6434e40394eso163092b3a.1 for ; Thu, 04 May 2023 00:49:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683186559; x=1685778559; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ul/uiOcXkLDavZkvtIfnUwBRi3/02lXQ4agKc6C1hzA=; b=bR2i3xFyj1VMtX79yvfqNEjNY1FjtpdiLdfTPSVqSgaZjC/NpTHwpbdFwlDkAG4k3Y fu0r5QNAHmi9bZdet7Z2m1mn2GMVhOgLQDe2vtlkelLSIuEtWYTgIUc9T9fF4HE/m8UP LNBbb2Of3Zudqln7d8BUo8gIVX8kW6GtFUHhDBm8cDT730/a/RumoYAyzQ8IE3kqg7Da Qlcci7bqC3NPt2MtzHlBqjjKdyL943jMnwQFjXAVJLEwknegEYsMDAYAP8oP1jDYPHC1 3CxFDkIfQKT1JLZkskaOo2ruOLCxZ8uXQTyq1k4tFO3pXo66aVdEhYjxnYgjoyGcwvOX KXGg== X-Gm-Message-State: AC+VfDydNgzml6wIbrlbH2LrlpV+68fLtVC8k0Unfp8YsYDeqXDW44FP t4zR8xaXz+hge+kbu9wEVfSGMtXyE1zH2bPYO7a45kZMxHi70wwdZW6nHgHalpGJ9mrkoCqZhw4 hEyThGMkBhIj9phyFnzHwyhcYB7mprUqleZ/UkdGSDfQyiHOd8VYcQemsvQKIDb6dkONOUYa0jg 6DtA== X-Google-Smtp-Source: ACHHUZ63XG8NqwVvZ+hopcapZEfUsgtJW/YUOo23Pz4ARDwLqLLB3Faj/IwTSYUzjyhsBPckQWhVPw== X-Received: by 2002:a17:902:dac5:b0:1a9:54c4:760f with SMTP id q5-20020a170902dac500b001a954c4760fmr3542809plx.54.1683186558687; Thu, 04 May 2023 00:49:18 -0700 (PDT) Received: from localhost.localdomain (36-238-22-214.dynamic-ip.hinet.net. [36.238.22.214]) by smtp.gmail.com with ESMTPSA id y18-20020a17090322d200b001ab06958770sm4875294plg.161.2023.05.04.00.49.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 00:49:18 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hau.hsu@sifive.com, kito.cheng@sifive.com, nick.knight@sifive.com, jerry.shih@sifive.com, vincent.chen@sifive.com, hongrong.hsu@sifive.com Subject: [PATCH v3 1/5] riscv: Enabling vectorized mem*/str* functions in build time Date: Thu, 4 May 2023 15:48:47 +0800 Message-Id: <20230504074851.38763-2-hau.hsu@sifive.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230504074851.38763-1-hau.hsu@sifive.com> References: <20230504074851.38763-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Vincent Chen Let the build selects the vectorized mem*/str* functions when it detects the compiler supports RISC-V V extension and enables it in this build. We agree that the these vectorized mem*/str* functions should be selected by IFUNC. Therefore, this patch is intended as a **temporary solution** to enable reviewers to evaluate the effectiveness of these vectorized mem*/str* functions. --- scripts/build-many-glibcs.py | 10 ++++++++++ sysdeps/riscv/preconfigure | 19 +++++++++++++++++++ sysdeps/riscv/preconfigure.ac | 18 ++++++++++++++++++ sysdeps/riscv/rv32/rvv/Implies | 2 ++ sysdeps/riscv/rv64/rvv/Implies | 2 ++ 5 files changed, 51 insertions(+) create mode 100644 sysdeps/riscv/rv32/rvv/Implies create mode 100644 sysdeps/riscv/rv64/rvv/Implies diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py index 95726c4a29..98688d6665 100755 --- a/scripts/build-many-glibcs.py +++ b/scripts/build-many-glibcs.py @@ -381,6 +381,11 @@ class Context(object): variant='rv32imafdc-ilp32d', gcc_cfg=['--with-arch=rv32imafdc', '--with-abi=ilp32d', '--disable-multilib']) + self.add_config(arch='riscv32', + os_name='linux-gnu', + variant='rv32imafdcv-ilp32d', + gcc_cfg=['--with-arch=rv32imafdcv', '--with-abi=ilp32d', + '--disable-multilib']) self.add_config(arch='riscv64', os_name='linux-gnu', variant='rv64imac-lp64', @@ -396,6 +401,11 @@ class Context(object): variant='rv64imafdc-lp64d', gcc_cfg=['--with-arch=rv64imafdc', '--with-abi=lp64d', '--disable-multilib']) + self.add_config(arch='riscv64', + os_name='linux-gnu', + variant='rv64imafdcv-lp64d', + gcc_cfg=['--with-arch=rv64imafdcv', '--with-abi=lp64d', + '--disable-multilib']) self.add_config(arch='s390x', os_name='linux-gnu', glibcs=[{}, diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure index 4dedf4b0bb..5ddc195b46 100644 --- a/sysdeps/riscv/preconfigure +++ b/sysdeps/riscv/preconfigure @@ -7,6 +7,7 @@ riscv*) flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'` float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'` atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2` + vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2` case "$xlen" in 64 | 32) @@ -32,6 +33,24 @@ riscv*) ;; esac + case "$vector" in + __riscv_vector) + case "$flen" in + 64) + float_machine=rvv + ;; + *) + # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only. + ;; + esac + ;; + *) + ;; + esac + + { $as_echo "$as_me:${as_lineno-$LINENO}: vector $vector flen $flen float_machine $float_machine" >&5 +$as_echo "$as_me: vector $vector flen $flen float_machine $float_machine" >&6;} + case "$float_abi" in soft) abi_flen=0 diff --git a/sysdeps/riscv/preconfigure.ac b/sysdeps/riscv/preconfigure.ac index a5c30e0dbf..b6b8bb46e4 100644 --- a/sysdeps/riscv/preconfigure.ac +++ b/sysdeps/riscv/preconfigure.ac @@ -7,6 +7,7 @@ riscv*) flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'` float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'` atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2` + vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2` case "$xlen" in 64 | 32) @@ -32,6 +33,23 @@ riscv*) ;; esac + case "$vector" in + __riscv_vector) + case "$flen" in + 64) + float_machine=rvv + ;; + *) + # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only. + ;; + esac + ;; + *) + ;; + esac + + AC_MSG_NOTICE([vector $vector flen $flen float_machine $float_machine]) + case "$float_abi" in soft) abi_flen=0 diff --git a/sysdeps/riscv/rv32/rvv/Implies b/sysdeps/riscv/rv32/rvv/Implies new file mode 100644 index 0000000000..25ce1df222 --- /dev/null +++ b/sysdeps/riscv/rv32/rvv/Implies @@ -0,0 +1,2 @@ +riscv/rv32/rvd +riscv/rvv diff --git a/sysdeps/riscv/rv64/rvv/Implies b/sysdeps/riscv/rv64/rvv/Implies new file mode 100644 index 0000000000..9993bb30e3 --- /dev/null +++ b/sysdeps/riscv/rv64/rvv/Implies @@ -0,0 +1,2 @@ +riscv/rv64/rvd +riscv/rvv From patchwork Thu May 4 07:48:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68734 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3E1763856959 for ; Thu, 4 May 2023 07:49:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3E1763856959 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683186588; bh=RuIeMG+GyZQq83Ad0xgOi7A9eG5tdf31paNjyP+0zSo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=MiqjdZh/NR+1YqDDhPsDr6pQoWT8PJ4WATkG9UdqnXiMXpDtlR9OU2+auFPGc3dDV W8yB+GKsbaXfcPIbcjKIhvYN/gLatLJ0XVeEQY6szG4rIJ0Uw19Bs7H+k9uHHAHsmT xiZ3hyqeCE9y2RGEEwYJ1U7k5NElJdY0KVM3MZ/k= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id CE193385771A for ; Thu, 4 May 2023 07:49:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CE193385771A Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1ab01bf474aso793265ad.1 for ; Thu, 04 May 2023 00:49:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683186561; x=1685778561; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RuIeMG+GyZQq83Ad0xgOi7A9eG5tdf31paNjyP+0zSo=; b=IMA681EKnbWe8iyX4eMtqpxU2G092gAyEJ5zbsozjBelNtxvgGXIaKMgZluxmKrP3J FDwUd7xjGCsHOiRrMv+eCcLgXMebKQjI2eaK2fJLb1RjI5JZ5FfYapznCM7c4Zy3XC+I oKrLkDY7wJ9FeZ1p0ID6i1/keRKyZGqF77lPJhJ+7PQm9C1fexdhE91gEInFwAwunreI +12pJ87nRHOD+ydAdJfkhNSg6QRSGPFx2Y121e9XEAhwGI+QNelyOwPNo3ltKNECNCVQ w1g7O5EGLOECDsmUa1ANS7bwp9qVitZ4L25RQPzqehP97ivX48Bv/x1b8UfXwmvvylKD c2Fw== X-Gm-Message-State: AC+VfDy2ayqZJTw3RrtVLQhymiFJq1eMaiOf0Zis/SrAg/Tq6fqOzLNX d7LE1t2QX3ZK0g7VUc1bxjLQis8CLQeGrbxS50m7vsG9rzxewF/PWlpW1+j2G9JI0RQ2j1IM4yE 1YQv/61vuBzesK0YtiqQ0PjPbZxXrKlVGqpXT8w6uC6cMbPfnGmLPlMEI2RBF6z9w7uGN/urOAO kk5A== X-Google-Smtp-Source: ACHHUZ4r3epfG8hm+LJDWNXg8Bzs3CyI19ryyuZitjPTmIQsgEU5u8BeNURzTHJ8H5OXZoo9KckGEg== X-Received: by 2002:a17:902:ee41:b0:1ab:1dff:954e with SMTP id 1-20020a170902ee4100b001ab1dff954emr3073801plo.15.1683186560503; Thu, 04 May 2023 00:49:20 -0700 (PDT) Received: from localhost.localdomain (36-238-22-214.dynamic-ip.hinet.net. [36.238.22.214]) by smtp.gmail.com with ESMTPSA id y18-20020a17090322d200b001ab06958770sm4875294plg.161.2023.05.04.00.49.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 00:49:20 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hau.hsu@sifive.com, kito.cheng@sifive.com, nick.knight@sifive.com, jerry.shih@sifive.com, vincent.chen@sifive.com, hongrong.hsu@sifive.com Subject: [PATCH v3 2/5] riscv: vectorized mem* functions Date: Thu, 4 May 2023 15:48:48 +0800 Message-Id: <20230504074851.38763-3-hau.hsu@sifive.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230504074851.38763-1-hau.hsu@sifive.com> References: <20230504074851.38763-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of memchr, memcmp, memcpy, memmove, and memset that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/memchr.S | 62 +++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcmp.S | 74 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcpy.S | 50 +++++++++++++++++++++++++ sysdeps/riscv/rvv/memmove.S | 71 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memset.S | 49 ++++++++++++++++++++++++ 5 files changed, 306 insertions(+) create mode 100644 sysdeps/riscv/rvv/memchr.S create mode 100644 sysdeps/riscv/rvv/memcmp.S create mode 100644 sysdeps/riscv/rvv/memcpy.S create mode 100644 sysdeps/riscv/rvv/memmove.S create mode 100644 sysdeps/riscv/rvv/memset.S diff --git a/sysdeps/riscv/rvv/memchr.S b/sysdeps/riscv/rvv/memchr.S new file mode 100644 index 0000000000..a8273e9a55 --- /dev/null +++ b/sysdeps/riscv/rvv/memchr.S @@ -0,0 +1,62 @@ +/* RVV versions memchr. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 + +#define src a0 +#define value a1 +#define num a2 + +#define ivl a3 +#define temp a4 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 +#define vmask v8 + +ENTRY(memchr) + +L(loop): + vsetvli zero, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vdata, (src) + /* Find the value inside the loaded data. */ + vmseq.vx vmask, vdata, value + vfirst.m temp, vmask + + /* Skip the loop if we find the matched value. */ + bgez temp, L(found) + + csrr ivl, vl + sub num, num, ivl + add src, src, ivl + + bnez num, L(loop) + + li result, 0 + ret + +L(found): + add result, src, temp + ret + +END(memchr) +libc_hidden_builtin_def (memchr) diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S new file mode 100644 index 0000000000..fbf81acc2f --- /dev/null +++ b/sysdeps/riscv/rvv/memcmp.S @@ -0,0 +1,74 @@ +/* RVV versions memcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 + +#define src1 a0 +#define src2 a1 +#define num a2 + +#define ivl a3 +#define temp a4 +#define temp1 a5 +#define temp2 a6 + +#define ELEM_LMUL_SETTING m8 +#define vdata1 v0 +#define vdata2 v8 +#define vmask v16 + +ENTRY(memcmp) + +L(loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata1, (src1) + vle8.v vdata2, (src2) + + vmsne.vv vmask, vdata1, vdata2 + sub num, num, ivl + vfirst.m temp, vmask + + /* Skip the loop if we find the different value between src1 and src2. */ + bgez temp, L(found) + + add src1, src1, ivl + add src2, src2, ivl + + bnez num, L(loop) + + li result, 0 + ret + +L(found): + add src1, src1, temp + add src2, src2, temp + lbu temp1, 0(src1) + lbu temp2, 0(src2) + sub result, temp1, temp2 + ret + +END(memcmp) +libc_hidden_builtin_def (memcmp) +weak_alias (memcmp,bcmp) +strong_alias (memcmp, __memcmpeq) +libc_hidden_def (__memcmpeq) + diff --git a/sysdeps/riscv/rvv/memcpy.S b/sysdeps/riscv/rvv/memcpy.S new file mode 100644 index 0000000000..982c128370 --- /dev/null +++ b/sysdeps/riscv/rvv/memcpy.S @@ -0,0 +1,50 @@ +/* RVV versions memcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define num a2 + +#define ivl a3 +#define dst_ptr a4 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 + +ENTRY(memcpy) + + mv dst_ptr, dst + +L(loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata, (src) + sub num, num, ivl + add src, src, ivl + vse8.v vdata, (dst_ptr) + add dst_ptr, dst_ptr, ivl + + bnez num, L(loop) + + ret + +END(memcpy) +libc_hidden_builtin_def (memcpy) diff --git a/sysdeps/riscv/rvv/memmove.S b/sysdeps/riscv/rvv/memmove.S new file mode 100644 index 0000000000..492c0b65f7 --- /dev/null +++ b/sysdeps/riscv/rvv/memmove.S @@ -0,0 +1,71 @@ +/* RVV versions memmove. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define num a2 + +#define ivl a3 +#define dst_ptr a4 +#define src_backward_ptr a5 +#define dst_backward_ptr a6 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 + +ENTRY(memmove) + + mv dst_ptr, dst + + /* If src is equal or after dst, all data in src will be loaded before + overwrited for the overlapping case. We could use faster `forward-copy`. */ + bgeu src, dst, L(forward_copy_loop) + add src_backward_ptr, src, num + add dst_backward_ptr, dst, num + /* If dst inside source data range, we need to use `backward_copy_loop` to + handle the overlapping issue. */ + bltu dst, src_backward_ptr, L(backward_copy_loop) + +L(forward_copy_loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata, (src) + sub num, num, ivl + add src, src, ivl + vse8.v vdata, (dst_ptr) + add dst_ptr, dst_ptr, ivl + + bnez num, L(forward_copy_loop) + ret + +L(backward_copy_loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + sub src_backward_ptr, src_backward_ptr, ivl + vle8.v vdata, (src_backward_ptr) + sub num, num, ivl + sub dst_backward_ptr, dst_backward_ptr, ivl + vse8.v vdata, (dst_backward_ptr) + bnez num, L(backward_copy_loop) + ret + +END(memmove) +libc_hidden_builtin_def (memmove) diff --git a/sysdeps/riscv/rvv/memset.S b/sysdeps/riscv/rvv/memset.S new file mode 100644 index 0000000000..ac3f88e492 --- /dev/null +++ b/sysdeps/riscv/rvv/memset.S @@ -0,0 +1,49 @@ +/* RVV versions memset. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define value a1 +#define num a2 + +#define ivl a3 +#define dst_ptr a5 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 + +ENTRY(memset) + + mv dst_ptr, dst + + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + vmv.v.x vdata, value + +L(loop): + vse8.v vdata, (dst_ptr) + sub num, num, ivl + add dst_ptr, dst_ptr, ivl + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + bnez num, L(loop) + + ret + +END(memset) +libc_hidden_builtin_def (memset) From patchwork Thu May 4 07:48:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68737 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C5AB9385B508 for ; Thu, 4 May 2023 07:50:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C5AB9385B508 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683186646; bh=5v9JhNGK3b3ZJHldH/fuimj+Iv6szqK/4D5oyy9yP1c=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=HQFp39I1Em0xu1YpsRBhYq16iYfEM5rQeyT6SdHGNdOjD8BCW4obQIz69ZrTg2xHy EydMYegvuBAzQnRVSA2nOlUytpmKFc2P746alZRtpM6J53j3Luq0TSj6cgzOSL4yPk Fmzi+Mx6NCdOr3RftUYtF2M/Viid36bRGCg963xU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by sourceware.org (Postfix) with ESMTPS id C9ED03857009 for ; Thu, 4 May 2023 07:49:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C9ED03857009 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-63b5ce4f069so266838b3a.1 for ; Thu, 04 May 2023 00:49:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683186562; x=1685778562; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5v9JhNGK3b3ZJHldH/fuimj+Iv6szqK/4D5oyy9yP1c=; b=GKYK1rDpNvCBJopSuNCdETbLbWPAmCYYpTleTbvppdCgD/Uzmrm5iwisq1Vyw/Dm3K PJVeaQmUOoYQBZh1lGZnRkP0KVSWgH98W2G3LmH+J4ztOafG5HlKz+X40Xc2BeKKQttQ D3/obS2LXVm7UwYIqa8zNkDoxU3ut0PWWUPAul/j7Ff3IaTvOTBNc6jkg/Lu8fYYFJQ+ MqZi8q8wT94rQCMIR36f/cXZDw665Snp0Cp9kML+wfvuUrfy7TQ98TU8vGdvV+YOzgq7 sxfIiglhkbogMLGTH3WfykEEvfF5TBL/lbkL3y/b1l145C4RdhfTYXwQLIH/jCa4Rt3u OBxg== X-Gm-Message-State: AC+VfDzsig9wkc4MPWJNnBuci3mv7HjH/1hWI5+8IsTJlU/3Pz/QNT+t p5rq4AeDwe19Q2reU2j0GQAmrqMeeeBpQoB6AQ7fbPlifxxTl+Wl+s+zEo+8+SOdl9hFkBt1zPc X1Dfr5daoQgmkSXxEisXo41y8GUmTqjqyF0ganZmm7dXrh8mqj73vjzOV8jXHjzNOWQMmn1gP9u O0bQ== X-Google-Smtp-Source: ACHHUZ5zz7llL5MRgc0VE0+8kRgH/DjqE93O3hj6+j1FFYJVBKLrP5Ir1CeWHG25MBMUKML1bR/DDQ== X-Received: by 2002:a05:6a20:729a:b0:ee:d266:32b9 with SMTP id o26-20020a056a20729a00b000eed26632b9mr1863079pzk.10.1683186562312; Thu, 04 May 2023 00:49:22 -0700 (PDT) Received: from localhost.localdomain (36-238-22-214.dynamic-ip.hinet.net. [36.238.22.214]) by smtp.gmail.com with ESMTPSA id y18-20020a17090322d200b001ab06958770sm4875294plg.161.2023.05.04.00.49.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 00:49:22 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hau.hsu@sifive.com, kito.cheng@sifive.com, nick.knight@sifive.com, jerry.shih@sifive.com, vincent.chen@sifive.com, hongrong.hsu@sifive.com Subject: [PATCH v3 3/5] riscv: vectorized str* functions Date: Thu, 4 May 2023 15:48:49 +0800 Message-Id: <20230504074851.38763-4-hau.hsu@sifive.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230504074851.38763-1-hau.hsu@sifive.com> References: <20230504074851.38763-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of strcat, strcmp, strcpy, strlen, strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/strcat.S | 71 ++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strcmp.S | 88 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strcpy.S | 55 +++++++++++++++++++++++ sysdeps/riscv/rvv/strlen.S | 53 ++++++++++++++++++++++ sysdeps/riscv/rvv/strncat.S | 82 ++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strncmp.S | 84 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strncpy.S | 85 +++++++++++++++++++++++++++++++++++ 7 files changed, 518 insertions(+) create mode 100644 sysdeps/riscv/rvv/strcat.S create mode 100644 sysdeps/riscv/rvv/strcmp.S create mode 100644 sysdeps/riscv/rvv/strcpy.S create mode 100644 sysdeps/riscv/rvv/strlen.S create mode 100644 sysdeps/riscv/rvv/strncat.S create mode 100644 sysdeps/riscv/rvv/strncmp.S create mode 100644 sysdeps/riscv/rvv/strncpy.S diff --git a/sysdeps/riscv/rvv/strcat.S b/sysdeps/riscv/rvv/strcat.S new file mode 100644 index 0000000000..fb5858fa82 --- /dev/null +++ b/sysdeps/riscv/rvv/strcat.S @@ -0,0 +1,71 @@ +/* RVV versions strcat. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define dst_ptr a2 + +#define ivl a3 +#define cur_vl a4 +#define active_elem_pos a5 + +#define ELEM_LMUL_SETTING m1 +#define vmask1 v0 +#define vmask2 v1 +#define vstr1 v8 +#define vstr2 v16 + +ENTRY(strcat) + + mv dst_ptr, dst + + /* Perform `strlen(dst)`. */ +L(strlen_loop): + vsetvli ivl, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vstr1, (dst_ptr) + vmseq.vx vmask1, vstr1, zero + csrr cur_vl, vl + vfirst.m active_elem_pos, vmask1 + add dst_ptr, dst_ptr, cur_vl + bltz active_elem_pos, L(strlen_loop) + + sub dst_ptr, dst_ptr, cur_vl + add dst_ptr, dst_ptr, active_elem_pos + + /* Perform `strcpy(dst, src)`. */ +L(strcpy_loop): + vsetvli ivl, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vstr1, (src) + vmseq.vx vmask2, vstr1, zero + csrr cur_vl, vl + vfirst.m active_elem_pos, vmask2 + vmsif.m vmask1, vmask2 + add src, src, cur_vl + vse8.v vstr1, (dst_ptr), vmask1.t + add dst_ptr, dst_ptr, cur_vl + bltz active_elem_pos, L(strcpy_loop) + + ret + +END(strcat) +libc_hidden_builtin_def (strcat) diff --git a/sysdeps/riscv/rvv/strcmp.S b/sysdeps/riscv/rvv/strcmp.S new file mode 100644 index 0000000000..2e60d76dc8 --- /dev/null +++ b/sysdeps/riscv/rvv/strcmp.S @@ -0,0 +1,88 @@ +/* RVV versions strcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 + +#define str1 a0 +#define str2 a1 + +#define ivl a2 +#define temp1 a3 +#define temp2 a4 + +#define vstr1 v0 +#define vstr2 v8 +#define vmask1 v16 +#define vmask2 v17 + +ENTRY(strcmp) + /* lmul=1 */ + +L(Loop): + vsetvli ivl, zero, e8, m1, ta, ma + vle8ff.v vstr1, (str1) + /* check if vstr1[i] == 0 */ + vmseq.vx vmask1, vstr1, zero + + vle8ff.v vstr2, (str2) + /* check if vstr1[i] != vstr2[i] */ + vmsne.vv vmask2, vstr1, vstr2 + + /* find the index x for vstr1[x]==0 */ + vfirst.m temp1, vmask1 + /* find the index x for vstr1[x]!=vstr2[x] */ + vfirst.m temp2, vmask2 + + bgez temp1, L(check1) + bgez temp2, L(check2) + + /* get the current vl updated by vle8ff. */ + csrr ivl, vl + add str1, str1, ivl + add str2, str2, ivl + j L(Loop) + + /* temp1>=0 */ +L(check1): + bltz temp2, 1f + blt temp2, temp1, L(check2) +1: + /* temp2<0 */ + /* temp2>=0 && temp1=0 */ +L(check2): + add str1, str1, temp2 + add str2, str2, temp2 + lbu temp1, 0(str1) + lbu temp2, 0(str2) + sub result, temp1, temp2 + ret + +END(strcmp) +libc_hidden_builtin_def (strcmp) diff --git a/sysdeps/riscv/rvv/strcpy.S b/sysdeps/riscv/rvv/strcpy.S new file mode 100644 index 0000000000..1ad433f5f3 --- /dev/null +++ b/sysdeps/riscv/rvv/strcpy.S @@ -0,0 +1,55 @@ +/* RVV versions strcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define dst_ptr a2 + +#define ivl a3 +#define cur_vl a4 +#define active_elem_pos a5 + +#define ELEM_LMUL_SETTING m1 +#define vmask1 v0 +#define vmask2 v1 +#define vstr1 v8 +#define vstr2 v16 + +ENTRY(strcpy) + + mv dst_ptr, dst + +L(strcpy_loop): + vsetvli ivl, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vstr1, (src) + vmseq.vx vmask2, vstr1, zero + csrr cur_vl, vl + vfirst.m active_elem_pos, vmask2 + vmsif.m vmask1, vmask2 + add src, src, cur_vl + vse8.v vstr1, (dst_ptr), vmask1.t + add dst_ptr, dst_ptr, cur_vl + bltz active_elem_pos, L(strcpy_loop) + + ret + +END(strcpy) +libc_hidden_builtin_def (strcpy) diff --git a/sysdeps/riscv/rvv/strlen.S b/sysdeps/riscv/rvv/strlen.S new file mode 100644 index 0000000000..cf3698f52a --- /dev/null +++ b/sysdeps/riscv/rvv/strlen.S @@ -0,0 +1,53 @@ +/* RVV versions strlen. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 +#define str a0 +#define copy_str a1 +#define ivl a2 +#define cur_vl a2 +#define end_offset a3 + +#define ELEM_LMUL_SETTING m2 +#define vstr v0 +#define vmask_end v2 + +ENTRY(strlen) + + mv copy_str, str +L(loop): + vsetvli ivl, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vstr, (copy_str) + csrr cur_vl, vl + vmseq.vi vmask_end, vstr, 0 + vfirst.m end_offset, vmask_end + add copy_str, copy_str, cur_vl + bltz end_offset, L(loop) + + add str, str, cur_vl + add copy_str, copy_str, end_offset + sub result, copy_str, result + + ret + +END(strlen) + +libc_hidden_builtin_def (strlen) diff --git a/sysdeps/riscv/rvv/strncat.S b/sysdeps/riscv/rvv/strncat.S new file mode 100644 index 0000000000..d30a6533a3 --- /dev/null +++ b/sysdeps/riscv/rvv/strncat.S @@ -0,0 +1,82 @@ +/* RVV versions strncat. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define length a2 +#define dst_ptr a3 + +#define ivl a4 +#define cur_vl a5 +#define activate_elem_pos a6 + +#define ELEM_LMUL_SETTING m1 +#define vmask1 v0 +#define vmask2 v1 +#define vstr1 v8 +#define vstr2 v16 + +ENTRY(strncat) + + mv dst_ptr, dst + + /* the strlen of dst. */ +L(strlen_loop): + vsetvli ivl, zero, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vstr1, (dst_ptr) + /* find the '\0'. */ + vmseq.vx vmask1, vstr1, zero + csrr cur_vl, vl + vfirst.m activate_elem_pos, vmask1 + add dst_ptr, dst_ptr, cur_vl + bltz activate_elem_pos, L(strlen_loop) + + sub dst_ptr, dst_ptr, cur_vl + add dst_ptr, dst_ptr, activate_elem_pos + + /* copy src to dst_ptr. */ +L(strcpy_loop): + vsetvli zero, length, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vstr1, (src) + vmseq.vx vmask2, vstr1, zero + csrr cur_vl, vl + vfirst.m activate_elem_pos, vmask2 + vmsif.m vmask1, vmask2 + add src, src, cur_vl + sub length, length, cur_vl + vse8.v vstr1, (dst_ptr), vmask1.t + add dst_ptr, dst_ptr, cur_vl + beqz length, L(fill_zero) + bltz activate_elem_pos, L(strcpy_loop) + + ret + +L(fill_zero): + bgez activate_elem_pos, L(fill_zero_end) + sb zero, (dst_ptr) + +L(fill_zero_end): + ret + +END(strncat) +libc_hidden_builtin_def (strncat) diff --git a/sysdeps/riscv/rvv/strncmp.S b/sysdeps/riscv/rvv/strncmp.S new file mode 100644 index 0000000000..2b6ab1f233 --- /dev/null +++ b/sysdeps/riscv/rvv/strncmp.S @@ -0,0 +1,84 @@ +/* RVV versions strncmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 + +#define str1 a0 +#define str2 a1 +#define length a2 + +#define ivl a3 +#define temp1 a4 +#define temp2 a5 + +#define ELEM_LMUL_SETTING m1 +#define vstr1 v0 +#define vstr2 v4 +#define vmask1 v8 +#define vmask2 v9 + +ENTRY(strncmp) + + beqz length, L(zero_length) + +L(loop): + vsetvli zero, length, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vstr1, (str1) + /* vstr1[i] == 0. */ + vmseq.vx vmask1, vstr1, zero + + vle8ff.v vstr2, (str2) + /* vstr1[i] != vstr2[i]. */ + vmsne.vv vmask2, vstr1, vstr2 + + csrr ivl, vl + + /* r = mask1 | mask2 + We could use vfirst.m to get the first zero char or the + first different char between str1 and str2. */ + vmor.mm vmask1, vmask1, vmask2 + + sub length, length, ivl + + vfirst.m temp1, vmask1 + + bgez temp1, L(end_loop) + + add str1, str1, ivl + add str2, str2, ivl + bnez length, L(loop) +L(end_loop): + + add str1, str1, temp1 + add str2, str2, temp1 + lbu temp1, 0(str1) + lbu temp2, 0(str2) + + sub result, temp1, temp2 + ret + +L(zero_length): + li result, 0 + ret + +END(strncmp) +libc_hidden_builtin_def (strncmp) diff --git a/sysdeps/riscv/rvv/strncpy.S b/sysdeps/riscv/rvv/strncpy.S new file mode 100644 index 0000000000..53fb8cdec7 --- /dev/null +++ b/sysdeps/riscv/rvv/strncpy.S @@ -0,0 +1,85 @@ +/* RVV versions strncpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define length a2 +#define dst_ptr a3 + +#define ivl a4 +#define cur_vl a5 +#define active_elem_pos a6 +#define temp a7 + +#define ELEM_LMUL_SETTING m1 +#define vmask1 v0 +#define vmask2 v1 +#define ZERO_FILL_ELEM_LMUL_SETTING m8 +#define vstr1 v8 +#define vstr2 v16 + +ENTRY(strncpy) + + mv dst_ptr, dst + + /* Copy src to dst_ptr. */ +L(strcpy_loop): + vsetvli zero, length, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vstr1, (src) + vmseq.vx vmask2, vstr1, zero + csrr cur_vl, vl + vfirst.m active_elem_pos, vmask2 + vmsif.m vmask1, vmask2 + add src, src, cur_vl + sub length, length, cur_vl + vse8.v vstr1, (dst_ptr), vmask1.t + add dst_ptr, dst_ptr, cur_vl + bgez active_elem_pos, L(fill_zero) + bnez length, L(strcpy_loop) + ret + + /* Fill the tail zero. */ +L(fill_zero): + /* We already copy the `\0` to dst. But we use `vfirst.m` to + get the `index` of `\0` position. We need to adjust `-1` + to get the correct remaining length for zero filling. */ + sub temp, cur_vl, active_elem_pos + addi temp, temp, -1 + add length, length, temp + /* Have an earily return for `strlen(src) + 1 == count` case. */ + bnez length, 1f + ret +1: + sub dst_ptr, dst_ptr, temp + vsetvli zero, length, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma + vmv.v.x vstr2, zero + +L(fill_zero_loop): + vsetvli ivl, length, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma + vse8.v vstr2, (dst_ptr) + sub length, length, ivl + add dst_ptr, dst_ptr, ivl + bnez length, L(fill_zero_loop) + + ret + +END(strncpy) +libc_hidden_builtin_def (strncpy) From patchwork Thu May 4 07:48:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68736 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2A684385293A for ; Thu, 4 May 2023 07:50:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2A684385293A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683186631; bh=jse2fHFH9fc3qdt9mCb8n03g+nxcsHPRJrpueR7/mlg=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Ul77m88suQB3PR++MqF41jckqmu0KFTbDFGJlcPH1i58TQxZWJj0D0UT//4m4yD4B snjU8JVqT+yu9mQLOIzJWVX+BHsjJk6zw0/StVZBZDSF5RqzfmQVEuceFWRCTR51BS jQHnM/s6vhkTmWxYmXXNLL9g/Iwycvey5Mqb1Edg= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by sourceware.org (Postfix) with ESMTPS id 6B8F63857005 for ; Thu, 4 May 2023 07:49:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6B8F63857005 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-64359d9c531so165847b3a.3 for ; Thu, 04 May 2023 00:49:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683186564; x=1685778564; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jse2fHFH9fc3qdt9mCb8n03g+nxcsHPRJrpueR7/mlg=; b=AJw/4guB7xlrEBKXZnPaae+yAxWRiJ7np5482OoIp84ASMm7MUushH9QtHa8OwmL2d fzLWgcyJhETcSqK2KhYGo77chpXkFhl84gaX7MCMFQcEgh8yWfLFCcbpUGAFj5tuCsAp S4SqJjjUpMTk9Q8QnpDOdKYMwQPWyRzj+oTVA3Bz3ZOq0peOH+FGu0yAHqDwxwXwlHzX wS/UhmVZ30VsxhdqJfJgktKhBdFicjJaoCrz+DOdXnLFPq8Oh6G4CwmoNNtLAcRhQJo0 +FnpHQA/ZaxPfN+oYF36N5AnbDhUFH+UaYpcFlL16r7bqhHAtq7rjhQqdqoNEMQjTnRV r3zA== X-Gm-Message-State: AC+VfDwN/YknRvs/7yWuwTuIBQb3RQ3x8+vejtH4akfAB2hvG5OGJj2m unwaElT7n1Gn/hoJPskARuOrk/l3w90OClWyUsOQJmooSEmpT1fdpdCjabK08mBz0P7uNNZ61tF 7TvxjL7uestpJYeeJNavQUv4IoEXlUCE3KXyZAUkqJKmnMvG1Q7q6XEo25GevHNCkNZMAHrKIvz pEKg== X-Google-Smtp-Source: ACHHUZ7//r5I0CMGLWh6RCZWxu8ajq8yqQJqZXwiYuSQYluHZ0NMnfu2zRqkolDqQH7FngYgyLZ6Mg== X-Received: by 2002:a17:902:ea95:b0:1a6:6fe3:df91 with SMTP id x21-20020a170902ea9500b001a66fe3df91mr2744470plb.50.1683186563982; Thu, 04 May 2023 00:49:23 -0700 (PDT) Received: from localhost.localdomain (36-238-22-214.dynamic-ip.hinet.net. [36.238.22.214]) by smtp.gmail.com with ESMTPSA id y18-20020a17090322d200b001ab06958770sm4875294plg.161.2023.05.04.00.49.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 00:49:23 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hau.hsu@sifive.com, kito.cheng@sifive.com, nick.knight@sifive.com, jerry.shih@sifive.com, vincent.chen@sifive.com, hongrong.hsu@sifive.com Subject: [PATCH v3 4/5] riscv: vectorized strchr and strnlen functions Date: Thu, 4 May 2023 15:48:50 +0800 Message-Id: <20230504074851.38763-5-hau.hsu@sifive.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230504074851.38763-1-hau.hsu@sifive.com> References: <20230504074851.38763-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Nick Knight This patch proposes implementations of strcat, strcmp, strcpy, strlen, strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/strchr.S | 62 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strnlen.S | 55 ++++++++++++++++++++++++++++++++ 2 files changed, 117 insertions(+) create mode 100644 sysdeps/riscv/rvv/strchr.S create mode 100644 sysdeps/riscv/rvv/strnlen.S diff --git a/sysdeps/riscv/rvv/strchr.S b/sysdeps/riscv/rvv/strchr.S new file mode 100644 index 0000000000..053923d3d7 --- /dev/null +++ b/sysdeps/riscv/rvv/strchr.S @@ -0,0 +1,62 @@ +/* RVV versions strchr. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + + +#include +#include + +#define str a0 +#define ch a1 +#define end_offset a2 +#define ch_offset a3 +#define temp1 a4 +#define temp2 a5 +#define cur_vl a6 +#define ivl t0 + +#define ELEM_LMUL_SETTING m1 +#define vstr v0 +#define vmask_end v8 +#define vmask_ch v9 + +ENTRY(strchr) + +L(strchr_loop): + vsetvli ivl, zero, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vstr, (str) + vmseq.vi vmask_end, vstr, 0 + vmseq.vx vmask_ch, vstr, ch + vfirst.m end_offset, vmask_end /* first occurrence of \0 */ + vfirst.m ch_offset, vmask_ch /* first occurrence of ch */ + sltz temp1, ch_offset + sltu temp2, end_offset, ch_offset + or temp1, temp1, temp2 + beqz temp1, L(found_ch) /* Found ch, not preceded by \0? */ + csrr cur_vl, vl + add str, str, cur_vl + bltz end_offset, L(strchr_loop) /* Didn't find \0? */ + li str, 0 + ret +L(found_ch): + add str, str, ch_offset + ret + +END(strchr) +weak_alias (strchr, index) +libc_hidden_builtin_def (strchr) + diff --git a/sysdeps/riscv/rvv/strnlen.S b/sysdeps/riscv/rvv/strnlen.S new file mode 100644 index 0000000000..b902ae0fd4 --- /dev/null +++ b/sysdeps/riscv/rvv/strnlen.S @@ -0,0 +1,55 @@ +/* RVV versions strnlen. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define str a0 +#define copy_str a2 +#define ret_value a0 +#define max_len a1 +#define cur_vl a3 +#define end_offset a4 + +#define ELEM_LMUL_SETTING m1 +#define vstr v0 +#define vmask_end v8 + +ENTRY(__strnlen) + + mv copy_str, str + mv ret_value, max_len +L(strnlen_loop): + beqz max_len, L(end_strnlen_loop) + vsetvli zero, max_len, e8, ELEM_LMUL_SETTING, ta, ma + vle8ff.v vstr, (copy_str) + vmseq.vi vmask_end, vstr, 0 + vfirst.m end_offset, vmask_end /* first occurence of \0 */ + csrr cur_vl, vl + add copy_str, copy_str, cur_vl + sub max_len, max_len, cur_vl + bltz end_offset, L(strnlen_loop) + add max_len, max_len, cur_vl + sub ret_value, ret_value, max_len + add ret_value, ret_value, end_offset +L(end_strnlen_loop): + ret +END(__strnlen) +weak_alias (__strnlen, strnlen) +libc_hidden_builtin_def (strnlen) +libc_hidden_builtin_def (__strnlen) From patchwork Thu May 4 07:48:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68735 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3D0653854146 for ; Thu, 4 May 2023 07:50:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3D0653854146 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683186601; bh=TH84YE7rqdm9XGv6T+O94pI96mPlo4WeYYypkankhqA=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=FU8SddE97MfDhbM+WicH47AYJO945jAUC0vVbzeHk4G2RfKZc6u9VzTskVFaoRXcS vsaqEwFiYtpT02qjoGNZsyp3fQjJkYVzohHewxtJFMrcdweoYuYEyuapRPPonLBbbp jeJHG2c9hIFjN7Z1aRsaKJ1gKx0pGoKBaD9N33+k= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id EE9AC3856DC8 for ; Thu, 4 May 2023 07:49:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE9AC3856DC8 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1aafa03f541so1185235ad.0 for ; Thu, 04 May 2023 00:49:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683186566; x=1685778566; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TH84YE7rqdm9XGv6T+O94pI96mPlo4WeYYypkankhqA=; b=HeFGBc6ibR4unD56o3xQLiGr5yLP/6MJBJqyF2RPpQiZA58hSuAazCwEC4tx403Qcf HEYUZn8ca2mFSK9RlLeRlVvct+Ml2z79/dttT7+5utAPc1M28QgoO6TUeY32VAMAX/E4 oavWPOF8PPrtdDjPgNO682aMBYQCZfVkUxJ4Hsbg241fB9tztae9HpUyNTKG27IZwKXN q+cHZ0hQdQ30vfKh1EQBOpTB3g0plDEbUIX/0vp1iOr4RPJIhwY1LJzYjxp5UFkUb6pn vpzRs3g2EuEI6rWvPTg3bVtF3mDiB2Qa2b24VCZYAJVQ1KGtbtyQgBQwjC9b1WG2vXrs RedA== X-Gm-Message-State: AC+VfDx7DwRlf5IdwSRl/DguMYjewyl+AuqH0n15JyLTdxVXuSeutDJ4 Zuq8Qksefggx7JApczWbrcyJHHPA6NEsEwp3kVkHSyRzC9y7UfCLns981Aj9nayYihCVIu2Ctzk abgxFMo8L0DziMYBPy2mqVt//Mxu6/YTKwi+wIPj3emyEUMrVylYqaFROl3Rzw3B8eYme13n4tC /a5g== X-Google-Smtp-Source: ACHHUZ6GrQcdIDjYFZPgyl3lRe78bgpVLPdocmMP8V9fbRZ+zOsU38t8i79OiWKESJx45reTVgmcxg== X-Received: by 2002:a17:902:7294:b0:1a9:21bc:65f8 with SMTP id d20-20020a170902729400b001a921bc65f8mr2808469pll.11.1683186565784; Thu, 04 May 2023 00:49:25 -0700 (PDT) Received: from localhost.localdomain (36-238-22-214.dynamic-ip.hinet.net. [36.238.22.214]) by smtp.gmail.com with ESMTPSA id y18-20020a17090322d200b001ab06958770sm4875294plg.161.2023.05.04.00.49.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 00:49:25 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hau.hsu@sifive.com, kito.cheng@sifive.com, nick.knight@sifive.com, jerry.shih@sifive.com, vincent.chen@sifive.com, hongrong.hsu@sifive.com, Yun Hsiang Subject: [PATCH v3 5/5] riscv: vectorized __memcmpeq function Date: Thu, 4 May 2023 15:48:51 +0800 Message-Id: <20230504074851.38763-6-hau.hsu@sifive.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230504074851.38763-1-hau.hsu@sifive.com> References: <20230504074851.38763-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Yun Hsiang This patch proposes implementations of __memcmpeq that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/memcmp.S | 4 --- sysdeps/riscv/rvv/memcmpeq.S | 67 ++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+), 4 deletions(-) create mode 100644 sysdeps/riscv/rvv/memcmpeq.S diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S index fbf81acc2f..eeec2cae6a 100644 --- a/sysdeps/riscv/rvv/memcmp.S +++ b/sysdeps/riscv/rvv/memcmp.S @@ -68,7 +68,3 @@ L(found): END(memcmp) libc_hidden_builtin_def (memcmp) -weak_alias (memcmp,bcmp) -strong_alias (memcmp, __memcmpeq) -libc_hidden_def (__memcmpeq) - diff --git a/sysdeps/riscv/rvv/memcmpeq.S b/sysdeps/riscv/rvv/memcmpeq.S new file mode 100644 index 0000000000..5820af69d7 --- /dev/null +++ b/sysdeps/riscv/rvv/memcmpeq.S @@ -0,0 +1,67 @@ +/* RVV versions memcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + +#define result a0 + +#define src1 a0 +#define src2 a1 +#define num a2 + +#define ivl a3 +#define temp a4 + +#define ELEM_LMUL_SETTING m1 +#define vdata1 v0 +#define vdata2 v8 +#define vmask v16 + +ENTRY(__memcmpeq) + +L(loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata1, (src1) + vle8.v vdata2, (src2) + + vmsne.vv vmask, vdata1, vdata2 + sub num, num, ivl + vfirst.m temp, vmask + + /* Skip the loop if we find the different value between src1 and src2. */ + bgez temp, L(found) + + add src1, src1, ivl + add src2, src2, ivl + + bnez num, L(loop) + + li result, 0 + ret + +L(found): + mv result, ivl + ret + +END(__memcmpeq) + +weak_alias (__memcmpeq, bcmp) +libc_hidden_def (__memcmpeq)