From patchwork Thu May 4 07:48:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hau Hsu X-Patchwork-Id: 68734 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3E1763856959 for ; Thu, 4 May 2023 07:49:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3E1763856959 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683186588; bh=RuIeMG+GyZQq83Ad0xgOi7A9eG5tdf31paNjyP+0zSo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=MiqjdZh/NR+1YqDDhPsDr6pQoWT8PJ4WATkG9UdqnXiMXpDtlR9OU2+auFPGc3dDV W8yB+GKsbaXfcPIbcjKIhvYN/gLatLJ0XVeEQY6szG4rIJ0Uw19Bs7H+k9uHHAHsmT xiZ3hyqeCE9y2RGEEwYJ1U7k5NElJdY0KVM3MZ/k= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id CE193385771A for ; Thu, 4 May 2023 07:49:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CE193385771A Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1ab01bf474aso793265ad.1 for ; Thu, 04 May 2023 00:49:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683186561; x=1685778561; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RuIeMG+GyZQq83Ad0xgOi7A9eG5tdf31paNjyP+0zSo=; b=IMA681EKnbWe8iyX4eMtqpxU2G092gAyEJ5zbsozjBelNtxvgGXIaKMgZluxmKrP3J FDwUd7xjGCsHOiRrMv+eCcLgXMebKQjI2eaK2fJLb1RjI5JZ5FfYapznCM7c4Zy3XC+I oKrLkDY7wJ9FeZ1p0ID6i1/keRKyZGqF77lPJhJ+7PQm9C1fexdhE91gEInFwAwunreI +12pJ87nRHOD+ydAdJfkhNSg6QRSGPFx2Y121e9XEAhwGI+QNelyOwPNo3ltKNECNCVQ w1g7O5EGLOECDsmUa1ANS7bwp9qVitZ4L25RQPzqehP97ivX48Bv/x1b8UfXwmvvylKD c2Fw== X-Gm-Message-State: AC+VfDy2ayqZJTw3RrtVLQhymiFJq1eMaiOf0Zis/SrAg/Tq6fqOzLNX d7LE1t2QX3ZK0g7VUc1bxjLQis8CLQeGrbxS50m7vsG9rzxewF/PWlpW1+j2G9JI0RQ2j1IM4yE 1YQv/61vuBzesK0YtiqQ0PjPbZxXrKlVGqpXT8w6uC6cMbPfnGmLPlMEI2RBF6z9w7uGN/urOAO kk5A== X-Google-Smtp-Source: ACHHUZ4r3epfG8hm+LJDWNXg8Bzs3CyI19ryyuZitjPTmIQsgEU5u8BeNURzTHJ8H5OXZoo9KckGEg== X-Received: by 2002:a17:902:ee41:b0:1ab:1dff:954e with SMTP id 1-20020a170902ee4100b001ab1dff954emr3073801plo.15.1683186560503; Thu, 04 May 2023 00:49:20 -0700 (PDT) Received: from localhost.localdomain (36-238-22-214.dynamic-ip.hinet.net. [36.238.22.214]) by smtp.gmail.com with ESMTPSA id y18-20020a17090322d200b001ab06958770sm4875294plg.161.2023.05.04.00.49.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 00:49:20 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hau.hsu@sifive.com, kito.cheng@sifive.com, nick.knight@sifive.com, jerry.shih@sifive.com, vincent.chen@sifive.com, hongrong.hsu@sifive.com Subject: [PATCH v3 2/5] riscv: vectorized mem* functions Date: Thu, 4 May 2023 15:48:48 +0800 Message-Id: <20230504074851.38763-3-hau.hsu@sifive.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230504074851.38763-1-hau.hsu@sifive.com> References: <20230504074851.38763-1-hau.hsu@sifive.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hau Hsu via Libc-alpha From: Hau Hsu Reply-To: Hau Hsu Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Jerry Shih This patch proposes implementations of memchr, memcmp, memcpy, memmove, and memset that leverage the RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. --- sysdeps/riscv/rvv/memchr.S | 62 +++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcmp.S | 74 +++++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memcpy.S | 50 +++++++++++++++++++++++++ sysdeps/riscv/rvv/memmove.S | 71 +++++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/memset.S | 49 ++++++++++++++++++++++++ 5 files changed, 306 insertions(+) create mode 100644 sysdeps/riscv/rvv/memchr.S create mode 100644 sysdeps/riscv/rvv/memcmp.S create mode 100644 sysdeps/riscv/rvv/memcpy.S create mode 100644 sysdeps/riscv/rvv/memmove.S create mode 100644 sysdeps/riscv/rvv/memset.S diff --git a/sysdeps/riscv/rvv/memchr.S b/sysdeps/riscv/rvv/memchr.S new file mode 100644 index 0000000000..a8273e9a55 --- /dev/null +++ b/sysdeps/riscv/rvv/memchr.S @@ -0,0 +1,62 @@ +/* RVV versions memchr. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 + +#define src a0 +#define value a1 +#define num a2 + +#define ivl a3 +#define temp a4 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 +#define vmask v8 + +ENTRY(memchr) + +L(loop): + vsetvli zero, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8ff.v vdata, (src) + /* Find the value inside the loaded data. */ + vmseq.vx vmask, vdata, value + vfirst.m temp, vmask + + /* Skip the loop if we find the matched value. */ + bgez temp, L(found) + + csrr ivl, vl + sub num, num, ivl + add src, src, ivl + + bnez num, L(loop) + + li result, 0 + ret + +L(found): + add result, src, temp + ret + +END(memchr) +libc_hidden_builtin_def (memchr) diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S new file mode 100644 index 0000000000..fbf81acc2f --- /dev/null +++ b/sysdeps/riscv/rvv/memcmp.S @@ -0,0 +1,74 @@ +/* RVV versions memcmp. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define result a0 + +#define src1 a0 +#define src2 a1 +#define num a2 + +#define ivl a3 +#define temp a4 +#define temp1 a5 +#define temp2 a6 + +#define ELEM_LMUL_SETTING m8 +#define vdata1 v0 +#define vdata2 v8 +#define vmask v16 + +ENTRY(memcmp) + +L(loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata1, (src1) + vle8.v vdata2, (src2) + + vmsne.vv vmask, vdata1, vdata2 + sub num, num, ivl + vfirst.m temp, vmask + + /* Skip the loop if we find the different value between src1 and src2. */ + bgez temp, L(found) + + add src1, src1, ivl + add src2, src2, ivl + + bnez num, L(loop) + + li result, 0 + ret + +L(found): + add src1, src1, temp + add src2, src2, temp + lbu temp1, 0(src1) + lbu temp2, 0(src2) + sub result, temp1, temp2 + ret + +END(memcmp) +libc_hidden_builtin_def (memcmp) +weak_alias (memcmp,bcmp) +strong_alias (memcmp, __memcmpeq) +libc_hidden_def (__memcmpeq) + diff --git a/sysdeps/riscv/rvv/memcpy.S b/sysdeps/riscv/rvv/memcpy.S new file mode 100644 index 0000000000..982c128370 --- /dev/null +++ b/sysdeps/riscv/rvv/memcpy.S @@ -0,0 +1,50 @@ +/* RVV versions memcpy. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define num a2 + +#define ivl a3 +#define dst_ptr a4 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 + +ENTRY(memcpy) + + mv dst_ptr, dst + +L(loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata, (src) + sub num, num, ivl + add src, src, ivl + vse8.v vdata, (dst_ptr) + add dst_ptr, dst_ptr, ivl + + bnez num, L(loop) + + ret + +END(memcpy) +libc_hidden_builtin_def (memcpy) diff --git a/sysdeps/riscv/rvv/memmove.S b/sysdeps/riscv/rvv/memmove.S new file mode 100644 index 0000000000..492c0b65f7 --- /dev/null +++ b/sysdeps/riscv/rvv/memmove.S @@ -0,0 +1,71 @@ +/* RVV versions memmove. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define src a1 +#define num a2 + +#define ivl a3 +#define dst_ptr a4 +#define src_backward_ptr a5 +#define dst_backward_ptr a6 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 + +ENTRY(memmove) + + mv dst_ptr, dst + + /* If src is equal or after dst, all data in src will be loaded before + overwrited for the overlapping case. We could use faster `forward-copy`. */ + bgeu src, dst, L(forward_copy_loop) + add src_backward_ptr, src, num + add dst_backward_ptr, dst, num + /* If dst inside source data range, we need to use `backward_copy_loop` to + handle the overlapping issue. */ + bltu dst, src_backward_ptr, L(backward_copy_loop) + +L(forward_copy_loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vdata, (src) + sub num, num, ivl + add src, src, ivl + vse8.v vdata, (dst_ptr) + add dst_ptr, dst_ptr, ivl + + bnez num, L(forward_copy_loop) + ret + +L(backward_copy_loop): + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + + sub src_backward_ptr, src_backward_ptr, ivl + vle8.v vdata, (src_backward_ptr) + sub num, num, ivl + sub dst_backward_ptr, dst_backward_ptr, ivl + vse8.v vdata, (dst_backward_ptr) + bnez num, L(backward_copy_loop) + ret + +END(memmove) +libc_hidden_builtin_def (memmove) diff --git a/sysdeps/riscv/rvv/memset.S b/sysdeps/riscv/rvv/memset.S new file mode 100644 index 0000000000..ac3f88e492 --- /dev/null +++ b/sysdeps/riscv/rvv/memset.S @@ -0,0 +1,49 @@ +/* RVV versions memset. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define dst a0 +#define value a1 +#define num a2 + +#define ivl a3 +#define dst_ptr a5 + +#define ELEM_LMUL_SETTING m8 +#define vdata v0 + +ENTRY(memset) + + mv dst_ptr, dst + + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + vmv.v.x vdata, value + +L(loop): + vse8.v vdata, (dst_ptr) + sub num, num, ivl + add dst_ptr, dst_ptr, ivl + vsetvli ivl, num, e8, ELEM_LMUL_SETTING, ta, ma + bnez num, L(loop) + + ret + +END(memset) +libc_hidden_builtin_def (memset)