| Message ID | 20260305061947.31797-1-pincheng.plct@isrc.iscas.ac.cn |
|---|---|
| Headers |
Return-Path: <newlib-bounces~patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 99B0D4BA23CE for <patchwork@sourceware.org>; Thu, 5 Mar 2026 06:20:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 99B0D4BA23CE X-Original-To: newlib@sourceware.org Delivered-To: newlib@sourceware.org Received: from cstnet.cn (smtp81.cstnet.cn [159.226.251.81]) by sourceware.org (Postfix) with ESMTPS id 262B64BA2E16 for <newlib@sourceware.org>; Thu, 5 Mar 2026 06:20:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 262B64BA2E16 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=isrc.iscas.ac.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=isrc.iscas.ac.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 262B64BA2E16 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=159.226.251.81 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1772691616; cv=none; b=oBgk2JavExUIuTGGKaUZDNMSIFZpo7M1XX6b7UkVui4/vQpoIygN35kCQiJzn250VBtqtJWLIhKcLVdD8tygtf8McNK490HYA5mBgU+V7RTpWBMCcO0SY9ucCdB2SLvUSKccuZ7pH9URlybIVYp7XcaknQ8nZwccSWOG9PQiFgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1772691616; c=relaxed/simple; bh=zzN6vMq696FjI4TCap8b0WljtyojSMXdGihzTIL6U54=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=XpVJoj7X4kDa0f3g2qxuVLKXWx8drsefXtPu1rAaOBj27syCngax8eYRfztrmIxsLr+xluLddXStdGtNoV2GZf2ZxDrvFsLACNVKqYNh7v3igWMg+4DEn8/WYc/LTfVJf9kLs65jiaKyikoxwABzT/IBep5UcKj/WTlecRV+/C4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 262B64BA2E16 Received: from localhost.localdomain (unknown [36.148.251.141]) by APP-03 (Coremail) with SMTP id rQCowAB3lt2ZIKlpn5bdCQ--.195S2; Thu, 05 Mar 2026 14:20:10 +0800 (CST) From: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> To: newlib@sourceware.org Cc: pincheng.plct@isrc.iscas.ac.cn, kito.cheng@gmail.com Subject: [PATCH v3 0/1] riscv: add vectorized memset, memcpy and memmove Date: Thu, 5 Mar 2026 14:19:46 +0800 Message-Id: <20260305061947.31797-1-pincheng.plct@isrc.iscas.ac.cn> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: rQCowAB3lt2ZIKlpn5bdCQ--.195S2 X-Coremail-Antispam: 1UD129KBjvJXoW7CFW5Jry5XF1rCw48ArW8Zwb_yoW8WFy7pF 4rGFn0kw1xJrn3Gr13Za18Zw13Was5Gw45GFy2k390qF4DGF1FkFZFka13Jr98JFZrKr1f Xw18KryYgw1UZa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyq14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r1j 6r4UM28EF7xvwVC2z280aVAFwI0_Jr0_Gr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r1j6r 4UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xII jxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr 1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUXVWUAwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20EY4 v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AK xVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjfU5WlkUUUUU X-Originating-IP: [36.148.251.141] X-CM-SenderInfo: pslquxhhqjh1xofwqxxvufhxpvfd2hldfou0/ X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, RCVD_IN_DNSWL_BLOCKED, RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Newlib mailing list <newlib.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/newlib>, <mailto:newlib-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/newlib/> List-Post: <mailto:newlib@sourceware.org> List-Help: <mailto:newlib-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/newlib>, <mailto:newlib-request@sourceware.org?subject=subscribe> Errors-To: newlib-bounces~patchwork=sourceware.org@sourceware.org |
| Series |
riscv: add vectorized memset, memcpy and memmove
|
|
Message
Pincheng Wang
March 5, 2026, 6:19 a.m. UTC
Hi all, This v3 patch adds RISC-V Vector (RVV) optimized implementations for memset, memcpy and memmove. Changes since v2: - Changed conditional compilation order, so vector path will not be chosen when optimizing for size. Changes since v1: - Switch the conditional compilation macro from __riscv_v to __riscv_vector. - Replaced '.option arch,+v' with '.option arch,+zve32x'. - Removed an unnecessary unconditional jump instruction in memmove-asm.S. - In memcpy and memmove, when __riscv_misaligned_fast is not defined, the destination address is now aligned to SZREG to improve performance on systems with slow misaligned accesses. These implementations use the RVV extension with e8 element size and m8 LMUL, and are conditionally compiled only when __riscv_vector is defined, ensuring compatibility with non-vector RISC-V systems. Benchmark results on Spacemit X60 (Muse-pi) and Canaan K230 show significant improvements. memcpy: Up to 4.84x on Muse-pi and 4.66x on K230. memset: Up to 4.31x on Muse-pi and 3.14x on K230. memmove: Up to 2.87x on Muse-pi and 1.48x on K230. Comments and suggestions are greatly appreciated. Thank you for your time and review! Best regards, Pincheng Wang Pincheng Wang (1): riscv: add vectorized memset, memcpy and memmove newlib/libc/machine/riscv/memcpy-asm.S | 52 ++++++++++++++ newlib/libc/machine/riscv/memcpy.c | 2 +- newlib/libc/machine/riscv/memmove-asm.S | 93 +++++++++++++++++++++++++ newlib/libc/machine/riscv/memmove.c | 2 +- newlib/libc/machine/riscv/memset.S | 18 +++++ 5 files changed, 165 insertions(+), 2 deletions(-)
Comments
Kito, ping? On Mar 5 14:19, Pincheng Wang wrote: > Hi all, > > This v3 patch adds RISC-V Vector (RVV) optimized implementations for > memset, memcpy and memmove. > > Changes since v2: > - Changed conditional compilation order, so vector path will not be > chosen when optimizing for size. > > Changes since v1: > - Switch the conditional compilation macro from __riscv_v to > __riscv_vector. > - Replaced '.option arch,+v' with '.option arch,+zve32x'. > - Removed an unnecessary unconditional jump instruction in > memmove-asm.S. > - In memcpy and memmove, when __riscv_misaligned_fast is not defined, > the destination address is now aligned to SZREG to improve performance > on systems with slow misaligned accesses. > > These implementations use the RVV extension with e8 element size and m8 > LMUL, and are conditionally compiled only when __riscv_vector is > defined, ensuring compatibility with non-vector RISC-V systems. > > Benchmark results on Spacemit X60 (Muse-pi) and Canaan K230 show > significant improvements. > > memcpy: Up to 4.84x on Muse-pi and 4.66x on K230. > memset: Up to 4.31x on Muse-pi and 3.14x on K230. > memmove: Up to 2.87x on Muse-pi and 1.48x on K230. > > Comments and suggestions are greatly appreciated. Thank you for your > time and review! > > Best regards, > Pincheng Wang > > Pincheng Wang (1): > riscv: add vectorized memset, memcpy and memmove > > newlib/libc/machine/riscv/memcpy-asm.S | 52 ++++++++++++++ > newlib/libc/machine/riscv/memcpy.c | 2 +- > newlib/libc/machine/riscv/memmove-asm.S | 93 +++++++++++++++++++++++++ > newlib/libc/machine/riscv/memmove.c | 2 +- > newlib/libc/machine/riscv/memset.S | 18 +++++ > 5 files changed, 165 insertions(+), 2 deletions(-) > > -- > 2.39.5
Hi all, Gentle ping on this patch series. Please let me know if you need any clarifications, reworks, or further testing from my end. :) Moreover, I’m also ready to upstream more RVV-enabled string functions for newlib once this series lands, so I’d welcome any early feedback on the overall direction. Best regards, Pincheng Wang On 2026/3/5 14:19, Pincheng Wang wrote: > Hi all, > > This v3 patch adds RISC-V Vector (RVV) optimized implementations for > memset, memcpy and memmove. > > Changes since v2: > - Changed conditional compilation order, so vector path will not be > chosen when optimizing for size. > > Changes since v1: > - Switch the conditional compilation macro from __riscv_v to > __riscv_vector. > - Replaced '.option arch,+v' with '.option arch,+zve32x'. > - Removed an unnecessary unconditional jump instruction in > memmove-asm.S. > - In memcpy and memmove, when __riscv_misaligned_fast is not defined, > the destination address is now aligned to SZREG to improve performance > on systems with slow misaligned accesses. > > These implementations use the RVV extension with e8 element size and m8 > LMUL, and are conditionally compiled only when __riscv_vector is > defined, ensuring compatibility with non-vector RISC-V systems. > > Benchmark results on Spacemit X60 (Muse-pi) and Canaan K230 show > significant improvements. > > memcpy: Up to 4.84x on Muse-pi and 4.66x on K230. > memset: Up to 4.31x on Muse-pi and 3.14x on K230. > memmove: Up to 2.87x on Muse-pi and 1.48x on K230. > > Comments and suggestions are greatly appreciated. Thank you for your > time and review! > > Best regards, > Pincheng Wang > > Pincheng Wang (1): > riscv: add vectorized memset, memcpy and memmove > > newlib/libc/machine/riscv/memcpy-asm.S | 52 ++++++++++++++ > newlib/libc/machine/riscv/memcpy.c | 2 +- > newlib/libc/machine/riscv/memmove-asm.S | 93 +++++++++++++++++++++++++ > newlib/libc/machine/riscv/memmove.c | 2 +- > newlib/libc/machine/riscv/memset.S | 18 +++++ > 5 files changed, 165 insertions(+), 2 deletions(-) >
ack, it seems ok, let me merge tomorrow after local testing pass Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> 於 2026年4月16日週四 下午9:18寫道: > > Hi all, > > Gentle ping on this patch series. Please let me know if you need any > clarifications, reworks, or further testing from my end. :) > > Moreover, I’m also ready to upstream more RVV-enabled string functions > for newlib once this series lands, so I’d welcome any early feedback on > the overall direction. > > Best regards, > Pincheng Wang > > On 2026/3/5 14:19, Pincheng Wang wrote: > > Hi all, > > > > This v3 patch adds RISC-V Vector (RVV) optimized implementations for > > memset, memcpy and memmove. > > > > Changes since v2: > > - Changed conditional compilation order, so vector path will not be > > chosen when optimizing for size. > > > > Changes since v1: > > - Switch the conditional compilation macro from __riscv_v to > > __riscv_vector. > > - Replaced '.option arch,+v' with '.option arch,+zve32x'. > > - Removed an unnecessary unconditional jump instruction in > > memmove-asm.S. > > - In memcpy and memmove, when __riscv_misaligned_fast is not defined, > > the destination address is now aligned to SZREG to improve performance > > on systems with slow misaligned accesses. > > > > These implementations use the RVV extension with e8 element size and m8 > > LMUL, and are conditionally compiled only when __riscv_vector is > > defined, ensuring compatibility with non-vector RISC-V systems. > > > > Benchmark results on Spacemit X60 (Muse-pi) and Canaan K230 show > > significant improvements. > > > > memcpy: Up to 4.84x on Muse-pi and 4.66x on K230. > > memset: Up to 4.31x on Muse-pi and 3.14x on K230. > > memmove: Up to 2.87x on Muse-pi and 1.48x on K230. > > > > Comments and suggestions are greatly appreciated. Thank you for your > > time and review! > > > > Best regards, > > Pincheng Wang > > > > Pincheng Wang (1): > > riscv: add vectorized memset, memcpy and memmove > > > > newlib/libc/machine/riscv/memcpy-asm.S | 52 ++++++++++++++ > > newlib/libc/machine/riscv/memcpy.c | 2 +- > > newlib/libc/machine/riscv/memmove-asm.S | 93 +++++++++++++++++++++++++ > > newlib/libc/machine/riscv/memmove.c | 2 +- > > newlib/libc/machine/riscv/memset.S | 18 +++++ > > 5 files changed, 165 insertions(+), 2 deletions(-) > > >
pushed, thanks :) Kito Cheng <kito.cheng@gmail.com> 於 2026年4月16日週四 下午10:26寫道: > > ack, it seems ok, let me merge tomorrow after local testing pass > > Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> 於 2026年4月16日週四 下午9:18寫道: > > > > Hi all, > > > > Gentle ping on this patch series. Please let me know if you need any > > clarifications, reworks, or further testing from my end. :) > > > > Moreover, I’m also ready to upstream more RVV-enabled string functions > > for newlib once this series lands, so I’d welcome any early feedback on > > the overall direction. > > > > Best regards, > > Pincheng Wang > > > > On 2026/3/5 14:19, Pincheng Wang wrote: > > > Hi all, > > > > > > This v3 patch adds RISC-V Vector (RVV) optimized implementations for > > > memset, memcpy and memmove. > > > > > > Changes since v2: > > > - Changed conditional compilation order, so vector path will not be > > > chosen when optimizing for size. > > > > > > Changes since v1: > > > - Switch the conditional compilation macro from __riscv_v to > > > __riscv_vector. > > > - Replaced '.option arch,+v' with '.option arch,+zve32x'. > > > - Removed an unnecessary unconditional jump instruction in > > > memmove-asm.S. > > > - In memcpy and memmove, when __riscv_misaligned_fast is not defined, > > > the destination address is now aligned to SZREG to improve performance > > > on systems with slow misaligned accesses. > > > > > > These implementations use the RVV extension with e8 element size and m8 > > > LMUL, and are conditionally compiled only when __riscv_vector is > > > defined, ensuring compatibility with non-vector RISC-V systems. > > > > > > Benchmark results on Spacemit X60 (Muse-pi) and Canaan K230 show > > > significant improvements. > > > > > > memcpy: Up to 4.84x on Muse-pi and 4.66x on K230. > > > memset: Up to 4.31x on Muse-pi and 3.14x on K230. > > > memmove: Up to 2.87x on Muse-pi and 1.48x on K230. > > > > > > Comments and suggestions are greatly appreciated. Thank you for your > > > time and review! > > > > > > Best regards, > > > Pincheng Wang > > > > > > Pincheng Wang (1): > > > riscv: add vectorized memset, memcpy and memmove > > > > > > newlib/libc/machine/riscv/memcpy-asm.S | 52 ++++++++++++++ > > > newlib/libc/machine/riscv/memcpy.c | 2 +- > > > newlib/libc/machine/riscv/memmove-asm.S | 93 +++++++++++++++++++++++++ > > > newlib/libc/machine/riscv/memmove.c | 2 +- > > > newlib/libc/machine/riscv/memset.S | 18 +++++ > > > 5 files changed, 165 insertions(+), 2 deletions(-) > > > > >