| Message ID | 20260529012933.12831-1-pincheng.plct@isrc.iscas.ac.cn |
|---|---|
| State | New |
| Headers |
Return-Path: <newlib-bounces~patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E720E4BA2E35 for <patchwork@sourceware.org>; Fri, 29 May 2026 01:30:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E720E4BA2E35 X-Original-To: newlib@sourceware.org Delivered-To: newlib@sourceware.org Received: from cstnet.cn (smtp25.cstnet.cn [159.226.251.25]) by sourceware.org (Postfix) with ESMTPS id ADB1A4BA2E16 for <newlib@sourceware.org>; Fri, 29 May 2026 01:30:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ADB1A4BA2E16 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=isrc.iscas.ac.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=isrc.iscas.ac.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ADB1A4BA2E16 Authentication-Results: sourceware.org; arc=none smtp.remote-ip=159.226.251.25 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1780018206; cv=none; b=w7Er7zhas3vHPm29KDF3rpgSMxog7zqSIv9VfSNIwv1r/iVYmZZ8khkgVYlqiJocGNFCnDW1V4nqhRgLYdufpElizuvbCVFMjTVZTiVpxB+2R4gbdzMH1RLhBEm+gQ1UqCzVZaxgOLYUDItaNEzu2fNjND6IX6o2eQ1UbYjsctM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1780018206; c=relaxed/simple; bh=raPaXDZI+8ORX1Xuh/GPUTXPZF8OsDSJh4lNi7o44o8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=MqlMD3aKpKwrkPoY0+Fpfd6HyE+60sb3xVHj2IuEd9UncmB/QrurTcvgAp1xHdffdv0WLolvhpubmtxt93xO+Tj9hXFPfLEtC5y1r07/bfpHae6A01za2tGjYddMJojiPYKp91DP6zvnLYmohdVm8Fe4KhkCO4w0OYh8M+hAr4A= ARC-Authentication-Results: i=1; sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ADB1A4BA2E16 Received: from localhost.localdomain (unknown [120.227.56.119]) by APP-05 (Coremail) with SMTP id zQCowADnU+cS7BhqrUu1EQ--.274S2; Fri, 29 May 2026 09:29:55 +0800 (CST) From: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> To: newlib@sourceware.org Cc: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Subject: [PATCH] riscv: avoid vrgather in RVV memcmp mismatch path Date: Fri, 29 May 2026 09:29:33 +0800 Message-Id: <20260529012933.12831-1-pincheng.plct@isrc.iscas.ac.cn> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: zQCowADnU+cS7BhqrUu1EQ--.274S2 X-Coremail-Antispam: 1UD129KBjvdXoW7JF43Xw13ZrWxCrWxCr1DAwb_yoWkXrg_uF Z2yFyqva98JayUGa17Krs3WF1qvay8Jr18C3s3KrWDW34Fg398C34vqan8tFyUXa9rAFWf CrsrGr9xKw17ZjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbw8FF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8w A2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Jr0_ Gr1l84ACjcxK6I8E87Iv67AKxVWUJVW8JwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Jr0_Gr 1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0 cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8Jw ACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1l42xK82IYc2Ij64vIr41l 4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67 AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1Y6r17MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8I cVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x0JUdEfOUUUUU= X-Originating-IP: [120.227.56.119] X-CM-SenderInfo: pslquxhhqjh1xofwqxxvufhxpvfd2hldfou0/ X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_BLOCKED, RCVD_IN_PBL, SPF_HELO_PASS, SPF_PASS, TXREP shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Newlib mailing list <newlib.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/newlib>, <mailto:newlib-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/newlib/> List-Post: <mailto:newlib@sourceware.org> List-Help: <mailto:newlib-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/newlib>, <mailto:newlib-request@sourceware.org?subject=subscribe> Errors-To: newlib-bounces~patchwork=sourceware.org@sourceware.org |
| Series |
riscv: avoid vrgather in RVV memcmp mismatch path
|
|
Commit Message
Pincheng Wang
May 29, 2026, 1:29 a.m. UTC
vfirst.m already returns the byte offset of the first mismatch in the
current vector chunk. Use that offset to reload the two differing bytes
with lbu instead of extracting them with vrgather.vx and vmv.x.s.
The vector gather path can be more expensive on some implementations and
also increases vector register pressure. This keeps the mismatch path
shorter while preserving the memcmp result.
Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn>
---
newlib/libc/machine/riscv/memcmp-asm.S | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
Comments
Hi all, Gentle ping. :) BR, Pincheng Wang On 2026/5/29 9:29, Pincheng Wang wrote: > vfirst.m already returns the byte offset of the first mismatch in the > current vector chunk. Use that offset to reload the two differing bytes > with lbu instead of extracting them with vrgather.vx and vmv.x.s. > > The vector gather path can be more expensive on some implementations and > also increases vector register pressure. This keeps the mismatch path > shorter while preserving the memcmp result. > > Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> > --- > newlib/libc/machine/riscv/memcmp-asm.S | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/newlib/libc/machine/riscv/memcmp-asm.S b/newlib/libc/machine/riscv/memcmp-asm.S > index b05df9521..1cf1680e2 100644 > --- a/newlib/libc/machine/riscv/memcmp-asm.S > +++ b/newlib/libc/machine/riscv/memcmp-asm.S > @@ -28,12 +28,10 @@ memcmp: > li a0, 0 > ret > .Lfound: > - vrgather.vx v16, v0, a4 > - vrgather.vx v24, v8, a4 > - vmv.x.s a0, v16 > - vmv.x.s a4, v24 > - andi a0, a0, 0xff > - andi a4, a4, 0xff > + add a0, a0, a4 > + add a1, a1, a4 > + lbu a0, 0(a0) > + lbu a4, 0(a1) > sub a0, a0, a4 > ret > .size memcmp, .-memcmp
Ack, the patch seems good to me, plan to put to my test queue then push :) Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> 於 2026年6月11日週四 下午10:01寫道: > Hi all, > > Gentle ping. :) > > BR, > Pincheng Wang > > On 2026/5/29 9:29, Pincheng Wang wrote: > > vfirst.m already returns the byte offset of the first mismatch in the > > current vector chunk. Use that offset to reload the two differing bytes > > with lbu instead of extracting them with vrgather.vx and vmv.x.s. > > > > The vector gather path can be more expensive on some implementations and > > also increases vector register pressure. This keeps the mismatch path > > shorter while preserving the memcmp result. > > > > Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> > > --- > > newlib/libc/machine/riscv/memcmp-asm.S | 10 ++++------ > > 1 file changed, 4 insertions(+), 6 deletions(-) > > > > diff --git a/newlib/libc/machine/riscv/memcmp-asm.S > b/newlib/libc/machine/riscv/memcmp-asm.S > > index b05df9521..1cf1680e2 100644 > > --- a/newlib/libc/machine/riscv/memcmp-asm.S > > +++ b/newlib/libc/machine/riscv/memcmp-asm.S > > @@ -28,12 +28,10 @@ memcmp: > > li a0, 0 > > ret > > .Lfound: > > - vrgather.vx v16, v0, a4 > > - vrgather.vx v24, v8, a4 > > - vmv.x.s a0, v16 > > - vmv.x.s a4, v24 > > - andi a0, a0, 0xff > > - andi a4, a4, 0xff > > + add a0, a0, a4 > > + add a1, a1, a4 > > + lbu a0, 0(a0) > > + lbu a4, 0(a1) > > sub a0, a0, a4 > > ret > > .size memcmp, .-memcmp > >
diff --git a/newlib/libc/machine/riscv/memcmp-asm.S b/newlib/libc/machine/riscv/memcmp-asm.S index b05df9521..1cf1680e2 100644 --- a/newlib/libc/machine/riscv/memcmp-asm.S +++ b/newlib/libc/machine/riscv/memcmp-asm.S @@ -28,12 +28,10 @@ memcmp: li a0, 0 ret .Lfound: - vrgather.vx v16, v0, a4 - vrgather.vx v24, v8, a4 - vmv.x.s a0, v16 - vmv.x.s a4, v24 - andi a0, a0, 0xff - andi a4, a4, 0xff + add a0, a0, a4 + add a1, a1, a4 + lbu a0, 0(a0) + lbu a4, 0(a1) sub a0, a0, a4 ret .size memcmp, .-memcmp