From patchwork Wed Mar 1 15:32:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Vincent Chen X-Patchwork-Id: 55541 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 10E79385B537 for ; Wed, 1 Mar 2023 15:33:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 10E79385B537 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1677684819; bh=sSS/MbUjvnHIe1FWI5nXhkZKLK7o7gOkpry6YbRxBhc=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=HRSVH+6vkFmCdslh7esO772l16KqX/azbaWAg5o49Mm5U3EpB7qSPvBoVjys1OLdf l/Wi8Q6H543Q+Eh+sJukj8PEsU7+Fqj8bozE+Sa+tljEknM7eyycj0L7rWzhjDBlop ZYLjaUvZM60vKbSOevPvT1UolOMs48A/9XsNSxw0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by sourceware.org (Postfix) with ESMTPS id 22A203858C39 for ; Wed, 1 Mar 2023 15:33:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 22A203858C39 Received: by mail-pl1-x635.google.com with SMTP id i5so12692244pla.2 for ; Wed, 01 Mar 2023 07:33:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677684783; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sSS/MbUjvnHIe1FWI5nXhkZKLK7o7gOkpry6YbRxBhc=; b=hBd/E20V5kPE32cmkj669E+aTC2zcrHG8KvN2VaTlI+c2UTeOEuQ+9U+rhz1WvXaRY mkhoZLvpF7IN+Rsm2I3O+debSc4wju6dvSLJtDmoj1kYF+5MgX8NI1e5ReFR2edmodaF XtVfUBUtClmL9QV1j4wWvL02a4qg5qMjfKJTL/Oi28/MEVbVEg5ymodJsgQDk+03m5Gf sVpqKKpqzSD9w0GQwmeRCYCUNX7sdKJ6FGjg/SjSAAwprVkRljpWfd+r/y+JQS/0B7gf x0C3VtgbKd8qW+MA1gp36ueF63irURrABQsFjQ+QpV4h7EG2hhjQOS1KenZmh2izUQ+A U52Q== X-Gm-Message-State: AO0yUKV5JWDsA7GYttOBvd+RusxH92mEjz2s2TJwqW5UPy6EDF0HM3fO MsZ7IW/uWZf7GHYTlzBJakjdMHlKkbRgVzroyLF+2Iny3BezCfIDM+GYWVmPp99K5dTSyWvegRc a/5057rA4/4Jsbolea4soE0TuU+h3uI/n1EurxeJXcdpaLat8Mms82YZmxWSa1qQ/kTE+ZuJulJ 5i86vg9A== X-Google-Smtp-Source: AK7set8Y6K0KlVLADsBoNKhF/jCbY1YOZTSM4495GZQZAf2+tWbGDicesDz3Lc3LINpV+g4dIQcFgA== X-Received: by 2002:a17:90b:384f:b0:234:31f3:e00f with SMTP id nl15-20020a17090b384f00b0023431f3e00fmr7660231pjb.43.1677684782473; Wed, 01 Mar 2023 07:33:02 -0800 (PST) Received: from localhost.localdomain (111-251-213-204.dynamic-ip.hinet.net. [111.251.213.204]) by smtp.gmail.com with ESMTPSA id a6-20020a17090a740600b002345ef591dasm8058025pjg.31.2023.03.01.07.33.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Mar 2023 07:33:02 -0800 (PST) To: libc-alpha@sourceware.org, palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, dj@redhat.com Cc: jerry.shih@sifive.com, nick.knight@sifive.com, hongrong.hsu@sifive.com, hau.hsu@sifive.com, kito.cheng@sifive.com, vincent.chen@sifive.com, greentime.hu@sifive.com Subject: [PATCH 0/4] riscv: Vectorized mem*/str* function Date: Wed, 1 Mar 2023 23:32:43 +0800 Message-Id: <20230301153247.1499566-1-vincent.chen@sifive.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Vincent Chen via Libc-alpha From: Vincent Chen Reply-To: Vincent Chen Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch proposes implementations of memchr, memcmp, memcpy, memmove, memset, strcat, strchr, strcmp, strcpy, strlen, strncat, strncmp, strncpy and strnlen that leverage the RISC-V V extension (RVV), version 1.0 (https://github.com/riscv/riscv-v-spec/releases/tag/v1.0). These routines are from https://github.com/sifive/sifive-libc, which we agree to be contributed to the Free Software Foundation. With regards to IFUNC, some details concerning `hwcap` are still under discussion in the community. For the purposes of reviewing this patch, we have temporarily opted for RVV delegation at compile time. Once the `hwcap` mechanism is ready, we’ll rebase on it. These routines assume VLEN is at least 32 bits, as is required by all currently defined vector extensions, and they support arbitrarily large VLEN. All implementations work for both RV32 and RV64 platforms, and make no assumptions about page size. The `mem*` (known-length) routines use LMUL=8 to minimize dynamic code size, while the `str*` (unknown-length) routines use LMUL=1 instead. Longer LMUL will still minimize dynamic code size for the latter routines, but it will also increase the cost of the remainder/tail loop: more data loaded and comparisons performed past the `\0`. This overhead will be particularly pronounced for smaller strings. Measured performance improvements of the vectorized ("rvv") implementations vs. the existing Glibc ("scalar") implementations are as follows: memchr: 85% time savings (i.e., if scalar is 100ms, then rvv is 15ms) memcmp: 55% memcpy: 88% memmove: 80% memset: 88% strcmp: 85% strlen: 70% strcat: 53% strchr: 85% strcpy: 70% strncmp 90% strncat: 50% strncpy: 60% strnlen: 80% Above data are collected on SiFive X280 (FPGA simulation), across a wide range of problem sizes. Jerry Shih (2): riscv: vectorized mem* functions riscv: vectorized str* functions Nick Knight (1): riscv: vectorized strchr and strnlen functions Vincent Chen (1): riscv: Enabling vectorized mem*/str* functions in build time scripts/build-many-glibcs.py | 10 +++ sysdeps/riscv/preconfigure | 19 +++++ sysdeps/riscv/preconfigure.ac | 18 +++++ sysdeps/riscv/rv32/rvv/Implies | 2 + sysdeps/riscv/rv64/rvv/Implies | 2 + sysdeps/riscv/rvv/memchr.S | 63 +++++++++++++++ sysdeps/riscv/rvv/memcmp.S | 75 ++++++++++++++++++ sysdeps/riscv/rvv/memcpy.S | 51 +++++++++++++ sysdeps/riscv/rvv/memmove.S | 72 ++++++++++++++++++ sysdeps/riscv/rvv/memset.S | 51 +++++++++++++ sysdeps/riscv/rvv/strcat.S | 72 ++++++++++++++++++ sysdeps/riscv/rvv/strchr.S | 53 +++++++++++++ sysdeps/riscv/rvv/strcmp.S | 135 +++++++++++++++++++++++++++++++++ sysdeps/riscv/rvv/strcpy.S | 56 ++++++++++++++ sysdeps/riscv/rvv/strlen.S | 54 +++++++++++++ sysdeps/riscv/rvv/strncat.S | 83 ++++++++++++++++++++ sysdeps/riscv/rvv/strncmp.S | 85 +++++++++++++++++++++ sysdeps/riscv/rvv/strncpy.S | 86 +++++++++++++++++++++ sysdeps/riscv/rvv/strnlen.S | 56 ++++++++++++++ 19 files changed, 1043 insertions(+) create mode 100644 sysdeps/riscv/rv32/rvv/Implies create mode 100644 sysdeps/riscv/rv64/rvv/Implies create mode 100644 sysdeps/riscv/rvv/memchr.S create mode 100644 sysdeps/riscv/rvv/memcmp.S create mode 100644 sysdeps/riscv/rvv/memcpy.S create mode 100644 sysdeps/riscv/rvv/memmove.S create mode 100644 sysdeps/riscv/rvv/memset.S create mode 100644 sysdeps/riscv/rvv/strcat.S create mode 100644 sysdeps/riscv/rvv/strchr.S create mode 100644 sysdeps/riscv/rvv/strcmp.S create mode 100644 sysdeps/riscv/rvv/strcpy.S create mode 100644 sysdeps/riscv/rvv/strlen.S create mode 100644 sysdeps/riscv/rvv/strncat.S create mode 100644 sysdeps/riscv/rvv/strncmp.S create mode 100644 sysdeps/riscv/rvv/strncpy.S create mode 100644 sysdeps/riscv/rvv/strnlen.S