From patchwork Wed Aug 2 15:59:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evan Green X-Patchwork-Id: 73491 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7ACA5384F8C7 for ; Wed, 2 Aug 2023 16:00:08 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by sourceware.org (Postfix) with ESMTPS id 5DDF33858017 for ; Wed, 2 Aug 2023 15:59:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5DDF33858017 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-686ea67195dso5090505b3a.2 for ; Wed, 02 Aug 2023 08:59:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1690991963; x=1691596763; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9IwlwhX9ODXDOQQKA4Q4vP2f3eu0ZRLe0JscR2okjy8=; b=k1qPtkKe9LKPNXV54HefmpERiT1j6wTT712KPMxMoptHC8P4MQxHFhVM5w9YwDiBbv QabpeABg98auio5LiEWvuDBxtMPyw2i0lL6kv0Xz8LrinfytwFzLizsw2PNULr201KVA uyz1JAtUILtQd3KawTx97GHPsthYHRzP3kU8ZA4eZDMcmUPWxBmZ0Hp3Sycy/0xT06u1 AoGfhjEc/TYOQdKUp2THftfPukZ46XE8MpxiiziVHsT0Mo6EG3dZWnGSpG0Nazx8kVnX VEYfD5GEmYz/VS4/eyNTlY/ai2ZNQPUcIXmLtLTg+Dj+1RDPFm6kb1ZezzuB0mLsc3SM JPyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690991963; x=1691596763; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9IwlwhX9ODXDOQQKA4Q4vP2f3eu0ZRLe0JscR2okjy8=; b=cAGbRP91QZsprOYELiXeLnrLFGISR+v2lDAe//bj7pGYCAuBMtxQW/O8Ti4l7ApElg zuezNkJ61kDOcuwqy8FCDx9uMBLz9g/JMiinusIoFlZPKjP/bKXLd+XplKOSktMwzuuS xIDYLPKesP4UwaVnxpwQRgUKLkMsd2xFlLOR/iVL52qQYTHQZr1BiMri66ACiPMYO/Oz hAoAMqlSXDotQfgCbGPqf4XNWbIREKiEh+3BmBm0GVi0/SOKq04XeIl3NYmN5WfWRDJF EWtLYMQV8in/CswBlAWdj9nIVTJIKVsWFPgctlx4mAuE13wFL+0jrxeCMjCSeZtD6bxd 6vPA== X-Gm-Message-State: AOJu0YzdUQQpuU5fZz+Lu4VHRjSf1WYUw8o8WB9XsTWDD3iNw2jVj006 wyGeIyEIk3cfxQ/ywomxC0BicOzg6aq+YmSsXmE= X-Google-Smtp-Source: AGHT+IH0DoCHGy/bZIRQCFIzpD+UnwX9vsMpaBthpOKqyJi5mKvmL7AHUkQngL0BA/V7xZJ4Z8hEkQ== X-Received: by 2002:a05:6a00:17a3:b0:687:82a4:4a03 with SMTP id s35-20020a056a0017a300b0068782a44a03mr267721pfg.20.1690991962945; Wed, 02 Aug 2023 08:59:22 -0700 (PDT) Received: from evan.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id i5-20020aa787c5000000b00682936d04ccsm11200091pfo.180.2023.08.02.08.59.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Aug 2023 08:59:22 -0700 (PDT) From: Evan Green To: libc-alpha@sourceware.org Cc: slewis@rivosinc.com, Florian Weimer , palmer@rivosinc.com, vineetg@rivosinc.com, Evan Green Subject: [PATCH v6 5/5] riscv: Add and use alignment-ignorant memcpy Date: Wed, 2 Aug 2023 08:59:03 -0700 Message-Id: <20230802155903.2552780-6-evan@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230802155903.2552780-1-evan@rivosinc.com> References: <20230802155903.2552780-1-evan@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" For CPU implementations that can perform unaligned accesses with little or no performance penalty, create a memcpy implementation that does not bother aligning buffers. It will use a block of integer registers, a single integer register, and fall back to bytewise copy for the remainder. Signed-off-by: Evan Green Reviewed-by: Palmer Dabbelt --- Changes in v6: - Fix a couple regressions in the assembly from v5 :/ - Use passed hwprobe pointer in memcpy ifunc selector. Changes in v5: - Do unaligned word access for final trailing bytes (Richard) Changes in v4: - Fixed comment style (Florian) Changes in v3: - Word align dest for large memcpy()s. - Add tags - Remove spurious blank line from sysdeps/riscv/memcpy.c Changes in v2: - Used _MASK instead of _FAST value itself. --- sysdeps/riscv/memcopy.h | 26 ++++ sysdeps/riscv/memcpy.c | 66 +++++++++ sysdeps/riscv/memcpy_noalignment.S | 138 ++++++++++++++++++ sysdeps/unix/sysv/linux/riscv/Makefile | 4 + .../unix/sysv/linux/riscv/memcpy-generic.c | 24 +++ 5 files changed, 258 insertions(+) create mode 100644 sysdeps/riscv/memcopy.h create mode 100644 sysdeps/riscv/memcpy.c create mode 100644 sysdeps/riscv/memcpy_noalignment.S create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c diff --git a/sysdeps/riscv/memcopy.h b/sysdeps/riscv/memcopy.h new file mode 100644 index 0000000000..2b685c8aa0 --- /dev/null +++ b/sysdeps/riscv/memcopy.h @@ -0,0 +1,26 @@ +/* memcopy.h -- definitions for memory copy functions. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +/* Redefine the generic memcpy implementation to __memcpy_generic, so + the memcpy ifunc can select between generic and special versions. + In rtld, don't bother with all the ifunciness. */ +#if IS_IN (libc) +#define MEMCPY __memcpy_generic +#endif diff --git a/sysdeps/riscv/memcpy.c b/sysdeps/riscv/memcpy.c new file mode 100644 index 0000000000..ecadd96433 --- /dev/null +++ b/sysdeps/riscv/memcpy.c @@ -0,0 +1,66 @@ +/* Multiple versions of memcpy. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017-2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if IS_IN (libc) +/* Redefine memcpy so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memcpy +# define memcpy __redirect_memcpy +# include +# include +# include +# include +# include + +# define INIT_ARCH() + +extern __typeof (__redirect_memcpy) __libc_memcpy; + +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; +extern __typeof (__redirect_memcpy) __memcpy_noalignment attribute_hidden; + +static inline __typeof (__redirect_memcpy) * +select_memcpy_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func) +{ + INIT_ARCH (); + + struct riscv_hwprobe pair; + + pair.key = RISCV_HWPROBE_KEY_CPUPERF_0; + if (!hwprobe_func || hwprobe_func(&pair, 1, 0, NULL, 0) != 0) + return __memcpy_generic; + + if ((pair.key > 0) && + (pair.value & RISCV_HWPROBE_MISALIGNED_MASK) == + RISCV_HWPROBE_MISALIGNED_FAST) + return __memcpy_noalignment; + + return __memcpy_generic; +} + +riscv_libc_ifunc (__libc_memcpy, select_memcpy_ifunc); + +# undef memcpy +strong_alias (__libc_memcpy, memcpy); +# ifdef SHARED +__hidden_ver1 (memcpy, __GI_memcpy, __redirect_memcpy) + __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memcpy); +# endif + +#endif diff --git a/sysdeps/riscv/memcpy_noalignment.S b/sysdeps/riscv/memcpy_noalignment.S new file mode 100644 index 0000000000..f3bf8e5867 --- /dev/null +++ b/sysdeps/riscv/memcpy_noalignment.S @@ -0,0 +1,138 @@ +/* memcpy for RISC-V, ignoring buffer alignment + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +/* void *memcpy(void *, const void *, size_t) */ +ENTRY (__memcpy_noalignment) + move t6, a0 /* Preserve return value */ + + /* Bail if 0 */ + beqz a2, 7f + + /* Jump to byte copy if size < SZREG */ + li a4, SZREG + bltu a2, a4, 5f + + /* Round down to the nearest "page" size */ + andi a4, a2, ~((16*SZREG)-1) + beqz a4, 2f + add a3, a1, a4 + + /* Copy the first word to get dest word aligned */ + andi a5, t6, SZREG-1 + beqz a5, 1f + REG_L a6, (a1) + REG_S a6, (t6) + + /* Align dst up to a word, move src and size as well. */ + addi t6, t6, SZREG-1 + andi t6, t6, ~(SZREG-1) + sub a5, t6, a0 + add a1, a1, a5 + sub a2, a2, a5 + + /* Recompute page count */ + andi a4, a2, ~((16*SZREG)-1) + beqz a4, 2f + +1: + /* Copy "pages" (chunks of 16 registers) */ + REG_L a4, 0(a1) + REG_L a5, SZREG(a1) + REG_L a6, 2*SZREG(a1) + REG_L a7, 3*SZREG(a1) + REG_L t0, 4*SZREG(a1) + REG_L t1, 5*SZREG(a1) + REG_L t2, 6*SZREG(a1) + REG_L t3, 7*SZREG(a1) + REG_L t4, 8*SZREG(a1) + REG_L t5, 9*SZREG(a1) + REG_S a4, 0(t6) + REG_S a5, SZREG(t6) + REG_S a6, 2*SZREG(t6) + REG_S a7, 3*SZREG(t6) + REG_S t0, 4*SZREG(t6) + REG_S t1, 5*SZREG(t6) + REG_S t2, 6*SZREG(t6) + REG_S t3, 7*SZREG(t6) + REG_S t4, 8*SZREG(t6) + REG_S t5, 9*SZREG(t6) + REG_L a4, 10*SZREG(a1) + REG_L a5, 11*SZREG(a1) + REG_L a6, 12*SZREG(a1) + REG_L a7, 13*SZREG(a1) + REG_L t0, 14*SZREG(a1) + REG_L t1, 15*SZREG(a1) + addi a1, a1, 16*SZREG + REG_S a4, 10*SZREG(t6) + REG_S a5, 11*SZREG(t6) + REG_S a6, 12*SZREG(t6) + REG_S a7, 13*SZREG(t6) + REG_S t0, 14*SZREG(t6) + REG_S t1, 15*SZREG(t6) + addi t6, t6, 16*SZREG + bltu a1, a3, 1b + andi a2, a2, (16*SZREG)-1 /* Update count */ + +2: + /* Remainder is smaller than a page, compute native word count */ + beqz a2, 7f + andi a5, a2, ~(SZREG-1) + andi a2, a2, (SZREG-1) + add a3, a1, a5 + /* Jump directly to last word if no words. */ + beqz a5, 4f + +3: + /* Use single native register copy */ + REG_L a4, 0(a1) + addi a1, a1, SZREG + REG_S a4, 0(t6) + addi t6, t6, SZREG + bltu a1, a3, 3b + + /* Jump directly out if no more bytes */ + beqz a2, 7f + +4: + /* Copy the last word unaligned */ + add a3, a1, a2 + add a4, t6, a2 + REG_L a5, -SZREG(a3) + REG_S a5, -SZREG(a4) + ret + +5: + /* Copy bytes when the total copy is . */ + +#include + +extern __typeof (memcpy) __memcpy_generic; +hidden_proto(__memcpy_generic) + +#include