From patchwork Sat Apr 15 11:23:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 55686 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D45783856964 for ; Sat, 15 Apr 2023 11:24:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D45783856964 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1681557860; bh=ZjR3BeVIhmImiiDWAPTQn76k8WPBAmdblexe67VwF9s=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=jv/d7BLM14xjAeE1ysJUL5CUHXSca7rfI04tbXGkt6jaUj/VfkiaS9CYDTzpo+Rsr d6GWeColA4wLyTFulcEkzgoj/NnRpWoWGXu4zgF1cp53mruurtjX+pgmN5gJt9VeYq GNHCyviG1fFkau+p9hmlVqVVENbhCJHaRJc1zgBk= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id 98BA4385772C for ; Sat, 15 Apr 2023 11:23:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 98BA4385772C Received: from stargazer.. (unknown [113.140.11.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 37E1D660B9; Sat, 15 Apr 2023 07:23:53 -0400 (EDT) To: libc-alpha@sourceware.org Cc: caiyinyu , Wang Xuerui , Adhemerval Zanella Netto , Xi Ruoyao Subject: [PATCH 0/5] LoongArch: Multiarch string and memory copy routines for unaligned access Date: Sat, 15 Apr 2023 19:23:35 +0800 Message-Id: <20230415112340.38431-1-xry111@xry111.site> X-Mailer: git-send-email 2.40.0 MIME-Version: 1.0 X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, LIKELY_SPAM_FROM, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Xi Ruoyao via Libc-alpha From: Xi Ruoyao Reply-To: Xi Ruoyao Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" LoongArch CPUs may have hardware unaligned access support. For the launched LoongArch CPUs, those branded as Loongson-3 (for desktops or servers) have hardware unaligned access support, but those branded as Loongson-2 (for embedded or industrial applications) do not. On Linux, the unaligned access support is indicated by a HWCAP bit provided by the kernel. So we can multiarch stpcpy and memcpy with ifunc to take the advantage on the CPUs with unaligned access support. On a Loongson-3A5000HV CPU running at 2.5GHz, "make bench" has shown these changes can really improve the performance: - https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-stpcpy-summary.txt - https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-memcpy-summary.txt Xi Ruoyao (5): LoongArch: Add bits/hwcap.h for Linux LoongArch: Add LOONGARCH_HAVE_UAL macro string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined LoongArch: Multiarch stpcpy for unaligned access LoongArch: Multiarch memcpy for unaligned access string/stpcpy.c | 3 ++ sysdeps/loongarch/loongarch-features.h | 26 ++++++++++ sysdeps/loongarch/multiarch/Makefile | 6 +++ sysdeps/loongarch/multiarch/memcpy-generic.c | 27 ++++++++++ sysdeps/loongarch/multiarch/memcpy-ual.c | 50 +++++++++++++++++++ sysdeps/loongarch/multiarch/memcpy.c | 39 +++++++++++++++ sysdeps/loongarch/multiarch/stpcpy-generic.c | 25 ++++++++++ sysdeps/loongarch/multiarch/stpcpy-ual.c | 43 ++++++++++++++++ sysdeps/loongarch/multiarch/stpcpy.c | 37 ++++++++++++++ .../loongarch/multiarch/wordcopy-ual-inline.c | 31 ++++++++++++ .../unix/sysv/linux/loongarch/bits/hwcap.h | 37 ++++++++++++++ .../sysv/linux/loongarch/loongarch-features.h | 30 +++++++++++ sysdeps/unix/sysv/linux/loongarch/sysdep.h | 1 + 13 files changed, 355 insertions(+) create mode 100644 sysdeps/loongarch/loongarch-features.h create mode 100644 sysdeps/loongarch/multiarch/Makefile create mode 100644 sysdeps/loongarch/multiarch/memcpy-generic.c create mode 100644 sysdeps/loongarch/multiarch/memcpy-ual.c create mode 100644 sysdeps/loongarch/multiarch/memcpy.c create mode 100644 sysdeps/loongarch/multiarch/stpcpy-generic.c create mode 100644 sysdeps/loongarch/multiarch/stpcpy-ual.c create mode 100644 sysdeps/loongarch/multiarch/stpcpy.c create mode 100644 sysdeps/loongarch/multiarch/wordcopy-ual-inline.c create mode 100644 sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h create mode 100644 sysdeps/unix/sysv/linux/loongarch/loongarch-features.h