From patchwork Mon May 27 11:18:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 57021 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AD8593858404 for ; Mon, 27 May 2024 11:19:43 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by sourceware.org (Postfix) with ESMTPS id BA3093858D28 for ; Mon, 27 May 2024 11:19:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BA3093858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BA3093858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716808756; cv=none; b=noMjp1GW6WJ5uMT6T3Z9tAnCUMkG1AKvCt+ZFYXvp4OSN6cpuSuTk6e3m0g6uPjOsPg1W71r9DIDJq24h/SgR6WdQcfF6XgANvYRDde0L63pgKp/vdC/S31tJDOMQeDQvW3zrtKzwXFNyilpc+aSTTNoiG2FPhcocSR9swNcJvw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716808756; c=relaxed/simple; bh=2/ESQojL4KEj9S4l0JISJlSfZrylmYAYN+7ZJlX7IqE=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=hEp9ADc0vvLLD82qV8t9VWq7SxnwFRLCWlXT7E/kodgHin1JmmX0QT3RJfqKmMkKhKV+EDI2LEZYJPXR/zhuFJFKvRSCNOz5IPQzk8bLGiQJdiPrmKT88prhr3mKTWlCyLM4SYJZAxciPPWgWNCyzJh//imvEhqPgLqDBjGhTXE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-5296935252dso2898400e87.3 for ; Mon, 27 May 2024 04:19:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1716808750; x=1717413550; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=eqm8h4jZ8MZSawNMj6kFZT8X85r4+TN074pGx39wrdc=; b=mtYzAWS1yxlTbK0Vk16xwtcya0t6glXuHFVtRYJP4lJm3QNNVdstWfZ3G80iht0FNO f3sacR7M1XBGjAX7Iblb1rAaMwfBvbjq9uM5WddX8e2TtL7QtTkiMGNIcKozeUDuGDkb A7EkWCbwIGe+HsbNJa5FtevN3l5vD+D8QxQxy3+GoA3C4NkVSLok7gGzLOaXCbvTCX3V TlQKpuhvcpHalFbNLSk+Y4N0IX0qeAi5mmIeEq+7tHBb751LLMmoKhh7THap23I6nuwq wDvEcMIxocx1WCKyG7TwinyDb2oi0xFz/Bg9DYE3WcM5BZTl9jeJh8sosbwWKPN8gaft OrPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716808750; x=1717413550; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eqm8h4jZ8MZSawNMj6kFZT8X85r4+TN074pGx39wrdc=; b=wdrg4oBaoU44j6z196s1o6xIQbj2yEetP6wA4gZKP/H2p9Qa+8or7ftppUMDww1Vij m3O3cAHYTQ/wjHbR/QGx4VMqp5ATRSsEs0q0DcuWQ8EFJ6Cdylvam9x65T0XMKFfIb8J Ps47GEKprm5zIUza7O3ZtLf1Q1A7eTmtICoDW1rB82o0FKQ/n4w6p1yEXbGvhyXUgo1s ZsUS9BnE00+65LNGtCw5A3RufDHZdJcol48Viga4iQH5Idq+jeRFvVErfTxaqasnjY2D N0xCKlJrXhXC5ghPmlCdxArdgxdTqO15XUpcaeFlvjaJRaBuYJS6ZqdRxZJxcdSI/Sbb zfnQ== X-Gm-Message-State: AOJu0YxxLoBIjquNyMl63CauOTTIB4BjhPIGqRb6oWvpEIT+yHZbFdpZ gVnVjK/Gil/DiJDhqhMjm8K99tDOp4CCyaPSQGuJiVRJX/a5NFl+b7kX6QyDbiopXtfhlIHgFFl qF1w= X-Google-Smtp-Source: AGHT+IHUxydswrSLcgEr+MfSh/uyn4+3WogxEu4E15xH9zIEuoYdvTABI0ogdxab/7jtJhedPQqClg== X-Received: by 2002:ac2:5053:0:b0:51c:fd0a:7e34 with SMTP id 2adb3069b0e04-52964ea9795mr5222458e87.22.1716808749526; Mon, 27 May 2024 04:19:09 -0700 (PDT) Received: from beast.fritz.box (static.239.130.217.95.clients.your-server.de. [95.217.130.239]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5296e887b1asm519861e87.32.2024.05.27.04.19.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 May 2024 04:19:08 -0700 (PDT) From: =?utf-8?q?Christoph_M=C3=BCllner?= To: libc-alpha@sourceware.org, Adhemerval Zanella , Palmer Dabbelt , Darius Rad , Andrew Waterman , Philipp Tomsich , Evan Green , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH v2 00/15] RISC-V: Add Zbb-optimized string routines as ifuncs Date: Mon, 27 May 2024 13:18:45 +0200 Message-ID: <20240527111900.1060546-1-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Glibc recently got hwprobe() support for RISC-V, which allows querying avaiable extensions at runtime. On top of that an optimized memcpy() routine (for fast unaligned accesses) has been merged, which is built by recompiling the generic C code with a different compiler flag. An ifunc resolver then detects which routine should be run using hwprobe(). This patchset follows this idea and recompiles the following functions for Zbb (via function attributes) and enables the existing Zbb/orc.b optimization in riscv/string-fza.h: memchr, memrchr, strchrnul, strcmp, strlen, strncmp. The resulting optimized routines are then selected by the resolver function if the Zbb extension is present at runtime. To use target function attributes, a few issues had to be resovled: - The functions above got a mechanism to be compiled with function attributes (patches 2-7). Only those routines have been touched, which are required for the purpose of this patchset. - Ensuring that inlined functions also get the same function attributes (first patch). - Add mechanism to explicitly enable the orc.b optimization for string functions (patch 8), which is a bit inspired by USE_FFS_BUILTIN. One of the design questions is, if Zbb represents a broad enough optimization target. Tests with Zb* extensions showed, that no further code improvements can be achieved with them. Also most other extensions likely won't affect the generated code for string routines (ignoring vector instructions, which are a different topic). Therefore, Zbb seemed like a sufficient target. This series was tested by writing a simple test program to invoke the libc routines (e.g. strcmp) and a modified QEMU that reports the emulation of orc.b on stderr. With that the QEMU can be used to test if the optimized routines are executed (-cpu "rv64,zbb=[false,true]"). Further, this series was tested with SPEC CPU 2017 intrate with Zbb enabled. The function attribute detection mechanism was tested with GCC 13 and GCC 14. Changes in v2: - Drop "Use .insn directive form for orc.b" - Introduce use of target function attribute (and all depenendcies) - Introduce detection of target function attribute support - Make orc.b optimization explicit - Small cleanups Christoph Müllner (15): cdefs: Add mechanism to add attributes to __always_inline functions string/memchr: Add mechanism to set function attributes string/memrchr: Add mechanism to set function attributes string/strchrnul: Add mechanism to set function attributes string/strcmp: Add mechanism to set function attributes string/strlen: Add mechanism to set function attributes string/strncmp: Add mechanism to set function attributes RISC-V: string-fz[a,i].h: Make orc.b optimization explicit RISC-V: Add compiler test for Zbb function attribute support RISC-V: Add Zbb optimized memchr as ifunc RISC-V: Add Zbb optimized memrchr as ifunc RISC-V: Add Zbb optimized strchrnul as ifunc RISC-V: Add Zbb optimized strcmp as ifunc RISC-V: Add Zbb optimized strlen as ifunc RISC-V: Add Zbb optimized strncmp as ifunc config.h.in | 3 + misc/sys/cdefs.h | 8 ++- string/memchr.c | 5 ++ string/memrchr.c | 5 ++ string/strchrnul.c | 5 ++ string/strcmp.c | 8 +++ string/strlen.c | 5 ++ string/strncmp.c | 8 +++ sysdeps/riscv/configure | 27 ++++++++ sysdeps/riscv/configure.ac | 18 +++++ sysdeps/riscv/multiarch/memchr-generic.c | 24 +++++++ sysdeps/riscv/multiarch/memchr-zbb.c | 23 +++++++ sysdeps/riscv/multiarch/memrchr-generic.c | 24 +++++++ sysdeps/riscv/multiarch/memrchr-zbb.c | 23 +++++++ sysdeps/riscv/multiarch/strchrnul-generic.c | 24 +++++++ sysdeps/riscv/multiarch/strchrnul-zbb.c | 23 +++++++ sysdeps/riscv/multiarch/strcmp-generic.c | 24 +++++++ sysdeps/riscv/multiarch/strcmp-zbb.c | 23 +++++++ sysdeps/riscv/multiarch/strlen-generic.c | 24 +++++++ sysdeps/riscv/multiarch/strlen-zbb.c | 23 +++++++ sysdeps/riscv/multiarch/strncmp-generic.c | 26 +++++++ sysdeps/riscv/multiarch/strncmp-zbb.c | 25 +++++++ sysdeps/riscv/string-fza.h | 22 +++++- sysdeps/riscv/string-fzi.h | 20 +++++- .../unix/sysv/linux/riscv/multiarch/Makefile | 23 +++++++ .../linux/riscv/multiarch/ifunc-impl-list.c | 67 +++++++++++++++++-- .../unix/sysv/linux/riscv/multiarch/memchr.c | 60 +++++++++++++++++ .../unix/sysv/linux/riscv/multiarch/memrchr.c | 63 +++++++++++++++++ .../sysv/linux/riscv/multiarch/strchrnul.c | 63 +++++++++++++++++ .../unix/sysv/linux/riscv/multiarch/strcmp.c | 59 ++++++++++++++++ .../unix/sysv/linux/riscv/multiarch/strlen.c | 59 ++++++++++++++++ .../unix/sysv/linux/riscv/multiarch/strncmp.c | 59 ++++++++++++++++ 32 files changed, 863 insertions(+), 10 deletions(-) create mode 100644 sysdeps/riscv/multiarch/memchr-generic.c create mode 100644 sysdeps/riscv/multiarch/memchr-zbb.c create mode 100644 sysdeps/riscv/multiarch/memrchr-generic.c create mode 100644 sysdeps/riscv/multiarch/memrchr-zbb.c create mode 100644 sysdeps/riscv/multiarch/strchrnul-generic.c create mode 100644 sysdeps/riscv/multiarch/strchrnul-zbb.c create mode 100644 sysdeps/riscv/multiarch/strcmp-generic.c create mode 100644 sysdeps/riscv/multiarch/strcmp-zbb.c create mode 100644 sysdeps/riscv/multiarch/strlen-generic.c create mode 100644 sysdeps/riscv/multiarch/strlen-zbb.c create mode 100644 sysdeps/riscv/multiarch/strncmp-generic.c create mode 100644 sysdeps/riscv/multiarch/strncmp-zbb.c create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memchr.c create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strchrnul.c create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strcmp.c create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strlen.c create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/strncmp.c