From patchwork Sun Nov 13 23:05:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60566 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7697A3953835 for ; Sun, 13 Nov 2022 23:06:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id D956238515D7 for ; Sun, 13 Nov 2022 23:05:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D956238515D7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id k2so24476296ejr.2 for ; Sun, 13 Nov 2022 15:05:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=ih7F7Xt+idSp3dcxHHoNtE/ipsrON32Zay+E1aup7PggsvUcP2ESidq1nkFHFrxcwa mKVj8xirM+aJdEObcQZzlMuFYtq7UnVljcuJ3aIbrGTrU0WWDaYHE4CnPv+ujl7f/yVf K0b20u8mfW6r+OK/mpyyVJdXtmbgwnCY9/jCaxFQCiyKVXhJiwsPszKEgXYfwd0tAO0f yzp+sx37haQM/wNd15UFcH0Ie0X4X7mDWQdbNklfiZg1eRMn5BHXiL9BYWBYIZNVJs29 OVq32jjVhKaGXvYNjwKz23G32bEzKzT6MEZbMzBHzpvCYnSZuDTUPWsRLaoNWsd4LVIh 2r9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=AAdnuBGpjE93UXl9CZshSoJTN+UQIQVY56dbBxYD0XMNW6p+i/LPm+kwUW6Qw14+bZ NZ2y1mo01O844wGgNeZjpyGMIxGLdtd8Z+in1hXUk6gk4+nchEuJ4Q+RyEC9E8RyU5Lg bEwYtIaMlWBmFES2LjQQAP9SiQoA/wERFZGcieHZiHFybUD3eOw0Ca+WIwaUImsAE9js kztc6H6O33RnEeRcn5mLh0gUKW7ppVjShPHsTVZq89jIBwXTHkIwXBCRoSNmLQVT7rll 5W93ljGDRUr3wJZdpSWup6kI5z3vhURl26aKeEakE1/3fgzE2mXeY6wHXaQ/lOpDPClm kLZQ== X-Gm-Message-State: ANoB5pnOteyQ1XbU98UGDs6yqhJ/DteSe7E+V1C71h61NO12IOT6TsQZ K1NR821RJaQUu9N8Kua2BAQChq2ZX78OBkCS X-Google-Smtp-Source: AA0mqf4XAyCGmz991AkfpISFe2XSbZaNTDGw06Gbst3ajo+Yz6U+BUyqNY6VkgKRUKSohc6ji29+QQ== X-Received: by 2002:a17:906:2856:b0:7a9:a59c:4be with SMTP id s22-20020a170906285600b007a9a59c04bemr8554809ejc.556.1668380731290; Sun, 13 Nov 2022 15:05:31 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:30 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 6/7] riscv: Add support for strlen inline expansion Date: Mon, 14 Nov 2022 00:05:20 +0100 Message-Id: <20221113230521.712693-7-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Christoph Müllner This patch implements the expansion of the strlen builtin using Zbb instructions (if available) for aligned strings using the following sequence: li a3,-1 addi a4,a0,8 .L2: ld a5,0(a0) addi a0,a0,8 orc.b a5,a5 beq a5,a3,6 <.L2> not a5,a5 ctz a5,a5 srli a5,a5,0x3 add a0,a0,a5 sub a0,a0,a4 This allows to inline calls to strlen(), with optimized code for determining the length of a string. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_strlen): New prototype. * config/riscv/riscv-string.cc (riscv_emit_unlikely_jump): New function. (GEN_EMIT_HELPER2): New helper macro. (GEN_EMIT_HELPER3): New helper macro. (do_load_from_addr): New helper function. (riscv_expand_strlen_zbb): New function. (riscv_expand_strlen): New function. * config/riscv/riscv.md (strlen): Invoke expansion functions for strlen. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 149 ++++++++++++++++++ gcc/config/riscv/riscv.md | 28 ++++ .../gcc.target/riscv/zbb-strlen-unaligned.c | 13 ++ gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 18 +++ 5 files changed, 209 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 344515dbaf4..18187e3bd78 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -96,6 +96,7 @@ rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ extern bool riscv_expand_block_move (rtx, rtx, rtx); +extern bool riscv_expand_strlen (rtx[]); /* Information about one CPU we know about. */ struct riscv_cpu_info { diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 1137df475be..bf96522b608 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -38,6 +38,81 @@ #include "predict.h" #include "optabs.h" +/* Emit unlikely jump instruction. */ + +static rtx_insn * +riscv_emit_unlikely_jump (rtx insn) +{ + rtx_insn *jump = emit_jump_insn (insn); + add_reg_br_prob_note (jump, profile_probability::very_unlikely ()); + return jump; +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER2(name) \ +static rtx_insn * \ +do_## name ## 2(rtx dest, rtx src) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di2 (dest, src)); \ + else \ + insn = emit_insn (gen_ ## name ## si2 (dest, src)); \ + return insn; \ +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER3(name) \ +static rtx_insn * \ +do_## name ## 3(rtx dest, rtx src1, rtx src2) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di3 (dest, src1, src2)); \ + else \ + insn = emit_insn (gen_ ## name ## si3 (dest, src1, src2)); \ + return insn; \ +} + +GEN_EMIT_HELPER3(add) /* do_add3 */ +GEN_EMIT_HELPER3(sub) /* do_sub3 */ +GEN_EMIT_HELPER3(lshr) /* do_lshr3 */ +GEN_EMIT_HELPER2(orcb) /* do_orcb2 */ +GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ +GEN_EMIT_HELPER2(clz) /* do_clz2 */ +GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ +GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ + +/* Helper function to load a byte or a Pmode register. + + MODE is the mode to use for the load (QImode or Pmode). + DEST is the destination register for the data. + ADDR_REG is the register that holds the address. + ADDR is the address expression to load from. + + This function returns an rtx containing the register, + where the ADDR is stored. */ + +static rtx +do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) +{ + rtx mem = gen_rtx_MEM (mode, addr_reg); + MEM_COPY_ATTRIBUTES (mem, addr); + set_mem_size (mem, GET_MODE_SIZE (mode)); + + if (mode == QImode) + do_zero_extendqi2 (dest, mem); + else if (mode == Pmode) + emit_move_insn (dest, mem); + else + gcc_unreachable (); + + return addr_reg; +} + + /* Emit straight-line code to move LENGTH bytes from SRC to DEST. Assume that the areas do not overlap. */ @@ -192,3 +267,77 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) } return false; } + +/* If the provided string is aligned, then read XLEN bytes + in a loop and use orc.b to find NUL-bytes. */ + +static bool +riscv_expand_strlen_zbb (rtx result, rtx src, rtx align) +{ + rtx m1, addr, addr_plus_regsz, word, zeros; + rtx loop_label, cond; + + gcc_assert (TARGET_ZBB); + + /* The alignment needs to be known and big enough. */ + if (!CONST_INT_P (align) || UINTVAL (align) < GET_MODE_SIZE (Pmode)) + return false; + + m1 = gen_reg_rtx (Pmode); + addr = copy_addr_to_reg (XEXP (src, 0)); + addr_plus_regsz = gen_reg_rtx (Pmode); + word = gen_reg_rtx (Pmode); + zeros = gen_reg_rtx (Pmode); + + emit_insn (gen_rtx_SET (m1, constm1_rtx)); + do_add3 (addr_plus_regsz, addr, GEN_INT (UNITS_PER_WORD)); + + loop_label = gen_label_rtx (); + emit_label (loop_label); + + /* Load a word and use orc.b to find a zero-byte. */ + do_load_from_addr (Pmode, word, addr, src); + do_add3 (addr, addr, GEN_INT (UNITS_PER_WORD)); + do_orcb2 (word, word); + cond = gen_rtx_EQ (VOIDmode, word, m1); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond, + word, m1, loop_label)); + + /* Calculate the return value by counting zero-bits. */ + do_one_cmpl2 (word, word); + if (TARGET_BIG_ENDIAN) + do_clz2 (zeros, word); + else + do_ctz2 (zeros, word); + + do_lshr3 (zeros, zeros, GEN_INT (exact_log2 (BITS_PER_UNIT))); + do_add3 (addr, addr, zeros); + do_sub3 (result, addr, addr_plus_regsz); + + return true; +} + +/* Expand a strlen operation and return true if successful. + Return false if we should let the compiler generate normal + code, probably a strlen call. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the source. + OPERANDS[2] is the search byte (must be 0) + OPERANDS[3] is the alignment in bytes. */ + +bool +riscv_expand_strlen (rtx operands[]) +{ + rtx result = operands[0]; + rtx src = operands[1]; + rtx search_char = operands[2]; + rtx align = operands[3]; + + gcc_assert (search_char == const0_rtx); + + if (TARGET_ZBB) + return riscv_expand_strlen_zbb (result, src, align); + + return false; +} diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 43b97f1181e..f05c764c3d4 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -65,6 +65,9 @@ (define_c_enum "unspec" [ ;; OR-COMBINE UNSPEC_ORC_B + + ;; ZBB STRLEN + UNSPEC_STRLEN ]) (define_c_enum "unspecv" [ @@ -3007,6 +3010,31 @@ (define_expand "cpymemsi" FAIL; }) +;; Search character in string (generalization of strlen). +;; Argument 0 is the resulting offset +;; Argument 1 is the string +;; Argument 2 is the search character +;; Argument 3 is the alignment + +(define_expand "strlen" + [(set (match_operand:X 0 "register_operand") + (unspec:X [(match_operand:BLK 1 "general_operand") + (match_operand:SI 2 "const_int_operand") + (match_operand:SI 3 "const_int_operand")] + UNSPEC_STRLEN))] + "" +{ + rtx search_char = operands[2]; + + if (optimize_insn_for_size_p () || search_char != const0_rtx) + FAIL; + + if (riscv_expand_strlen (operands)) + DONE; + else + FAIL; +}) + (include "bitmanip.md") (include "sync.md") (include "peephole.md") diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c new file mode 100644 index 00000000000..39da70a5021 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler-not "orc.b\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c new file mode 100644 index 00000000000..d01b7fc552d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + s = __builtin_assume_aligned (s, 4096); + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler "orc.b\t" } } */ +/* { dg-final { scan-assembler-not "jalr" } } */ +/* { dg-final { scan-assembler-not "call" } } */ +/* { dg-final { scan-assembler-not "jr" } } */ +/* { dg-final { scan-assembler-not "tail" } } */