From patchwork Sun Nov 13 23:05:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60563 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 78D453899427 for ; Sun, 13 Nov 2022 23:05:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 360913857365 for ; Sun, 13 Nov 2022 23:05:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 360913857365 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id 13so24506685ejn.3 for ; Sun, 13 Nov 2022 15:05:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=kDf6CpvIiHTXCKB+bGNFUiN2AZZOL3Lt7/xgFFvIVw0=; b=AiVnnOnyLSOGjL+tkaZIXrr9DNy2Pc7TOlV8P5hmG5SpSMIAep5+sCvMSSIF03e6Ur JdBsd51ttc+z/Wocfux0Z8qdtV74RINtiBiNhYgawDfaosdTNmmXbHBuaB2V0mufE5AR 6l5c60nRH78+MwlE7tNkRQne1VtG7YK6nZ2OKpW6XqRu/k/qpM+SLHpDbbUztgq2AUnd 5/j2jhf6mLcINX80xkCd9sKZ6a99iBR5M/BA2aQp65mUs3AzfAald6x3XpwuvtpF0+JU s2Dc922mCdkuqP6bZGe4TwAuRCxN5A2ccot4GIh+fq5Wo8uYML3a2P6UJHx0x6Y8uwql 5zsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kDf6CpvIiHTXCKB+bGNFUiN2AZZOL3Lt7/xgFFvIVw0=; b=ljWdJSwi6orWywv08mPxw0i59bgNMo3ZuKMSs58NHXFUixcUvAVSR/BStP5SbcyjtW neNwoW2cWQv18WhU2q+YH6KBCwTl44rvMC8uC5epAapXtvW8X6+lDRH9oQ8aIARiYsyn TyWUa+U3v6np8LQZ8LT8Kd+KpjTv6P9oA0fAalekF3A/wS2P84GPTDWh8Y3FphxRKjjC S/IY746MhtGPM3tUSSt1eOY1Lt3lnbDviMNDVz5gmo1E6G5e7dCEoC+ObgGMcDG4+l0G sXq+zEreYcMqN7hVWLviXv23sjPWxX8l66quP3esmO5XhZCJQ5yL3nGtCxgtj5AsFfpO Xb7Q== X-Gm-Message-State: ANoB5pmerw7JC0o4sxvVbz+ANTB1lg6q7UTjcsABhwebhjMqnvkqLus6 fzYzdDr4XPjSWWZHBll9zeUzpf/g6kamjWYR X-Google-Smtp-Source: AA0mqf74RgiScC6Kmq9hxRUSYj/MfyzoN5lNko7THo3SA7KYksHwydWmNhHRuBmsUcOTtv6clhWoKw== X-Received: by 2002:a17:906:f196:b0:78d:6a9b:216c with SMTP id gs22-20020a170906f19600b0078d6a9b216cmr8542359ejb.602.1668380725766; Sun, 13 Nov 2022 15:05:25 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:25 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Subject: [PATCH 1/7] riscv: bitmanip: add orc.b as an unspec Date: Mon, 14 Nov 2022 00:05:15 +0100 Message-Id: <20221113230521.712693-2-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Philipp Tomsich As a basis for optimized string functions (e.g., the by-pieces implementations), we need orc.b available. This adds orc.b as an unspec, so we can expand to it. gcc/ChangeLog: * config/riscv/bitmanip.md (orcb2): Add orc.b as an unspec. * config/riscv/riscv.md: Add UNSPEC_ORC_B. Signed-off-by: Philipp Tomsich --- gcc/config/riscv/bitmanip.md | 8 ++++++++ gcc/config/riscv/riscv.md | 3 +++ 2 files changed, 11 insertions(+) diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index b44fb9517e7..3dbe6002974 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -242,6 +242,14 @@ (define_insn "rotlsi3_sext" "rolw\t%0,%1,%2" [(set_attr "type" "bitmanip")]) +;; orc.b (or-combine) is added as an unspec for the benefit of the support +;; for optimized string functions (such as strcmp). +(define_insn "orcb2" + [(set (match_operand:X 0 "register_operand" "=r") + (unspec:X [(match_operand:X 1 "register_operand" "r")] UNSPEC_ORC_B))] + "TARGET_ZBB" + "orc.b\t%0,%1") + (define_insn "bswap2" [(set (match_operand:X 0 "register_operand" "=r") (bswap:X (match_operand:X 1 "register_operand" "r")))] diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 798f7370a08..532289dd178 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -62,6 +62,9 @@ (define_c_enum "unspec" [ ;; Stack tie UNSPEC_TIE + + ;; OR-COMBINE + UNSPEC_ORC_B ]) (define_c_enum "unspecv" [ From patchwork Sun Nov 13 23:05:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60564 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 20368388A026 for ; Sun, 13 Nov 2022 23:06:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id 48C7D3856975 for ; Sun, 13 Nov 2022 23:05:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 48C7D3856975 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x52c.google.com with SMTP id a13so14983467edj.0 for ; Sun, 13 Nov 2022 15:05:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bjhtEefrkufDLOq14J9obOBH5W8nvfgMIKxgd2KsfsU=; b=A+UIsEmcnYucC0tlylj9bZY/V2kKKvMsRsLrbIfTRq4stEhD14X50svmReLgSY/AAn 7T0YV0NNwiAwp798P7WaBvld9OUvxnJpeO032Yf35cN8S+bwEzAr13GskUYbcH1DY4eI a+dzW2LOfu7evj5fw5S71T/x4VKqIA7z1A8FAOAXNx918LW352kEOUCVeXZqVYlIiuPh K7vS1V17jXGiQ1+ethCYMY4adZj5D4jUcR1O9thVx9w1ba3FXSGYrZ1T6yy6hvRmY+RP ijWXT++okLY02D34gGeGGHjpCXZ/kcYKJlnMkbyeEv1iTEgK1H/1OMiW2VWpnSMKI6CQ Omgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bjhtEefrkufDLOq14J9obOBH5W8nvfgMIKxgd2KsfsU=; b=Xpl/f7i1gVXWDEsturAwHRNBUb8TbH2zDJh2cQ2QaJj3/kZjM8xDbqH4RCZAKZnauB 8Uq/2jP7eQnVB/MI71buTOyQ34KUDbguW9g9h/4tNhSPvjEuibMCXbwGnuDPFXZdUuaG H3Wt2+3qPJg8LSQQQnB++gNfOgt2lBwXbuX1XmsNLDWTy1ipGqpAxrPlPggYem5lJ6pc AdLuqXG/hwUYqJ2h16l4P1lFRdGL93vq9GY4BEGwCbKERnmXdICnnMjIFXhItyGjiSeN U8+I+lb2X8rKLh8kS/vyQTM/tlVgC1HLKOtmDjcXKGXIbI1oyLa6USChNBQMeUbDBktR wAbw== X-Gm-Message-State: ANoB5pnBfSvQmx5p9xvZyBRX2oRuXkljOMksxGKEXRhpgXhoT1vKpuax rxUNC50jCb3e9HRuQik9GOsNqkEKG/EY5Two X-Google-Smtp-Source: AA0mqf7w7aKnnfWoRLonfygZh5Iov9xvPX0YK+hWSnilNp4nECk7YtBdpynLX3kJ9QYGs/+mwtyt2w== X-Received: by 2002:aa7:d38b:0:b0:467:71de:fe10 with SMTP id x11-20020aa7d38b000000b0046771defe10mr7854943edq.63.1668380726814; Sun, 13 Nov 2022 15:05:26 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:26 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 2/7] riscv: bitmanip/zbb: Add prefix/postfix and enable visiblity Date: Mon, 14 Nov 2022 00:05:16 +0100 Message-Id: <20221113230521.712693-3-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Christoph Müllner INSNs are usually postfixed by a number representing the argument count. Given the instructions will be used in a later commit, let's make them visible, but add a "riscv_" prefix to avoid conflicts with standard INSNs. gcc/ChangeLog: * config/riscv/bitmanip.md (*_not): Rename INSN. (riscv__not3): Rename INSN. (*xor_not): Rename INSN. (xor_not3): Rename INSN. Signed-off-by: Christoph Müllner --- gcc/config/riscv/bitmanip.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 3dbe6002974..d6d94e5cdf8 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -119,7 +119,7 @@ (define_insn "*slliuw" ;; ZBB extension. -(define_insn "*_not" +(define_insn "riscv__not3" [(set (match_operand:X 0 "register_operand" "=r") (bitmanip_bitwise:X (not:X (match_operand:X 1 "register_operand" "r")) (match_operand:X 2 "register_operand" "r")))] @@ -128,7 +128,7 @@ (define_insn "*_not" [(set_attr "type" "bitmanip") (set_attr "mode" "")]) -(define_insn "*xor_not" +(define_insn "riscv_xor_not3" [(set (match_operand:X 0 "register_operand" "=r") (not:X (xor:X (match_operand:X 1 "register_operand" "r") (match_operand:X 2 "register_operand" "r"))))] From patchwork Sun Nov 13 23:05:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60565 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5404C3953807 for ; Sun, 13 Nov 2022 23:06:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id DAB34386183F for ; Sun, 13 Nov 2022 23:05:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DAB34386183F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x62e.google.com with SMTP id 13so24506972ejn.3 for ; Sun, 13 Nov 2022 15:05:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MVII97EUAex2exa/LG51nEDdc2kfukU6SsqgvvYX2IM=; b=ihoyISVYj/nnKD4GPpaYCLYTxvHvbW97K/Nw1Qv7vOyNYmeO7vsxfe2K5n4C51MK1H FtBmj1pwUAG0yzVAB0HycRrtBVE+D+h+T0dCAMcyaW//zqB+DEeeSuQHPb4QZBNeVXWJ WEWJMW7cLIorUBWWlEbWakELmo5jSeJqaL8jQUUJcATWwePRQIKvdTh3QzNA2oOEiqmQ FTe3ocAReB07TdDKXeq4tyF8Mhpry5kZHZUcUcxjo3fjWF/zY+7q1LeFzD9nZEKh39hk OqgNhUzy+sMR3v5ekzxzDFPblXSsxL7LCgLPVujTuItAu/5mjdHAToDLqEKVuKidiIsp tFyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MVII97EUAex2exa/LG51nEDdc2kfukU6SsqgvvYX2IM=; b=oG0QPV/IrV7p5DE7tahqDVJo6Ncozup4QC7cOD1fIJxEeYPfd6KediOuz3gfPKaQXx 6Lw+bvrxcFJbc111DWoM9PQcSNUJMYZoHvwG/J2C6xffFH2pr8kNUCfnDuWU3CuRc6Lw 8iLzkqAUKHMybLG1LCXB4qV2oyAwg7v6mEiDyJT2hYBnfZ4oVTCbBonra+YHjf3tHcDM t3tobrG3tcNWm3zmH5ktDjcGCauiyXdNTOCXYpVd9wDwzJb0Q3QhlDxYafG8+kD5tkcQ Is1VG84yK7KRNyfMiGxiD027IQK5T6DDWRjSJrK8jJ8NfDmuA8Dpt0UxfxEUws2wu936 uh2Q== X-Gm-Message-State: ANoB5pkFrXH1uAr0nXwTt2t5vp7ZlHwO0z375HrWoh8xQtM2oyRmL50m 7JXalPtkkfxblvND6+tdKYpEX9K9qG+hV/Xm X-Google-Smtp-Source: AA0mqf5FiOxNtKf9PMo9bVaBZNAKrHmG4WymBIbSRoyNbzHsqc5KBY306JKBRYdG3JFYABIa1/kmqg== X-Received: by 2002:a17:907:8b0a:b0:78d:99f2:a94e with SMTP id sz10-20020a1709078b0a00b0078d99f2a94emr8674086ejc.232.1668380729097; Sun, 13 Nov 2022 15:05:29 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:28 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 4/7] riscv: Move riscv_block_move_loop to separate file Date: Mon, 14 Nov 2022 00:05:18 +0100 Message-Id: <20221113230521.712693-5-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Christoph Müllner Let's try to not accumulate too much functionality in one single file as this does not really help maintaining or extending the code. So in order to add more similar functionality like riscv_block_move_loop let's move this function to a separate file. This change does not do any functional changes. It does modify a single line in the existing code, that check_GNU_style.py complained about. gcc/ChangeLog: * config.gcc: Add new object riscv-string.o * config/riscv/riscv-protos.h (riscv_expand_block_move): Remove duplicated prototype and move to new section for riscv-string.cc. * config/riscv/riscv.cc (riscv_block_move_straight): Remove function. (riscv_adjust_block_mem): Likewise. (riscv_block_move_loop): Likewise. (riscv_expand_block_move): Likewise. * config/riscv/riscv.md (cpymemsi): Move to new section for riscv-string.cc. * config/riscv/t-riscv: Add compile rule for riscv-string.o * config/riscv/riscv-string.c: New file. Signed-off-by: Christoph Müllner --- gcc/config.gcc | 3 +- gcc/config/riscv/riscv-protos.h | 5 +- gcc/config/riscv/riscv-string.cc | 194 +++++++++++++++++++++++++++++++ gcc/config/riscv/riscv.cc | 155 ------------------------ gcc/config/riscv/riscv.md | 28 ++--- gcc/config/riscv/t-riscv | 4 + 6 files changed, 218 insertions(+), 171 deletions(-) create mode 100644 gcc/config/riscv/riscv-string.cc diff --git a/gcc/config.gcc b/gcc/config.gcc index b5eda046033..fc9e582e713 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -518,7 +518,8 @@ pru-*-*) ;; riscv*) cpu_type=riscv - extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o" + extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o" + extra_objs="${extra_objs} riscv-string.o riscv-v.o" extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o" d_target_objs="riscv-d.o" extra_headers="riscv_vector.h" diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a718bb62b4..344515dbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -62,7 +62,6 @@ extern void riscv_expand_conditional_move (rtx, rtx, rtx, rtx_code, rtx, rtx); #endif extern rtx riscv_legitimize_call_address (rtx); extern void riscv_set_return_address (rtx, rtx); -extern bool riscv_expand_block_move (rtx, rtx, rtx); extern rtx riscv_return_addr (int, rtx); extern poly_int64 riscv_initial_elimination_offset (int, int); extern void riscv_expand_prologue (void); @@ -70,7 +69,6 @@ extern void riscv_expand_epilogue (int); extern bool riscv_epilogue_uses (unsigned int); extern bool riscv_can_use_return_insn (void); extern rtx riscv_function_value (const_tree, const_tree, enum machine_mode); -extern bool riscv_expand_block_move (rtx, rtx, rtx); extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn *); extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *); extern bool riscv_gpr_save_operation_p (rtx); @@ -96,6 +94,9 @@ extern bool riscv_hard_regno_rename_ok (unsigned, unsigned); rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); +/* Routines implemented in riscv-string.c. */ +extern bool riscv_expand_block_move (rtx, rtx, rtx); + /* Information about one CPU we know about. */ struct riscv_cpu_info { /* This CPU's canonical name. */ diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc new file mode 100644 index 00000000000..6882f0be269 --- /dev/null +++ b/gcc/config/riscv/riscv-string.cc @@ -0,0 +1,194 @@ +/* Subroutines used to expand string and block move, clear, + compare and other operations for RISC-V. + Copyright (C) 2011-2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define IN_TARGET_CODE 1 + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "rtl.h" +#include "tree.h" +#include "memmodel.h" +#include "tm_p.h" +#include "ira.h" +#include "print-tree.h" +#include "varasm.h" +#include "explow.h" +#include "expr.h" +#include "output.h" +#include "target.h" +#include "predict.h" +#include "optabs.h" + +/* Emit straight-line code to move LENGTH bytes from SRC to DEST. + Assume that the areas do not overlap. */ + +static void +riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) +{ + unsigned HOST_WIDE_INT offset, delta; + unsigned HOST_WIDE_INT bits; + int i; + enum machine_mode mode; + rtx *regs; + + bits = MAX (BITS_PER_UNIT, + MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest)))); + + mode = mode_for_size (bits, MODE_INT, 0).require (); + delta = bits / BITS_PER_UNIT; + + /* Allocate a buffer for the temporary registers. */ + regs = XALLOCAVEC (rtx, length / delta); + + /* Load as many BITS-sized chunks as possible. Use a normal load if + the source has enough alignment, otherwise use left/right pairs. */ + for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + { + regs[i] = gen_reg_rtx (mode); + riscv_emit_move (regs[i], adjust_address (src, mode, offset)); + } + + /* Copy the chunks to the destination. */ + for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); + + /* Mop up any left-over bytes. */ + if (offset < length) + { + src = adjust_address (src, BLKmode, offset); + dest = adjust_address (dest, BLKmode, offset); + move_by_pieces (dest, src, length - offset, + MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN); + } +} + +/* Helper function for doing a loop-based block operation on memory + reference MEM. Each iteration of the loop will operate on LENGTH + bytes of MEM. + + Create a new base register for use within the loop and point it to + the start of MEM. Create a new memory reference that uses this + register. Store them in *LOOP_REG and *LOOP_MEM respectively. */ + +static void +riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length, + rtx *loop_reg, rtx *loop_mem) +{ + *loop_reg = copy_addr_to_reg (XEXP (mem, 0)); + + /* Although the new mem does not refer to a known location, + it does keep up to LENGTH bytes of alignment. */ + *loop_mem = change_address (mem, BLKmode, *loop_reg); + set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT)); +} + +/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER + bytes at a time. LENGTH must be at least BYTES_PER_ITER. Assume that + the memory regions do not overlap. */ + +static void +riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length, + unsigned HOST_WIDE_INT bytes_per_iter) +{ + rtx label, src_reg, dest_reg, final_src, test; + unsigned HOST_WIDE_INT leftover; + + leftover = length % bytes_per_iter; + length -= leftover; + + /* Create registers and memory references for use within the loop. */ + riscv_adjust_block_mem (src, bytes_per_iter, &src_reg, &src); + riscv_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest); + + /* Calculate the value that SRC_REG should have after the last iteration + of the loop. */ + final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length), + 0, 0, OPTAB_WIDEN); + + /* Emit the start of the loop. */ + label = gen_label_rtx (); + emit_label (label); + + /* Emit the loop body. */ + riscv_block_move_straight (dest, src, bytes_per_iter); + + /* Move on to the next block. */ + riscv_emit_move (src_reg, plus_constant (Pmode, src_reg, bytes_per_iter)); + riscv_emit_move (dest_reg, plus_constant (Pmode, dest_reg, bytes_per_iter)); + + /* Emit the loop condition. */ + test = gen_rtx_NE (VOIDmode, src_reg, final_src); + emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label)); + + /* Mop up any left-over bytes. */ + if (leftover) + riscv_block_move_straight (dest, src, leftover); + else + emit_insn (gen_nop ()); +} + +/* Expand a cpymemsi instruction, which copies LENGTH bytes from + memory reference SRC to memory reference DEST. */ + +bool +riscv_expand_block_move (rtx dest, rtx src, rtx length) +{ + if (CONST_INT_P (length)) + { + unsigned HOST_WIDE_INT hwi_length = UINTVAL (length); + unsigned HOST_WIDE_INT factor, align; + + align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD); + factor = BITS_PER_WORD / align; + + if (optimize_function_for_size_p (cfun) + && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false)) + return false; + + if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) + { + riscv_block_move_straight (dest, src, INTVAL (length)); + return true; + } + else if (optimize && align >= BITS_PER_WORD) + { + unsigned min_iter_words + = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD; + unsigned iter_words = min_iter_words; + unsigned HOST_WIDE_INT bytes = hwi_length; + unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD; + + /* Lengthen the loop body if it shortens the tail. */ + for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++) + { + unsigned cur_cost = iter_words + words % iter_words; + unsigned new_cost = i + words % i; + if (new_cost <= cur_cost) + iter_words = i; + } + + riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD); + return true; + } + } + return false; +} diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 7357cf51cdf..fab40c6f8dc 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -3949,161 +3949,6 @@ riscv_legitimize_call_address (rtx addr) return addr; } -/* Emit straight-line code to move LENGTH bytes from SRC to DEST. - Assume that the areas do not overlap. */ - -static void -riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) -{ - unsigned HOST_WIDE_INT offset, delta; - unsigned HOST_WIDE_INT bits; - int i; - enum machine_mode mode; - rtx *regs; - - bits = MAX (BITS_PER_UNIT, - MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest)))); - - mode = mode_for_size (bits, MODE_INT, 0).require (); - delta = bits / BITS_PER_UNIT; - - /* Allocate a buffer for the temporary registers. */ - regs = XALLOCAVEC (rtx, length / delta); - - /* Load as many BITS-sized chunks as possible. Use a normal load if - the source has enough alignment, otherwise use left/right pairs. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) - { - regs[i] = gen_reg_rtx (mode); - riscv_emit_move (regs[i], adjust_address (src, mode, offset)); - } - - /* Copy the chunks to the destination. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) - riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); - - /* Mop up any left-over bytes. */ - if (offset < length) - { - src = adjust_address (src, BLKmode, offset); - dest = adjust_address (dest, BLKmode, offset); - move_by_pieces (dest, src, length - offset, - MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN); - } -} - -/* Helper function for doing a loop-based block operation on memory - reference MEM. Each iteration of the loop will operate on LENGTH - bytes of MEM. - - Create a new base register for use within the loop and point it to - the start of MEM. Create a new memory reference that uses this - register. Store them in *LOOP_REG and *LOOP_MEM respectively. */ - -static void -riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length, - rtx *loop_reg, rtx *loop_mem) -{ - *loop_reg = copy_addr_to_reg (XEXP (mem, 0)); - - /* Although the new mem does not refer to a known location, - it does keep up to LENGTH bytes of alignment. */ - *loop_mem = change_address (mem, BLKmode, *loop_reg); - set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT)); -} - -/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER - bytes at a time. LENGTH must be at least BYTES_PER_ITER. Assume that - the memory regions do not overlap. */ - -static void -riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length, - unsigned HOST_WIDE_INT bytes_per_iter) -{ - rtx label, src_reg, dest_reg, final_src, test; - unsigned HOST_WIDE_INT leftover; - - leftover = length % bytes_per_iter; - length -= leftover; - - /* Create registers and memory references for use within the loop. */ - riscv_adjust_block_mem (src, bytes_per_iter, &src_reg, &src); - riscv_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest); - - /* Calculate the value that SRC_REG should have after the last iteration - of the loop. */ - final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length), - 0, 0, OPTAB_WIDEN); - - /* Emit the start of the loop. */ - label = gen_label_rtx (); - emit_label (label); - - /* Emit the loop body. */ - riscv_block_move_straight (dest, src, bytes_per_iter); - - /* Move on to the next block. */ - riscv_emit_move (src_reg, plus_constant (Pmode, src_reg, bytes_per_iter)); - riscv_emit_move (dest_reg, plus_constant (Pmode, dest_reg, bytes_per_iter)); - - /* Emit the loop condition. */ - test = gen_rtx_NE (VOIDmode, src_reg, final_src); - emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label)); - - /* Mop up any left-over bytes. */ - if (leftover) - riscv_block_move_straight (dest, src, leftover); - else - emit_insn(gen_nop ()); -} - -/* Expand a cpymemsi instruction, which copies LENGTH bytes from - memory reference SRC to memory reference DEST. */ - -bool -riscv_expand_block_move (rtx dest, rtx src, rtx length) -{ - if (CONST_INT_P (length)) - { - unsigned HOST_WIDE_INT hwi_length = UINTVAL (length); - unsigned HOST_WIDE_INT factor, align; - - align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD); - factor = BITS_PER_WORD / align; - - if (optimize_function_for_size_p (cfun) - && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false)) - return false; - - if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) - { - riscv_block_move_straight (dest, src, INTVAL (length)); - return true; - } - else if (optimize && align >= BITS_PER_WORD) - { - unsigned min_iter_words - = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD; - unsigned iter_words = min_iter_words; - unsigned HOST_WIDE_INT bytes = hwi_length; - unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD; - - /* Lengthen the loop body if it shortens the tail. */ - for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++) - { - unsigned cur_cost = iter_words + words % iter_words; - unsigned new_cost = i + words % i; - if (new_cost <= cur_cost) - iter_words = i; - } - - riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD); - return true; - } - } - return false; -} - /* Print symbolic operand OP, which is part of a HIGH or LO_SUM in context CONTEXT. HI_RELOC indicates a high-part reloc. */ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 532289dd178..43b97f1181e 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1872,19 +1872,6 @@ (define_split DONE; }) -(define_expand "cpymemsi" - [(parallel [(set (match_operand:BLK 0 "general_operand") - (match_operand:BLK 1 "general_operand")) - (use (match_operand:SI 2 "")) - (use (match_operand:SI 3 "const_int_operand"))])] - "" -{ - if (riscv_expand_block_move (operands[0], operands[1], operands[2])) - DONE; - else - FAIL; -}) - ;; Expand in-line code to clear the instruction cache between operand[0] and ;; operand[1]. (define_expand "clear_cache" @@ -3005,6 +2992,21 @@ (define_insn "riscv_prefetchi_" "prefetch.i\t%a0" ) +;; Expansions from riscv-string.c + +(define_expand "cpymemsi" + [(parallel [(set (match_operand:BLK 0 "general_operand") + (match_operand:BLK 1 "general_operand")) + (use (match_operand:SI 2 "")) + (use (match_operand:SI 3 "const_int_operand"))])] + "" +{ + if (riscv_expand_block_move (operands[0], operands[1], operands[2])) + DONE; + else + FAIL; +}) + (include "bitmanip.md") (include "sync.md") (include "peephole.md") diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index 7997db3d898..5cb58a74a53 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -63,6 +63,10 @@ riscv-selftests.o: $(srcdir)/config/riscv/riscv-selftests.cc $(COMPILE) $< $(POSTCOMPILE) +riscv-string.o: $(srcdir)/config/riscv/riscv-string.cc + $(COMPILE) $< + $(POSTCOMPILE) + riscv-v.o: $(srcdir)/config/riscv/riscv-v.cc $(COMPILE) $< $(POSTCOMPILE) From patchwork Sun Nov 13 23:05:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60561 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BDB7F388A413 for ; Sun, 13 Nov 2022 23:05:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by sourceware.org (Postfix) with ESMTPS id B0167384F034 for ; Sun, 13 Nov 2022 23:05:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B0167384F034 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x533.google.com with SMTP id s12so14906761edd.5 for ; Sun, 13 Nov 2022 15:05:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=sTO97jEIpdQiU8aOAkFhuFh2aJi0xjSaISwXGHWF2H4WAbSDah87QW74ivqNJzxdJJ XtI4saOL0FFnoBNZzbhlo3XSbrTzQl2nTSmdIroZLkjSIFkauy766j2Pw6S3aXn0P6c0 6Mh0kUI8WB46oWeXBvEMFlt750QVsfsgvsoTBQch6nzPyVJmJrI5lLd+lmVqsN9aP1pE fC2OWyaAmeKqLMqlJV+ViUliltuX9vrkoI9s6rrlJij+OZB1bD/0O9R0aY9nUiiifkkr Kj/Rlmt+xGVPggpeP7gXmtOXKI0vpWzfYpYt1E5aWiHv9zyxEfW6P5LzeFXdRdL2Xu9e I/CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=cA/VagKLeOwTqpsWsN7DNBRL2iOScwfDaEbflVGxYj3XOcAcen3LRHRdKeuUaU4rn4 cIfrWwoDrfFTPw1hBkoXLYd1A9uYtJceLhlE8KHyB07ACaDKrDAnvFGcqYNHjmugrKWu 7SIHjDLXzKtdtLny6mZxWkrt5fV/0dOg7gqMWESSMvHsNHhjQh99hxaFKs8qyL64f2vo 4B7wcFQ9o12QE6pMnxJbZblemxKl+4S2Ps1RTOw2lBLSdymXAv2++bAR461c0DefM0Cd e2me5t6gVhgD/84gry9MK39P0JZBhJkfqJZxxl8Z2eRRNVhIPTg2xThRiuEtCX5uHBgS COwg== X-Gm-Message-State: ANoB5pl84N1xb+1FuNv5tw2EZhnfW5I1WezluG4aWdpo293tfFHs3xha BJG7UiIbLnjQJ1KkiUQBTGFhUt20DMFLNqm+ X-Google-Smtp-Source: AA0mqf4vR6ZeQ0JG0b+qBMz+T5UgwM9EnAECkgrNibrQF1RE9qyCTPobaa+XHmk9j7mzUp5/2jiKaQ== X-Received: by 2002:a05:6402:528f:b0:464:4a3f:510b with SMTP id en15-20020a056402528f00b004644a3f510bmr9635313edb.222.1668380730190; Sun, 13 Nov 2022 15:05:30 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:29 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 5/7] riscv: Use by-pieces to do overlapping accesses in block_move_straight Date: Mon, 14 Nov 2022 00:05:19 +0100 Message-Id: <20221113230521.712693-6-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Christoph Müllner The current implementation of riscv_block_move_straight() emits a couple of load-store pairs with maximum width (e.g. 8-byte for RV64). The remainder is handed over to move_by_pieces(), which emits code based target settings like slow_unaligned_access and overlap_op_by_pieces. move_by_pieces() will emit overlapping memory accesses with maximum width only if the given length exceeds the size of one access (e.g. 15-bytes for 8-byte accesses). This patch changes the implementation of riscv_block_move_straight() such, that it preserves a remainder within the interval [delta..2*delta) instead of [0..delta), so that overlapping memory access may be emitted (if the requirements for them are given). gcc/ChangeLog: * config/riscv/riscv-string.c (riscv_block_move_straight): Adjust range for emitted load/store pairs. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-string.cc | 8 ++++---- .../gcc.target/riscv/memcpy-overlapping.c | 19 ++++++++----------- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 6882f0be269..1137df475be 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -57,18 +57,18 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) delta = bits / BITS_PER_UNIT; /* Allocate a buffer for the temporary registers. */ - regs = XALLOCAVEC (rtx, length / delta); + regs = XALLOCAVEC (rtx, length / delta - 1); /* Load as many BITS-sized chunks as possible. Use a normal load if the source has enough alignment, otherwise use left/right pairs. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++) { regs[i] = gen_reg_rtx (mode); riscv_emit_move (regs[i], adjust_address (src, mode, offset)); } /* Copy the chunks to the destination. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++) riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); /* Mop up any left-over bytes. */ @@ -166,7 +166,7 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) { - riscv_block_move_straight (dest, src, INTVAL (length)); + riscv_block_move_straight (dest, src, hwi_length); return true; } else if (optimize && align >= BITS_PER_WORD) diff --git a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c index ffb7248bfd1..ef95bfb879b 100644 --- a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c +++ b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c @@ -25,26 +25,23 @@ COPY_N(15) /* Emits 2x {ld,sd} and 1x {lw,sw}. */ COPY_N(19) -/* Emits 3x ld and 3x sd. */ +/* Emits 3x {ld,sd}. */ COPY_N(23) /* The by-pieces infrastructure handles up to 24 bytes. So the code below is emitted via cpymemsi/block_move_straight. */ -/* Emits 3x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +/* Emits 3x {ld,sd} and 1x {lw,sw}. */ COPY_N(27) -/* Emits 3x {ld,sd} and 1x {lw,lbu,sw,sb}. */ +/* Emits 4x {ld,sd}. */ COPY_N(29) -/* Emits 3x {ld,sd} and 2x {lw,sw}. */ +/* Emits 4x {ld,sd}. */ COPY_N(31) -/* { dg-final { scan-assembler-times "ld\t" 21 } } */ -/* { dg-final { scan-assembler-times "sd\t" 21 } } */ +/* { dg-final { scan-assembler-times "ld\t" 23 } } */ +/* { dg-final { scan-assembler-times "sd\t" 23 } } */ -/* { dg-final { scan-assembler-times "lw\t" 5 } } */ -/* { dg-final { scan-assembler-times "sw\t" 5 } } */ - -/* { dg-final { scan-assembler-times "lbu\t" 2 } } */ -/* { dg-final { scan-assembler-times "sb\t" 2 } } */ +/* { dg-final { scan-assembler-times "lw\t" 3 } } */ +/* { dg-final { scan-assembler-times "sw\t" 3 } } */ From patchwork Sun Nov 13 23:05:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60566 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7697A3953835 for ; Sun, 13 Nov 2022 23:06:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id D956238515D7 for ; Sun, 13 Nov 2022 23:05:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D956238515D7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id k2so24476296ejr.2 for ; Sun, 13 Nov 2022 15:05:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=ih7F7Xt+idSp3dcxHHoNtE/ipsrON32Zay+E1aup7PggsvUcP2ESidq1nkFHFrxcwa mKVj8xirM+aJdEObcQZzlMuFYtq7UnVljcuJ3aIbrGTrU0WWDaYHE4CnPv+ujl7f/yVf K0b20u8mfW6r+OK/mpyyVJdXtmbgwnCY9/jCaxFQCiyKVXhJiwsPszKEgXYfwd0tAO0f yzp+sx37haQM/wNd15UFcH0Ie0X4X7mDWQdbNklfiZg1eRMn5BHXiL9BYWBYIZNVJs29 OVq32jjVhKaGXvYNjwKz23G32bEzKzT6MEZbMzBHzpvCYnSZuDTUPWsRLaoNWsd4LVIh 2r9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=AAdnuBGpjE93UXl9CZshSoJTN+UQIQVY56dbBxYD0XMNW6p+i/LPm+kwUW6Qw14+bZ NZ2y1mo01O844wGgNeZjpyGMIxGLdtd8Z+in1hXUk6gk4+nchEuJ4Q+RyEC9E8RyU5Lg bEwYtIaMlWBmFES2LjQQAP9SiQoA/wERFZGcieHZiHFybUD3eOw0Ca+WIwaUImsAE9js kztc6H6O33RnEeRcn5mLh0gUKW7ppVjShPHsTVZq89jIBwXTHkIwXBCRoSNmLQVT7rll 5W93ljGDRUr3wJZdpSWup6kI5z3vhURl26aKeEakE1/3fgzE2mXeY6wHXaQ/lOpDPClm kLZQ== X-Gm-Message-State: ANoB5pnOteyQ1XbU98UGDs6yqhJ/DteSe7E+V1C71h61NO12IOT6TsQZ K1NR821RJaQUu9N8Kua2BAQChq2ZX78OBkCS X-Google-Smtp-Source: AA0mqf4XAyCGmz991AkfpISFe2XSbZaNTDGw06Gbst3ajo+Yz6U+BUyqNY6VkgKRUKSohc6ji29+QQ== X-Received: by 2002:a17:906:2856:b0:7a9:a59c:4be with SMTP id s22-20020a170906285600b007a9a59c04bemr8554809ejc.556.1668380731290; Sun, 13 Nov 2022 15:05:31 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:30 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 6/7] riscv: Add support for strlen inline expansion Date: Mon, 14 Nov 2022 00:05:20 +0100 Message-Id: <20221113230521.712693-7-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Christoph Müllner This patch implements the expansion of the strlen builtin using Zbb instructions (if available) for aligned strings using the following sequence: li a3,-1 addi a4,a0,8 .L2: ld a5,0(a0) addi a0,a0,8 orc.b a5,a5 beq a5,a3,6 <.L2> not a5,a5 ctz a5,a5 srli a5,a5,0x3 add a0,a0,a5 sub a0,a0,a4 This allows to inline calls to strlen(), with optimized code for determining the length of a string. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_strlen): New prototype. * config/riscv/riscv-string.cc (riscv_emit_unlikely_jump): New function. (GEN_EMIT_HELPER2): New helper macro. (GEN_EMIT_HELPER3): New helper macro. (do_load_from_addr): New helper function. (riscv_expand_strlen_zbb): New function. (riscv_expand_strlen): New function. * config/riscv/riscv.md (strlen): Invoke expansion functions for strlen. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 149 ++++++++++++++++++ gcc/config/riscv/riscv.md | 28 ++++ .../gcc.target/riscv/zbb-strlen-unaligned.c | 13 ++ gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 18 +++ 5 files changed, 209 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 344515dbaf4..18187e3bd78 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -96,6 +96,7 @@ rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ extern bool riscv_expand_block_move (rtx, rtx, rtx); +extern bool riscv_expand_strlen (rtx[]); /* Information about one CPU we know about. */ struct riscv_cpu_info { diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 1137df475be..bf96522b608 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -38,6 +38,81 @@ #include "predict.h" #include "optabs.h" +/* Emit unlikely jump instruction. */ + +static rtx_insn * +riscv_emit_unlikely_jump (rtx insn) +{ + rtx_insn *jump = emit_jump_insn (insn); + add_reg_br_prob_note (jump, profile_probability::very_unlikely ()); + return jump; +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER2(name) \ +static rtx_insn * \ +do_## name ## 2(rtx dest, rtx src) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di2 (dest, src)); \ + else \ + insn = emit_insn (gen_ ## name ## si2 (dest, src)); \ + return insn; \ +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER3(name) \ +static rtx_insn * \ +do_## name ## 3(rtx dest, rtx src1, rtx src2) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di3 (dest, src1, src2)); \ + else \ + insn = emit_insn (gen_ ## name ## si3 (dest, src1, src2)); \ + return insn; \ +} + +GEN_EMIT_HELPER3(add) /* do_add3 */ +GEN_EMIT_HELPER3(sub) /* do_sub3 */ +GEN_EMIT_HELPER3(lshr) /* do_lshr3 */ +GEN_EMIT_HELPER2(orcb) /* do_orcb2 */ +GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ +GEN_EMIT_HELPER2(clz) /* do_clz2 */ +GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ +GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ + +/* Helper function to load a byte or a Pmode register. + + MODE is the mode to use for the load (QImode or Pmode). + DEST is the destination register for the data. + ADDR_REG is the register that holds the address. + ADDR is the address expression to load from. + + This function returns an rtx containing the register, + where the ADDR is stored. */ + +static rtx +do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) +{ + rtx mem = gen_rtx_MEM (mode, addr_reg); + MEM_COPY_ATTRIBUTES (mem, addr); + set_mem_size (mem, GET_MODE_SIZE (mode)); + + if (mode == QImode) + do_zero_extendqi2 (dest, mem); + else if (mode == Pmode) + emit_move_insn (dest, mem); + else + gcc_unreachable (); + + return addr_reg; +} + + /* Emit straight-line code to move LENGTH bytes from SRC to DEST. Assume that the areas do not overlap. */ @@ -192,3 +267,77 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) } return false; } + +/* If the provided string is aligned, then read XLEN bytes + in a loop and use orc.b to find NUL-bytes. */ + +static bool +riscv_expand_strlen_zbb (rtx result, rtx src, rtx align) +{ + rtx m1, addr, addr_plus_regsz, word, zeros; + rtx loop_label, cond; + + gcc_assert (TARGET_ZBB); + + /* The alignment needs to be known and big enough. */ + if (!CONST_INT_P (align) || UINTVAL (align) < GET_MODE_SIZE (Pmode)) + return false; + + m1 = gen_reg_rtx (Pmode); + addr = copy_addr_to_reg (XEXP (src, 0)); + addr_plus_regsz = gen_reg_rtx (Pmode); + word = gen_reg_rtx (Pmode); + zeros = gen_reg_rtx (Pmode); + + emit_insn (gen_rtx_SET (m1, constm1_rtx)); + do_add3 (addr_plus_regsz, addr, GEN_INT (UNITS_PER_WORD)); + + loop_label = gen_label_rtx (); + emit_label (loop_label); + + /* Load a word and use orc.b to find a zero-byte. */ + do_load_from_addr (Pmode, word, addr, src); + do_add3 (addr, addr, GEN_INT (UNITS_PER_WORD)); + do_orcb2 (word, word); + cond = gen_rtx_EQ (VOIDmode, word, m1); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond, + word, m1, loop_label)); + + /* Calculate the return value by counting zero-bits. */ + do_one_cmpl2 (word, word); + if (TARGET_BIG_ENDIAN) + do_clz2 (zeros, word); + else + do_ctz2 (zeros, word); + + do_lshr3 (zeros, zeros, GEN_INT (exact_log2 (BITS_PER_UNIT))); + do_add3 (addr, addr, zeros); + do_sub3 (result, addr, addr_plus_regsz); + + return true; +} + +/* Expand a strlen operation and return true if successful. + Return false if we should let the compiler generate normal + code, probably a strlen call. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the source. + OPERANDS[2] is the search byte (must be 0) + OPERANDS[3] is the alignment in bytes. */ + +bool +riscv_expand_strlen (rtx operands[]) +{ + rtx result = operands[0]; + rtx src = operands[1]; + rtx search_char = operands[2]; + rtx align = operands[3]; + + gcc_assert (search_char == const0_rtx); + + if (TARGET_ZBB) + return riscv_expand_strlen_zbb (result, src, align); + + return false; +} diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 43b97f1181e..f05c764c3d4 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -65,6 +65,9 @@ (define_c_enum "unspec" [ ;; OR-COMBINE UNSPEC_ORC_B + + ;; ZBB STRLEN + UNSPEC_STRLEN ]) (define_c_enum "unspecv" [ @@ -3007,6 +3010,31 @@ (define_expand "cpymemsi" FAIL; }) +;; Search character in string (generalization of strlen). +;; Argument 0 is the resulting offset +;; Argument 1 is the string +;; Argument 2 is the search character +;; Argument 3 is the alignment + +(define_expand "strlen" + [(set (match_operand:X 0 "register_operand") + (unspec:X [(match_operand:BLK 1 "general_operand") + (match_operand:SI 2 "const_int_operand") + (match_operand:SI 3 "const_int_operand")] + UNSPEC_STRLEN))] + "" +{ + rtx search_char = operands[2]; + + if (optimize_insn_for_size_p () || search_char != const0_rtx) + FAIL; + + if (riscv_expand_strlen (operands)) + DONE; + else + FAIL; +}) + (include "bitmanip.md") (include "sync.md") (include "peephole.md") diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c new file mode 100644 index 00000000000..39da70a5021 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler-not "orc.b\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c new file mode 100644 index 00000000000..d01b7fc552d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + s = __builtin_assume_aligned (s, 4096); + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler "orc.b\t" } } */ +/* { dg-final { scan-assembler-not "jalr" } } */ +/* { dg-final { scan-assembler-not "call" } } */ +/* { dg-final { scan-assembler-not "jr" } } */ +/* { dg-final { scan-assembler-not "tail" } } */ From patchwork Sun Nov 13 23:05:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 60567 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E2F013889E01 for ; Sun, 13 Nov 2022 23:07:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 2796E384F00C for ; Sun, 13 Nov 2022 23:05:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2796E384F00C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id t25so24392438ejb.8 for ; Sun, 13 Nov 2022 15:05:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NJaE0g4ZEsMhS6Pl/08H9GkVtKqTVvsLy0QFIabwUmw=; b=qnJq2onvKgXWprzqROnoVmAZDD4ep3e3sxATS8kkakPaIdk57ITM9tPUuKlRr5oQD/ 6PcYYEsxMz7zSOIzqenDjRUj5coUIceHgkAg3j8nHPp3sdJ7kO4XvjijLiWfO04b/3T5 HqZaCrbyjbyl6GhvhCbYambApdLv06t4mN0MFl0k9dcsRQUDgdw4XXHfeGroP3inXogm VZbPt89faAU5kUgVw9l0cVVvMYoFp2KogsYpHH43YTQowUShjp1aaklvZlkURgecRSFk I4s2K5aKgVAPZyTPwo/ImmuV7A+AN7MQ1higYvqwuA/+0o1/XXnldVgUnFzRf7WrnVuB ZR+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NJaE0g4ZEsMhS6Pl/08H9GkVtKqTVvsLy0QFIabwUmw=; b=GFmBj+bxu350rCEEx+QpFj+7qDISkuofWUScO1AdvcS6PplvzmUaFCV4Lxc93gPOQk DSpYixunCbUxjXaHEaV+Jsuaeze5kz7zM4d9ArRpOKWzKE3AbEEiRGrENPBpQxJ2VDHI YBaXrFr1YEQ4wjoDAglNy3uL8EW5xcyYcRG4EnivEtNwBYpua+646rgGqAtUSdDgS59K +eqZ+6LQZtW+5jbb2hJG5oaXG4KkPDVp8n79oQr9TC2CZSrUIgRza16mIhsYDtg1Khwf DNkgpbbFRo5q8e0ekhmqJLMnm/Ev1tXhGrd4nmIv4XYQ4pB/8uCg8+M2LJ0GSyk9DblF W5Vg== X-Gm-Message-State: ANoB5pkWPI1glJWp+3ORdES7z2Oujf0waUdFf3csk6XaOFQMkJAAZj2L USJEOr7IiPtDDTqZvaWhf6dcpkCu6F9Lb3I7 X-Google-Smtp-Source: AA0mqf4/PnOnksZ2DHDNmxVhXZJ2HK+PkRcjAQCteONJUeY4P8aM1bSC+c5xBJIuCozQUKJl4yBpPQ== X-Received: by 2002:a17:906:d9b:b0:7ae:acea:fca6 with SMTP id m27-20020a1709060d9b00b007aeaceafca6mr8775876eji.150.1668380732379; Sun, 13 Nov 2022 15:05:32 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:31 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 7/7] riscv: Add support for str(n)cmp inline expansion Date: Mon, 14 Nov 2022 00:05:21 +0100 Message-Id: <20221113230521.712693-8-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Christoph Müllner This patch implements expansions for the cmpstrsi and the cmpstrnsi builtins using Zbb instructions (if available). This allows to inline calls to strcmp() and strncmp(). The expansion basically emits a peeled comparison sequence (i.e. a peeled comparison loop) which compares XLEN bits per step if possible. The emitted sequence can be controlled, by setting the maximum number of compared bytes (-mstring-compare-inline-limit). gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_strn_compare): New prototype. * config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper macros. (GEN_EMIT_HELPER2): New helper macros. (expand_strncmp_zbb_sequence): New function. (riscv_emit_str_compare_zbb): New function. (riscv_expand_strn_compare): New function. * config/riscv/riscv.md (cmpstrnsi): Invoke expansion functions for strn_compare. (cmpstrsi): Invoke expansion functions for strn_compare. * config/riscv/riscv.opt: Add new parameter '-mstring-compare-inline-limit'. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 344 ++++++++++++++++++ gcc/config/riscv/riscv.md | 46 +++ gcc/config/riscv/riscv.opt | 5 + .../gcc.target/riscv/zbb-strcmp-unaligned.c | 36 ++ gcc/testsuite/gcc.target/riscv/zbb-strcmp.c | 55 +++ 6 files changed, 487 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 18187e3bd78..7f334be333c 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -97,6 +97,7 @@ rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ extern bool riscv_expand_block_move (rtx, rtx, rtx); extern bool riscv_expand_strlen (rtx[]); +extern bool riscv_expand_strn_compare (rtx[], int); /* Information about one CPU we know about. */ struct riscv_cpu_info { diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index bf96522b608..f157e04ac0c 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -84,6 +84,11 @@ GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ GEN_EMIT_HELPER2(clz) /* do_clz2 */ GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ +GEN_EMIT_HELPER3(xor) /* do_xor3 */ +GEN_EMIT_HELPER3(ashl) /* do_ashl3 */ +GEN_EMIT_HELPER2(bswap) /* do_bswap2 */ +GEN_EMIT_HELPER3(riscv_ior_not) /* do_riscv_ior_not3 */ +GEN_EMIT_HELPER3(riscv_and_not) /* do_riscv_and_not3 */ /* Helper function to load a byte or a Pmode register. @@ -268,6 +273,345 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) return false; } +/* Generate the sequence of compares for strcmp/strncmp using zbb instructions. + BYTES_TO_COMPARE is the number of bytes to be compared. + BASE_ALIGN is the smaller of the alignment of the two strings. + ORIG_SRC1 is the unmodified rtx for the first string. + ORIG_SRC2 is the unmodified rtx for the second string. + DATA1 is the register for loading the first string. + DATA2 is the register for loading the second string. + HAS_NUL is the register holding non-NUL bytes for NUL-bytes in the string. + TARGET is the rtx for the result register (SImode) + EQUALITY_COMPARE_REST if set, then we hand over to libc if string matches. + END_LABEL is the location before the calculation of the result value. + FINAL_LABEL is the location after the calculation of the result value. */ + +static void +expand_strncmp_zbb_sequence (unsigned HOST_WIDE_INT bytes_to_compare, + rtx src1, rtx src2, rtx data1, rtx data2, + rtx target, rtx orc, bool equality_compare_rest, + rtx end_label, rtx final_label) +{ + const unsigned HOST_WIDE_INT p_mode_size = GET_MODE_SIZE (Pmode); + rtx src1_addr = force_reg (Pmode, XEXP (src1, 0)); + rtx src2_addr = force_reg (Pmode, XEXP (src2, 0)); + unsigned HOST_WIDE_INT offset = 0; + + rtx m1 = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (m1, constm1_rtx)); + + /* Generate a compare sequence. */ + while (bytes_to_compare > 0) + { + machine_mode load_mode = QImode; + unsigned HOST_WIDE_INT load_mode_size = 1; + if (bytes_to_compare > 1) + { + load_mode = Pmode; + load_mode_size = p_mode_size; + } + unsigned HOST_WIDE_INT cmp_bytes = 0; + + if (bytes_to_compare >= load_mode_size) + cmp_bytes = load_mode_size; + else + cmp_bytes = bytes_to_compare; + + unsigned HOST_WIDE_INT remain = bytes_to_compare - cmp_bytes; + + /* load_mode_size...bytes we will read + cmp_bytes...bytes we will compare (might be less than load_mode_size) + bytes_to_compare...bytes we will compare (incl. cmp_bytes) + remain...bytes left to compare (excl. cmp_bytes) */ + + rtx addr1 = gen_rtx_PLUS (Pmode, src1_addr, GEN_INT (offset)); + rtx addr2 = gen_rtx_PLUS (Pmode, src2_addr, GEN_INT (offset)); + + do_load_from_addr (load_mode, data1, addr1, src1); + do_load_from_addr (load_mode, data2, addr2, src2); + + if (load_mode_size == 1) + { + /* Special case for comparing just single (last) byte. */ + gcc_assert (remain == 0); + + if (!equality_compare_rest) + { + /* Calculate difference and jump to final_label. */ + rtx result = gen_reg_rtx (Pmode); + do_sub3 (result, data1, data2); + emit_insn (gen_movsi (target, gen_lowpart (SImode, result))); + emit_jump_insn (gen_jump (final_label)); + } + else + { + /* Compare both bytes and jump to final_label if not equal. */ + rtx result = gen_reg_rtx (Pmode); + do_sub3 (result, data1, data2); + emit_insn (gen_movsi (target, gen_lowpart (SImode, result))); + /* Check if str1[i] is NULL. */ + rtx cond1 = gen_rtx_EQ (VOIDmode, data1, const0_rtx); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond1, + data1, const0_rtx, final_label)); + /* Check if str1[i] == str2[i]. */ + rtx cond2 = gen_rtx_NE (VOIDmode, data1, data2); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond2, + data1, data2, final_label)); + /* Processing will fall through to libc calls. */ + } + } + else + { + /* Eliminate irrelevant data (behind the N-th character). */ + if (bytes_to_compare < p_mode_size) + { + gcc_assert (remain == 0); + /* Set a NUL-byte after the relevant data (behind the string). */ + unsigned long im = 0xffUL; + rtx imask = gen_rtx_CONST_INT (Pmode, im); + rtx m_reg = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (m_reg, imask)); + do_ashl3 (m_reg, m_reg, GEN_INT (cmp_bytes * BITS_PER_UNIT)); + do_riscv_and_not3 (data1, m_reg, data1); + do_riscv_and_not3 (data2, m_reg, data2); + do_orcb2 (orc, data1); + emit_jump_insn (gen_jump (end_label)); + } + else + { + /* Check if data1 contains a NUL character. */ + do_orcb2 (orc, data1); + rtx cond1 = gen_rtx_NE (VOIDmode, orc, m1); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond1, orc, m1, + end_label)); + + /* Break out if u1 != u2 */ + rtx cond2 = gen_rtx_NE (VOIDmode, data1, data2); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond2, data1, + data2, end_label)); + + /* Fast-exit for complete and equal strings. */ + if (remain == 0 && !equality_compare_rest) + { + /* All compared and everything was equal. */ + emit_insn (gen_rtx_SET (target, gen_rtx_CONST_INT (SImode, 0))); + emit_jump_insn (gen_jump (final_label)); + } + } + } + + offset += cmp_bytes; + bytes_to_compare -= cmp_bytes; + } + /* Processing will fall through to libc calls. */ +} + +/* Emit a string comparison sequence using Zbb instruction. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the first source. + OPERANDS[2] is the second source. + If NO_LENGTH is zero, then: + OPERANDS[3] is the length. + OPERANDS[4] is the alignment in bytes. + If NO_LENGTH is nonzero, then: + OPERANDS[3] is the alignment in bytes. + BYTES_TO_COMPARE is the maximum number of bytes to compare. + EQUALITY_COMPARE_REST defines if str(n)cmp should be called on equality. + */ + +static bool +riscv_emit_str_compare_zbb (rtx operands[], int no_length, + unsigned HOST_WIDE_INT bytes_to_compare, + bool equality_compare_rest) +{ + const unsigned HOST_WIDE_INT p_mode_size = GET_MODE_SIZE (Pmode); + rtx target = operands[0]; + rtx src1 = operands[1]; + rtx src2 = operands[2]; + rtx bytes_rtx = NULL; + rtx align_rtx = operands[3]; + + if (!no_length) + { + bytes_rtx = operands[3]; + align_rtx = operands[4]; + } + + gcc_assert (TARGET_ZBB); + + /* Enable only if we can access at least one XLEN-register. */ + if (bytes_to_compare < p_mode_size) + return false; + + /* Limit to 12-bits (maximum load-offset). */ + if (bytes_to_compare > IMM_REACH) + return false; + + /* We don't support big endian. */ + if (BYTES_BIG_ENDIAN) + return false; + + /* We need to know the alignment. */ + if (!CONST_INT_P (align_rtx)) + return false; + + unsigned HOST_WIDE_INT base_align = UINTVAL (align_rtx); + unsigned HOST_WIDE_INT required_align = p_mode_size; + if (base_align < required_align) + return false; + + rtx data1 = gen_reg_rtx (Pmode); + rtx data2 = gen_reg_rtx (Pmode); + rtx orc = gen_reg_rtx (Pmode); + rtx end_label = gen_label_rtx (); + rtx final_label = gen_label_rtx (); + + /* Generate a sequence of zbb instructions to compare out + to the length specified. */ + expand_strncmp_zbb_sequence (bytes_to_compare, src1, src2, data1, data2, + target, orc, equality_compare_rest, + end_label, final_label); + + if (equality_compare_rest) + { + /* Update pointers past what has been compared already. */ + rtx src1_addr = force_reg (Pmode, XEXP (src1, 0)); + rtx src2_addr = force_reg (Pmode, XEXP (src2, 0)); + unsigned HOST_WIDE_INT offset = bytes_to_compare; + rtx src1 = force_reg (Pmode, + gen_rtx_PLUS (Pmode, src1_addr, GEN_INT (offset))); + rtx src2 = force_reg (Pmode, + gen_rtx_PLUS (Pmode, src2_addr, GEN_INT (offset))); + + /* Construct call to strcmp/strncmp to compare the rest of the string. */ + if (no_length) + { + tree fun = builtin_decl_explicit (BUILT_IN_STRCMP); + emit_library_call_value (XEXP (DECL_RTL (fun), 0), + target, LCT_NORMAL, GET_MODE (target), + src1, Pmode, src2, Pmode); + } + else + { + unsigned HOST_WIDE_INT bytes = UINTVAL (bytes_rtx); + unsigned HOST_WIDE_INT delta = bytes - bytes_to_compare; + gcc_assert (delta > 0); + rtx len_rtx = gen_reg_rtx (Pmode); + emit_move_insn (len_rtx, gen_int_mode (delta, Pmode)); + tree fun = builtin_decl_explicit (BUILT_IN_STRNCMP); + emit_library_call_value (XEXP (DECL_RTL (fun), 0), + target, LCT_NORMAL, GET_MODE (target), + src1, Pmode, src2, Pmode, len_rtx, Pmode); + } + + emit_jump_insn (gen_jump (final_label)); + } + + emit_barrier (); /* No fall-through. */ + + emit_label (end_label); + + /* Convert non-equal bytes into non-NUL bytes. */ + rtx diff = gen_reg_rtx (Pmode); + do_xor3 (diff, data1, data2); + do_orcb2 (diff, diff); + + /* Convert non-equal or NUL-bytes into non-NUL bytes. */ + rtx syndrome = gen_reg_rtx (Pmode); + do_riscv_ior_not3 (syndrome, orc, diff); + + /* Count the number of equal bits from the beginning of the word. */ + rtx shift = gen_reg_rtx (Pmode); + do_ctz2 (shift, syndrome); + + do_bswap2 (data1, data1); + do_bswap2 (data2, data2); + + /* The most-significant-non-zero bit of the syndrome marks either the + first bit that is different, or the top bit of the first zero byte. + Shifting left now will bring the critical information into the + top bits. */ + do_ashl3 (data1, data1, gen_lowpart (QImode, shift)); + do_ashl3 (data2, data2, gen_lowpart (QImode, shift)); + + /* But we need to zero-extend (char is unsigned) the value and then + perform a signed 32-bit subtraction. */ + unsigned int shiftr = p_mode_size * BITS_PER_UNIT - 8; + do_lshr3 (data1, data1, GEN_INT (shiftr)); + do_lshr3 (data2, data2, GEN_INT (shiftr)); + + rtx result = gen_reg_rtx (Pmode); + do_sub3 (result, data1, data2); + emit_insn (gen_movsi (target, gen_lowpart (SImode, result))); + + /* And we are done. */ + emit_label (final_label); + return true; +} + +/* Expand a string compare operation with length, and return + true if successful. Return false if we should let the + compiler generate normal code, probably a strncmp call. + If NO_LENGTH is set, there is no upper bound of the strings. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the first source. + OPERANDS[2] is the second source. + If NO_LENGTH is zero, then: + OPERANDS[3] is the length. + OPERANDS[4] is the alignment in bytes. + If NO_LENGTH is nonzero, then: + OPERANDS[3] is the alignment in bytes. */ + +bool +riscv_expand_strn_compare (rtx operands[], int no_length) +{ + rtx bytes_rtx = NULL; + const unsigned HOST_WIDE_INT compare_max = riscv_string_compare_inline_limit; + unsigned HOST_WIDE_INT compare_length; /* How much to compare inline. */ + bool equality_compare_rest = false; /* Call libc to compare remainder. */ + + if (riscv_string_compare_inline_limit == 0) + return false; + + /* Decide how many bytes to compare inline and what to do if there is + no difference detected at the end of the compared bytes. + We might call libc to continue the comparison. */ + if (no_length) + { + compare_length = compare_max; + equality_compare_rest = true; + } + else + { + /* If we have a length, it must be constant. */ + bytes_rtx = operands[3]; + if (!CONST_INT_P (bytes_rtx)) + return false; + + unsigned HOST_WIDE_INT bytes = UINTVAL (bytes_rtx); + if (bytes <= compare_max) + { + compare_length = bytes; + equality_compare_rest = false; + } + else + { + compare_length = compare_max; + equality_compare_rest = true; + } + } + + if (TARGET_ZBB) + { + return riscv_emit_str_compare_zbb (operands, no_length, compare_length, + equality_compare_rest); + } + + return false; +} + /* If the provided string is aligned, then read XLEN bytes in a loop and use orc.b to find NUL-bytes. */ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index f05c764c3d4..dce33a4b638 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -3010,6 +3010,52 @@ (define_expand "cpymemsi" FAIL; }) +;; String compare N insn. +;; Argument 0 is the target (result) +;; Argument 1 is the source1 +;; Argument 2 is the source2 +;; Argument 3 is the length +;; Argument 4 is the alignment + +(define_expand "cmpstrnsi" + [(parallel [(set (match_operand:SI 0) + (compare:SI (match_operand:BLK 1) + (match_operand:BLK 2))) + (use (match_operand:SI 3)) + (use (match_operand:SI 4))])] + "" +{ + if (optimize_insn_for_size_p ()) + FAIL; + + if (riscv_expand_strn_compare (operands, 0)) + DONE; + else + FAIL; +}) + +;; String compare insn. +;; Argument 0 is the target (result) +;; Argument 1 is the destination +;; Argument 2 is the source +;; Argument 3 is the alignment + +(define_expand "cmpstrsi" + [(parallel [(set (match_operand:SI 0) + (compare:SI (match_operand:BLK 1) + (match_operand:BLK 2))) + (use (match_operand:SI 3))])] + "" +{ + if (optimize_insn_for_size_p ()) + FAIL; + + if (riscv_expand_strn_compare (operands, 1)) + DONE; + else + FAIL; +}) + ;; Search character in string (generalization of strlen). ;; Argument 0 is the resulting offset ;; Argument 1 is the string diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 7c3ca48d1cc..fdf768ae9a7 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -249,3 +249,8 @@ Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213) misa-spec= Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC) Set the version of RISC-V ISA spec. + +mstring-compare-inline-limit= +Target Var(riscv_string_compare_inline_limit) Init(64) RejectNegative Joined UInteger Save +Max number of bytes to compare. + diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c new file mode 100644 index 00000000000..2126c849e0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -mstring-compare-inline-limit=64" } */ + +typedef long unsigned int size_t; + +int +my_str_cmp (const char *s1, const char *s2) +{ + return __builtin_strcmp (s1, s2); +} + +int +my_str_cmp_const (const char *s1) +{ + return __builtin_strcmp (s1, "foo"); +} + +int +my_strn_cmp (const char *s1, const char *s2, size_t n) +{ + return __builtin_strncmp (s1, s2, n); +} + +int +my_strn_cmp_const (const char *s1, size_t n) +{ + return __builtin_strncmp (s1, "foo", n); +} + +int +my_strn_cmp_bounded (const char *s1, const char *s2) +{ + return __builtin_strncmp (s1, s2, 42); +} + +/* { dg-final { scan-assembler-not "orc.b\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strcmp.c b/gcc/testsuite/gcc.target/riscv/zbb-strcmp.c new file mode 100644 index 00000000000..3465e7ffee3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strcmp.c @@ -0,0 +1,55 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -mstring-compare-inline-limit=64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" "-Og" } } */ + +typedef long unsigned int size_t; + +/* Emits 8+1 orc.b instructions. */ + +int +my_str_cmp (const char *s1, const char *s2) +{ + s1 = __builtin_assume_aligned (s1, 4096); + s2 = __builtin_assume_aligned (s2, 4096); + return __builtin_strcmp (s1, s2); +} + +/* 8+1 because the backend does not know the size of "foo". */ + +int +my_str_cmp_const (const char *s1) +{ + s1 = __builtin_assume_aligned (s1, 4096); + return __builtin_strcmp (s1, "foo"); +} + +/* Emits 6+1 orc.b instructions. */ + +int +my_strn_cmp (const char *s1, const char *s2) +{ + s1 = __builtin_assume_aligned (s1, 4096); + s2 = __builtin_assume_aligned (s2, 4096); + return __builtin_strncmp (s1, s2, 42); +} + +/* Note expanded because the backend does not know the size of "foo". */ + +int +my_strn_cmp_const (const char *s1, size_t n) +{ + s1 = __builtin_assume_aligned (s1, 4096); + return __builtin_strncmp (s1, "foo", n); +} + +/* Emits 6+1 orc.b instructions. */ + +int +my_strn_cmp_bounded (const char *s1, const char *s2) +{ + s1 = __builtin_assume_aligned (s1, 4096); + s2 = __builtin_assume_aligned (s2, 4096); + return __builtin_strncmp (s1, s2, 42); +} + +/* { dg-final { scan-assembler-times "orc.b\t" 32 } } */