From patchwork Tue Oct 29 03:28:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 99742 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 842073858C33 for ; Tue, 29 Oct 2024 03:29:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by sourceware.org (Postfix) with ESMTPS id 47A593858D26 for ; Tue, 29 Oct 2024 03:29:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 47A593858D26 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 47A593858D26 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730172551; cv=none; b=C4aL4MoZZfjz7CHQ2Xfki9m1LtqjguqGyOtAv2RgFT8zEHJxSBgLV9gWyP1/mlheGZeK8ZKpYnubtTg17bbYW1m29b46gYqA1/avLXlJxzyByhSgjM5gsTJV5qCTxCaAsZI7C6nCh51PUOPtdPrfWhydw1AF5cTgBR5igOaUQPM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730172551; c=relaxed/simple; bh=7YvWTAdFdnyfeLv//CVEODdldKu7xr7TFLa66n9hTCA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=kuJu7Y94RZr+T2Ts2GphWxfkBIcL1GkAmsPriVgWmAvKGNWaB6ndjSWfEDOnGpl5PzfK9YH6g0KnvR/mQSRxKmdKGbMyI46LeXYaj+loDdk/fqwwjiFc1BB1rtGn3RA482xvWnX+VMMiDBxNRs3q+EnVPiUtvQa64cyhRtDyJvM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279867.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49SLlUII019075 for ; Tue, 29 Oct 2024 03:29:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:message-id :mime-version:subject:to; s=qcppdkim1; bh=Rwy2TQwGvvXyXYtQy3YIYE wlOSWkhoOUggnBQtuJSoE=; b=UCMr46oWqf50Z4iUt8iZv2hAFQbC/GamJ60qF3 HnrrXlJuI9cpt2KyKwZF3CDlA68fYa35M93AfmSsLbsqzJ3qxtAzGuFoBKi+KKxA isVlk6EU7Pb5O9HLUfuMsKRU3+7AUGY7IKGfAg7PrG/8I+MEC6QEBPFIqVnWkkr7 HPWqpZxFKkFk29iXuY3YzwWF3Km4gBs46CpABDnZD+tmMYBYDW1CKnRxpRGl3eeD JZekAWVAT4azP1PPCR1w5NeWnXPtTt9D0+G2UFM3GmTrBGtnilRPSw0NMMKc0o7V 5Xb4fhSgWAuZ5pufJiERL/AK6ehG/aWUBxHmuKhhuoWBjKGA== Received: from nasanppmta04.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 42gp4dy82f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Oct 2024 03:29:07 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA04.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 49T3T7Cm019137 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Oct 2024 03:29:07 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Mon, 28 Oct 2024 20:29:07 -0700 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH] ifcombine: For short circuit case, allow 2 defining statements [PR85605] Date: Mon, 28 Oct 2024 20:28:56 -0700 Message-ID: <20241029032856.856349-1-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01a.na.qualcomm.com (10.47.209.196) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: PQdFh_RMsxTF9a0KgypSBe74tObxx77F X-Proofpoint-GUID: PQdFh_RMsxTF9a0KgypSBe74tObxx77F X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-06_09,2024-09-06_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 priorityscore=1501 mlxlogscore=999 clxscore=1015 spamscore=0 lowpriorityscore=0 malwarescore=0 bulkscore=0 adultscore=0 phishscore=0 mlxscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2409260000 definitions=main-2410290024 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org r0-126134-g5d2a9da9a7f7c1 added support for circuiting and combing the ifs into using either AND or OR. But it only allowed the inner condition basic block having the conditional only. This changes to allow up to 2 defining statements as long as they are just nop conversions for either the lhs or rhs of the conditional. This should allow to use ccmp on aarch64 and x86_64 (APX) slightly more than before. Boootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/85605 gcc/ChangeLog: * tree-ssa-ifcombine.cc (can_combine_bbs_with_short_circuit): New function. (ifcombine_ifandif): Use can_combine_bbs_with_short_circuit instead of checking if iterator is one before the last statement. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/ifcombine-ccmp-1.C: New test. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-7.c: New test. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-8.c: New test. Signed-off-by: Andrew Pinski --- .../g++.dg/tree-ssa/ifcombine-ccmp-1.C | 27 +++++++++++++ .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-7.c | 18 +++++++++ .../gcc.dg/tree-ssa/ssa-ifcombine-ccmp-8.c | 19 +++++++++ gcc/tree-ssa-ifcombine.cc | 39 ++++++++++++++++++- 4 files changed, 101 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/tree-ssa/ifcombine-ccmp-1.C create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-7.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-8.c diff --git a/gcc/testsuite/g++.dg/tree-ssa/ifcombine-ccmp-1.C b/gcc/testsuite/g++.dg/tree-ssa/ifcombine-ccmp-1.C new file mode 100644 index 00000000000..282cec8c628 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/ifcombine-ccmp-1.C @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -g -fdump-tree-optimized --param logical-op-non-short-circuit=1" } */ + +/* PR tree-optimization/85605 */ +#include + +template +inline bool cmp(T a, T2 b) { + return a<0 ? true : T2(a) < b; +} + +template +inline bool cmp2(T a, T2 b) { + return (a<0) | (T2(a) < b); +} + +bool f(int a, int b) { + return cmp(int64_t(a), unsigned(b)); +} + +bool f2(int a, int b) { + return cmp2(int64_t(a), unsigned(b)); +} + + +/* Both of these functions should be optimized to the same, and have an | in them. */ +/* { dg-final { scan-tree-dump-times " \\\| " 2 "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-7.c new file mode 100644 index 00000000000..1bdbb9358b4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-7.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -g -fdump-tree-optimized --param logical-op-non-short-circuit=1" } */ + +/* PR tree-optimization/85605 */ +/* Like ssa-ifcombine-ccmp-1.c but with conversion from unsigned to signed in the + inner bb which should be able to move too. */ + +int t (int a, unsigned b) +{ + if (a > 0) + { + signed t = b; + if (t > 0) + return 0; + } + return 1; +} +/* { dg-final { scan-tree-dump "\&" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-8.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-8.c new file mode 100644 index 00000000000..8d74b4932c5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-8.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -g -fdump-tree-optimized --param logical-op-non-short-circuit=1" } */ + +/* PR tree-optimization/85605 */ +/* Like ssa-ifcombine-ccmp-2.c but with conversion from unsigned to signed in the + inner bb which should be able to move too. */ + +int t (int a, unsigned b) +{ + if (a > 0) + goto L1; + signed t = b; + if (t > 0) + goto L1; + return 0; +L1: + return 1; +} +/* { dg-final { scan-tree-dump "\|" "optimized" } } */ diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc index 39702929fc0..3acecda31cc 100644 --- a/gcc/tree-ssa-ifcombine.cc +++ b/gcc/tree-ssa-ifcombine.cc @@ -400,6 +400,38 @@ update_profile_after_ifcombine (basic_block inner_cond_bb, outer2->probability = profile_probability::never (); } +/* Returns true if inner_cond_bb contains just the condition or 1/2 statements + that define lhs or rhs with a nop conversion. */ + +static bool +can_combine_bbs_with_short_circuit (basic_block inner_cond_bb, tree lhs, tree rhs) +{ + gimple_stmt_iterator gsi; + gsi = gsi_start_nondebug_after_labels_bb (inner_cond_bb); + /* If only the condition, this should be allowed. */ + if (gsi_one_before_end_p (gsi)) + return true; + /* Can have up to 2 statements defining each of lhs/rhs. */ + for (int i = 0; i < 2; i++) + { + gimple *stmt = gsi_stmt (gsi); + if (!gimple_assign_cast_p (stmt)) + return false; + if (!tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (stmt)), + TREE_TYPE (gimple_assign_rhs1 (stmt)))) + return false; + /* The defining statement needs to match either the lhs or rhs of + the condition. */ + if (!operand_equal_p (lhs, gimple_assign_lhs (stmt)) + && !operand_equal_p (rhs, gimple_assign_lhs (stmt))) + return false; + gsi_next_nondebug (&gsi); + if (gsi_one_before_end_p (gsi)) + return true; + } + return false; +} + /* If-convert on a and pattern with a common else block. The inner if is specified by its INNER_COND_BB, the outer by OUTER_COND_BB. inner_inv, outer_inv and result_inv indicate whether the conditions @@ -610,8 +642,11 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, = param_logical_op_non_short_circuit; if (!logical_op_non_short_circuit || sanitize_coverage_p ()) return false; - /* Only do this optimization if the inner bb contains only the conditional. */ - if (!gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb (inner_cond_bb))) + /* Only do this optimization if the inner bb contains only the conditional + or there is one or 2 statements which are nop conversion for the comparison. */ + if (!can_combine_bbs_with_short_circuit (inner_cond_bb, + gimple_cond_lhs (inner_cond), + gimple_cond_rhs (inner_cond))) return false; t1 = fold_build2_loc (gimple_location (inner_cond), inner_cond_code,