| Message ID | 20260106214612.3756648-1-daniel.barboza@oss.qualcomm.com |
|---|---|
| State | New |
| Headers |
Return-Path: <gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id B3DD04BA2E1F for <patchwork@sourceware.org>; Tue, 6 Jan 2026 21:47:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B3DD04BA2E1F Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.a=rsa-sha256 header.s=qcppdkim1 header.b=fJ9hwJl7; dkim=pass (2048-bit key, unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.a=rsa-sha256 header.s=google header.b=Iyd/6vsG X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 329374BA2E24 for <gcc-patches@gcc.gnu.org>; Tue, 6 Jan 2026 21:46:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 329374BA2E24 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=oss.qualcomm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=oss.qualcomm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 329374BA2E24 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1767735980; cv=none; b=P7xmkgrb9j2NdN4n72ZXvpkMwQAng2o5p/v9DnFdttqgn89gYkaBz9XOMDgb0avUbJIpMoWngeztvT8Wmvjw5KaG4to5Vk76T4ZO0e1s+9HmmTuEPLErmriEgk95kQi2lcUtAGDQW0uv2c+jo57te4jdwrAljJhiJCm+Deho9Ns= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1767735980; c=relaxed/simple; bh=7rhs64aflCe0lSsEnvMXFNPMncVPZ+m1fqlOHFipMI0=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=IVCBwNB2BAvsUCmosrw5870uwWBVgBLAi/8SwFxhbL64RQDsXq0Zhm9IrDHrp1sWv++tME4dYruM/EG692s0niGZsE9PwHiZMWGJ4VRaFkj7tfMztNcdL8AaKsXw+aRybOZrapKKeTv7VGpTW3LlTiyBOvogOhBV4Ghurodl3yQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 329374BA2E24 Received: from pps.filterd (m0279873.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 606GqNLO3889010 for <gcc-patches@gcc.gnu.org>; Tue, 6 Jan 2026 21:46:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:message-id:mime-version :subject:to; s=qcppdkim1; bh=caD2oR8Z8Vmkx6MSxDmbosOAZezcpupBNL+ yUWqLkM4=; b=fJ9hwJl7qlMhoTaI11zNgHvSMltvhOg0okLZHkX4X7FXO6App0F UrV8lTl7SsfF8mPhQ+bHrB6OC8bX2nc3TkRCCNFzULL5FGazxfOnR6HdZoy8WS3C oecvQy2uIah5eN7PYfgiBCHfg0VMDNsaZo8qgNoBWnqu2IxHat2tyUrLwPJQiwlQ yd9ND3na2odbiH8en9dIwjosEBlIUjHDNlyrlsUEFPQPW2Z87Ovyeey/dXytS3Vz nRpo7cfZr2AZbXyXWpNc14zX0Uw1G8YX384q7q5k8blC6yoQyiCTHLJOC92Tlgr8 W+O2hj+2TxPr4rwPTF0n1qJMdtaJZpXPwTw== Received: from mail-dl1-f70.google.com (mail-dl1-f70.google.com [74.125.82.70]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4bh6a0gtya-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for <gcc-patches@gcc.gnu.org>; Tue, 06 Jan 2026 21:46:19 +0000 (GMT) Received: by mail-dl1-f70.google.com with SMTP id a92af1059eb24-11dd10b03c6so1073114c88.0 for <gcc-patches@gcc.gnu.org>; Tue, 06 Jan 2026 13:46:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1767735978; x=1768340778; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=caD2oR8Z8Vmkx6MSxDmbosOAZezcpupBNL+yUWqLkM4=; b=Iyd/6vsGKSGMsMo6+wVuEJxmv9iqrv7ppS07R9pL/5jrTeZP3nn1IJVbOdgSkRU2Pa PMKgTOJUx4YREKKNxtOQJtbennPqc8Vmd19TT0RKePH5QSpTAUGFn6AHUj+o6ZImSGnR PzdDGTwyKGA4wxPQgAbn1iiukrdS8sAQ04TrLJL9qRiCTgh/V6zL9064ky+JIpIRLViT pb5QWiKMnYhqHqlqaKL4KqP3rA6dJafz9EYYt/4tnHc/jrqMwexHjORU5OL/31ODU5mw Fpn41Cm245Bp9ZrJBuElJeonCZWnc6+YW/7JJwOJZ+g7bB4IWQEzdJzDSYi9WQUAkYch 2ulg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767735978; x=1768340778; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=caD2oR8Z8Vmkx6MSxDmbosOAZezcpupBNL+yUWqLkM4=; b=S37Vt0nqeL+TX8FKTc9uPSSQWulxeXofsOLABG+llG+gYonXZ/EjtKJMjJVRcj1GJV xXKAwzBpG9WygdmznRxyMpBErAEliHAJ4WtwH5sa1nxKvSMMLAMDoOzKNo7VDANn5BsW 6A52lSUPbFw2iyHlvEH0Q+8Jiye62FfjXnUnRLWBknrh7X54+e6FkH6FEk7LBQvlwsQF lUeR1itEIYiWCXqsEBSJT7eA6aAewn2ib4k3Wm8NMn1DZ2iBCD5O4IOf29+fZsoDDgeK e/GXhNzZOu33JjdPBBIMdc398ZOvGwOF7LqJ2l9fTFCaDrv8kgW9QUyDEX3MEIKe5NF1 sNPw== X-Gm-Message-State: AOJu0YxPYC6hF2xtCw9hG+FaGcDaCydj5umGloHwtGw24aOaVAy0/b15 hu5c33le170uzprrIYZfXEdlzlybvarELd0z1sJiGqP28komWtqv1EAAAFfm6BuSEUno0rSCHbe IaTEXC1Z1GqM8DqOEJ41YG2L/l2Ru2HjVzYqI+uyEWS6mAoGdMNho7qUlacsm9UoFC30s X-Gm-Gg: AY/fxX6avUnliroWlfDF58F+efJpEgDeMOpaw0ELXcuvWo5j5mR5RLaOtPdH61DM08x PtIbua8BKiOGTAL22uR4XENbD2qM9cYlk5TXaCBEvaYZIeqC/EXShSGe37xGJ9RN7kMYcsgYxjU gjhpt/D5w+AOSIc1KZcwtrfNdztUDCKeJe5u6f3m8XY/SzxoPY6q6ht4mgM6lOKVbUdU9vY1C3C tMS9AtZg0vHp2n71TnHU3cOk/5v9xxqER3Er1W5PMXZV59JRyYUedMq7FmiAkCvC4SVqyBa1cH4 qsM0PU8hHYHUaHL+c9b6aV/hQt7x0fD50LiR4OTR3HCfd0e+FqTvnTkLcRwzqTLHVA5Mruwk7dV 3Ee0vB2D7rvcqf1dXB5zwc+t7RT/Ex3V0jZTHlr34NJFctw89HFivKJZ4MDYWrjNWFWDHbEWKlh fIx3kLfwVrU2b87lN+ X-Received: by 2002:a05:7022:6723:b0:11b:9386:8267 with SMTP id a92af1059eb24-121f8b9bef5mr269244c88.44.1767735977924; Tue, 06 Jan 2026 13:46:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IH2xnU9L70GDbylNxe2kHZvNaXZrTlleJIf2acuFYpbXeniozV5pVXNyvthWmkDrbzabtF9eg== X-Received: by 2002:a05:7022:6723:b0:11b:9386:8267 with SMTP id a92af1059eb24-121f8b9bef5mr269215c88.44.1767735977341; Tue, 06 Jan 2026 13:46:17 -0800 (PST) Received: from grind.dc1.ventanamicro.com (200-162-225-127.static-corp.ajato.com.br. [200.162.225.127]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-121f248c16fsm7570349c88.10.2026.01.06.13.46.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jan 2026 13:46:16 -0800 (PST) From: Daniel Barboza <daniel.barboza@oss.qualcomm.com> To: gcc-patches@gcc.gnu.org Cc: jeffrey.law@oss.qualcomm.com, andrew.pinski@oss.qualcomm.com, Daniel Barboza <daniel.barboza@oss.qualcomm.com> Subject: [PATCH] match.pd: simplify "shift + reg EQ|NE reg" Date: Tue, 6 Jan 2026 18:46:12 -0300 Message-ID: <20260106214612.3756648-1-daniel.barboza@oss.qualcomm.com> X-Mailer: git-send-email 2.51.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Proofpoint-ORIG-GUID: TzEVIsddrJw_fEnp638_QDDc8LSnClKu X-Proofpoint-GUID: TzEVIsddrJw_fEnp638_QDDc8LSnClKu X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTA2MDE4OCBTYWx0ZWRfX3VIyh2kTHuHb 5h8Wkhpwmmf+wI8y1ds9xGVqfDppBKiZgxATZX8O26obJjZrvEjTDHW5q9yEQWOayq6S3wHSwCz DIt/JXWv3OmczXAqLNr+wA8fVCmb8FBp+GZRLcuFqDL0LBP9XTJfIRLu83cAHLAL2ecSaDDDswJ SI5rTH8NmP9Kr3ZFvW8LAzBC+hVXt+tHIxogHF9VyFSq3PDi9AmmIsxJZYOPBJkW9vtBnFGm8F3 a/tXoJifyUPOIZc2CtUuhBRSHXtyB+n+o4pnA1Ctp6BsBRBFNkMcQiwEn/gmB08e1UioaZE5T65 u2LIi6ap47KLH32eeg498WtQKS/HB4DWtx6Ix//gr4PbL9f8hV9Y2oj0n/qPkBjvLDUZzL/jz9I yYCuKcNqZuV3H3kVP0S0B2zEKCl+k4T5lkQg22bHIDOYGtsKwwaFmuBnEf1bk42A45Pzba0/vAa oqltrSkqKi8nVI/Mymg== X-Authority-Analysis: v=2.4 cv=MtdfKmae c=1 sm=1 tr=0 ts=695d82ab cx=c_pps a=SvEPeNj+VMjHSW//kvnxuw==:117 a=ewOVoc8TSmC0cCmMeNMfEg==:17 a=vUbySO9Y5rIA:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=EUspDBNiAAAA:8 a=sZIj4lYkIgEEet_wa14A:9 a=Kq8ClHjjuc5pcCNDwlU0:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-06_02,2026-01-06_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 phishscore=0 adultscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 impostorscore=0 spamscore=0 lowpriorityscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2512120000 definitions=main-2601060188 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_BLOCKED, RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org |
| Series |
match.pd: simplify "shift + reg EQ|NE reg"
|
|
Commit Message
Daniel Henrique Barboza
Jan. 6, 2026, 9:46 p.m. UTC
Add a transformation for a nested lshift/rshift inside a plus that compares for
equality with the same operand of the plus. In other words:
((a OP b) + c EQ|NE c) ? x : y
becomes, for OP = (lshift, rshift) and in a scenario without overflows:
a !=/== 0 ? x : y
This optimizes the following code, seen in perlbench:
long
frob(long x, long y, long z)
{
long ret = (y << 2) + z;
long cond = ret != z;
if (cond == 0)
ret = 0;
return ret;
}
gcc/ChangeLog:
* match.pd (`((a OP b) + c eq|ne c) ? x : y`): New pattern.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/ifcond-plus-shift-czero.c: New test.
Signed-off-by: Daniel Barboza <daniel.barboza@oss.qualcomm.com>
---
gcc/match.pd | 17 +++++++++++++
.../gcc.dg/torture/ifcond-plus-shift-czero.c | 25 +++++++++++++++++++
2 files changed, 42 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c
Comments
On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza <daniel.barboza@oss.qualcomm.com> wrote: > > Add a transformation for a nested lshift/rshift inside a plus that compares for > equality with the same operand of the plus. In other words: > > ((a OP b) + c EQ|NE c) ? x : y > > becomes, for OP = (lshift, rshift) and in a scenario without overflows: > > a !=/== 0 ? x : y I think we want the transformation even if it is used outside of a cond_expr. Also the above is valid even for types that wrap; just not valid for types that trap on overflow. As we already have a pattern for `a + C == C`: ``` /* For equality, this is also true with wrapping overflow. */ (for op (eq ne) (simplify (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use (@3))) && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) (op @0 { build_zero_cst (TREE_TYPE (@0)); }))) ``` The problem is the above does not work as there is an single_use check on the plus. The single use check was there since the original match pattern was added. I am not sure if we should add a special case where in the above pattern @0 is a shift. Though changing that will have to wait for GCC 17 I think. Jeff/Richard, What are your thoughts on this? I know Richard had thoughts on some of the single_use in the match patterns before but I have not tracked them that well. Thanks, Andrew > > This optimizes the following code, seen in perlbench: > > long > frob(long x, long y, long z) > { > long ret = (y << 2) + z; > long cond = ret != z; > if (cond == 0) > ret = 0; > return ret; > } > > gcc/ChangeLog: > > * match.pd (`((a OP b) + c eq|ne c) ? x : y`): New pattern. > > gcc/testsuite/ChangeLog: > > * gcc.dg/torture/ifcond-plus-shift-czero.c: New test. > > Signed-off-by: Daniel Barboza <daniel.barboza@oss.qualcomm.com> > --- > gcc/match.pd | 17 +++++++++++++ > .../gcc.dg/torture/ifcond-plus-shift-czero.c | 25 +++++++++++++++++++ > 2 files changed, 42 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c > > diff --git a/gcc/match.pd b/gcc/match.pd > index ccdc1129e23..fdc18b1ef41 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -2824,6 +2824,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (le (minus (convert:etype @0) { lo; }) { hi; }) > (gt (minus (convert:etype @0) { lo; }) { hi; }))))))))) > > +/* Reduces ((A OP B) + C EQ|NE C) ? X : Y to > + A EQ|NE 0 ? x : y, for OP = (lshift, rshift). */ > +(for cmp (eq ne) > + (for op (lshift rshift) > + (simplify > + (cond (cmp:c (plus:c (op @1 @2) @0) @0) @3 @4) > + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@1)) > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1))) > + (with > + { > + fold_overflow_warning ("assuming signed overflow does not occur " > + "when changing (X << imm) + C1 cmp C1 to " > + "X << imm cmp 0 and X << imm cmp 0 " > + "to X cmp 0", > + WARN_STRICT_OVERFLOW_COMPARISON); } > + (cond (cmp @1 { build_zero_cst (type); }) @3 @4)))))) > + > /* X + Z < Y + Z is the same as X < Y when there is no overflow. */ > (for op (lt le ge gt) > (simplify > diff --git a/gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c b/gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c > new file mode 100644 > index 00000000000..769412202c3 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c > @@ -0,0 +1,25 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-fdump-tree-phiopt2" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-fno-fat-lto-objects" } { "" } } */ > + > +long > +f1 (long x, long y, long z) > +{ > + long ret = (y << 2) + z; > + long cond = ret != z; > + if (cond == 0) > + ret = 0; > + return ret; > +} > + > +long > +f2 (long x, long y, long z) > +{ > + long ret = (y >> 2) + z; > + long cond = ret != z; > + if (cond == 0) > + ret = 0; > + return ret; > +} > + > +/* { dg-final { scan-tree-dump-times "== 0" 2 "phiopt2" } } */ > -- > 2.43.0 >
On 1/6/2026 3:13 PM, Andrew Pinski wrote: > On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza > <daniel.barboza@oss.qualcomm.com> wrote: >> Add a transformation for a nested lshift/rshift inside a plus that compares for >> equality with the same operand of the plus. In other words: >> >> ((a OP b) + c EQ|NE c) ? x : y >> >> becomes, for OP = (lshift, rshift) and in a scenario without overflows: >> >> a !=/== 0 ? x : y > I think we want the transformation even if it is used outside of a cond_expr. > Also the above is valid even for types that wrap; just not valid for > types that trap on overflow. > > As we already have a pattern for `a + C == C`: > ``` > /* For equality, this is also true with wrapping overflow. */ > (for op (eq ne) > (simplify > (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) > (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use (@3))) > && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > (op @0 { build_zero_cst (TREE_TYPE (@0)); }))) > ``` > > The problem is the above does not work as there is an single_use check > on the plus. The single use check was there since the original match > pattern was added. > I am not sure if we should add a special case where in the above > pattern @0 is a shift. > Though changing that will have to wait for GCC 17 I think. > > Jeff/Richard, > What are your thoughts on this? I know Richard had thoughts on some > of the single_use in the match patterns before but I have not tracked > them that well. I had no idea we had that match.pd pattern; clearly what Daniel is doing is closely related. The single_use in those patterns comes up here: https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html Looks like single use is in there because of interactions with another pattern. jeff
On 1/7/2026 11:44 AM, Jeffrey Law wrote: > > > On 1/6/2026 3:13 PM, Andrew Pinski wrote: >> On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza >> <daniel.barboza@oss.qualcomm.com> wrote: >>> Add a transformation for a nested lshift/rshift inside a plus that >>> compares for >>> equality with the same operand of the plus. In other words: >>> >>> ((a OP b) + c EQ|NE c) ? x : y >>> >>> becomes, for OP = (lshift, rshift) and in a scenario without overflows: >>> >>> a !=/== 0 ? x : y >> I think we want the transformation even if it is used outside of a >> cond_expr. >> Also the above is valid even for types that wrap; just not valid for >> types that trap on overflow. >> >> As we already have a pattern for `a + C == C`: >> ``` >> /* For equality, this is also true with wrapping overflow. */ >> (for op (eq ne) >> (simplify >> (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) >> (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) >> && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) >> || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) >> && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use >> (@3))) >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) >> (op @0 { build_zero_cst (TREE_TYPE (@0)); }))) >> ``` >> >> The problem is the above does not work as there is an single_use check >> on the plus. The single use check was there since the original match >> pattern was added. >> I am not sure if we should add a special case where in the above >> pattern @0 is a shift. >> Though changing that will have to wait for GCC 17 I think. I tried variation of that pattern, taking into account the lshift and removing the single_use and constant_class_p conditionals: /* Do the same but with lshift|rshift as @0. */ (for op (eq ne) (simplify (op:c (nop_convert?@3 (plus:c@2 (lshift @0 @4) (convert1? @1))) (convert2? @1)) (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) (op @0 { build_zero_cst (TREE_TYPE (@0)); })))) And I got the following 'optimized' dump: long int frob (long int x, long int y, long int z) { long int ret; long int _1; ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, freq 1.0000), maybe hot ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) ;; pred: ENTRY [always] count:1073741824 (estimated locally, freq 1.0000) (FALLTHRU,EXECUTABLE) if (y_3(D) == 0) goto <bb 4>; [50.00%] else goto <bb 3>; [50.00%] ;; succ: 4 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) ;; 3 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) ;; basic block 3, loop depth 0, count 536870912 (estimated locally, freq 0.5000), maybe hot ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) _1 = y_3(D) << 2; ret_5 = _1 + z_4(D); ;; succ: 4 [always] count:536870912 (estimated locally, freq 0.5000) (FALLTHRU,EXECUTABLE) ;; basic block 4, loop depth 0, count 1073741824 (estimated locally, freq 1.0000), maybe hot ;; prev block 3, next block 1, flags: (NEW, REACHABLE, VISITED) ;; pred: 3 [always] count:536870912 (estimated locally, freq 0.5000) (FALLTHRU,EXECUTABLE) ;; 2 [50.0% (guessed)] count:536870912 (estimated locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) # ret_2 = PHI <ret_5(3), y_3(D)(2)> # VUSE <.MEM_6(D)> return ret_2; ;; succ: EXIT [always] count:1073741824 (estimated locally, freq 1.0000) (EXECUTABLE) test.c:8:10 } With the pattern I sent this is the 'optimized' dump I get: long int frob (long int x, long int y, long int z) { long int ret; long int _1; _Bool _7; long int _8; ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, freq 1.0000), maybe hot ;; prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED) ;; pred: ENTRY [always] count:1073741824 (estimated locally, freq 1.0000) (FALLTHRU,EXECUTABLE) _1 = y_3(D) << 2; ret_5 = _1 + z_4(D); _7 = y_3(D) == 0; _8 = _7 ? 0 : ret_5; # VUSE <.MEM_6(D)> return _8; ;; succ: EXIT [always] count:1073741824 (estimated locally, freq 1.0000) (EXECUTABLE) test.c:8:10 I'm not sure why we're having this difference given that the patterns are quite similar. By the way my initial idea for upstreaming was to split the pattern in two. One like this: X << N EQ|NE 0 -> X EQ|NE 0 And another one like the original I sent but returning the full lshift comparison instead of doing the reduction, i.e.: (a OP b) + c EQ|NE c -> (a OP b) EQ|NE 0 But I got a lot of trouble with errors like: Analyzing compilation unit Performing interprocedural optimizations <*free_lang_data> {heap 1028k} <visibility> {heap 1028k} <build_ssa_passes> {heap 1028k} <targetclone> {heap 1440k} <opt_lo cal_passes> {heap 1440k}during GIMPLE pass: ccp dump file: test.c.038t.ccp1 test.c: In function ‘frob’: test.c:9:1: internal compiler error: in decompose, at wide-int.h:1049 9 | } | ^ 0x2ecc34f internal_error(char const*, ...) ../../gcc/diagnostic-global-context.cc:787 0xf0790f fancy_abort(char const*, int, char const*) ../../gcc/diagnostics/context.cc:1805 0xa5b276 wi::int_traits<generic_wide_int<wide_int_ref_storage<false, false> > >::decompose(long*, unsigned int, generic_wide _int<wide_int_ref_storage<false, false> > const&) ../../gcc/wide-int.h:1049 0xa5d419 wi::int_traits<generic_wide_int<wide_int_storage> >::decompose(long*, unsigned int, generic_wide_int<wide_int_stora ge> const&) ../../gcc/tree.h:3906 0xa5d419 wide_int_ref_storage<true, false>::wide_int_ref_storage<generic_wide_int<wide_int_storage> >(generic_wide_int<wide_ int_storage> const&, unsigned int) ../../gcc/wide-int.h:1099 0xa5d419 generic_wide_int<wide_int_ref_storage<true, false> >::generic_wide_int<generic_wide_int<wide_int_storage> >(generic _wide_int<wide_int_storage> const&, unsigned int) ../../gcc/wide-int.h:855 My guess is that I was messing up the relation between the 'type' precision and the precision used in build_zero_cst(). I gave up and sent the full pattern, but I wonder if splitting it would be better. Thanks, Daniel >> >> Jeff/Richard, >> What are your thoughts on this? I know Richard had thoughts on some >> of the single_use in the match patterns before but I have not tracked >> them that well. > > I had no idea we had that match.pd pattern; clearly what Daniel is doing > is closely related. > > The single_use in those patterns comes up here: > https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html > > Looks like single use is in there because of interactions with another > pattern. > > jeff > > > > >
On Wed, Jan 7, 2026 at 1:47 PM Daniel Henrique Barboza <daniel.barboza@oss.qualcomm.com> wrote: > > > > On 1/7/2026 11:44 AM, Jeffrey Law wrote: > > > > > > On 1/6/2026 3:13 PM, Andrew Pinski wrote: > >> On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza > >> <daniel.barboza@oss.qualcomm.com> wrote: > >>> Add a transformation for a nested lshift/rshift inside a plus that > >>> compares for > >>> equality with the same operand of the plus. In other words: > >>> > >>> ((a OP b) + c EQ|NE c) ? x : y > >>> > >>> becomes, for OP = (lshift, rshift) and in a scenario without overflows: > >>> > >>> a !=/== 0 ? x : y > >> I think we want the transformation even if it is used outside of a > >> cond_expr. > >> Also the above is valid even for types that wrap; just not valid for > >> types that trap on overflow. > >> > >> As we already have a pattern for `a + C == C`: > >> ``` > >> /* For equality, this is also true with wrapping overflow. */ > >> (for op (eq ne) > >> (simplify > >> (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) > >> (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > >> && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > >> || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > >> && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use > >> (@3))) > >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > >> (op @0 { build_zero_cst (TREE_TYPE (@0)); }))) > >> ``` > >> > >> The problem is the above does not work as there is an single_use check > >> on the plus. The single use check was there since the original match > >> pattern was added. > >> I am not sure if we should add a special case where in the above > >> pattern @0 is a shift. > >> Though changing that will have to wait for GCC 17 I think. > > > I tried variation of that pattern, taking into account the lshift and > removing the single_use and constant_class_p conditionals: > > /* Do the same but with lshift|rshift as @0. */ > (for op (eq ne) > (simplify > (op:c (nop_convert?@3 (plus:c@2 (lshift @0 @4) (convert1? @1))) > (convert2? @1)) > (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > (op @0 { build_zero_cst (TREE_TYPE (@0)); })))) > > > And I got the following 'optimized' dump: > > > long int frob (long int x, long int y, long int z) > { > long int ret; > long int _1; > > ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, > freq 1.0000), maybe hot > ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) > ;; pred: ENTRY [always] count:1073741824 (estimated locally, > freq 1.0000) (FALLTHRU,EXECUTABLE) > if (y_3(D) == 0) > goto <bb 4>; [50.00%] > else > goto <bb 3>; [50.00%] > ;; succ: 4 [50.0% (guessed)] count:536870912 (estimated > locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) > ;; 3 [50.0% (guessed)] count:536870912 (estimated > locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) > > ;; basic block 3, loop depth 0, count 536870912 (estimated locally, > freq 0.5000), maybe hot > ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) > ;; pred: 2 [50.0% (guessed)] count:536870912 (estimated > locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) > _1 = y_3(D) << 2; > ret_5 = _1 + z_4(D); > ;; succ: 4 [always] count:536870912 (estimated locally, freq > 0.5000) (FALLTHRU,EXECUTABLE) > > ;; basic block 4, loop depth 0, count 1073741824 (estimated locally, > freq 1.0000), maybe hot > ;; prev block 3, next block 1, flags: (NEW, REACHABLE, VISITED) > ;; pred: 3 [always] count:536870912 (estimated locally, freq > 0.5000) (FALLTHRU,EXECUTABLE) > ;; 2 [50.0% (guessed)] count:536870912 (estimated > locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) > # ret_2 = PHI <ret_5(3), y_3(D)(2)> > # VUSE <.MEM_6(D)> > return ret_2; > ;; succ: EXIT [always] count:1073741824 (estimated locally, > freq 1.0000) (EXECUTABLE) test.c:8:10 > > } Yes afterwards, the sink happens. And pushed ` y_3(D) << 2`/`_1 + z_4(D)` into the branch. And then uncprop changes `0(2)` into `y_3(D)(2)` in the PHI (sometimes that is better; sometimes that is worse; for riscv with zicond it is usually worse because ifcvt does not always convert back into 0). I wonder if keeping `(lshift @0 @4)` as being compared to 0 if that would be better, it would at least keep one expression out of the branch. > > > With the pattern I sent this is the 'optimized' dump I get: > > > long int frob (long int x, long int y, long int z) > { > long int ret; > long int _1; > _Bool _7; > long int _8; > > ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, > freq 1.0000), maybe hot > ;; prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED) > ;; pred: ENTRY [always] count:1073741824 (estimated locally, > freq 1.0000) (FALLTHRU,EXECUTABLE) > _1 = y_3(D) << 2; > ret_5 = _1 + z_4(D); > _7 = y_3(D) == 0; > _8 = _7 ? 0 : ret_5; > # VUSE <.MEM_6(D)> > return _8; > ;; succ: EXIT [always] count:1073741824 (estimated locally, > freq 1.0000) (EXECUTABLE) test.c:8:10 > > > I'm not sure why we're having this difference given that the patterns > are quite similar. Because when you have a cond_expr in the output of a match pattern, we don't recreate an if/else branches. In the pattern without the cond_expr, the sink pass sinks the other expressions. While here we have an explicit COND_EXPR rather than control flow. The big question is, can ifcvt on the rtl level recover the first way into the second way; I think currently it does not but I have not fully looked; so we might to need help it on the gimple level (for GCC 17 though). > > By the way my initial idea for upstreaming was to split the pattern in > two. One like this: > > X << N EQ|NE 0 -> X EQ|NE 0 > > And another one like the original I sent but returning the full lshift > comparison instead of doing the reduction, i.e.: > > (a OP b) + c EQ|NE c -> (a OP b) EQ|NE 0 > > > But I got a lot of trouble with errors like: > > Analyzing compilation unit > Performing interprocedural optimizations > <*free_lang_data> {heap 1028k} <visibility> {heap 1028k} > <build_ssa_passes> {heap 1028k} <targetclone> {heap 1440k} <opt_lo > cal_passes> {heap 1440k}during GIMPLE pass: ccp > dump file: test.c.038t.ccp1 > test.c: In function ‘frob’: > test.c:9:1: internal compiler error: in decompose, at wide-int.h:1049 > 9 | } > | ^ > 0x2ecc34f internal_error(char const*, ...) > ../../gcc/diagnostic-global-context.cc:787 > 0xf0790f fancy_abort(char const*, int, char const*) > ../../gcc/diagnostics/context.cc:1805 > 0xa5b276 wi::int_traits<generic_wide_int<wide_int_ref_storage<false, > false> > >::decompose(long*, unsigned int, generic_wide > _int<wide_int_ref_storage<false, false> > const&) > ../../gcc/wide-int.h:1049 > 0xa5d419 wi::int_traits<generic_wide_int<wide_int_storage> > >::decompose(long*, unsigned int, generic_wide_int<wide_int_stora > ge> const&) > ../../gcc/tree.h:3906 > 0xa5d419 wide_int_ref_storage<true, > false>::wide_int_ref_storage<generic_wide_int<wide_int_storage> > >(generic_wide_int<wide_ > int_storage> const&, unsigned int) > ../../gcc/wide-int.h:1099 > 0xa5d419 generic_wide_int<wide_int_ref_storage<true, false> > >::generic_wide_int<generic_wide_int<wide_int_storage> >(generic > _wide_int<wide_int_storage> const&, unsigned int) > ../../gcc/wide-int.h:855 > > My guess is that I was messing up the relation between the 'type' > precision and the precision used in build_zero_cst(). I gave up and > sent the full pattern, but I wonder if splitting it would be better. This usually does mean a type mismatch has happened or you are comparing two `wi::to_wide` which come from different precision INTEGER_CSTs. Thanks, Andrew > > > Thanks, > > Daniel > > > > > >> > >> Jeff/Richard, > >> What are your thoughts on this? I know Richard had thoughts on some > >> of the single_use in the match patterns before but I have not tracked > >> them that well. > > > > I had no idea we had that match.pd pattern; clearly what Daniel is doing > > is closely related. > > > > The single_use in those patterns comes up here: > > https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html > > > > Looks like single use is in there because of interactions with another > > pattern. > > > > jeff > > > > > > > > > > >
On 1/7/2026 3:06 PM, Andrew Pinski wrote: > Yes afterwards, the sink happens. And pushed ` y_3(D) << 2`/`_1 + > z_4(D)` into the branch. > And then uncprop changes `0(2)` into `y_3(D)(2)` in the PHI (sometimes > that is better; sometimes that is worse; for riscv with zicond it is > usually worse because ifcvt does not always convert back into 0). In general uncprop should be profitable as it reduces the number of constant initializations created by conditional equivalences. Essentially it replaces a constant zero with an object known to have the value zero on the appropriate path. RTL cse/gcse/cprop all know how to handle conditional equivalences and should push the 0 value back in. It's also the cas that the zicond code knows how to handle cases where both arms are pseudos, but one of the arms is also used in the test and as a result we know its going to have the value zero in the context we care about. > I wonder if keeping `(lshift @0 @4)` as being compared to 0 if that > would be better, it would at least keep one expression out of the > branch. Dunno. I'd probably need to see before/after RTL and assembly code. > > >> >> With the pattern I sent this is the 'optimized' dump I get: >> >> >> long int frob (long int x, long int y, long int z) >> { >> long int ret; >> long int _1; >> _Bool _7; >> long int _8; >> >> ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, >> freq 1.0000), maybe hot >> ;; prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED) >> ;; pred: ENTRY [always] count:1073741824 (estimated locally, >> freq 1.0000) (FALLTHRU,EXECUTABLE) >> _1 = y_3(D) << 2; >> ret_5 = _1 + z_4(D); >> _7 = y_3(D) == 0; >> _8 = _7 ? 0 : ret_5; >> # VUSE <.MEM_6(D)> >> return _8; >> ;; succ: EXIT [always] count:1073741824 (estimated locally, >> freq 1.0000) (EXECUTABLE) test.c:8:10 >> >> >> I'm not sure why we're having this difference given that the patterns >> are quite similar. > Because when you have a cond_expr in the output of a match pattern, we > don't recreate an if/else branches. > In the pattern without the cond_expr, the sink pass sinks the other > expressions. While here we have an explicit COND_EXPR rather than > control flow. > The big question is, can ifcvt on the rtl level recover the first way > into the second way; I think currently it does not but I have not > fully looked; so we might to need help it on the gimple level (for GCC > 17 though). I would generally expect the form above to eventually collapse down to something ifcvt can convert using zicond. Jeff
On 1/7/2026 7:06 PM, Andrew Pinski wrote: > On Wed, Jan 7, 2026 at 1:47 PM Daniel Henrique Barboza > <daniel.barboza@oss.qualcomm.com> wrote: >> >> >> >> On 1/7/2026 11:44 AM, Jeffrey Law wrote: >>> >>> >>> On 1/6/2026 3:13 PM, Andrew Pinski wrote: >>>> On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza >>>> <daniel.barboza@oss.qualcomm.com> wrote: >>>>> Add a transformation for a nested lshift/rshift inside a plus that >>>>> compares for >>>>> equality with the same operand of the plus. In other words: >>>>> >>>>> ((a OP b) + c EQ|NE c) ? x : y >>>>> >>>>> becomes, for OP = (lshift, rshift) and in a scenario without overflows: >>>>> >>>>> a !=/== 0 ? x : y >>>> I think we want the transformation even if it is used outside of a >>>> cond_expr. >>>> Also the above is valid even for types that wrap; just not valid for >>>> types that trap on overflow. >>>> >>>> As we already have a pattern for `a + C == C`: >>>> ``` >>>> /* For equality, this is also true with wrapping overflow. */ >>>> (for op (eq ne) >>>> (simplify >>>> (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) >>>> (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) >>>> && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) >>>> || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) >>>> && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use >>>> (@3))) >>>> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) >>>> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) >>>> (op @0 { build_zero_cst (TREE_TYPE (@0)); }))) >>>> ``` >>>> >>>> The problem is the above does not work as there is an single_use check >>>> on the plus. The single use check was there since the original match >>>> pattern was added. >>>> I am not sure if we should add a special case where in the above >>>> pattern @0 is a shift. >>>> Though changing that will have to wait for GCC 17 I think. >> >> >> I tried variation of that pattern, taking into account the lshift and >> removing the single_use and constant_class_p conditionals: >> >> /* Do the same but with lshift|rshift as @0. */ >> (for op (eq ne) >> (simplify >> (op:c (nop_convert?@3 (plus:c@2 (lshift @0 @4) (convert1? @1))) >> (convert2? @1)) >> (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) >> && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) >> || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) >> (op @0 { build_zero_cst (TREE_TYPE (@0)); })))) >> >> >> And I got the following 'optimized' dump: >> >> >> long int frob (long int x, long int y, long int z) >> { >> long int ret; >> long int _1; >> >> ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, >> freq 1.0000), maybe hot >> ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) >> ;; pred: ENTRY [always] count:1073741824 (estimated locally, >> freq 1.0000) (FALLTHRU,EXECUTABLE) >> if (y_3(D) == 0) >> goto <bb 4>; [50.00%] >> else >> goto <bb 3>; [50.00%] >> ;; succ: 4 [50.0% (guessed)] count:536870912 (estimated >> locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) >> ;; 3 [50.0% (guessed)] count:536870912 (estimated >> locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) >> >> ;; basic block 3, loop depth 0, count 536870912 (estimated locally, >> freq 0.5000), maybe hot >> ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) >> ;; pred: 2 [50.0% (guessed)] count:536870912 (estimated >> locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) >> _1 = y_3(D) << 2; >> ret_5 = _1 + z_4(D); >> ;; succ: 4 [always] count:536870912 (estimated locally, freq >> 0.5000) (FALLTHRU,EXECUTABLE) >> >> ;; basic block 4, loop depth 0, count 1073741824 (estimated locally, >> freq 1.0000), maybe hot >> ;; prev block 3, next block 1, flags: (NEW, REACHABLE, VISITED) >> ;; pred: 3 [always] count:536870912 (estimated locally, freq >> 0.5000) (FALLTHRU,EXECUTABLE) >> ;; 2 [50.0% (guessed)] count:536870912 (estimated >> locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) >> # ret_2 = PHI <ret_5(3), y_3(D)(2)> >> # VUSE <.MEM_6(D)> >> return ret_2; >> ;; succ: EXIT [always] count:1073741824 (estimated locally, >> freq 1.0000) (EXECUTABLE) test.c:8:10 >> >> } > > Yes afterwards, the sink happens. And pushed ` y_3(D) << 2`/`_1 + > z_4(D)` into the branch. > And then uncprop changes `0(2)` into `y_3(D)(2)` in the PHI (sometimes > that is better; sometimes that is worse; for riscv with zicond it is > usually worse because ifcvt does not always convert back into 0). > I wonder if keeping `(lshift @0 @4)` as being compared to 0 if that > would be better, it would at least keep one expression out of the > branch. I did some tests with the RISC-V target to see the final .s for each (-march=rv64gcv_zba_zbb_zbc_zbs_zfa_zicond). For the pattern I sent: +(for cmp (eq ne) + (for op (lshift rshift) + (simplify + (cond (cmp:c (plus:c (op @1 @2) @0) @0) @3 @4) + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1))) + (with + { + fold_overflow_warning ("assuming signed overflow does not occur " + "when changing (X << imm) + C1 cmp C1 to " + "X << imm cmp 0 and X << imm cmp 0 " + "to X cmp 0", + WARN_STRICT_OVERFLOW_COMPARISON); } + (cond (cmp @1 { build_zero_cst (type); }) @3 @4)))))) frob: .LFB0: .cfi_startproc sh2add a2,a1,a2 czero.eqz a0,a2,a1 ret .cfi_endproc .LFE0: Using a variation of the existing pattern Andrew mentioned, returning a lshift czero compare: +/* Do the same but with lshift|rshift as @0. */ +(for cmp (eq ne) + (for op (lshift rshift) + (simplify + (cmp:c (nop_convert?@3 (plus:c@2 (op @0 @4) (convert1? @1))) (convert2? @1)) + (with { tree type0 = TREE_TYPE (@0); } + (if (ANY_INTEGRAL_TYPE_P (type0) + && (TYPE_OVERFLOW_UNDEFINED (type0) + || TYPE_OVERFLOW_WRAPS (type0)) + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) + (cmp (convert:type0 (op @0 @4)) { build_zero_cst (type0); })))))) frob: .LFB0: .cfi_startproc slli a1,a1,2 czero.eqz a0,a2,a1 add a0,a1,a0 ret .cfi_endproc .LFE0: Same thing, but returning a @0 czero compare instead: +(for cmp (eq ne) + (for op (lshift rshift) + (simplify + (cmp:c (nop_convert?@3 (plus:c@2 (op @0 @4) (convert1? @1))) (convert2? @1)) + (with { tree type0 = TREE_TYPE (@0); } + (if (ANY_INTEGRAL_TYPE_P (type0) + && (TYPE_OVERFLOW_UNDEFINED (type0) + || TYPE_OVERFLOW_WRAPS (type0)) + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) + (cmp @0 { build_zero_cst (type0); })))))) frob: .LFB0: .cfi_startproc sh2add a2,a1,a2 czero.eqz a0,a2,a1 ret .cfi_endproc .LFE0: So it seems to me that yes, ifcvt can convert from the existing pattern to the new pattern I sent if we return a @0 czero compare. I prefer adding a variation of an existing pattern instead of a brand new one so I would go with this approach. But it only works without the single_use() checks. Given that this pattern is rather specific, as long as we're not regressing existing tests, maybe it's ok? Thanks, Daniel > > >> >> >> With the pattern I sent this is the 'optimized' dump I get: >> >> >> long int frob (long int x, long int y, long int z) >> { >> long int ret; >> long int _1; >> _Bool _7; >> long int _8; >> >> ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, >> freq 1.0000), maybe hot >> ;; prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED) >> ;; pred: ENTRY [always] count:1073741824 (estimated locally, >> freq 1.0000) (FALLTHRU,EXECUTABLE) >> _1 = y_3(D) << 2; >> ret_5 = _1 + z_4(D); >> _7 = y_3(D) == 0; >> _8 = _7 ? 0 : ret_5; >> # VUSE <.MEM_6(D)> >> return _8; >> ;; succ: EXIT [always] count:1073741824 (estimated locally, >> freq 1.0000) (EXECUTABLE) test.c:8:10 >> >> >> I'm not sure why we're having this difference given that the patterns >> are quite similar. > > Because when you have a cond_expr in the output of a match pattern, we > don't recreate an if/else branches. > In the pattern without the cond_expr, the sink pass sinks the other > expressions. While here we have an explicit COND_EXPR rather than > control flow. > The big question is, can ifcvt on the rtl level recover the first way > into the second way; I think currently it does not but I have not > fully looked; so we might to need help it on the gimple level (for GCC > 17 though). > >> >> By the way my initial idea for upstreaming was to split the pattern in >> two. One like this: >> >> X << N EQ|NE 0 -> X EQ|NE 0 >> >> And another one like the original I sent but returning the full lshift >> comparison instead of doing the reduction, i.e.: >> >> (a OP b) + c EQ|NE c -> (a OP b) EQ|NE 0 >> >> >> But I got a lot of trouble with errors like: >> >> Analyzing compilation unit >> Performing interprocedural optimizations >> <*free_lang_data> {heap 1028k} <visibility> {heap 1028k} >> <build_ssa_passes> {heap 1028k} <targetclone> {heap 1440k} <opt_lo >> cal_passes> {heap 1440k}during GIMPLE pass: ccp >> dump file: test.c.038t.ccp1 >> test.c: In function ‘frob’: >> test.c:9:1: internal compiler error: in decompose, at wide-int.h:1049 >> 9 | } >> | ^ >> 0x2ecc34f internal_error(char const*, ...) >> ../../gcc/diagnostic-global-context.cc:787 >> 0xf0790f fancy_abort(char const*, int, char const*) >> ../../gcc/diagnostics/context.cc:1805 >> 0xa5b276 wi::int_traits<generic_wide_int<wide_int_ref_storage<false, >> false> > >::decompose(long*, unsigned int, generic_wide >> _int<wide_int_ref_storage<false, false> > const&) >> ../../gcc/wide-int.h:1049 >> 0xa5d419 wi::int_traits<generic_wide_int<wide_int_storage> >> >::decompose(long*, unsigned int, generic_wide_int<wide_int_stora >> ge> const&) >> ../../gcc/tree.h:3906 >> 0xa5d419 wide_int_ref_storage<true, >> false>::wide_int_ref_storage<generic_wide_int<wide_int_storage> >> >(generic_wide_int<wide_ >> int_storage> const&, unsigned int) >> ../../gcc/wide-int.h:1099 >> 0xa5d419 generic_wide_int<wide_int_ref_storage<true, false> >> >::generic_wide_int<generic_wide_int<wide_int_storage> >(generic >> _wide_int<wide_int_storage> const&, unsigned int) >> ../../gcc/wide-int.h:855 >> >> My guess is that I was messing up the relation between the 'type' >> precision and the precision used in build_zero_cst(). I gave up and >> sent the full pattern, but I wonder if splitting it would be better. > > This usually does mean a type mismatch has happened or you are > comparing two `wi::to_wide` which come from different precision > INTEGER_CSTs. > > Thanks, > Andrew > >> >> >> Thanks, >> >> Daniel >> >> >> >> >>>> >>>> Jeff/Richard, >>>> What are your thoughts on this? I know Richard had thoughts on some >>>> of the single_use in the match patterns before but I have not tracked >>>> them that well. >>> >>> I had no idea we had that match.pd pattern; clearly what Daniel is doing >>> is closely related. >>> >>> The single_use in those patterns comes up here: >>> https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html >>> >>> Looks like single use is in there because of interactions with another >>> pattern. >>> >>> jeff >>> >>> >>> >>> >>> >>
On Thu, Jan 8, 2026 at 10:26 PM Daniel Henrique Barboza <daniel.barboza@oss.qualcomm.com> wrote: > > > > On 1/7/2026 7:06 PM, Andrew Pinski wrote: > > On Wed, Jan 7, 2026 at 1:47 PM Daniel Henrique Barboza > > <daniel.barboza@oss.qualcomm.com> wrote: > >> > >> > >> > >> On 1/7/2026 11:44 AM, Jeffrey Law wrote: > >>> > >>> > >>> On 1/6/2026 3:13 PM, Andrew Pinski wrote: > >>>> On Tue, Jan 6, 2026 at 1:46 PM Daniel Barboza > >>>> <daniel.barboza@oss.qualcomm.com> wrote: > >>>>> Add a transformation for a nested lshift/rshift inside a plus that > >>>>> compares for > >>>>> equality with the same operand of the plus. In other words: > >>>>> > >>>>> ((a OP b) + c EQ|NE c) ? x : y > >>>>> > >>>>> becomes, for OP = (lshift, rshift) and in a scenario without overflows: > >>>>> > >>>>> a !=/== 0 ? x : y > >>>> I think we want the transformation even if it is used outside of a > >>>> cond_expr. > >>>> Also the above is valid even for types that wrap; just not valid for > >>>> types that trap on overflow. > >>>> > >>>> As we already have a pattern for `a + C == C`: > >>>> ``` > >>>> /* For equality, this is also true with wrapping overflow. */ > >>>> (for op (eq ne) > >>>> (simplify > >>>> (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) > >>>> (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > >>>> && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > >>>> || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > >>>> && (CONSTANT_CLASS_P (@0) || (single_use (@2) && single_use > >>>> (@3))) > >>>> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > >>>> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > >>>> (op @0 { build_zero_cst (TREE_TYPE (@0)); }))) > >>>> ``` > >>>> > >>>> The problem is the above does not work as there is an single_use check > >>>> on the plus. The single use check was there since the original match > >>>> pattern was added. > >>>> I am not sure if we should add a special case where in the above > >>>> pattern @0 is a shift. > >>>> Though changing that will have to wait for GCC 17 I think. > >> > >> > >> I tried variation of that pattern, taking into account the lshift and > >> removing the single_use and constant_class_p conditionals: > >> > >> /* Do the same but with lshift|rshift as @0. */ > >> (for op (eq ne) > >> (simplify > >> (op:c (nop_convert?@3 (plus:c@2 (lshift @0 @4) (convert1? @1))) > >> (convert2? @1)) > >> (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > >> && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) > >> || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > >> && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > >> (op @0 { build_zero_cst (TREE_TYPE (@0)); })))) > >> > >> > >> And I got the following 'optimized' dump: > >> > >> > >> long int frob (long int x, long int y, long int z) > >> { > >> long int ret; > >> long int _1; > >> > >> ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, > >> freq 1.0000), maybe hot > >> ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) > >> ;; pred: ENTRY [always] count:1073741824 (estimated locally, > >> freq 1.0000) (FALLTHRU,EXECUTABLE) > >> if (y_3(D) == 0) > >> goto <bb 4>; [50.00%] > >> else > >> goto <bb 3>; [50.00%] > >> ;; succ: 4 [50.0% (guessed)] count:536870912 (estimated > >> locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) > >> ;; 3 [50.0% (guessed)] count:536870912 (estimated > >> locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) > >> > >> ;; basic block 3, loop depth 0, count 536870912 (estimated locally, > >> freq 0.5000), maybe hot > >> ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) > >> ;; pred: 2 [50.0% (guessed)] count:536870912 (estimated > >> locally, freq 0.5000) (FALSE_VALUE,EXECUTABLE) > >> _1 = y_3(D) << 2; > >> ret_5 = _1 + z_4(D); > >> ;; succ: 4 [always] count:536870912 (estimated locally, freq > >> 0.5000) (FALLTHRU,EXECUTABLE) > >> > >> ;; basic block 4, loop depth 0, count 1073741824 (estimated locally, > >> freq 1.0000), maybe hot > >> ;; prev block 3, next block 1, flags: (NEW, REACHABLE, VISITED) > >> ;; pred: 3 [always] count:536870912 (estimated locally, freq > >> 0.5000) (FALLTHRU,EXECUTABLE) > >> ;; 2 [50.0% (guessed)] count:536870912 (estimated > >> locally, freq 0.5000) (TRUE_VALUE,EXECUTABLE) > >> # ret_2 = PHI <ret_5(3), y_3(D)(2)> > >> # VUSE <.MEM_6(D)> > >> return ret_2; > >> ;; succ: EXIT [always] count:1073741824 (estimated locally, > >> freq 1.0000) (EXECUTABLE) test.c:8:10 > >> > >> } > > > > Yes afterwards, the sink happens. And pushed ` y_3(D) << 2`/`_1 + > > z_4(D)` into the branch. > > And then uncprop changes `0(2)` into `y_3(D)(2)` in the PHI (sometimes > > that is better; sometimes that is worse; for riscv with zicond it is > > usually worse because ifcvt does not always convert back into 0). > > I wonder if keeping `(lshift @0 @4)` as being compared to 0 if that > > would be better, it would at least keep one expression out of the > > branch. > > > I did some tests with the RISC-V target to see the final .s for each > (-march=rv64gcv_zba_zbb_zbc_zbs_zfa_zicond). For the pattern I sent: > > +(for cmp (eq ne) > + (for op (lshift rshift) > + (simplify > + (cond (cmp:c (plus:c (op @1 @2) @0) @0) @3 @4) > + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@1)) > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1))) > + (with > + { > + fold_overflow_warning ("assuming signed overflow does not occur " > + "when changing (X << imm) + C1 cmp C1 to " > + "X << imm cmp 0 and X << imm cmp 0 " > + "to X cmp 0", > + WARN_STRICT_OVERFLOW_COMPARISON); } Please don't any new fold_overflow_warning things, those are borderly useless (IMO all of them should be wiped). No comments on the rest, I didn't review the discussion. Richard. > + (cond (cmp @1 { build_zero_cst (type); }) @3 @4)))))) > > frob: > .LFB0: > .cfi_startproc > sh2add a2,a1,a2 > czero.eqz a0,a2,a1 > ret > .cfi_endproc > .LFE0: > > > Using a variation of the existing pattern Andrew mentioned, returning a > lshift czero compare: > > > +/* Do the same but with lshift|rshift as @0. */ > +(for cmp (eq ne) > + (for op (lshift rshift) > + (simplify > + (cmp:c (nop_convert?@3 (plus:c@2 (op @0 @4) (convert1? @1))) > (convert2? @1)) > + (with { tree type0 = TREE_TYPE (@0); } > + (if (ANY_INTEGRAL_TYPE_P (type0) > + && (TYPE_OVERFLOW_UNDEFINED (type0) > + || TYPE_OVERFLOW_WRAPS (type0)) > + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > + (cmp (convert:type0 (op @0 @4)) { build_zero_cst (type0); })))))) > > > frob: > .LFB0: > .cfi_startproc > slli a1,a1,2 > czero.eqz a0,a2,a1 > add a0,a1,a0 > ret > .cfi_endproc > .LFE0: > > > Same thing, but returning a @0 czero compare instead: > > +(for cmp (eq ne) > + (for op (lshift rshift) > + (simplify > + (cmp:c (nop_convert?@3 (plus:c@2 (op @0 @4) (convert1? @1))) > (convert2? @1)) > + (with { tree type0 = TREE_TYPE (@0); } > + (if (ANY_INTEGRAL_TYPE_P (type0) > + && (TYPE_OVERFLOW_UNDEFINED (type0) > + || TYPE_OVERFLOW_WRAPS (type0)) > + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@2)) > + && tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1))) > + (cmp @0 { build_zero_cst (type0); })))))) > > > frob: > .LFB0: > .cfi_startproc > sh2add a2,a1,a2 > czero.eqz a0,a2,a1 > ret > .cfi_endproc > .LFE0: > > > So it seems to me that yes, ifcvt can convert from the existing pattern > to the new pattern I sent if we return a @0 czero compare. > > > I prefer adding a variation of an existing pattern instead of a brand > new one so I would go with this approach. But it only works without the > single_use() checks. Given that this pattern is rather specific, as long > as we're not regressing existing tests, maybe it's ok? > > > Thanks, > > Daniel > > > > > > > > > >> > >> > >> With the pattern I sent this is the 'optimized' dump I get: > >> > >> > >> long int frob (long int x, long int y, long int z) > >> { > >> long int ret; > >> long int _1; > >> _Bool _7; > >> long int _8; > >> > >> ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, > >> freq 1.0000), maybe hot > >> ;; prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED) > >> ;; pred: ENTRY [always] count:1073741824 (estimated locally, > >> freq 1.0000) (FALLTHRU,EXECUTABLE) > >> _1 = y_3(D) << 2; > >> ret_5 = _1 + z_4(D); > >> _7 = y_3(D) == 0; > >> _8 = _7 ? 0 : ret_5; > >> # VUSE <.MEM_6(D)> > >> return _8; > >> ;; succ: EXIT [always] count:1073741824 (estimated locally, > >> freq 1.0000) (EXECUTABLE) test.c:8:10 > >> > >> > >> I'm not sure why we're having this difference given that the patterns > >> are quite similar. > > > > Because when you have a cond_expr in the output of a match pattern, we > > don't recreate an if/else branches. > > In the pattern without the cond_expr, the sink pass sinks the other > > expressions. While here we have an explicit COND_EXPR rather than > > control flow. > > The big question is, can ifcvt on the rtl level recover the first way > > into the second way; I think currently it does not but I have not > > fully looked; so we might to need help it on the gimple level (for GCC > > 17 though). > > > >> > >> By the way my initial idea for upstreaming was to split the pattern in > >> two. One like this: > >> > >> X << N EQ|NE 0 -> X EQ|NE 0 > >> > >> And another one like the original I sent but returning the full lshift > >> comparison instead of doing the reduction, i.e.: > >> > >> (a OP b) + c EQ|NE c -> (a OP b) EQ|NE 0 > >> > >> > >> But I got a lot of trouble with errors like: > >> > >> Analyzing compilation unit > >> Performing interprocedural optimizations > >> <*free_lang_data> {heap 1028k} <visibility> {heap 1028k} > >> <build_ssa_passes> {heap 1028k} <targetclone> {heap 1440k} <opt_lo > >> cal_passes> {heap 1440k}during GIMPLE pass: ccp > >> dump file: test.c.038t.ccp1 > >> test.c: In function ‘frob’: > >> test.c:9:1: internal compiler error: in decompose, at wide-int.h:1049 > >> 9 | } > >> | ^ > >> 0x2ecc34f internal_error(char const*, ...) > >> ../../gcc/diagnostic-global-context.cc:787 > >> 0xf0790f fancy_abort(char const*, int, char const*) > >> ../../gcc/diagnostics/context.cc:1805 > >> 0xa5b276 wi::int_traits<generic_wide_int<wide_int_ref_storage<false, > >> false> > >::decompose(long*, unsigned int, generic_wide > >> _int<wide_int_ref_storage<false, false> > const&) > >> ../../gcc/wide-int.h:1049 > >> 0xa5d419 wi::int_traits<generic_wide_int<wide_int_storage> > >> >::decompose(long*, unsigned int, generic_wide_int<wide_int_stora > >> ge> const&) > >> ../../gcc/tree.h:3906 > >> 0xa5d419 wide_int_ref_storage<true, > >> false>::wide_int_ref_storage<generic_wide_int<wide_int_storage> > >> >(generic_wide_int<wide_ > >> int_storage> const&, unsigned int) > >> ../../gcc/wide-int.h:1099 > >> 0xa5d419 generic_wide_int<wide_int_ref_storage<true, false> > >> >::generic_wide_int<generic_wide_int<wide_int_storage> >(generic > >> _wide_int<wide_int_storage> const&, unsigned int) > >> ../../gcc/wide-int.h:855 > >> > >> My guess is that I was messing up the relation between the 'type' > >> precision and the precision used in build_zero_cst(). I gave up and > >> sent the full pattern, but I wonder if splitting it would be better. > > > > This usually does mean a type mismatch has happened or you are > > comparing two `wi::to_wide` which come from different precision > > INTEGER_CSTs. > > > > Thanks, > > Andrew > > > >> > >> > >> Thanks, > >> > >> Daniel > >> > >> > >> > >> > >>>> > >>>> Jeff/Richard, > >>>> What are your thoughts on this? I know Richard had thoughts on some > >>>> of the single_use in the match patterns before but I have not tracked > >>>> them that well. > >>> > >>> I had no idea we had that match.pd pattern; clearly what Daniel is doing > >>> is closely related. > >>> > >>> The single_use in those patterns comes up here: > >>> https://gcc.gnu.org/pipermail/gcc-patches/2017-October/484606.html > >>> > >>> Looks like single use is in there because of interactions with another > >>> pattern. > >>> > >>> jeff > >>> > >>> > >>> > >>> > >>> > >> >
On 1/9/2026 12:24 AM, Richard Biener wrote: > Please don't any new fold_overflow_warning things, those are borderly > useless (IMO all of them should be wiped). > > No comments on the rest, I didn't review the discussion. Blame me for that. I saw a note somewhere in our docs that made me believe it was a requirement for these kinds of cases. It's certainly simpler if we don't have to do that! jeff
On 1/9/2026 10:44 AM, Jeffrey Law wrote: > > > On 1/9/2026 12:24 AM, Richard Biener wrote: >> Please don't any new fold_overflow_warning things, those are borderly >> useless (IMO all of them should be wiped). >> >> No comments on the rest, I didn't review the discussion. > Blame me for that. I saw a note somewhere in our docs that made me > believe it was a requirement for these kinds of cases. It's certainly > simpler if we don't have to do that! The note is on the docs of the macro in gcc/tree.h: /* True if overflow is undefined for the given integral or pointer type. We may optimize on the assumption that values in the type never overflow. IMPORTANT NOTE: Any optimization based on TYPE_OVERFLOW_UNDEFINED must issue a warning based on warn_strict_overflow. In some cases it will be appropriate to issue the warning immediately, and in other cases it will be appropriate to simply set a flag and let the caller decide whether a warning is appropriate or not. */ #define TYPE_OVERFLOW_UNDEFINED(TYPE) \ Daniel > > jeff
On Fri, Jan 9, 2026 at 3:45 PM Daniel Henrique Barboza <daniel.barboza@oss.qualcomm.com> wrote: > > > > On 1/9/2026 10:44 AM, Jeffrey Law wrote: > > > > > > On 1/9/2026 12:24 AM, Richard Biener wrote: > >> Please don't any new fold_overflow_warning things, those are borderly > >> useless (IMO all of them should be wiped). > >> > >> No comments on the rest, I didn't review the discussion. > > Blame me for that. I saw a note somewhere in our docs that made me > > believe it was a requirement for these kinds of cases. It's certainly > > simpler if we don't have to do that! > > The note is on the docs of the macro in gcc/tree.h: > > /* True if overflow is undefined for the given integral or pointer type. > We may optimize on the assumption that values in the type never > overflow. > > IMPORTANT NOTE: Any optimization based on TYPE_OVERFLOW_UNDEFINED > must issue a warning based on warn_strict_overflow. In some cases > it will be appropriate to issue the warning immediately, and in > other cases it will be appropriate to simply set a flag and let the > caller decide whether a warning is appropriate or not. */ > #define TYPE_OVERFLOW_UNDEFINED(TYPE) \ Ah, we should remove that note. -Wstrict-overflow proved useless IMO, it's way too noisy as it diagnoses when the compiler relies on overflow not happening, not diagnosing when it possibly happens. That's not a very useful diagnostic to have - it does not point to a possible problem in the code (we could as well diagnose _all_ signed arithmetic operations for the same argument that we might eventually rely on overflow not happening). The documentation about -Wstrict-overflow actually says this. UBSAN is a _much_ more useful tool given it diagnoses signed overflow that actually happens! Richard. > > Daniel > > > > > > jeff >
diff --git a/gcc/match.pd b/gcc/match.pd index ccdc1129e23..fdc18b1ef41 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2824,6 +2824,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (le (minus (convert:etype @0) { lo; }) { hi; }) (gt (minus (convert:etype @0) { lo; }) { hi; }))))))))) +/* Reduces ((A OP B) + C EQ|NE C) ? X : Y to + A EQ|NE 0 ? x : y, for OP = (lshift, rshift). */ +(for cmp (eq ne) + (for op (lshift rshift) + (simplify + (cond (cmp:c (plus:c (op @1 @2) @0) @0) @3 @4) + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1))) + (with + { + fold_overflow_warning ("assuming signed overflow does not occur " + "when changing (X << imm) + C1 cmp C1 to " + "X << imm cmp 0 and X << imm cmp 0 " + "to X cmp 0", + WARN_STRICT_OVERFLOW_COMPARISON); } + (cond (cmp @1 { build_zero_cst (type); }) @3 @4)))))) + /* X + Z < Y + Z is the same as X < Y when there is no overflow. */ (for op (lt le ge gt) (simplify diff --git a/gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c b/gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c new file mode 100644 index 00000000000..769412202c3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/ifcond-plus-shift-czero.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fdump-tree-phiopt2" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-fno-fat-lto-objects" } { "" } } */ + +long +f1 (long x, long y, long z) +{ + long ret = (y << 2) + z; + long cond = ret != z; + if (cond == 0) + ret = 0; + return ret; +} + +long +f2 (long x, long y, long z) +{ + long ret = (y >> 2) + z; + long cond = ret != z; + if (cond == 0) + ret = 0; + return ret; +} + +/* { dg-final { scan-tree-dump-times "== 0" 2 "phiopt2" } } */