Message ID | 20211125081834.GX2646553@tucnak |
---|---|
State | Committed |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D5CAD385AC20 for <patchwork@sourceware.org>; Thu, 25 Nov 2021 08:19:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D5CAD385AC20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1637828354; bh=XCpCh3lr3MXlyVksVsmcupoVuf7lijn3IYjG+wDCjdU=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=alhwmGy3JecV0VuxVZyOQ8s9wgO+ZHdSH1A3daxj+JLc2+aEpJU51V84kCIFLO3Zh hcewSOWJn3WJ3W/BvAF+VhPyx/hDtInXjruZYA9V21OjQ+WQaQFKfVm+ohKlmNm1ml /f/p1V4sxG0b22ExHaVR1VHEKV+sSwOv0CJdPC0o= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id E4E76385843B for <gcc-patches@gcc.gnu.org>; Thu, 25 Nov 2021 08:18:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E4E76385843B Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-143-ciTEierHMhSAK19P7nQmow-1; Thu, 25 Nov 2021 03:18:41 -0500 X-MC-Unique: ciTEierHMhSAK19P7nQmow-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C9B9A1927838; Thu, 25 Nov 2021 08:18:39 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 429CA60854; Thu, 25 Nov 2021 08:18:39 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 1AP8Iaxs2565184 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 09:18:36 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 1AP8IZIm2565183; Thu, 25 Nov 2021 09:18:35 +0100 Date: Thu, 25 Nov 2021 09:18:34 +0100 To: Richard Biener <rguenther@suse.de> Subject: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p simplification [PR103417] Message-ID: <20211125081834.GX2646553@tucnak> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Jakub Jelinek <jakub@redhat.com> Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
match.pd: Fix up the recent bitmask_inv_cst_vector_p simplification [PR103417]
|
|
Commit Message
Jakub Jelinek
Nov. 25, 2021, 8:18 a.m. UTC
Hi! The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10 changes. The simplification triggers on (x & 4294967040U) >= 0U and turns it into: x <= 255U which is incorrect, it should fold to 1 because unsigned >= 0U is always true and normally the /* Non-equality compare simplifications from fold_binary */ (if (wi::to_wide (cst) == min) (if (cmp == GE_EXPR) { constant_boolean_node (true, type); }) simplification folds that, but this simplification was done earlier. The simplification correctly doesn't include lt which has the same reason why it shouldn't be handled, we'll fold it to 0 elsewhere. But, IMNSHO while it isn't incorrect to handle le and gt there, it is unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should never appear, again in /* Non-equality compare simplifications from fold_binary */ we have a simplification for it: (if (cmp == LE_EXPR) (eq @2 @1)) (if (cmp == GT_EXPR) (ne @2 @1)))) This is done for (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done for both integers and vectors. As the bitmask_inv_cst_vector_p simplification only handles eq and ne for signed types, I think it can be simplified to just following patch. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? I wonder if (for cst (VECTOR_CST INTEGER_CST) is good for the best size of *-match.c, wouldn't accepting just CONSTANT_CLASS_P@1 and then just say in bitmask_inv_cst_vector_p return NULL_TREE if it isn't INTEGER_CST or VECTOR_CST? Also, without/with this patch I see on i686-linux (can be reproduced with RUNTESTFLAGS="--target_board=unix/-m32/-mno-sse dg.exp='bic-bitmask* signbit-2*'" too): FAIL: gcc.dg/bic-bitmask-10.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" FAIL: gcc.dg/bic-bitmask-11.c scan-tree-dump dce7 ">\\\\s*.+{ 255,.+}" FAIL: gcc.dg/bic-bitmask-12.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" FAIL: gcc.dg/bic-bitmask-2.c scan-tree-dump-times dce7 "<=\\\\s*.+{ 255,.+}" 1 FAIL: gcc.dg/bic-bitmask-23.c (test for excess errors) FAIL: gcc.dg/bic-bitmask-23.c scan-tree-dump dce7 "<=\\\\s*.+{ 255, 15, 1, 65535 }" FAIL: gcc.dg/bic-bitmask-3.c scan-tree-dump-times dce7 "<=\\\\s*.+{ 255,.+}" 1 FAIL: gcc.dg/bic-bitmask-4.c scan-tree-dump-times dce7 "=\\\\s*.+{ 1,.+}" 1 FAIL: gcc.dg/bic-bitmask-5.c scan-tree-dump-times dce7 ">\\\\s*.+{ 255,.+}" 1 FAIL: gcc.dg/bic-bitmask-6.c scan-tree-dump-times dce7 "<=\\\\s*.+{ 255,.+}" 1 FAIL: gcc.dg/bic-bitmask-8.c scan-tree-dump-times dce7 ">\\\\s*.+{ 1,.+}" 1 FAIL: gcc.dg/bic-bitmask-9.c scan-tree-dump dce7 "&\\\\s*.+{ 4294967290,.+}" FAIL: gcc.dg/signbit-2.c scan-tree-dump optimized "\\\\s+>\\\\s+{ 0(, 0)+ }" Those tests use vect_int effective target, but AFAIK that can be used only in *.dg/vect/ because it relies on vect.exp enabling options to support vectorization on the particular target (e.g. for i686-linux that -msse2). I think there isn't other way to get the DEFAULT_VECTCFLAGS into dg-options other than having the test driven by vect.exp. And, finally, I've noticed incorrect formatting in the new bitmask_inv_cst_vector_p routine: do { if (idx > 0) cst = vector_cst_elt (t, idx); ... builder.quick_push (newcst); } while (++idx < nelts); It should be do { if (idx > 0) cst = vector_cst_elt (t, idx); ... builder.quick_push (newcst); } while (++idx < nelts); 2021-11-25 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/103417 * match.pd ((X & Y) CMP 0): Only handle eq and ne. Commonalize common tests. * gcc.c-torture/execute/pr103417.c: New test. Jakub
Comments
Hi Jakub, > -----Original Message----- > From: Jakub Jelinek <jakub@redhat.com> > Sent: Thursday, November 25, 2021 8:19 AM > To: Richard Biener <rguenther@suse.de> > Cc: Tamar Christina <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org > Subject: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > simplification [PR103417] > > Hi! > > The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10 > changes. > The simplification triggers on > (x & 4294967040U) >= 0U > and turns it into: > x <= 255U > which is incorrect, it should fold to 1 because unsigned >= 0U is always true > and normally the > /* Non-equality compare simplifications from fold_binary */ > (if (wi::to_wide (cst) == min) > (if (cmp == GE_EXPR) > { constant_boolean_node (true, type); }) simplification folds that, but > this simplification was done earlier. > > The simplification correctly doesn't include lt which has the same reason why > it shouldn't be handled, we'll fold it to 0 elsewhere. Yes this was a bug, sorry I'm not sure why I didn't catch it... > > But, IMNSHO while it isn't incorrect to handle le and gt there, it is > unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should never appear, > again in > /* Non-equality compare simplifications from fold_binary */ we have a > simplification for it: > (if (cmp == LE_EXPR) > (eq @2 @1)) > (if (cmp == GT_EXPR) > (ne @2 @1)))) > This is done for > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done > for both integers and vectors. > As the bitmask_inv_cst_vector_p simplification only handles eq and ne for > signed types, I think it can be simplified to just following patch. As I mentioned on the PR I don't think LE and GT should be removed, the patch Is attempting to simplify the bitmask used because most vector ISAs can create the simpler mask much easier than the complex mask. It. 0xFFFFFF00 is harder to create than 0xFF. So while for scalar it doesn't matter as much, it does for vector code. Regards, Tamar > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > I wonder if (for cst (VECTOR_CST INTEGER_CST) is good for the best size of *- > match.c, wouldn't accepting just CONSTANT_CLASS_P@1 and then just say in > bitmask_inv_cst_vector_p return NULL_TREE if it isn't INTEGER_CST or > VECTOR_CST? > > Also, without/with this patch I see on i686-linux (can be reproduced with > RUNTESTFLAGS="--target_board=unix/-m32/-mno-sse dg.exp='bic-bitmask* > signbit-2*'" > too): > FAIL: gcc.dg/bic-bitmask-10.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-11.c scan-tree-dump dce7 ">\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-12.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-2.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-23.c (test for excess errors) > FAIL: gcc.dg/bic-bitmask-23.c scan-tree-dump dce7 "<=\\\\s*.+{ 255, 15, 1, > 65535 }" > FAIL: gcc.dg/bic-bitmask-3.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-4.c scan-tree-dump-times dce7 "=\\\\s*.+{ 1,.+}" 1 > FAIL: gcc.dg/bic-bitmask-5.c scan-tree-dump-times dce7 ">\\\\s*.+{ 255,.+}" > 1 > FAIL: gcc.dg/bic-bitmask-6.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-8.c scan-tree-dump-times dce7 ">\\\\s*.+{ 1,.+}" 1 > FAIL: gcc.dg/bic-bitmask-9.c scan-tree-dump dce7 > "&\\\\s*.+{ 4294967290,.+}" > FAIL: gcc.dg/signbit-2.c scan-tree-dump optimized "\\\\s+>\\\\s+{ 0(, 0)+ }" > Those tests use vect_int effective target, but AFAIK that can be used only in > *.dg/vect/ because it relies on vect.exp enabling options to support > vectorization on the particular target (e.g. for i686-linux that -msse2). > I think there isn't other way to get the DEFAULT_VECTCFLAGS into dg- > options other than having the test driven by vect.exp. > > And, finally, I've noticed incorrect formatting in the new > bitmask_inv_cst_vector_p routine: > do { > if (idx > 0) > cst = vector_cst_elt (t, idx); > ... > builder.quick_push (newcst); > } while (++idx < nelts); > It should be > do > { > if (idx > 0) > cst = vector_cst_elt (t, idx); > ... > builder.quick_push (newcst); > } > while (++idx < nelts); > > 2021-11-25 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/103417 > * match.pd ((X & Y) CMP 0): Only handle eq and ne. Commonalize > common tests. > > * gcc.c-torture/execute/pr103417.c: New test. > > --- gcc/match.pd.jj 2021-11-24 11:46:03.191918052 +0100 > +++ gcc/match.pd 2021-11-24 22:33:43.852575772 +0100 > @@ -5214,20 +5214,16 @@ (define_operator_list SYNC_FETCH_AND_AND > /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z > where ~Y + 1 == pow2 and Z = ~Y. */ (for cst (VECTOR_CST INTEGER_CST) > - (for cmp (le eq ne ge gt) > - icmp (le le gt le gt) > - (simplify > - (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > - (with { tree csts = bitmask_inv_cst_vector_p (@1); } > - (switch > - (if (csts && TYPE_UNSIGNED (TREE_TYPE (@1)) > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > - (icmp @0 { csts; })) > - (if (csts && !TYPE_UNSIGNED (TREE_TYPE (@1)) > - && (cmp == EQ_EXPR || cmp == NE_EXPR) > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > + (for cmp (eq ne) > + icmp (le gt) > + (simplify > + (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > + (with { tree csts = bitmask_inv_cst_vector_p (@1); } > + (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > + (if (TYPE_UNSIGNED (TREE_TYPE (@1))) > + (icmp @0 { csts; }) > (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); } > - (icmp (convert:utype @0) { csts; })))))))) > + (icmp (convert:utype @0) { csts; })))))))) > > /* -A CMP -B -> B CMP A. */ > (for cmp (tcc_comparison) > --- gcc/testsuite/gcc.c-torture/execute/pr103417.c.jj 2021-11-24 > 22:36:00.732626424 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr103417.c 2021-11-24 > 22:35:43.964865218 +0100 > @@ -0,0 +1,11 @@ > +/* PR tree-optimization/103417 */ > + > +struct { int a : 8; int b : 24; } c = { 0, 1 }; > + > +int > +main () > +{ > + if (c.b && !c.b) > + __builtin_abort (); > + return 0; > +} > > Jakub
On Thu, 25 Nov 2021, Tamar Christina wrote: > Hi Jakub, > > > -----Original Message----- > > From: Jakub Jelinek <jakub@redhat.com> > > Sent: Thursday, November 25, 2021 8:19 AM > > To: Richard Biener <rguenther@suse.de> > > Cc: Tamar Christina <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org > > Subject: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > > simplification [PR103417] > > > > Hi! > > > > The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10 > > changes. > > The simplification triggers on > > (x & 4294967040U) >= 0U > > and turns it into: > > x <= 255U > > which is incorrect, it should fold to 1 because unsigned >= 0U is always true > > and normally the > > /* Non-equality compare simplifications from fold_binary */ > > (if (wi::to_wide (cst) == min) > > (if (cmp == GE_EXPR) > > { constant_boolean_node (true, type); }) simplification folds that, but > > this simplification was done earlier. > > > > The simplification correctly doesn't include lt which has the same reason why > > it shouldn't be handled, we'll fold it to 0 elsewhere. > > > Yes this was a bug, sorry I'm not sure why I didn't catch it... > > > > > But, IMNSHO while it isn't incorrect to handle le and gt there, it is > > unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should never appear, > > again in > > /* Non-equality compare simplifications from fold_binary */ we have a > > simplification for it: > > (if (cmp == LE_EXPR) > > (eq @2 @1)) > > (if (cmp == GT_EXPR) > > (ne @2 @1)))) > > This is done for > > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done > > for both integers and vectors. > > As the bitmask_inv_cst_vector_p simplification only handles eq and ne for > > signed types, I think it can be simplified to just following patch. Note that would mean the transform should be ordered _after_ the above, even if we retain it for vector le/gt. > As I mentioned on the PR I don't think LE and GT should be removed, the patch > Is attempting to simplify the bitmask used because most vector ISAs can create > the simpler mask much easier than the complex mask. > > It. 0xFFFFFF00 is harder to create than 0xFF. So while for scalar it doesn't matter > as much, it does for vector code. > > Regards, > Tamar > > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > > > I wonder if (for cst (VECTOR_CST INTEGER_CST) is good for the best size of *- > > match.c, wouldn't accepting just CONSTANT_CLASS_P@1 and then just say in > > bitmask_inv_cst_vector_p return NULL_TREE if it isn't INTEGER_CST or > > VECTOR_CST? In the end that should be recoverable by genmatch. I do have some ideas to improve it for size in this area, maybe during stage4. Originally genmatch was trying to optimize for matching speed but with now honoring ordering of patterns that very much became secondary (note re-ordering patterns in match.pd can also improve *-match.c size greatly! maybe some script can try to brute-force the "optimal" order - but note some pattern order matters ;)) > > Also, without/with this patch I see on i686-linux (can be reproduced with > > RUNTESTFLAGS="--target_board=unix/-m32/-mno-sse dg.exp='bic-bitmask* > > signbit-2*'" > > too): > > FAIL: gcc.dg/bic-bitmask-10.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > > FAIL: gcc.dg/bic-bitmask-11.c scan-tree-dump dce7 ">\\\\s*.+{ 255,.+}" > > FAIL: gcc.dg/bic-bitmask-12.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > > FAIL: gcc.dg/bic-bitmask-2.c scan-tree-dump-times dce7 > > "<=\\\\s*.+{ 255,.+}" 1 > > FAIL: gcc.dg/bic-bitmask-23.c (test for excess errors) > > FAIL: gcc.dg/bic-bitmask-23.c scan-tree-dump dce7 "<=\\\\s*.+{ 255, 15, 1, > > 65535 }" > > FAIL: gcc.dg/bic-bitmask-3.c scan-tree-dump-times dce7 > > "<=\\\\s*.+{ 255,.+}" 1 > > FAIL: gcc.dg/bic-bitmask-4.c scan-tree-dump-times dce7 "=\\\\s*.+{ 1,.+}" 1 > > FAIL: gcc.dg/bic-bitmask-5.c scan-tree-dump-times dce7 ">\\\\s*.+{ 255,.+}" > > 1 > > FAIL: gcc.dg/bic-bitmask-6.c scan-tree-dump-times dce7 > > "<=\\\\s*.+{ 255,.+}" 1 > > FAIL: gcc.dg/bic-bitmask-8.c scan-tree-dump-times dce7 ">\\\\s*.+{ 1,.+}" 1 > > FAIL: gcc.dg/bic-bitmask-9.c scan-tree-dump dce7 > > "&\\\\s*.+{ 4294967290,.+}" > > FAIL: gcc.dg/signbit-2.c scan-tree-dump optimized "\\\\s+>\\\\s+{ 0(, 0)+ }" > > Those tests use vect_int effective target, but AFAIK that can be used only in > > *.dg/vect/ because it relies on vect.exp enabling options to support > > vectorization on the particular target (e.g. for i686-linux that -msse2). > > I think there isn't other way to get the DEFAULT_VECTCFLAGS into dg- > > options other than having the test driven by vect.exp. > > > > And, finally, I've noticed incorrect formatting in the new > > bitmask_inv_cst_vector_p routine: > > do { > > if (idx > 0) > > cst = vector_cst_elt (t, idx); > > ... > > builder.quick_push (newcst); > > } while (++idx < nelts); > > It should be > > do > > { > > if (idx > 0) > > cst = vector_cst_elt (t, idx); > > ... > > builder.quick_push (newcst); > > } > > while (++idx < nelts); > > > > 2021-11-25 Jakub Jelinek <jakub@redhat.com> > > > > PR tree-optimization/103417 > > * match.pd ((X & Y) CMP 0): Only handle eq and ne. Commonalize > > common tests. > > > > * gcc.c-torture/execute/pr103417.c: New test. > > > > --- gcc/match.pd.jj 2021-11-24 11:46:03.191918052 +0100 > > +++ gcc/match.pd 2021-11-24 22:33:43.852575772 +0100 > > @@ -5214,20 +5214,16 @@ (define_operator_list SYNC_FETCH_AND_AND > > /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z > > where ~Y + 1 == pow2 and Z = ~Y. */ (for cst (VECTOR_CST INTEGER_CST) > > - (for cmp (le eq ne ge gt) > > - icmp (le le gt le gt) > > - (simplify > > - (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > > - (with { tree csts = bitmask_inv_cst_vector_p (@1); } > > - (switch > > - (if (csts && TYPE_UNSIGNED (TREE_TYPE (@1)) > > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > > - (icmp @0 { csts; })) > > - (if (csts && !TYPE_UNSIGNED (TREE_TYPE (@1)) > > - && (cmp == EQ_EXPR || cmp == NE_EXPR) > > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > > + (for cmp (eq ne) > > + icmp (le gt) > > + (simplify > > + (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > > + (with { tree csts = bitmask_inv_cst_vector_p (@1); } > > + (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > > + (if (TYPE_UNSIGNED (TREE_TYPE (@1))) > > + (icmp @0 { csts; }) > > (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); } > > - (icmp (convert:utype @0) { csts; })))))))) > > + (icmp (convert:utype @0) { csts; })))))))) > > > > /* -A CMP -B -> B CMP A. */ > > (for cmp (tcc_comparison) > > --- gcc/testsuite/gcc.c-torture/execute/pr103417.c.jj 2021-11-24 > > 22:36:00.732626424 +0100 > > +++ gcc/testsuite/gcc.c-torture/execute/pr103417.c 2021-11-24 > > 22:35:43.964865218 +0100 > > @@ -0,0 +1,11 @@ > > +/* PR tree-optimization/103417 */ > > + > > +struct { int a : 8; int b : 24; } c = { 0, 1 }; > > + > > +int > > +main () > > +{ > > + if (c.b && !c.b) > > + __builtin_abort (); > > + return 0; > > +} > > > > Jakub > >
On Thu, Nov 25, 2021 at 08:23:50AM +0000, Tamar Christina wrote: > > But, IMNSHO while it isn't incorrect to handle le and gt there, it is > > unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should never appear, > > again in > > /* Non-equality compare simplifications from fold_binary */ we have a > > simplification for it: > > (if (cmp == LE_EXPR) > > (eq @2 @1)) > > (if (cmp == GT_EXPR) > > (ne @2 @1)))) > > This is done for > > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done > > for both integers and vectors. > > As the bitmask_inv_cst_vector_p simplification only handles eq and ne for > > signed types, I think it can be simplified to just following patch. > > As I mentioned on the PR I don't think LE and GT should be removed, the patch > Is attempting to simplify the bitmask used because most vector ISAs can create > the simpler mask much easier than the complex mask. > > It. 0xFFFFFF00 is harder to create than 0xFF. So while for scalar it doesn't matter > as much, it does for vector code. What I'm trying to explain is that you should never see those le or gt cases with TYPE_UNSIGNED (especially when the simplification is moved after those /* Non-equality compare simplifications from fold_binary */ I've mentioned), because if you try: typedef unsigned V __attribute__((vector_size (4))); unsigned f1 (unsigned x) { unsigned z = 0; return x > z; } unsigned f2 (unsigned x) { unsigned z = 0; return x <= z; } V f3 (V x) { V z = (V) {}; return x > z; } V f4 (V x) { V z = (V) {}; return x <= z; } you'll see that this is at ccp1 when the constants propagate simplified using the rules I mentioned into x != 0U, x == 0U, x != (V) {} and x == (V) {}. The important rule of match.pd is composability, the simplifications should rely on other simplifications and not repeating all their decisions because that makes the *match.c larger and more expensive (and a source of extra possible bugs). Jakub
> -----Original Message----- > From: Jakub Jelinek <jakub@redhat.com> > Sent: Thursday, November 25, 2021 8:39 AM > To: Tamar Christina <Tamar.Christina@arm.com> > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > simplification [PR103417] > > On Thu, Nov 25, 2021 at 08:23:50AM +0000, Tamar Christina wrote: > > > But, IMNSHO while it isn't incorrect to handle le and gt there, it > > > is unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should > > > never appear, again in > > > /* Non-equality compare simplifications from fold_binary */ we have > > > a simplification for it: > > > (if (cmp == LE_EXPR) > > > (eq @2 @1)) > > > (if (cmp == GT_EXPR) > > > (ne @2 @1)))) > > > This is done for > > > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be > > > done for both integers and vectors. > > > As the bitmask_inv_cst_vector_p simplification only handles eq and > > > ne for signed types, I think it can be simplified to just following patch. > > > > As I mentioned on the PR I don't think LE and GT should be removed, > > the patch Is attempting to simplify the bitmask used because most > > vector ISAs can create the simpler mask much easier than the complex > mask. > > > > It. 0xFFFFFF00 is harder to create than 0xFF. So while for scalar it doesn't > matter > > as much, it does for vector code. > > What I'm trying to explain is that you should never see those le or gt cases > with TYPE_UNSIGNED (especially when the simplification is moved after > those > /* Non-equality compare simplifications from fold_binary */ I've mentioned), > because if you try: > typedef unsigned V __attribute__((vector_size (4))); > > unsigned f1 (unsigned x) { unsigned z = 0; return x > z; } unsigned f2 > (unsigned x) { unsigned z = 0; return x <= z; } V f3 (V x) { V z = (V) {}; return x > > z; } V f4 (V x) { V z = (V) {}; return x <= z; } you'll see that this is at ccp1 when > the constants propagate simplified using the rules I mentioned into x != 0U, x > == 0U, x != (V) {} and x == (V) {}. Ah I see, sorry I didn't see that rule before, you're right that if this is ordered after it then they can be dropped. Thanks, Tamar > > The important rule of match.pd is composability, the simplifications should > rely on other simplifications and not repeating all their decisions because that > makes the *match.c larger and more expensive (and a source of extra > possible bugs). > > Jakub
> -----Original Message----- > From: Jakub Jelinek <jakub@redhat.com> > Sent: Thursday, November 25, 2021 8:19 AM > To: Richard Biener <rguenther@suse.de> > Cc: Tamar Christina <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org > Subject: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > simplification [PR103417] > > Hi! > > The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10 > changes. > The simplification triggers on > (x & 4294967040U) >= 0U > and turns it into: > x <= 255U > which is incorrect, it should fold to 1 because unsigned >= 0U is always true > and normally the > /* Non-equality compare simplifications from fold_binary */ > (if (wi::to_wide (cst) == min) > (if (cmp == GE_EXPR) > { constant_boolean_node (true, type); }) simplification folds that, but > this simplification was done earlier. > > The simplification correctly doesn't include lt which has the same reason why > it shouldn't be handled, we'll fold it to 0 elsewhere. > > But, IMNSHO while it isn't incorrect to handle le and gt there, it is > unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should never appear, > again in > /* Non-equality compare simplifications from fold_binary */ we have a > simplification for it: > (if (cmp == LE_EXPR) > (eq @2 @1)) > (if (cmp == GT_EXPR) > (ne @2 @1)))) > This is done for > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done > for both integers and vectors. > As the bitmask_inv_cst_vector_p simplification only handles eq and ne for > signed types, I think it can be simplified to just following patch. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > I wonder if (for cst (VECTOR_CST INTEGER_CST) is good for the best size of *- > match.c, wouldn't accepting just CONSTANT_CLASS_P@1 and then just say in > bitmask_inv_cst_vector_p return NULL_TREE if it isn't INTEGER_CST or > VECTOR_CST? > > Also, without/with this patch I see on i686-linux (can be reproduced with > RUNTESTFLAGS="--target_board=unix/-m32/-mno-sse dg.exp='bic-bitmask* > signbit-2*'" > too): > FAIL: gcc.dg/bic-bitmask-10.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-11.c scan-tree-dump dce7 ">\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-12.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-2.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-23.c (test for excess errors) > FAIL: gcc.dg/bic-bitmask-23.c scan-tree-dump dce7 "<=\\\\s*.+{ 255, 15, 1, > 65535 }" > FAIL: gcc.dg/bic-bitmask-3.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-4.c scan-tree-dump-times dce7 "=\\\\s*.+{ 1,.+}" 1 > FAIL: gcc.dg/bic-bitmask-5.c scan-tree-dump-times dce7 ">\\\\s*.+{ 255,.+}" > 1 > FAIL: gcc.dg/bic-bitmask-6.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-8.c scan-tree-dump-times dce7 ">\\\\s*.+{ 1,.+}" 1 > FAIL: gcc.dg/bic-bitmask-9.c scan-tree-dump dce7 > "&\\\\s*.+{ 4294967290,.+}" > FAIL: gcc.dg/signbit-2.c scan-tree-dump optimized "\\\\s+>\\\\s+{ 0(, 0)+ }" > Those tests use vect_int effective target, but AFAIK that can be used only in > *.dg/vect/ because it relies on vect.exp enabling options to support > vectorization on the particular target (e.g. for i686-linux that -msse2). > I think there isn't other way to get the DEFAULT_VECTCFLAGS into dg- > options other than having the test driven by vect.exp. Yeah, I now see that vect_int is different from some of the other effective target checks like the SVE one. I'll move the ones testing the vector code to vect and leave the scalars where they are. Thanks, Tamar > > And, finally, I've noticed incorrect formatting in the new > bitmask_inv_cst_vector_p routine: > do { > if (idx > 0) > cst = vector_cst_elt (t, idx); > ... > builder.quick_push (newcst); > } while (++idx < nelts); > It should be > do > { > if (idx > 0) > cst = vector_cst_elt (t, idx); > ... > builder.quick_push (newcst); > } > while (++idx < nelts); > > 2021-11-25 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/103417 > * match.pd ((X & Y) CMP 0): Only handle eq and ne. Commonalize > common tests. > > * gcc.c-torture/execute/pr103417.c: New test. > > --- gcc/match.pd.jj 2021-11-24 11:46:03.191918052 +0100 > +++ gcc/match.pd 2021-11-24 22:33:43.852575772 +0100 > @@ -5214,20 +5214,16 @@ (define_operator_list SYNC_FETCH_AND_AND > /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z > where ~Y + 1 == pow2 and Z = ~Y. */ (for cst (VECTOR_CST INTEGER_CST) > - (for cmp (le eq ne ge gt) > - icmp (le le gt le gt) > - (simplify > - (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > - (with { tree csts = bitmask_inv_cst_vector_p (@1); } > - (switch > - (if (csts && TYPE_UNSIGNED (TREE_TYPE (@1)) > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > - (icmp @0 { csts; })) > - (if (csts && !TYPE_UNSIGNED (TREE_TYPE (@1)) > - && (cmp == EQ_EXPR || cmp == NE_EXPR) > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > + (for cmp (eq ne) > + icmp (le gt) > + (simplify > + (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > + (with { tree csts = bitmask_inv_cst_vector_p (@1); } > + (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > + (if (TYPE_UNSIGNED (TREE_TYPE (@1))) > + (icmp @0 { csts; }) > (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); } > - (icmp (convert:utype @0) { csts; })))))))) > + (icmp (convert:utype @0) { csts; })))))))) > > /* -A CMP -B -> B CMP A. */ > (for cmp (tcc_comparison) > --- gcc/testsuite/gcc.c-torture/execute/pr103417.c.jj 2021-11-24 > 22:36:00.732626424 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr103417.c 2021-11-24 > 22:35:43.964865218 +0100 > @@ -0,0 +1,11 @@ > +/* PR tree-optimization/103417 */ > + > +struct { int a : 8; int b : 24; } c = { 0, 1 }; > + > +int > +main () > +{ > + if (c.b && !c.b) > + __builtin_abort (); > + return 0; > +} > > Jakub
On Thu, 25 Nov 2021, Tamar Christina wrote: > > > > -----Original Message----- > > From: Jakub Jelinek <jakub@redhat.com> > > Sent: Thursday, November 25, 2021 8:39 AM > > To: Tamar Christina <Tamar.Christina@arm.com> > > Cc: Richard Biener <rguenther@suse.de>; gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > > simplification [PR103417] > > > > On Thu, Nov 25, 2021 at 08:23:50AM +0000, Tamar Christina wrote: > > > > But, IMNSHO while it isn't incorrect to handle le and gt there, it > > > > is unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should > > > > never appear, again in > > > > /* Non-equality compare simplifications from fold_binary */ we have > > > > a simplification for it: > > > > (if (cmp == LE_EXPR) > > > > (eq @2 @1)) > > > > (if (cmp == GT_EXPR) > > > > (ne @2 @1)))) > > > > This is done for > > > > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be > > > > done for both integers and vectors. > > > > As the bitmask_inv_cst_vector_p simplification only handles eq and > > > > ne for signed types, I think it can be simplified to just following patch. > > > > > > As I mentioned on the PR I don't think LE and GT should be removed, > > > the patch Is attempting to simplify the bitmask used because most > > > vector ISAs can create the simpler mask much easier than the complex > > mask. > > > > > > It. 0xFFFFFF00 is harder to create than 0xFF. So while for scalar it doesn't > > matter > > > as much, it does for vector code. > > > > What I'm trying to explain is that you should never see those le or gt cases > > with TYPE_UNSIGNED (especially when the simplification is moved after > > those > > /* Non-equality compare simplifications from fold_binary */ I've mentioned), > > because if you try: > > typedef unsigned V __attribute__((vector_size (4))); > > > > unsigned f1 (unsigned x) { unsigned z = 0; return x > z; } unsigned f2 > > (unsigned x) { unsigned z = 0; return x <= z; } V f3 (V x) { V z = (V) {}; return x > > > z; } V f4 (V x) { V z = (V) {}; return x <= z; } you'll see that this is at ccp1 when > > the constants propagate simplified using the rules I mentioned into x != 0U, x > > == 0U, x != (V) {} and x == (V) {}. > > Ah I see, sorry I didn't see that rule before, you're right that if this is ordered > after it then they can be dropped. So the patch is OK, possibly with re-ordering the matches. Thanks, Richard.
On Thu, Nov 25, 2021 at 10:17:52AM +0100, Richard Biener wrote: > > Ah I see, sorry I didn't see that rule before, you're right that if this is ordered > > after it then they can be dropped. > > So the patch is OK, possibly with re-ordering the matches. I've committed the patch as is because it has been tested that way and I'd like to avoid dups of that PR flowing in. Even when not reordered, the new earlier match.pd simplification will not trigger for the lt le gt ge cases anymore and the later old simplifications will trigger and I'd expect after that latter simplification the earlier should trigger again because the IL changed, no? Tamar, can you handle the reordering together with the testsuite changes (and perhaps formatting fixes in the tree.c routine)? Jakub
On Thu, 25 Nov 2021, Jakub Jelinek wrote: > On Thu, Nov 25, 2021 at 10:17:52AM +0100, Richard Biener wrote: > > > Ah I see, sorry I didn't see that rule before, you're right that if this is ordered > > > after it then they can be dropped. > > > > So the patch is OK, possibly with re-ordering the matches. > > I've committed the patch as is because it has been tested that way and I'd > like to avoid dups of that PR flowing in. Even when not reordered, the new > earlier match.pd simplification will not trigger for the lt le gt ge cases > anymore and the later old simplifications will trigger and I'd expect after > that latter simplification the earlier should trigger again because the IL > changed, no? Yes, the result always is re-folded. > Tamar, can you handle the reordering together with the testsuite changes > (and perhaps formatting fixes in the tree.c routine)?
> -----Original Message----- > From: Jakub Jelinek <jakub@redhat.com> > Sent: Thursday, November 25, 2021 9:53 AM > To: Richard Biener <rguenther@suse.de> > Cc: Tamar Christina <Tamar.Christina@arm.com>; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > simplification [PR103417] > > On Thu, Nov 25, 2021 at 10:17:52AM +0100, Richard Biener wrote: > > > Ah I see, sorry I didn't see that rule before, you're right that if > > > this is ordered after it then they can be dropped. > > > > So the patch is OK, possibly with re-ordering the matches. > > I've committed the patch as is because it has been tested that way and I'd > like to avoid dups of that PR flowing in. Even when not reordered, the new > earlier match.pd simplification will not trigger for the lt le gt ge cases anymore > and the later old simplifications will trigger and I'd expect after that latter > simplification the earlier should trigger again because the IL changed, no? > Tamar, can you handle the reordering together with the testsuite changes > (and perhaps formatting fixes in the tree.c routine)? Yes I will, I'll send a patch tomorrow morning. Thanks! Regards, Tamar > > Jakub
--- gcc/match.pd.jj 2021-11-24 11:46:03.191918052 +0100 +++ gcc/match.pd 2021-11-24 22:33:43.852575772 +0100 @@ -5214,20 +5214,16 @@ (define_operator_list SYNC_FETCH_AND_AND /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z where ~Y + 1 == pow2 and Z = ~Y. */ (for cst (VECTOR_CST INTEGER_CST) - (for cmp (le eq ne ge gt) - icmp (le le gt le gt) - (simplify - (cmp (bit_and:c@2 @0 cst@1) integer_zerop) - (with { tree csts = bitmask_inv_cst_vector_p (@1); } - (switch - (if (csts && TYPE_UNSIGNED (TREE_TYPE (@1)) - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) - (icmp @0 { csts; })) - (if (csts && !TYPE_UNSIGNED (TREE_TYPE (@1)) - && (cmp == EQ_EXPR || cmp == NE_EXPR) - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) + (for cmp (eq ne) + icmp (le gt) + (simplify + (cmp (bit_and:c@2 @0 cst@1) integer_zerop) + (with { tree csts = bitmask_inv_cst_vector_p (@1); } + (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) + (if (TYPE_UNSIGNED (TREE_TYPE (@1))) + (icmp @0 { csts; }) (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); } - (icmp (convert:utype @0) { csts; })))))))) + (icmp (convert:utype @0) { csts; })))))))) /* -A CMP -B -> B CMP A. */ (for cmp (tcc_comparison) --- gcc/testsuite/gcc.c-torture/execute/pr103417.c.jj 2021-11-24 22:36:00.732626424 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr103417.c 2021-11-24 22:35:43.964865218 +0100 @@ -0,0 +1,11 @@ +/* PR tree-optimization/103417 */ + +struct { int a : 8; int b : 24; } c = { 0, 1 }; + +int +main () +{ + if (c.b && !c.b) + __builtin_abort (); + return 0; +}