Message ID | 20211125075452.GW2646553@tucnak |
---|---|
State | Committed |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 260B03857009 for <patchwork@sourceware.org>; Thu, 25 Nov 2021 07:55:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 260B03857009 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1637826941; bh=ZwXaSjybgyWz+ZFo3Patez9UT9Wkp/MBt/e6Ie/7M2g=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=Mq3yX0+9vY4I0uT2YT2Uc4/879GRpJ+WP/aUSNJX1CDYtafgSWmSlTTbLzthLB5pz 0NUSzLgNvS9IJyvYREe2foR8nCBlTKRSt8Bi8n6eZtDKUsVm1eVrHlXMB2dJI+ZJg0 pLz6Hp09YXccHmFS99e0uz9WtqYD8/ATuspPzZqU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 37CC03857820 for <gcc-patches@gcc.gnu.org>; Thu, 25 Nov 2021 07:55:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 37CC03857820 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-418-8GpIYj7rOkSDreRfXhtclg-1; Thu, 25 Nov 2021 02:54:58 -0500 X-MC-Unique: 8GpIYj7rOkSDreRfXhtclg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AC3C4102CC42; Thu, 25 Nov 2021 07:54:57 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 342B75D6B1; Thu, 25 Nov 2021 07:54:57 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 1AP7srAA2565019 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 08:54:54 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 1AP7sqrd2565018; Thu, 25 Nov 2021 08:54:52 +0100 Date: Thu, 25 Nov 2021 08:54:52 +0100 To: Richard Biener <rguenther@suse.de> Subject: [PATCH] bswap: Improve perform_symbolic_merge [PR103376] Message-ID: <20211125075452.GW2646553@tucnak> References: <001601d7df7c$7f739920$7e5acb60$@nextmovesoftware.com> <20211124083539.GI2646553@tucnak> <6p87q324-15o3-36q9-9os7-31954087p319@fhfr.qr> MIME-Version: 1.0 In-Reply-To: <6p87q324-15o3-36q9-9os7-31954087p319@fhfr.qr> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Jakub Jelinek <jakub@redhat.com> Cc: gcc-patches@gcc.gnu.org, Roger Sayle <roger@nextmovesoftware.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
bswap: Improve perform_symbolic_merge [PR103376]
|
|
Commit Message
Jakub Jelinek
Nov. 25, 2021, 7:54 a.m. UTC
On Wed, Nov 24, 2021 at 09:45:16AM +0100, Richard Biener wrote: > > Thinking more about it, perhaps we could do more for BIT_XOR_EXPR. > > We could allow masked1 == masked2 case for it, but would need to > > do something different than the > > n->n = n1->n | n2->n; > > we do on all the bytes together. > > In particular, for masked1 == masked2 if masked1 != 0 (well, for 0 > > both variants are the same) and masked1 != 0xff we would need to > > clear corresponding n->n byte instead of setting it to the input > > as x ^ x = 0 (but if we don't know what x and y are, the result is > > also don't know). Now, for plus it is much harder, because not only > > for non-zero operands we don't know what the result is, but it can > > modify upper bytes as well. So perhaps only if current's byte > > masked1 && masked2 set the resulting byte to 0xff (unknown) iff > > the byte above it is 0 and 0, and set that resulting byte to 0xff too. > > Also, even for | we could instead of return NULL just set the resulting > > byte to 0xff if it is different, perhaps it will be masked off later on. > > Ok to handle that incrementally? > > Not sure if it is worth the trouble - the XOR handling sounds > straight forward at least. But sure, the merging routine could > simply be conservatively correct here. This patch implements that (except that for + it just punts whenever both operand bytes aren't 0 like before). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-11-25 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/103376 * gimple-ssa-store-merging.c (perform_symbolic_merge): For BIT_IOR_EXPR, if masked1 && masked2 && masked1 != masked2, don't punt, but set the corresponding result byte to MARKER_BYTE_UNKNOWN. For BIT_XOR_EXPR similarly and if masked1 == masked2 and the byte isn't MARKER_BYTE_UNKNOWN, set the corresponding result byte to 0. Jakub
Comments
On Thu, 25 Nov 2021, Jakub Jelinek wrote: > On Wed, Nov 24, 2021 at 09:45:16AM +0100, Richard Biener wrote: > > > Thinking more about it, perhaps we could do more for BIT_XOR_EXPR. > > > We could allow masked1 == masked2 case for it, but would need to > > > do something different than the > > > n->n = n1->n | n2->n; > > > we do on all the bytes together. > > > In particular, for masked1 == masked2 if masked1 != 0 (well, for 0 > > > both variants are the same) and masked1 != 0xff we would need to > > > clear corresponding n->n byte instead of setting it to the input > > > as x ^ x = 0 (but if we don't know what x and y are, the result is > > > also don't know). Now, for plus it is much harder, because not only > > > for non-zero operands we don't know what the result is, but it can > > > modify upper bytes as well. So perhaps only if current's byte > > > masked1 && masked2 set the resulting byte to 0xff (unknown) iff > > > the byte above it is 0 and 0, and set that resulting byte to 0xff too. > > > Also, even for | we could instead of return NULL just set the resulting > > > byte to 0xff if it is different, perhaps it will be masked off later on. > > > Ok to handle that incrementally? > > > > Not sure if it is worth the trouble - the XOR handling sounds > > straight forward at least. But sure, the merging routine could > > simply be conservatively correct here. > > This patch implements that (except that for + it just punts whenever > both operand bytes aren't 0 like before). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK if you can add a testcase that exercises this "feature". Thanks, Richard. > 2021-11-25 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/103376 > * gimple-ssa-store-merging.c (perform_symbolic_merge): For > BIT_IOR_EXPR, if masked1 && masked2 && masked1 != masked2, don't > punt, but set the corresponding result byte to MARKER_BYTE_UNKNOWN. > For BIT_XOR_EXPR similarly and if masked1 == masked2 and the > byte isn't MARKER_BYTE_UNKNOWN, set the corresponding result byte to > 0. > > --- gcc/gimple-ssa-store-merging.c.jj 2021-11-24 09:54:37.684365460 +0100 > +++ gcc/gimple-ssa-store-merging.c 2021-11-24 11:18:54.422226266 +0100 > @@ -556,6 +556,7 @@ perform_symbolic_merge (gimple *source_s > n->bytepos = n_start->bytepos; > n->type = n_start->type; > size = TYPE_PRECISION (n->type) / BITS_PER_UNIT; > + uint64_t res_n = n1->n | n2->n; > > for (i = 0, mask = MARKER_MASK; i < size; i++, mask <<= BITS_PER_MARKER) > { > @@ -563,12 +564,33 @@ perform_symbolic_merge (gimple *source_s > > masked1 = n1->n & mask; > masked2 = n2->n & mask; > - /* For BIT_XOR_EXPR or PLUS_EXPR, at least one of masked1 and masked2 > - has to be 0, for BIT_IOR_EXPR x | x is still x. */ > - if (masked1 && masked2 && (code != BIT_IOR_EXPR || masked1 != masked2)) > - return NULL; > + /* If at least one byte is 0, all of 0 | x == 0 ^ x == 0 + x == x. */ > + if (masked1 && masked2) > + { > + /* + can carry into upper bits, just punt. */ > + if (code == PLUS_EXPR) > + return NULL; > + /* x | x is still x. */ > + if (code == BIT_IOR_EXPR && masked1 == masked2) > + continue; > + if (code == BIT_XOR_EXPR) > + { > + /* x ^ x is 0, but MARKER_BYTE_UNKNOWN stands for > + unknown values and unknown ^ unknown is unknown. */ > + if (masked1 == masked2 > + && masked1 != ((uint64_t) MARKER_BYTE_UNKNOWN > + << i * BITS_PER_MARKER)) > + { > + res_n &= ~mask; > + continue; > + } > + } > + /* Otherwise set the byte to unknown, it might still be > + later masked off. */ > + res_n |= mask; > + } > } > - n->n = n1->n | n2->n; > + n->n = res_n; > n->n_ops = n1->n_ops + n2->n_ops; > > return source_stmt; > > > Jakub > >
On Thu, Nov 25, 2021 at 09:21:37AM +0100, Richard Biener wrote:
> OK if you can add a testcase that exercises this "feature".
Sure, that is easy.
Here is what I've committed. f2 tests the x | x = x handling in it,
f3 tests x | y = unknown instead of punting, f4 tests x ^ x = 0
and f5 tests x ^ y = unknown. Without the patch only f2 is optimized
to __builtin_bswap32, with the patch all of them.
2021-11-25 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/103376
* gimple-ssa-store-merging.c (perform_symbolic_merge): For
BIT_IOR_EXPR, if masked1 && masked2 && masked1 != masked2, don't
punt, but set the corresponding result byte to MARKER_BYTE_UNKNOWN.
For BIT_XOR_EXPR similarly and if masked1 == masked2 and the
byte isn't MARKER_BYTE_UNKNOWN, set the corresponding result byte to
0.
* gcc.dg/optimize-bswapsi-7.c: New test.
--- gcc/gimple-ssa-store-merging.c.jj 2021-11-24 09:54:37.684365460 +0100
+++ gcc/gimple-ssa-store-merging.c 2021-11-24 11:18:54.422226266 +0100
@@ -556,6 +556,7 @@ perform_symbolic_merge (gimple *source_s
n->bytepos = n_start->bytepos;
n->type = n_start->type;
size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
+ uint64_t res_n = n1->n | n2->n;
for (i = 0, mask = MARKER_MASK; i < size; i++, mask <<= BITS_PER_MARKER)
{
@@ -563,12 +564,33 @@ perform_symbolic_merge (gimple *source_s
masked1 = n1->n & mask;
masked2 = n2->n & mask;
- /* For BIT_XOR_EXPR or PLUS_EXPR, at least one of masked1 and masked2
- has to be 0, for BIT_IOR_EXPR x | x is still x. */
- if (masked1 && masked2 && (code != BIT_IOR_EXPR || masked1 != masked2))
- return NULL;
+ /* If at least one byte is 0, all of 0 | x == 0 ^ x == 0 + x == x. */
+ if (masked1 && masked2)
+ {
+ /* + can carry into upper bits, just punt. */
+ if (code == PLUS_EXPR)
+ return NULL;
+ /* x | x is still x. */
+ if (code == BIT_IOR_EXPR && masked1 == masked2)
+ continue;
+ if (code == BIT_XOR_EXPR)
+ {
+ /* x ^ x is 0, but MARKER_BYTE_UNKNOWN stands for
+ unknown values and unknown ^ unknown is unknown. */
+ if (masked1 == masked2
+ && masked1 != ((uint64_t) MARKER_BYTE_UNKNOWN
+ << i * BITS_PER_MARKER))
+ {
+ res_n &= ~mask;
+ continue;
+ }
+ }
+ /* Otherwise set the byte to unknown, it might still be
+ later masked off. */
+ res_n |= mask;
+ }
}
- n->n = n1->n | n2->n;
+ n->n = res_n;
n->n_ops = n1->n_ops + n2->n_ops;
return source_stmt;
--- gcc/testsuite/gcc.dg/optimize-bswapsi-7.c.jj 2021-11-25 10:36:03.847529686 +0100
+++ gcc/testsuite/gcc.dg/optimize-bswapsi-7.c 2021-11-25 10:35:46.522778192 +0100
@@ -0,0 +1,37 @@
+/* PR tree-optimization/103376 */
+/* { dg-do compile } */
+/* { dg-require-effective-target bswap } */
+/* { dg-options "-O2 -fno-tree-vectorize -fdump-tree-optimized" } */
+/* { dg-additional-options "-march=z900" { target s390-*-* } } */
+
+static unsigned int
+f1 (unsigned int x)
+{
+ return (x << 24) | (x >> 8);
+}
+
+unsigned int
+f2 (unsigned *p)
+{
+ return ((f1 (p[0]) | (p[0] >> 8)) & 0xff000000U) | (p[0] >> 24) | ((p[0] & 0xff00U) << 8) | ((p[0] & 0xff0000U) >> 8);
+}
+
+unsigned int
+f3 (unsigned *p)
+{
+ return ((f1 (p[0]) | (p[0] & 0x00ff00ffU)) & 0xff00ff00U) | (f1 (f1 (f1 (p[0]))) & 0x00ff00ffU);
+}
+
+unsigned int
+f4 (unsigned *p)
+{
+ return (f1 (p[0]) ^ (p[0] >> 8)) ^ (p[0] >> 24) ^ ((p[0] & 0xff00U) << 8) ^ ((p[0] & 0xff0000U) >> 8);
+}
+
+unsigned int
+f5 (unsigned *p)
+{
+ return (((f1 (p[0]) | (p[0] >> 16)) ^ (p[0] >> 8)) & 0xffff0000U) ^ (p[0] >> 24) ^ ((p[0] & 0xff00U) << 8) ^ ((p[0] & 0xff0000U) >> 8);
+}
+
+/* { dg-final { scan-tree-dump-times "= __builtin_bswap32 \\\(" 4 "optimized" } } */
Jakub
--- gcc/gimple-ssa-store-merging.c.jj 2021-11-24 09:54:37.684365460 +0100 +++ gcc/gimple-ssa-store-merging.c 2021-11-24 11:18:54.422226266 +0100 @@ -556,6 +556,7 @@ perform_symbolic_merge (gimple *source_s n->bytepos = n_start->bytepos; n->type = n_start->type; size = TYPE_PRECISION (n->type) / BITS_PER_UNIT; + uint64_t res_n = n1->n | n2->n; for (i = 0, mask = MARKER_MASK; i < size; i++, mask <<= BITS_PER_MARKER) { @@ -563,12 +564,33 @@ perform_symbolic_merge (gimple *source_s masked1 = n1->n & mask; masked2 = n2->n & mask; - /* For BIT_XOR_EXPR or PLUS_EXPR, at least one of masked1 and masked2 - has to be 0, for BIT_IOR_EXPR x | x is still x. */ - if (masked1 && masked2 && (code != BIT_IOR_EXPR || masked1 != masked2)) - return NULL; + /* If at least one byte is 0, all of 0 | x == 0 ^ x == 0 + x == x. */ + if (masked1 && masked2) + { + /* + can carry into upper bits, just punt. */ + if (code == PLUS_EXPR) + return NULL; + /* x | x is still x. */ + if (code == BIT_IOR_EXPR && masked1 == masked2) + continue; + if (code == BIT_XOR_EXPR) + { + /* x ^ x is 0, but MARKER_BYTE_UNKNOWN stands for + unknown values and unknown ^ unknown is unknown. */ + if (masked1 == masked2 + && masked1 != ((uint64_t) MARKER_BYTE_UNKNOWN + << i * BITS_PER_MARKER)) + { + res_n &= ~mask; + continue; + } + } + /* Otherwise set the byte to unknown, it might still be + later masked off. */ + res_n |= mask; + } } - n->n = n1->n | n2->n; + n->n = res_n; n->n_ops = n1->n_ops + n2->n_ops; return source_stmt;