From patchwork Wed Sep 29 16:21:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 45565 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8FF6B3857C42 for ; Wed, 29 Sep 2021 16:25:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8FF6B3857C42 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1632932740; bh=HSq+kLIHiRr62ZBr4AB1185bAEm6iGPadKR/NcuDI7Y=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=TgIBtmNu7eEfKAolCVRF4LmddgPPzPlsFU0Z4Yjd/EE4yoAVypRDpfXZP00rCYQTA zRoAbPe7DD5YNhFcmyMTJ9T8ccopRga3jWglMGvLYvMQXvVP2645JafJHexHL9ydIj ZWi6TtpqrK9dhmU3o34ESRfbLv5yinvP/wh+1h90= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-eopbgr20054.outbound.protection.outlook.com [40.107.2.54]) by sourceware.org (Postfix) with ESMTPS id AFC923858036 for ; Wed, 29 Sep 2021 16:21:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AFC923858036 Received: from DB8P191CA0002.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:130::12) by PR3PR08MB5596.eurprd08.prod.outlook.com (2603:10a6:102:88::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.15; Wed, 29 Sep 2021 16:21:31 +0000 Received: from DB5EUR03FT044.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:130:cafe::6) by DB8P191CA0002.outlook.office365.com (2603:10a6:10:130::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.14 via Frontend Transport; Wed, 29 Sep 2021 16:21:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT044.mail.protection.outlook.com (10.152.21.167) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.14 via Frontend Transport; Wed, 29 Sep 2021 16:21:31 +0000 Received: ("Tessian outbound ab2dc3678fa9:v103"); Wed, 29 Sep 2021 16:21:31 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8c10d9dc10b14617 X-CR-MTA-TID: 64aa7808 Received: from f9d19cf25aff.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 443CD751-E108-4730-966D-C56F47C6DAC9.1; Wed, 29 Sep 2021 16:21:18 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f9d19cf25aff.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 29 Sep 2021 16:21:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IMNZN7xDzcU0auEsLyWZHzgBGTNh+cccvHowXsftIpH+emrq47hxTa7GQvOXLgN+48qnZQgeub9MtuB3jDYELYrrzjzrColKf2HR3B9FVhKjeqvGz/tplxMSHRGMxpqCMTbmPUtg7dyn+JkAfqR5/6mh6wpIonQH+gzEyitjH1nLzXahVfmqpQsLMq9n0ULWXBRzHhl6Hir6qbQNT3K9Dq+VyiCynQiravPHrd5fnbz2TJ/RlT6/ZF2H+DSpqIHXOGN5xywiqsbfc3k4QUmK6Am81GqQCKiA/SLL5lTiS4jdN8BbWky+f8BwNHCn9ldY/4CtrYDyFUz8N1Qgfouhtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=HSq+kLIHiRr62ZBr4AB1185bAEm6iGPadKR/NcuDI7Y=; b=R1YBVt2eFShXxBo/4bVtLOPS/aB7fzXUs0688Lr6qU61PuKg0WeQh5eYJ4lmqKoFVvIcUuS5oFRI/yzDTKJSDMzJisufrp7vM3cX3IZtBhyWtD/uZWEVnBHMP8Z+oK07EhWU0N4kxzkIJbYFKGkK/KTHmZ6ZbpCyeYxDtjWhMDA2PJoLQwsP5/mtNyIaSVONUIxvBbFijlKWuaODQ2aeo3itN+INePdFXhnOH8yGgiGULM2OgLPaIK5tm2QbWu/c+4aroeKshhbZn7IooqwzH9gi4d3NC7oYNkbBawl0QuovLwa0+WTGP6XJ+GFZxlya7saly6XaRAO/WR/ZTQMFYg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VI1PR08MB3437.eurprd08.prod.outlook.com (2603:10a6:803:7e::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.13; Wed, 29 Sep 2021 16:21:15 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bd45:5ad5:f666:272a]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::bd45:5ad5:f666:272a%5]) with mapi id 15.20.4544.022; Wed, 29 Sep 2021 16:21:15 +0000 Date: Wed, 29 Sep 2021 17:21:08 +0100 To: gcc-patches@gcc.gnu.org Subject: [PATCH 5/7]middle-end Convert bitclear + cmp #0 into cm Message-ID: <20210929162106.GA5336@arm.com> Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SA0PR12CA0022.namprd12.prod.outlook.com (2603:10b6:806:6f::27) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 Received: from arm.com (217.140.106.55) by SA0PR12CA0022.namprd12.prod.outlook.com (2603:10b6:806:6f::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.13 via Frontend Transport; Wed, 29 Sep 2021 16:21:14 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a1ed8892-ca36-47c4-132b-08d98365329c X-MS-TrafficTypeDiagnostic: VI1PR08MB3437:|PR3PR08MB5596: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:3631;OLM:3631; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: CXgqlbCndEkrJgr+lLyHdpm/pRTKlSYsUpEfdc+IX1s5qTOH/RGkEVC5Cg5aNtK/9wQs5L91MDDtELjTdqOeNPLIH9sMSAdzmS5VezN93IAkyB4V3fhlhfa47gB0EIO6wT0ZDwnhM8+RaZRPPAz2/8q1gdaoFGkIFTt+713AwfBp5JPY5JSquIWpBOPIZ90nZ3IvzXL4kVU/OuWmmUYvFH70umpdy9v7Z27BhOChV1gX9yT6wOzRh88x5QRhNG1x+vQV3M4upsT3MteKQNgbt5o0Q6nnMw4lVqF6yuSzyNV8fTTojAbYj+TA6cx6N59wtaCMtqmN5nfU479WGY5ifJkTCj6OS4a6vPmc4nXDCyoPiREJoCJ0UyoxgqtOU+b6zMGm4dZCfluAK+6vPA0TrkneEjPcm5hYQw/q2ykTVcAb8ZFce//ZOP6VLqSWH/UF2dQOFTWny1wm6rByg0H2+sCFKCnMnuIOyRMTq9faI6DkysLWRQ1YXBi+VNgCrTwqE/BcaDVw2o34AXiP15S3cTuPsCaHZiZGvdzaPcWcpmIll9rS41QkQvl9bSsMW4e2VdHf5i+KlewpUOPAspQCpi+PMJFNsveqGuh99hgOftnobGZ7nRdZFGbmdZbXycD00ghyK1opD+MY+udrelQMP7zkvK2PDlK12zYNxJySOUpJKgwH40uJkmwQXYPEUgAXNKGqympZBb+xVOugtf5S21sv16UOQ7gMsGXtN1IYJxTveib6OwI4CzgIJXIC8l6a5PH3GMp+I+qNvZVOyg87L7g0MuNcN+UtU70wpTbez+gRIiWBqSyX+haubewnsuQJ9c059WS4ODo8BvkZwwnXvgp250HNZ97g9zMh7GS3Qgg= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(8936002)(6666004)(8886007)(8676002)(36756003)(86362001)(2906002)(30864003)(4743002)(66556008)(66476007)(66616009)(55016002)(33656002)(316002)(38350700002)(38100700002)(1076003)(26005)(235185007)(5660300002)(33964004)(508600001)(52116002)(66946007)(44832011)(44144004)(7696005)(6916009)(956004)(2616005)(4326008)(186003)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB3437 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 20b0b3c6-3e47-4748-95bf-08d98365290e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: c4FbtUTWbYx0G1Imxnk63F9gcIJJAAP+NpZH4qdUXN6RCVuMbxnDZdZb1nOXmhHMFmcOo4IZM/R2/mdxRkbrvzNfCOfZalhGPUnbAaFAEUB1wnaK4QJ6v+uO11KKp4Yoxz9orkjoFSUrlyqno83TqYYqim3dm6PU4m9cbs1uF/cx203fqSHyTDHK1aPmweqqiA/Wy5AYUHE7kQtbbBcwUVYr34qSFa62p1+dwaDlQRsNR0IfjWR1d7PVfHJAgKQBqRatjSKN6UI7GYN/9ILjnSuj7kPSdNQ3JG5nsxkyI+4876SNYzjZ/FGyy2hdDyxI6q86Wk6+dSPuFLNHbMuKPLAUF3LHkfiYb5/wt3EqRqSl9wCHVKZ+Qz1JoSDC4HJuGhWNHFDajNteOd9sY1GmulIewowUk6/TbVPP+9aPqLU80EWUehiRrzkN5ba7PNjM7fTkjvWObPjiTqrZULsQfg/uHoDR2YVIQ92ufKwhqbz0OLHAI1OPy0gLxQElantnsh9L3JmWl8BL/WIOwOX6xvSMzFZ8WAyIke2+ak247D3YX8I2ap3rHzBNUh5VYWTHRh+5IFVx5Yo9OklTiB/S7fublfBr6n7YXaegd2ipQ1tD3UAmlaEWkoprCb6GV3UOskeVS3dPmNXs+afHZnuJRcXaLrTk/pqfF4AWrE/Aut3AR/IV++K2Ap4jxib7JdbYHwNBTUsci6QIcf65tE9gbpv70YsY97j/XMEsyQqkGXIRzpaNNxVi0JF3+IM3i5KEvT8G9OH+TqgaJUj9WD+cmVSD1Bv5FvV95x49z9m4GBK7Dy5o333hqNNn6sM74inxFq0CYEW0AEb5T8sViWm6JVDKUhbBiVgJRcZNTJ+0WD0= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(8886007)(235185007)(107886003)(4743002)(33656002)(6666004)(82310400003)(36860700001)(66616009)(55016002)(70586007)(70206006)(86362001)(30864003)(7696005)(508600001)(2906002)(81166007)(186003)(336012)(316002)(44144004)(47076005)(33964004)(4326008)(356005)(5660300002)(2616005)(8676002)(956004)(26005)(36756003)(6916009)(44832011)(8936002)(1076003)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Sep 2021 16:21:31.3396 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a1ed8892-ca36-47c4-132b-08d98365329c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR08MB5596 X-Spam-Status: No, score=-13.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This optimizes the case where a mask Y which fulfills ~Y + 1 == pow2 is used to clear a some bits and then compared against 0 into one without the masking and a compare against a different bit immediate. We can do this for all unsigned compares and for signed we can do it for comparisons of EQ and NE: (x & (~255)) == 0 becomes x <= 255. Which for leaves it to the target to optimally deal with the comparison. This transformation has to be done in the mid-end because in RTL you don't have the signs of the comparison operands and if the target needs an immediate this should be floated outside of the loop. The RTL loop invariant hoisting is done before split1. i.e. void fun1(int32_t *x, int n) { for (int i = 0; i < (n & -16); i++) x[i] = (x[i]&(~255)) == 0; } now generates: .L3: ldr q0, [x0] cmhs v0.4s, v2.4s, v0.4s and v0.16b, v1.16b, v0.16b str q0, [x0], 16 cmp x0, x1 bne .L3 and floats the immediate out of the loop. instead of: .L3: ldr q0, [x0] bic v0.4s, #255 cmeq v0.4s, v0.4s, #0 and v0.16b, v1.16b, v0.16b str q0, [x0], 16 cmp x0, x1 bne .L3 Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * match.pd: New bitmask compare pattern. gcc/testsuite/ChangeLog: * gcc.dg/bic-bitmask-10.c: New test. * gcc.dg/bic-bitmask-11.c: New test. * gcc.dg/bic-bitmask-12.c: New test. * gcc.dg/bic-bitmask-2.c: New test. * gcc.dg/bic-bitmask-3.c: New test. * gcc.dg/bic-bitmask-4.c: New test. * gcc.dg/bic-bitmask-5.c: New test. * gcc.dg/bic-bitmask-6.c: New test. * gcc.dg/bic-bitmask-7.c: New test. * gcc.dg/bic-bitmask-8.c: New test. * gcc.dg/bic-bitmask-9.c: New test. * gcc.dg/bic-bitmask.h: New test. * gcc.target/aarch64/bic-bitmask-1.c: New test. --- inline copy of patch -- diff --git a/gcc/match.pd b/gcc/match.pd index 0fcfd0ea62c043dc217d0d560ce5b7e569b70e7d..df9212cb27d172856b9d43b0875262f96e8993c4 100644 diff --git a/gcc/match.pd b/gcc/match.pd index 0fcfd0ea62c043dc217d0d560ce5b7e569b70e7d..df9212cb27d172856b9d43b0875262f96e8993c4 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4288,6 +4288,56 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (ic == ncmp) (ncmp @0 @1)))))) +/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z + where ~Y + 1 == pow2 and Z = ~Y. */ +(for cmp (simple_comparison) + (simplify + (cmp (bit_and:c @0 VECTOR_CST@1) integer_zerop) + (if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (@1)) + && uniform_vector_p (@1)) + (with { tree elt = vector_cst_elt (@1, 0); } + (switch + (if (TYPE_UNSIGNED (TREE_TYPE (@1)) && tree_fits_uhwi_p (elt)) + (with { unsigned HOST_WIDE_INT diff = tree_to_uhwi (elt); + tree tdiff = wide_int_to_tree (TREE_TYPE (elt), (~diff) + 1); + tree newval = wide_int_to_tree (TREE_TYPE (elt), ~diff); + tree newmask = build_uniform_cst (TREE_TYPE (@1), newval); } + (if (integer_pow2p (tdiff)) + (switch + /* ((mask & x) < 0) -> 0. */ + (if (cmp == LT_EXPR) + { build_zero_cst (TREE_TYPE (@1)); }) + /* ((mask & x) <= 0) -> x < mask. */ + (if (cmp == LE_EXPR) + (lt @0 { newmask; })) + /* ((mask & x) == 0) -> x < mask. */ + (if (cmp == EQ_EXPR) + (le @0 { newmask; })) + /* ((mask & x) != 0) -> x > mask. */ + (if (cmp == NE_EXPR) + (gt @0 { newmask; })) + /* ((mask & x) >= 0) -> x <= mask. */ + (if (cmp == GE_EXPR) + (le @0 { newmask; })) + /* ((mask & x) > 0) -> x < mask. */ + (if (cmp == GT_EXPR) + (lt @0 { newmask; })))))) + (if (!TYPE_UNSIGNED (TREE_TYPE (@1)) && tree_fits_shwi_p (elt)) + (with { unsigned HOST_WIDE_INT diff = tree_to_shwi (elt); + tree ustype = unsigned_type_for (TREE_TYPE (elt)); + tree uvtype = unsigned_type_for (TREE_TYPE (@1)); + tree tdiff = wide_int_to_tree (ustype, (~diff) + 1); + tree udiff = wide_int_to_tree (ustype, ~diff); + tree cst = build_uniform_cst (uvtype, udiff); } + (if (integer_pow2p (tdiff)) + (switch + /* ((mask & x) == 0) -> x < mask. */ + (if (cmp == EQ_EXPR) + (le (convert:uvtype @0) { cst; })) + /* ((mask & x) != 0) -> x > mask. */ + (if (cmp == NE_EXPR) + (gt (convert:uvtype @0) { cst; }))))))))))) + /* Transform comparisons of the form X - Y CMP 0 to X CMP Y. ??? The transformation is valid for the other operators if overflow is undefined for the type, but performing it here badly interacts diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-10.c b/gcc/testsuite/gcc.dg/bic-bitmask-10.c new file mode 100644 index 0000000000000000000000000000000000000000..76a22a2313137a2a75dd711c2c15c2d3a34e15aa --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-10.c @@ -0,0 +1,26 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(int32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(int32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +#define TYPE int32_t +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump {<=\s*.+\{ 255,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967290,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-11.c b/gcc/testsuite/gcc.dg/bic-bitmask-11.c new file mode 100644 index 0000000000000000000000000000000000000000..32553d7ba2f823f7a21237451990d0a216d2f912 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-11.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) != 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) != 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump {>\s*.+\{ 255,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967290,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-12.c b/gcc/testsuite/gcc.dg/bic-bitmask-12.c new file mode 100644 index 0000000000000000000000000000000000000000..e10cbf7fabe2dbf7ce436cdf37b0f8b207c58408 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-12.c @@ -0,0 +1,17 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 -fdump-tree-dce" } */ + +#include + +typedef unsigned int v4si __attribute__ ((vector_size (16))); + +__attribute__((noinline, noipa)) +void fun(v4si *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +/* { dg-final { scan-tree-dump {<=\s*.+\{ 255,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967290,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-2.c b/gcc/testsuite/gcc.dg/bic-bitmask-2.c new file mode 100644 index 0000000000000000000000000000000000000000..da30fad89f6c8239baa4395b3ffaec0be577e13f --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-2.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {<=\s*.+\{ 255,.+\}} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967040,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-3.c b/gcc/testsuite/gcc.dg/bic-bitmask-3.c new file mode 100644 index 0000000000000000000000000000000000000000..da30fad89f6c8239baa4395b3ffaec0be577e13f --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-3.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) == 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {<=\s*.+\{ 255,.+\}} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967040,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-4.c b/gcc/testsuite/gcc.dg/bic-bitmask-4.c new file mode 100644 index 0000000000000000000000000000000000000000..1bcf23ccf1447d6c8c999ed1eb25ba0a450028e1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-4.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) >= 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) >= 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {=\s*.+\{ 1,.+\}} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967040,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-5.c b/gcc/testsuite/gcc.dg/bic-bitmask-5.c new file mode 100644 index 0000000000000000000000000000000000000000..6e5a2fca9992efbc01f8dbbc6f95936e86643028 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-5.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) > 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) > 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {>\s*.+\{ 255,.+\}} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {&`s*.+\{ 4294967040,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-6.c b/gcc/testsuite/gcc.dg/bic-bitmask-6.c new file mode 100644 index 0000000000000000000000000000000000000000..018e7a4348c9fc461106c3d9d01291325d3406c2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-6.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) <= 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~255)) <= 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {<=\s*.+\{ 255,.+\}} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967040,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-7.c b/gcc/testsuite/gcc.dg/bic-bitmask-7.c new file mode 100644 index 0000000000000000000000000000000000000000..798678fb7555052c93abc4ca34f617d640f73bb4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-7.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~1)) < 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~1)) < 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {__builtin_memset} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-8.c b/gcc/testsuite/gcc.dg/bic-bitmask-8.c new file mode 100644 index 0000000000000000000000000000000000000000..1dabe834ed57dfa0be48c1dc3dbb226092c79a1a --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-8.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~1)) != 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~1)) != 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-times {>\s*.+\{ 1,.+\}} 1 dce7 } } */ +/* { dg-final { scan-tree-dump-not {&\s*.+\{ 4294967294,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask-9.c b/gcc/testsuite/gcc.dg/bic-bitmask-9.c new file mode 100644 index 0000000000000000000000000000000000000000..9c1f8ee0adfc45d1b9fc212138ea26bb6b693e49 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask-9.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fdump-tree-dce" } */ + +#include + +__attribute__((noinline, noipa)) +void fun1(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~5)) == 0; +} + +__attribute__((noinline, noipa, optimize("O1"))) +void fun2(uint32_t *x, int n) +{ + for (int i = 0; i < (n & -16); i++) + x[i] = (x[i]&(~5)) == 0; +} + +#include "bic-bitmask.h" + +/* { dg-final { scan-tree-dump-not {<=\s*.+\{ 4294967289,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump {&\s*.+\{ 4294967290,.+\}} dce7 } } */ +/* { dg-final { scan-tree-dump-not {\s+bic\s+} dce7 { target { aarch64*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.dg/bic-bitmask.h b/gcc/testsuite/gcc.dg/bic-bitmask.h new file mode 100644 index 0000000000000000000000000000000000000000..2b94065c025e0cbf71a21ac9b9d6314e24b0c2d9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/bic-bitmask.h @@ -0,0 +1,43 @@ +#include + +#ifndef N +#define N 50 +#endif + +#ifndef TYPE +#define TYPE uint32_t +#endif + +#ifndef DEBUG +#define DEBUG 0 +#endif + +#define BASE ((TYPE) -1 < 0 ? -126 : 4) + +int main () +{ + TYPE a[N]; + TYPE b[N]; + + for (int i = 0; i < N; ++i) + { + a[i] = BASE + i * 13; + b[i] = BASE + i * 13; + if (DEBUG) + printf ("%d: 0x%x\n", i, a[i]); + } + + fun1 (a, N); + fun2 (b, N); + + for (int i = 0; i < N; ++i) + { + if (DEBUG) + printf ("%d = 0x%x == 0x%x\n", i, a[i], b[i]); + + if (a[i] != b[i]) + __builtin_abort (); + } + return 0; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c b/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c new file mode 100644 index 0000000000000000000000000000000000000000..568c1ffc8bc4148efaeeba7a45a75ecbd3a7a3dd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/bic-bitmask-1.c @@ -0,0 +1,13 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -save-temps" } */ + +#include + +uint32x4_t foo (int32x4_t a) +{ + int32x4_t cst = vdupq_n_s32 (255); + int32x4_t zero = vdupq_n_s32 (0); + return vceqq_s32 (vbicq_s32 (a, cst), zero); +} + +/* { dg-final { scan-assembler-not {\tbic\t} { xfail { aarch64*-*-* } } } } */