From patchwork Thu Mar 9 19:36:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 66184 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0CE753858C78 for ; Thu, 9 Mar 2023 19:37:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0CE753858C78 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678390635; bh=HfaTzRKj+PLZrt+l+qR9oFIFuSRiT9wcEzq7z/T3Y5A=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=debXJcGGxeO14ZOSwFTJnOafBfPC3BlYMR0qyznrRgEHajMsaGmx68HoHNLKnQrRO 3uS3lnbp72CnJYrjzHHeteAvi/H4jBYVuNpOsU2QvlLTOBsWYgzqTsIeCVb1EeC2ZU VK1sn+3+AQ6pan3tF+WU3QzMpyg12G8oQ7hz4edA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2049.outbound.protection.outlook.com [40.107.21.49]) by sourceware.org (Postfix) with ESMTPS id C0F083858D20 for ; Thu, 9 Mar 2023 19:36:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C0F083858D20 Received: from FR3P281CA0057.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:4b::8) by AS2PR08MB8477.eurprd08.prod.outlook.com (2603:10a6:20b:55b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.18; Thu, 9 Mar 2023 19:36:39 +0000 Received: from VI1EUR03FT058.eop-EUR03.prod.protection.outlook.com (2603:10a6:d10:4b:cafe::18) by FR3P281CA0057.outlook.office365.com (2603:10a6:d10:4b::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.19 via Frontend Transport; Thu, 9 Mar 2023 19:36:39 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VI1EUR03FT058.mail.protection.outlook.com (100.127.144.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.19 via Frontend Transport; Thu, 9 Mar 2023 19:36:39 +0000 Received: ("Tessian outbound 0df938784972:v135"); Thu, 09 Mar 2023 19:36:38 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 06969337f8ccb067 X-CR-MTA-TID: 64aa7808 Received: from 43b6b0854d43.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 8A746EED-9C83-489B-81BE-B30CDBBECFD3.1; Thu, 09 Mar 2023 19:36:31 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 43b6b0854d43.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 09 Mar 2023 19:36:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RHbwBCMgT2ua3up9vZWPJDxEcRjQIfS0yNOOKgW9cxPaL+Iwyg614taz1+jgNDiJpfuxBnQ0Kn9ifMe4wJcSTtANabgCYmqItAJvHf/FWUi+TkHfz8Vrgl77UXy8BwAMFO2jBTgfIPO6cDFRQzb6/zAdumcQlNMDHrGoKWxIkiGIqa3lWcCyUb0x7GK8atBpUpVchyfLrAtmtpWtHLyXlRpXUVDHk3M6aMw1kTlSpqnmVcDS/DFbGKH/03M9sNIggJj8knzBYz8U3Nzus02BI10JSLIfal3NGCNJcFrY/zZfro2zbifWWb2Fto3kmCCROTRNsZSqkNNzrAlkpGnA6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HfaTzRKj+PLZrt+l+qR9oFIFuSRiT9wcEzq7z/T3Y5A=; b=X3+E0xlaQG4QL7o1C8PWBLiR8+FsWhpkeR/L8MX+BfCQl5R8BXlqZbFiPCUBbVYD4AbynB+dBfrHR5I4J4ZncodiX9SWeSDw8gae/4Ahi5jl6+XNbbDRMFQPdWoAkD5TK4TlZ4+ZXDF6uvFSVIjLMp70x/GAfwkNxbfCMBKgFa4NNysOsxGLey4HX/ZVn5h+Z0wCaHzz/APxBqR0S3qdmV3lPwEL8mZfcfwXTniaTzfy5h6Ok9UbmOl9iFFtz4XxxuXN7p/a2LR+qb3FFW9y/KzetJ6KHo51CNnbzQSwcCLHUOTRxHvw17YdRmkDLpmoAHtQLu4ME6u0rnti7zzSbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from AM0PR08MB5316.eurprd08.prod.outlook.com (2603:10a6:208:185::14) by AM9PR08MB5924.eurprd08.prod.outlook.com (2603:10a6:20b:282::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.19; Thu, 9 Mar 2023 19:36:24 +0000 Received: from AM0PR08MB5316.eurprd08.prod.outlook.com ([fe80::4e42:7a37:a4a4:c7cb]) by AM0PR08MB5316.eurprd08.prod.outlook.com ([fe80::4e42:7a37:a4a4:c7cb%4]) with mapi id 15.20.6178.017; Thu, 9 Mar 2023 19:36:24 +0000 Date: Thu, 9 Mar 2023 19:36:21 +0000 To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com, richard.sandiford@arm.com Subject: [PATCH]middle-end: don't form FMAs when multiplication is not single use. [PR108583] Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0588.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:295::18) To AM0PR08MB5316.eurprd08.prod.outlook.com (2603:10a6:208:185::14) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: AM0PR08MB5316:EE_|AM9PR08MB5924:EE_|VI1EUR03FT058:EE_|AS2PR08MB8477:EE_ X-MS-Office365-Filtering-Correlation-Id: 0450d675-87ef-4a19-0955-08db20d59a4a x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Lq/jWNUKAIeGikXUs+XxRyxka60cTW+DObDXI4gkZ+BjgKY/93rgXPY1VPTkEmLjSLlYKxvk+TwYr2UzYo1JiELyqT/gV4bF3ISbvJK/DA+dfCSiftRDmATV7iKFjCiSCTilc6Z1eF7xyNhHUC1lU4J3paJ1g32h4aiOmr3w+xqv2wMO1RkEcIrxClWbNF9MoodhUu2mai7kZnv8CEJGOl6bhXG3aCFlPtGNTfKSBcTmyx8vWYCyAYVpYXse5HT5t1Gh/Bx24YTfSqcCK2C3g14MmxQ8TymJUZ5Pdt7iR+IKqzqADjplfBgYPXhoGFTgCA+005BHsHApGgeN7Wi258pKq33dIIHDBAZx5Uuy9M+TRHfH2bddYEfmr3EGOR0VeVTe4Qw8k7qgRQD7rNK3FR54V4At2Ae2dCi4CpLfjKzUxDhdAV/PQBFGQtikl7mUfJ7egvQaWmy6nuoHjobHrDGmEytaUiA65Jc96XBDW/4yd7+b10lmZdrnNvR4Z3mIqHqY+n6YBwdxImmN2kVXL4+T9LI4znPT8Ovmc+vm2/pSLjn1xA8Uzy3HdfyHFRVkQkTOWKl/nSpBx5fbUeMIA77eaGMaj6T7+d+k5FNbZdz79pVMDpQreybmAyP1Yawz2l+WZWgQFkLAfLymN5ZmWpEKs1qWDZW8SKQEaCZUy+Yvr8z1t4S4dfqKF2QTGu+gHGd3+mC471DSfZQjNM3Rkg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR08MB5316.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230025)(4636009)(346002)(136003)(39860400002)(366004)(396003)(376002)(451199018)(478600001)(6666004)(33964004)(44144004)(186003)(4743002)(84970400001)(38100700002)(6506007)(66946007)(66556008)(66476007)(6486002)(8676002)(4326008)(6916009)(316002)(26005)(6512007)(2616005)(41300700001)(8936002)(5660300002)(235185007)(44832011)(2906002)(86362001)(36756003)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR08MB5924 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VI1EUR03FT058.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 237f8a11-3bc2-4566-f18d-08db20d5917a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3rsGMuAnpIrjOdnAZ0WyJUHiBt95VIj8guKPYh+P4E86jYRuPD31JOQvEEpAEEgkeffWWj6DZQbCmgaXeC9cC0zEGwfxw27iW6PeoM1HCc5uu0EnGyqdV1LJqEH6grYK7qKC/3mXOS/72pPgqS4M5hpaKCWUjK02nmmZczK732OP3crw1JVmGnoy1a1n5C7hamzkRykboeTtOqrlrS7KRfzOW2lNSRkgnClsMxJvKJG9wmej96ULVgn5g0N0itcCiXontfLiqOi/sPqYOKrPhHDl54vYeaiMbjJcvu7Fw7HmbYHgLCCqFdMTbLcoWgRw61KLGDiVViWHbpOlteZWEpf38cnI6tgyscF1t8aYurqBZMqhpXcjDN/lbx6fXxGQDzoh/cM7JW2dlnYv0OVc0zEprA5hWvf4KmZ7ZxjOWwys/6av0u5UpfqiwbxoX3eW46IJCMqwI2jgXmXG7Qo0K/zFlZJOYw1ZKuRRXSbyVq6JfLv7cb5cTnnGh4BCTwlgTFYXvzLy0z6BXsvQNlG+44P8ytxoTb4HJUgUfitIWusQO2OdyGoKfnzb/aamQXsdIMXxW1Bppx54tsuKLICzSyYnbY/2MteHmNdtqEqDuloi4FHDK8UQiE4i52tsgDoXUTPhHYbku7SephmTjgpawHQlPWq4DWPflB52pJGRs+7BFRkzc+jmElLDyXs/Ieljb9/fuycwmdE3QuM3SNWi4LgFF4Jwt5/+G6mWPkxyqjfWF1LAhAb1TThXOMV9MviI X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230025)(4636009)(396003)(376002)(346002)(39860400002)(136003)(451199018)(46966006)(40470700004)(36840700001)(36756003)(8936002)(235185007)(5660300002)(26005)(8676002)(70586007)(33964004)(6506007)(44144004)(186003)(81166007)(82740400003)(6666004)(336012)(82310400005)(47076005)(4743002)(36860700001)(2616005)(40480700001)(41300700001)(316002)(86362001)(40460700003)(6916009)(4326008)(70206006)(356005)(6512007)(6486002)(478600001)(84970400001)(2906002)(44832011)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Mar 2023 19:36:39.0160 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0450d675-87ef-4a19-0955-08db20d59a4a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VI1EUR03FT058.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8477 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, The testcase typedef unsigned int vec __attribute__((vector_size(32))); vec f3 (vec a, vec b, vec c) { vec d = a * b; return d + ((c + d) >> 1); } shows a case where we don't want to form an FMA due to the MUL not being single use. In this case to form an FMA we have to redo the MUL as well as we no longer have it to share. As such making an FMA here would be a de-optimization. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/108583 * tree-ssa-math-opts.cc (convert_mult_to_fma): Inhibit FMA in case not single use. gcc/testsuite/ChangeLog: PR target/108583 * gcc.dg/mla_1.c: New test. Co-Authored-By: Richard Sandiford --- inline copy of patch -- diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c new file mode 100644 index 0000000000000000000000000000000000000000..a92ecf248116d89b1bc4207a907ea5ed95728a28 --- diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c new file mode 100644 index 0000000000000000000000000000000000000000..a92ecf248116d89b1bc4207a907ea5ed95728a28 --- /dev/null +++ b/gcc/testsuite/gcc.dg/mla_1.c @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve -fdump-tree-optimized" } */ + +unsigned int +f1 (unsigned int a, unsigned int b, unsigned int c) { + unsigned int d = a * b; + return d + ((c + d) >> 1); +} + +unsigned int +g1 (unsigned int a, unsigned int b, unsigned int c) { + return a * b + c; +} + +__Uint32x4_t +f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) { + __Uint32x4_t d = a * b; + return d + ((c + d) >> 1); +} + +__Uint32x4_t +g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) { + return a * b + c; +} + +typedef unsigned int vec __attribute__((vector_size(32))); vec +f3 (vec a, vec b, vec c) +{ + vec d = a * b; + return d + ((c + d) >> 1); +} + +vec +g3 (vec a, vec b, vec c) +{ + return a * b + c; +} + +/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target aarch64*-*-* } } } */ diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree op2, param_avoid_fma_max_bits)); bool defer = check_defer; bool seen_negate_p = false; + + /* There is no numerical difference between fused and unfused integer FMAs, + and the assumption below that FMA is as cheap as addition is unlikely + to be true, especially if the multiplication occurs multiple times on + the same chain. E.g., for something like: + + (((a * b) + c) >> 1) + (a * b) + + we do not want to duplicate the a * b into two additions, not least + because the result is not a natural FMA chain. */ + if (ANY_INTEGRAL_TYPE_P (type) + && !has_single_use (mul_result)) + return false; + /* Make sure that the multiplication statement becomes dead after the transformation, thus that all uses are transformed to FMAs. This means we assume that an FMA operation has the same cost --- /dev/null +++ b/gcc/testsuite/gcc.dg/mla_1.c @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve -fdump-tree-optimized" } */ + +unsigned int +f1 (unsigned int a, unsigned int b, unsigned int c) { + unsigned int d = a * b; + return d + ((c + d) >> 1); +} + +unsigned int +g1 (unsigned int a, unsigned int b, unsigned int c) { + return a * b + c; +} + +__Uint32x4_t +f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) { + __Uint32x4_t d = a * b; + return d + ((c + d) >> 1); +} + +__Uint32x4_t +g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) { + return a * b + c; +} + +typedef unsigned int vec __attribute__((vector_size(32))); vec +f3 (vec a, vec b, vec c) +{ + vec d = a * b; + return d + ((c + d) >> 1); +} + +vec +g3 (vec a, vec b, vec c) +{ + return a * b + c; +} + +/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target aarch64*-*-* } } } */ diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree op2, param_avoid_fma_max_bits)); bool defer = check_defer; bool seen_negate_p = false; + + /* There is no numerical difference between fused and unfused integer FMAs, + and the assumption below that FMA is as cheap as addition is unlikely + to be true, especially if the multiplication occurs multiple times on + the same chain. E.g., for something like: + + (((a * b) + c) >> 1) + (a * b) + + we do not want to duplicate the a * b into two additions, not least + because the result is not a natural FMA chain. */ + if (ANY_INTEGRAL_TYPE_P (type) + && !has_single_use (mul_result)) + return false; + /* Make sure that the multiplication statement becomes dead after the transformation, thus that all uses are transformed to FMAs. This means we assume that an FMA operation has the same cost