From patchwork Fri Oct 29 11:06:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 46783 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 69AF43857C47 for ; Fri, 29 Oct 2021 11:10:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 69AF43857C47 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1635505811; bh=NSG+Fl+8d1aIoDyI4/96mwZQisaaTevcu/mH5JNAzcA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=dMRfybgffOyz48furcDkfVYPx3JM52EJj+rG39htv9BeEelXgA4OnPgGlzQK7m4wF 9ZUW7bBtCio5RRO0WCyXQn3xjry1MLcBPZgGlTgdAHvm4kmrvadjhzlI4QA0M0yepa HeZY0K04jTBPdgql5QxHYwHzKiSF5ve6b2Wc/qQM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30040.outbound.protection.outlook.com [40.107.3.40]) by sourceware.org (Postfix) with ESMTPS id A926F3857C72 for ; Fri, 29 Oct 2021 11:07:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A926F3857C72 Received: from AS9PR06CA0300.eurprd06.prod.outlook.com (2603:10a6:20b:45a::7) by PAXPR08MB6670.eurprd08.prod.outlook.com (2603:10a6:102:130::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.14; Fri, 29 Oct 2021 11:07:05 +0000 Received: from VE1EUR03FT051.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:45a:cafe::a0) by AS9PR06CA0300.outlook.office365.com (2603:10a6:20b:45a::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.15 via Frontend Transport; Fri, 29 Oct 2021 11:07:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT051.mail.protection.outlook.com (10.152.19.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.14 via Frontend Transport; Fri, 29 Oct 2021 11:07:04 +0000 Received: ("Tessian outbound c71e1a752bff:v108"); Fri, 29 Oct 2021 11:07:03 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: eb65749259e0d8bb X-CR-MTA-TID: 64aa7808 Received: from 482fa036a757.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B4EBF22A-903D-4EA2-888B-63EDFBB959D4.1; Fri, 29 Oct 2021 11:06:38 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 482fa036a757.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 29 Oct 2021 11:06:38 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Epxw5s1Jmn7p8suHQPrqG2QgbdqLWoGNZSlLHo/Bfr3yiSx2izYhNNkLN30ylUqKPVFoUsrqwzPkg5UqVRvPR6XfwoJW6huktj+ryffwOiSZpXCJxD0cQvYYU7oEm6M5UdWvm5msorgC81CpPtNMimigpu7pIxWn4KXJeP1Zi0AxqSWyZ3CD2sYTB0Ki2gSCSme0l17GfAuCYZ7iTP7oHtDWZX9j8lIXzLDEqvMv4bkApSjAo+fpcnf3RYZgMKVqXqy5NU+i6nKqz3eyfMcWUUpM/cH3plnMu0NFQZETAwZKacCS4VtIof//XypO+z769Fhwmkpc+udP7boepE9Ejg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NSG+Fl+8d1aIoDyI4/96mwZQisaaTevcu/mH5JNAzcA=; b=AMOL5CCggx5LhVYElmOTOeiZ/QnuXLvIrdCsZQeN2TdKyfk+fW6PM722hoV3z8n9qbnOi8eOLd5RvTtGczzqVKarYtvg/Sp1NYkWzAdPR4Hn59OMTC0BwApA/vZqHfPz9H8IOOjpeciTMqUymO9YsDuzSCMLflY8aRIKR4P3BHhPK/sKzyugTU+qkmSoP7l1BV9LelxbEwFkLdX0qhk4KmVJo3hOJa/BEnt8lhpr/gVtAYC4zgjPaNPbXO4C1DGedj8CTSlrvT+mwkAjbNndTHvlZpwNQyuNRh4vRORzhcYieKbCXBbcyzb3m+TmtdiqirZrUrU1J/yW5jiWuL1sdg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5183.eurprd08.prod.outlook.com (2603:10a6:803:10b::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.18; Fri, 29 Oct 2021 11:06:35 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::31cf:ea55:2234:c50b]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::31cf:ea55:2234:c50b%7]) with mapi id 15.20.4649.015; Fri, 29 Oct 2021 11:06:35 +0000 Date: Fri, 29 Oct 2021 12:06:33 +0100 To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/2]middle-end Update the complex numbers auto-vec detection to the new format of the SLP tree. Message-ID: Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LNXP123CA0010.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:d2::22) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 Received: from arm.com (217.140.106.55) by LNXP123CA0010.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:d2::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.15 via Frontend Transport; Fri, 29 Oct 2021 11:06:35 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 12d8dc49-51d9-4a87-26ce-08d99acc3d64 X-MS-TrafficTypeDiagnostic: VE1PR08MB5183:|PAXPR08MB6670: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 2XhO2Shv0OhIEey7mmgbhiBYm4zG2Z9wyf9LjbLniLK/i8KWaE8LFdwVC2IdFEOLGmvQ2dpRAqMiORIOKaI5rAE9QRkhl0QVZmbqTlKTivQbeY5ZR98RS2IgALRSd6sdj9ati6EzrEyMmO2/dOdjww4nrhPRgsX7rRkPtvYP6+ABTRjT4iOEVUDDntDa22ZjYEr58uGeoU5qC/lVO2FkXJk/ftPh0+Vokf/x4dK9+6EOyloECGQghKDq7FDLa/n5NIeFZ9DdPMwB52BGGODxHgUcV1UtGDXL1lfR6IcJXOGOLHbgtqlsnIPCACxW0nbUF4sik9YAwR3VtqjZPsz3q+k6xxgU+a1uV6VQOWK/+KKhVl95vM/K6yfJuK0MHAg6wrJyocP2gpmCIh12PoOeddz3GglTLSqByILAoh7/rcx3PMCX0FvlhPvxLoUG6+fpyOYdMvyh59zzwHKCwBZVjh+2od3Cq6INGQBgQYik4x176K/BCXjMzK7inWcxhYo3l9S5EVOtn/Kg/wQZxEszjXY/qiXrwC/bUp3Sfo9SW+6kmZVCJbFgluZprYsmVxJ3R4+FqjEkcezZw3Ie6+mZ6zDB+6/qAQ+jxDff27+GfwH1gg3LzceTlg+28uUm8kAqb8fF2XXi/QlMCLEHRiMdHITTCdB+eLkMborNqAO0VE9OIBt/Ye8tJD0nqtsHO09p2heyVO1rOvb9Zf5DnSdlmmEyJ1AmuddKCubPykLXa1iWnVeHhUkjcW8JUWsady+wZhJVFDibuH55zg/yDOc44c0Rnj40m0iJPI3ahb7wmexXvYzkipeVolqykHWKacBE X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(38100700002)(66556008)(8936002)(2616005)(86362001)(15650500001)(8886007)(186003)(508600001)(7696005)(66476007)(956004)(235185007)(44832011)(316002)(5660300002)(4326008)(66946007)(38350700002)(83380400001)(33964004)(44144004)(26005)(8676002)(6916009)(36756003)(52116002)(30864003)(2906002)(55016002)(4216001)(2700100001)(414714003)(473944003); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5183 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT051.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 3c4692fe-92ea-4f44-eb7a-08d99acc2c08 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: WHzdAOnvOe0IYjf/Jtn3jG22Z8OV0qMDsZuh4xGl+xaHI9JwBFz5UPxzRwbOz7iQGenWkVHqSuqqMLHOs4mMkkFr2ew8TAkhall7g03zx2vXDnWvwp2nRzKS8Q1YhgTJbGputtficoOGhiREXGKPv/xQWOCJd0jNRL2o+0ZvMotruGG1RwBwGe7QV/ig8m2Ge7tp+aarAgZa0j8FhLm0B6Yz1/KSBWGJf+oHtcEoSJD3EaRrgvQPtjNrrZUc2V1DIxtJl2fbFoRKiaxls2Ro519CXcwmZHRamRnaRVNfmFPz3mOfPL+dGt71LqZ+V2pshUsmkr8Ed4IV1GxW8ggeRc/2/HlpuweUCQdKP2of3/YB6w3ZDBJ+fOWaLQh1i7bmaHnxd2AlxIf2l6adpdhjfbpjxkWpO1xbjVQBwx3SJYhlPSjHlh3E/5udeUH4WbKjZRQIdkO9/3RkWPU6hgmHVePtAMu+72IxFNWCwkONzmCxr4dTV2wAoPS/ITnzX0aahy231/doWXIk2Ct6X1tzXv9jeoYOHC+sJ0OQod/H4Bwt2CkBF3gybvQJUjq7G+Fn/Pos1mIhXXMYm0d/VcNQn5cZP1JFnJCU0+S/CAcoGkWUNAb6MWA4N+tVFWHXCtq5ZF+6dfvWBfFXxRZD/b71YpL1nfUclbCggOoxPeeKOgLTmGqmh50jJRbwYe6rDd6G3nMNfO1Q7y2a2CM5jIYvnubGzGdyQDMmIK9gOsSVk0IrBbdIkS8P5HOHo3NCL9a/x+Aakp5uvf3AANOJPo6xrSr/OzT1u7Kin9N+bufenmDYEU+hQogj6kkW6rEgI/eT X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(316002)(44144004)(70586007)(2616005)(33964004)(956004)(186003)(4326008)(8886007)(6916009)(26005)(7696005)(86362001)(36756003)(15650500001)(235185007)(5660300002)(508600001)(70206006)(55016002)(336012)(2906002)(8936002)(30864003)(83380400001)(82310400003)(356005)(36860700001)(44832011)(47076005)(81166007)(8676002)(107886003)(4216001)(2700100001)(473944003)(414714003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Oct 2021 11:07:04.2166 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 12d8dc49-51d9-4a87-26ce-08d99acc3d64 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT051.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6670 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, The layout of the SLP tree has changed in GCC 12 which broke the detection of complex FMA and FMS. This patch updates the detection to the new tree shape and by necessity merges the complex MUL and FMA detection into one. This does not yet address the wrong code-gen PR which I will fix in a different patch as that needs backporting. Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no regressions. Ok for master? Thanks, Tamar gcc/ChangeLog: PR tree-optimization/102977 * tree-vect-slp-patterns.c (vect_match_call_p): Remove. (vect_detect_pair_op): Add crosslane check. (vect_match_call_complex_mla): Remove. (class complex_mul_pattern): Update comment. (complex_mul_pattern::matches): Update detection. (class complex_fma_pattern): Remove. (complex_fma_pattern::matches): Remove. (complex_fma_pattern::recognize): Remove. (complex_fma_pattern::build): Remove. (class complex_fms_pattern): Update comment. (complex_fms_pattern::matches): Remove. (complex_operations_pattern::recognize): Remove complex_fma_pattern --- inline copy of patch -- diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index b8d09b7832e29689ede832d555e1b6af2c24ce1e..99dea82aba91a333500bb5ff35bf30b6416c09ca 100644 diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index b8d09b7832e29689ede832d555e1b6af2c24ce1e..99dea82aba91a333500bb5ff35bf30b6416c09ca 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -306,24 +306,6 @@ vect_match_expression_p (slp_tree node, tree_code code) return true; } -/* Checks to see if the expression represented by NODE is a call to the internal - function FN. */ - -static inline bool -vect_match_call_p (slp_tree node, internal_fn fn) -{ - if (!node - || !SLP_TREE_REPRESENTATIVE (node)) - return false; - - gimple* expr = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node)); - if (!expr - || !gimple_call_internal_p (expr, fn)) - return false; - - return true; -} - /* Check if the given lane permute in PERMUTES matches an alternating sequence of {even odd even odd ...}. This to account for unrolled loops. Further mode there resulting permute must be linear. */ @@ -389,6 +371,16 @@ vect_detect_pair_op (slp_tree node1, slp_tree node2, lane_permutation_t &lanes, if (result != CMPLX_NONE && ops != NULL) { + if (two_operands) + { + auto l0node = SLP_TREE_CHILDREN (node1); + auto l1node = SLP_TREE_CHILDREN (node2); + + /* Check if the tree is connected as we expect it. */ + if (!((l0node[0] == l1node[0] && l0node[1] == l1node[1]) + || (l0node[0] == l1node[1] && l0node[1] == l1node[0]))) + return CMPLX_NONE; + } ops->safe_push (node1); ops->safe_push (node2); } @@ -717,27 +709,6 @@ complex_add_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, * complex_mul_pattern ******************************************************************************/ -/* Helper function of that looks for a match in the CHILDth child of NODE. The - child used is stored in RES. - - If the match is successful then ARGS will contain the operands matched - and the complex_operation_t type is returned. If match is not successful - then CMPLX_NONE is returned and ARGS is left unmodified. */ - -static inline complex_operation_t -vect_match_call_complex_mla (slp_tree node, unsigned child, - vec *args = NULL, slp_tree *res = NULL) -{ - gcc_assert (child < SLP_TREE_CHILDREN (node).length ()); - - slp_tree data = SLP_TREE_CHILDREN (node)[child]; - - if (res) - *res = data; - - return vect_detect_pair_op (data, false, args); -} - /* Check to see if either of the trees in ARGS are a NEGATE_EXPR. If the first child (args[0]) is a NEGATE_EXPR then NEG_FIRST_P is set to TRUE. @@ -945,9 +916,10 @@ class complex_mul_pattern : public complex_pattern }; -/* Pattern matcher for trying to match complex multiply pattern in SLP tree - If the operation matches then IFN is set to the operation it matched - and the arguments to the two replacement statements are put in m_ops. +/* Pattern matcher for trying to match complex multiply and complex multiply + and accumulate pattern in SLP tree. If the operation matches then IFN + is set to the operation it matched and the arguments to the two + replacement statements are put in m_ops. If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. @@ -972,19 +944,43 @@ complex_mul_pattern::matches (complex_operation_t op, if (op != MINUS_PLUS) return IFN_LAST; - slp_tree root = *node; - /* First two nodes must be a multiply. */ - auto_vec muls; - if (vect_match_call_complex_mla (root, 0) != MULT_MULT - || vect_match_call_complex_mla (root, 1, &muls) != MULT_MULT) + auto childs = *ops; + auto l0node = SLP_TREE_CHILDREN (childs[0]); + auto l1node = SLP_TREE_CHILDREN (childs[1]); + + bool mul0 = vect_match_expression_p (l0node[0], MULT_EXPR); + bool mul1 = vect_match_expression_p (l0node[1], MULT_EXPR); + if (!mul0 && !mul1) return IFN_LAST; /* Now operand2+4 may lead to another expression. */ auto_vec left_op, right_op; - left_op.safe_splice (SLP_TREE_CHILDREN (muls[0])); - right_op.safe_splice (SLP_TREE_CHILDREN (muls[1])); + slp_tree add0 = NULL; + + /* Check if we may be a multiply add. */ + if (!mul0 + && vect_match_expression_p (l0node[0], PLUS_EXPR)) + { + auto vals = SLP_TREE_CHILDREN (l0node[0]); + /* Check if it's a multiply, otherwise no idea what this is. */ + if (!vect_match_expression_p (vals[1], MULT_EXPR)) + return IFN_LAST; + + /* Check if the ADD is linear, otherwise it's not valid complex FMA. */ + if (linear_loads_p (perm_cache, vals[0]) != PERM_EVENODD) + return IFN_LAST; - if (linear_loads_p (perm_cache, left_op[1]) == PERM_ODDEVEN) + left_op.safe_splice (SLP_TREE_CHILDREN (vals[1])); + add0 = vals[0]; + } + else + left_op.safe_splice (SLP_TREE_CHILDREN (l0node[0])); + + right_op.safe_splice (SLP_TREE_CHILDREN (l0node[1])); + + if (left_op.length () != 2 + || right_op.length () != 2 + || linear_loads_p (perm_cache, left_op[1]) == PERM_ODDEVEN) return IFN_LAST; bool neg_first = false; @@ -998,23 +994,32 @@ complex_mul_pattern::matches (complex_operation_t op, if (!vect_validate_multiplication (perm_cache, left_op, PERM_EVENEVEN) || vect_normalize_conj_loc (left_op)) return IFN_LAST; - ifn = IFN_COMPLEX_MUL; + if (!mul0) + ifn = IFN_COMPLEX_FMA; + else + ifn = IFN_COMPLEX_MUL; } - else if (is_neg) + else { if (!vect_validate_multiplication (perm_cache, left_op, right_op, neg_first, &conj_first_operand, false)) return IFN_LAST; - ifn = IFN_COMPLEX_MUL_CONJ; + if(!mul0) + ifn = IFN_COMPLEX_FMA_CONJ; + else + ifn = IFN_COMPLEX_MUL_CONJ; } if (!vect_pattern_validate_optab (ifn, *node)) return IFN_LAST; ops->truncate (0); - ops->create (3); + ops->create (add0 ? 4 : 3); + + if (add0) + ops->quick_push (add0); complex_perm_kinds_t kind = linear_loads_p (perm_cache, left_op[0]); if (kind == PERM_EVENODD) @@ -1070,170 +1075,55 @@ complex_mul_pattern::build (vec_info *vinfo) { slp_tree node; unsigned i; - slp_tree newnode - = vect_build_combine_node (this->m_ops[0], this->m_ops[1], *this->m_node); - SLP_TREE_REF_COUNT (this->m_ops[2])++; - - FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) - vect_free_slp_tree (node); - - /* First re-arrange the children. */ - SLP_TREE_CHILDREN (*this->m_node).reserve_exact (2); - SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[2]; - SLP_TREE_CHILDREN (*this->m_node)[1] = newnode; + switch (this->m_ifn) + { + case IFN_COMPLEX_MUL: + case IFN_COMPLEX_MUL_CONJ: + { + slp_tree newnode + = vect_build_combine_node (this->m_ops[0], this->m_ops[1], + *this->m_node); + SLP_TREE_REF_COUNT (this->m_ops[2])++; + + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) + vect_free_slp_tree (node); + + /* First re-arrange the children. */ + SLP_TREE_CHILDREN (*this->m_node).reserve_exact (2); + SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[2]; + SLP_TREE_CHILDREN (*this->m_node)[1] = newnode; + break; + } + case IFN_COMPLEX_FMA: + case IFN_COMPLEX_FMA_CONJ: + { + SLP_TREE_REF_COUNT (this->m_ops[0])++; + slp_tree newnode + = vect_build_combine_node (this->m_ops[1], this->m_ops[2], + *this->m_node); + SLP_TREE_REF_COUNT (this->m_ops[3])++; + + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) + vect_free_slp_tree (node); + + /* First re-arrange the children. */ + SLP_TREE_CHILDREN (*this->m_node).safe_grow (3); + SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[0]; + SLP_TREE_CHILDREN (*this->m_node)[1] = this->m_ops[3]; + SLP_TREE_CHILDREN (*this->m_node)[2] = newnode; + + /* Tell the builder to expect an extra argument. */ + this->m_num_args++; + break; + } + default: + gcc_unreachable (); + } /* And then rewrite the node itself. */ complex_pattern::build (vinfo); } -/******************************************************************************* - * complex_fma_pattern class - ******************************************************************************/ - -class complex_fma_pattern : public complex_pattern -{ - protected: - complex_fma_pattern (slp_tree *node, vec *m_ops, internal_fn ifn) - : complex_pattern (node, m_ops, ifn) - { - this->m_num_args = 3; - } - - public: - void build (vec_info *); - static internal_fn - matches (complex_operation_t op, slp_tree_to_load_perm_map_t *, slp_tree *, - vec *); - - static vect_pattern* - recognize (slp_tree_to_load_perm_map_t *, slp_tree *); - - static vect_pattern* - mkInstance (slp_tree *node, vec *m_ops, internal_fn ifn) - { - return new complex_fma_pattern (node, m_ops, ifn); - } -}; - -/* Pattern matcher for trying to match complex multiply and accumulate - and multiply and subtract patterns in SLP tree. - If the operation matches then IFN is set to the operation it matched and - the arguments to the two replacement statements are put in m_ops. - - If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. - - This function matches the patterns shaped as: - - double ax = (b[i+1] * a[i]) + (b[i] * a[i]); - double bx = (a[i+1] * b[i]) - (a[i+1] * b[i+1]); - - c[i] = c[i] - ax; - c[i+1] = c[i+1] + bx; - - If a match occurred then TRUE is returned, else FALSE. The match is - performed after COMPLEX_MUL which would have done the majority of the work. - This function merely matches an ADD with a COMPLEX_MUL IFN. The initial - match is expected to be in OP1 and the initial match operands in args0. */ - -internal_fn -complex_fma_pattern::matches (complex_operation_t op, - slp_tree_to_load_perm_map_t * /* perm_cache */, - slp_tree *ref_node, vec *ops) -{ - internal_fn ifn = IFN_LAST; - - /* Find the two components. We match Complex MUL first which reduces the - amount of work this pattern has to do. After that we just match the - head node and we're done.: - - * FMA: + +. - - We need to ignore the two_operands nodes that may also match. - For that we can check if they have any scalar statements and also - check that it's not a permute node as we're looking for a normal - PLUS_EXPR operation. */ - if (op != CMPLX_NONE) - return IFN_LAST; - - /* Find the two components. We match Complex MUL first which reduces the - amount of work this pattern has to do. After that we just match the - head node and we're done.: - - * FMA: + + on a non-two_operands node. */ - slp_tree vnode = *ref_node; - if (SLP_TREE_LANE_PERMUTATION (vnode).exists () - || !SLP_TREE_CHILDREN (vnode).exists () - || !vect_match_expression_p (vnode, PLUS_EXPR)) - return IFN_LAST; - - slp_tree node = SLP_TREE_CHILDREN (vnode)[1]; - - if (vect_match_call_p (node, IFN_COMPLEX_MUL)) - ifn = IFN_COMPLEX_FMA; - else if (vect_match_call_p (node, IFN_COMPLEX_MUL_CONJ)) - ifn = IFN_COMPLEX_FMA_CONJ; - else - return IFN_LAST; - - if (!vect_pattern_validate_optab (ifn, vnode)) - return IFN_LAST; - - ops->truncate (0); - ops->create (3); - - if (ifn == IFN_COMPLEX_FMA) - { - ops->quick_push (SLP_TREE_CHILDREN (vnode)[0]); - ops->quick_push (SLP_TREE_CHILDREN (node)[1]); - ops->quick_push (SLP_TREE_CHILDREN (node)[0]); - } - else - { - ops->quick_push (SLP_TREE_CHILDREN (vnode)[0]); - ops->quick_push (SLP_TREE_CHILDREN (node)[0]); - ops->quick_push (SLP_TREE_CHILDREN (node)[1]); - } - - return ifn; -} - -/* Attempt to recognize a complex mul pattern. */ - -vect_pattern* -complex_fma_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, - slp_tree *node) -{ - auto_vec ops; - complex_operation_t op - = vect_detect_pair_op (*node, true, &ops); - internal_fn ifn - = complex_fma_pattern::matches (op, perm_cache, node, &ops); - if (ifn == IFN_LAST) - return NULL; - - return new complex_fma_pattern (node, &ops, ifn); -} - -/* Perform a replacement of the detected complex mul pattern with the new - instruction sequences. */ - -void -complex_fma_pattern::build (vec_info *vinfo) -{ - slp_tree node = SLP_TREE_CHILDREN (*this->m_node)[1]; - - SLP_TREE_CHILDREN (*this->m_node).release (); - SLP_TREE_CHILDREN (*this->m_node).create (3); - SLP_TREE_CHILDREN (*this->m_node).safe_splice (this->m_ops); - - SLP_TREE_REF_COUNT (this->m_ops[1])++; - SLP_TREE_REF_COUNT (this->m_ops[2])++; - - vect_free_slp_tree (node); - - complex_pattern::build (vinfo); -} - /******************************************************************************* * complex_fms_pattern class ******************************************************************************/ @@ -1264,10 +1154,10 @@ class complex_fms_pattern : public complex_pattern }; -/* Pattern matcher for trying to match complex multiply and accumulate - and multiply and subtract patterns in SLP tree. - If the operation matches then IFN is set to the operation it matched and - the arguments to the two replacement statements are put in m_ops. +/* Pattern matcher for trying to match complex multiply and subtract pattern + in SLP tree. If the operation matches then IFN is set to the operation + it matched and the arguments to the two replacement statements are put in + m_ops. If no match is found then IFN is set to IFN_LAST and m_ops is unchanged. @@ -1289,38 +1179,33 @@ complex_fms_pattern::matches (complex_operation_t op, { internal_fn ifn = IFN_LAST; - /* Find the two components. We match Complex MUL first which reduces the - amount of work this pattern has to do. After that we just match the - head node and we're done.: - - * FMS: - +. */ - slp_tree child = NULL; - /* We need to ignore the two_operands nodes that may also match, for that we can check if they have any scalar statements and also check that it's not a permute node as we're looking for a normal - PLUS_EXPR operation. */ - if (op != PLUS_MINUS) + MINUS_EXPR operation. */ + if (op != CMPLX_NONE) return IFN_LAST; - child = SLP_TREE_CHILDREN ((*ops)[1])[1]; - if (vect_detect_pair_op (child) != MINUS_PLUS) + slp_tree root = *ref_node; + if (!vect_match_expression_p (root, MINUS_EXPR)) return IFN_LAST; - /* First two nodes must be a multiply. */ - auto_vec muls; - if (vect_match_call_complex_mla (child, 0) != MULT_MULT - || vect_match_call_complex_mla (child, 1, &muls) != MULT_MULT) + auto nodes = SLP_TREE_CHILDREN (root); + if (!vect_match_expression_p (nodes[1], MULT_EXPR) + || vect_detect_pair_op (nodes[0]) != PLUS_MINUS) return IFN_LAST; + auto childs = SLP_TREE_CHILDREN (nodes[0]); + auto l0node = SLP_TREE_CHILDREN (childs[0]); + auto l1node = SLP_TREE_CHILDREN (childs[1]); + /* Now operand2+4 may lead to another expression. */ auto_vec left_op, right_op; - left_op.safe_splice (SLP_TREE_CHILDREN (muls[0])); - right_op.safe_splice (SLP_TREE_CHILDREN (muls[1])); + left_op.safe_splice (SLP_TREE_CHILDREN (l0node[1])); + right_op.safe_splice (SLP_TREE_CHILDREN (nodes[1])); bool is_neg = vect_normalize_conj_loc (left_op); - child = SLP_TREE_CHILDREN ((*ops)[1])[0]; bool conj_first_operand = false; if (!vect_validate_multiplication (perm_cache, right_op, left_op, false, &conj_first_operand, true)) @@ -1340,28 +1225,28 @@ complex_fms_pattern::matches (complex_operation_t op, complex_perm_kinds_t kind = linear_loads_p (perm_cache, right_op[0]); if (kind == PERM_EVENODD) { - ops->quick_push (child); + ops->quick_push (l0node[0]); ops->quick_push (right_op[0]); ops->quick_push (right_op[1]); ops->quick_push (left_op[1]); } else if (kind == PERM_TOP) { - ops->quick_push (child); + ops->quick_push (l0node[0]); ops->quick_push (right_op[1]); ops->quick_push (right_op[0]); ops->quick_push (left_op[0]); } else if (kind == PERM_EVENEVEN && !is_neg) { - ops->quick_push (child); + ops->quick_push (l0node[0]); ops->quick_push (right_op[1]); ops->quick_push (right_op[0]); ops->quick_push (left_op[0]); } else { - ops->quick_push (child); + ops->quick_push (l0node[0]); ops->quick_push (right_op[1]); ops->quick_push (right_op[0]); ops->quick_push (left_op[1]); @@ -1473,10 +1358,6 @@ complex_operations_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, if (ifn != IFN_LAST) return complex_mul_pattern::mkInstance (node, &ops, ifn); - ifn = complex_fma_pattern::matches (op, perm_cache, node, &ops); - if (ifn != IFN_LAST) - return complex_fma_pattern::mkInstance (node, &ops, ifn); - ifn = complex_add_pattern::matches (op, perm_cache, node, &ops); if (ifn != IFN_LAST) return complex_add_pattern::mkInstance (node, &ops, ifn);