From patchwork Wed Dec 27 10:40:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Di Zhao OS X-Patchwork-Id: 82889 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B8CDE3858C36 for ; Wed, 27 Dec 2023 10:41:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2122.outbound.protection.outlook.com [40.107.102.122]) by sourceware.org (Postfix) with ESMTPS id 22F5A3858D37 for ; Wed, 27 Dec 2023 10:41:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 22F5A3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=os.amperecomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 22F5A3858D37 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.102.122 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1703673664; cv=pass; b=FjKEpqzfbNyCS4VV1qFKyO3WUP5vNGfKbQp57XWH4Nu95aRGUQOUTSghS6Jm/F8SBrsy1C8hxHymrNviZL8JeoxWnNFxwVuTKxDgczpHRg7SuHGC+wn//RC9Sdyn+wecynk/khOeoEJrFDvxfwJJahRNDH/7phUoZ5+OP1cOP6k= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1703673664; c=relaxed/simple; bh=abqmUlPEmzc835JPNtgu7boNxSlcpZu7k9D9pc4WQq8=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=UQ893OSaMn8H6u8cmOUhtYLT1ih6VR3HMTkHNCnul2bpfOmnKgBy6GyHmejXO1Zs9iHoGmfyCfmLFIcd9Lwz5TTf0ZBqdery9moGUDPDYHp48IO4DOCvZOJ5W5gy4rOl9rPX6OjiCVJCW04+FIg2IXn24o19IKBk8SqdR9Fr6eg= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iNuFfgoe6MFusSvvFjPCugf+383JrjvSOSiyqyZ7BcoLRgSoY2mU0KlE9xpBQsLlvtIigb9AbHsZeS1BlvWsFR25pVd/2ojW+yhsd92sofU6SMVEX+pASfM5iVoVDYJc+aE/WiWt2E9ZoQSpt44PJLzIHIiGrDix74vuovE76WYhU4PDotk6fQBes8kfFc8Dd1beCmquU3ZGGGkGzDhEmlUmKnTrwB0Q7OxIO/BFSChcOe9P7MXHBZMglCAMWbSTlszFKb0WYJRKXAkvYL00OwK6hSuEPAPqlj6TEupNxY91DMdlaA9QGvIB8mr8Vhb8DQ3keNrn8Dkryfhn7/ePnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sbv/W+QNKBFuCPcRlxjaOF+u+hhlHLj/1Zi76eo1N9E=; b=bWinNibwOhaRT/3BfABaoSd1KRYWWytRjVESPDkPTVl/B1WAgFKZGKCNxeo0trpL4B3gFgsBUYIWAFTnLHFvcqNzLtViEPd/4e6klS7FzFwqvccj7N5EJQNkPr/x/gsxox82be7SH0vhZiwpIY7oEnRiQfQlUHIb2sk8do2qJzxBBeVr55WFN9PAdKUbAMhO1g/ac/DxFQ0dVYhdyG3/QFf8zsRssHA2dY0lcb5PSEaWuV0kuocN4wvMXRBSuR4n7P9+BcdSgLSEbOi3ix2glCsPfh4ChasIuCFBNRwfz2v+uL05OOnoc/DBcLsiMmChzAZlodEFJRCrpUOplrWcrA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sbv/W+QNKBFuCPcRlxjaOF+u+hhlHLj/1Zi76eo1N9E=; b=ntSp6KPboULkAuwgRuxCf+tuyGnC3bssYTupXgRBJccbMjOfHrkXuPVz7xbeEPu7GpnvSWTz45DaiQMSMAxEHNRKvvtIR27OQmFFyAkVVWVzFiIvm1cHnVhhcAww2SY0HsKA1aWBDSGFEfCo/1S55LbMuZWCZSU3pqpTinAi1ZY= Received: from SN6PR01MB4240.prod.exchangelabs.com (2603:10b6:805:ae::22) by MW6PR01MB8384.prod.exchangelabs.com (2603:10b6:303:23e::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7113.28; Wed, 27 Dec 2023 10:40:57 +0000 Received: from SN6PR01MB4240.prod.exchangelabs.com ([fe80::77f9:5768:a5b0:daa]) by SN6PR01MB4240.prod.exchangelabs.com ([fe80::77f9:5768:a5b0:daa%7]) with mapi id 15.20.7113.028; Wed, 27 Dec 2023 10:40:57 +0000 From: Di Zhao OS To: "gcc-patches@gcc.gnu.org" Subject: [PATCH] aarch64: add 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA' Thread-Topic: [PATCH] aarch64: add 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA' Thread-Index: Ado4sCDEPm2nNlPkRsSOEkDafq9YQA== Date: Wed, 27 Dec 2023 10:40:57 +0000 Message-ID: Accept-Language: en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_ActionId=b92a8710-cbea-49ec-a7c2-d90f2eb70d46; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_ContentBits=0; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Enabled=true; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Method=Standard; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Name=Confidential (Default); MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SetDate=2023-12-27T10:07:36Z; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SiteId=3bc2b170-fd94-476d-b0ce-4229bdc904a7; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=os.amperecomputing.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: SN6PR01MB4240:EE_|MW6PR01MB8384:EE_ x-ms-office365-filtering-correlation-id: 048bd2c1-fbde-4049-3a3e-08dc06c84f2e x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: RAkcm4MIX+3HGe5oD9lt8yZXXeBUh9TuWJ7zafEZ7HaGdzDfLZnKU2rCmjX9VUfKMP7iK6JBQ+0K7yxyp5WTf1MBG9nZsugbnjGEKg0EOSF3Avh3oP0fD12h7sI4QINYoHTvo91BAHJvFV/70rteReyzmIFH0CAM8gzhF7DUpm7OsQIMWmjFweskn9LifP68kx7qJqXdp07dg3BRyQS+NdLiVXDGpIViK8d2zoLu4uMbP37QYuxucjBtGN6Yk2X4Kv9jbzYOnlaU6IlvQwkNRxP83wh5hWu6+573c2xdsqoB7i8VlvtkU+OXkY+WlWPkxk030buO5BpPPUEfoHktAmQDCmCF9On87cqwxNmYKtWnmvtsGJFDL4eXse4JkBFwBJQMsb9vGCQsWXWMiIMHwReOw+8gFpGP2qhNuBbciDP/Cg75xiuSopM6Prhb8hTsfPVZav/2eoSBP4zMfhntp2O/gSlr4JaUuncKp7P0COLl2aq1fX8gjGrwCJnnafZn5gFgjgy6RcKmHBq1iHRgcsF9mMG7uNsJSchYWGboy+28OQ+eqGSq8Wlf/tHG7SKhZFqXzohnGD/QA+8aAc1ipIOgX5dSyn8uQg3FKFDyFS8egyLjM/zbtcJHf23Wjuab x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR01MB4240.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(13230031)(346002)(366004)(396003)(376002)(39850400004)(136003)(230922051799003)(1800799012)(64100799003)(186009)(451199024)(71200400001)(478600001)(33656002)(9686003)(6506007)(86362001)(66446008)(66556008)(66476007)(66946007)(64756008)(6916009)(7696005)(76116006)(316002)(41300700001)(55016003)(8936002)(8676002)(83380400001)(38100700002)(2906002)(52536014)(122000001)(38070700009)(26005)(5660300002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: /HagpHp8l1IyHyTNk6QtHdCxYOPzXisLNV6rUCJ6CevgCbhkkmez4JxSsHjnyR7dNW4BQRuQf9fXelBlb3ofILpgT+5aFAFmI+fO3OLUvsJqpNzSgKn0LzQkwVfu8uVfXEZ5uLkDD1+xQaDw9FD0Ha1q0IxjauhqDL6BGsWFzzDLVu5o52uAvKOyHQcN2eO0ikaPdxp9haWT2LT/oL/gbkjAgC0LGJpJdRqskEPtc2P43NJ3DpefamJC4ljzm5VKr+i5FQqfmyw3FkTJSot9T/TQLgj53HgVHXz8l0HrFZv0cD1XTJ6Fa00N27DG+LlO77T1TiulICm9EG2N+p2Gpm7Iod/X9zulZws1sngIxyDa9QBlLbc7AZDGg95YprAFkITsnt6S8K9VQeBLrGkzl6N3v8lh45Sniomol3BCH5w081YOrxPZuTAFNQkuRJP3Jz4qsIIaXZIYmnnKxUAiJII0tPpfws2OOeGLuBspwjEdee4RqseWQdWw7D9HN2qJpHOAaY3EJL+Uitj6JYViXw2tm6MBqmH8dd6H5wxJ/rs7Yi0xE3eSl2pO1WS36ShMhFjUawK6Rc1+ZbEImfscpgwqBU7a5exTSD0HT4Au/7++JmP6pCSr7iFMbjHw+Htr5OVmWDiTLTAkvx+il0QAehfaAl3O6V/bV9Kpf8CAEde1pNBGPvatgUgXgAjscDZqrCxdAW9rTPLx21dBpMJOrS4Oty1+EMrrWj3rUrRQfbiqYS6OGhF4v8zWvMQPUTAKM+O6Ruq3uii2Kvdx+htulnMQPUcU8/93s6SxMGlgk0LPmMM58i32u1Urxz7MHxOP0oyyT/ronYB1QJh42eOwQ8ii+sOOHut9g4Ko9ZiZEyxbLSiExVANNBj4XM5FTAa/uWDpQ3tdXSqpa4mCY6EwmFI/48K7yKy+WW2IVfQhNRqLhtF9B2/hruLGdMeEnVLUUSDqrJKDz5o+TejUldHg/gkVKS6wMYbxrqwhy1M9xRnrx7ctv50GygjGUZSYibd3/gSYhC7GpDt3kNS+Dqwg/z5bI/i0Fr9mEBjykornHjYU3rpSuUDZC3iaXHthrEc9X0AGd5G1aUNI4p4LunXJSiZ/GvsG7AUZO60xeTdyBKz+MBk/dpT88khcMXU9wlXkSki3pYHkLxpBocdiwZA1RqAT/RqOUlXqm3J9T4f6Xz7lYBefRUP/Jkyoj2lqr5NwZu8KC5VZ4Y8zJFa8pg4Ut+sEKmS5DWjkJbbEGKMiPt3WQbQIXVJgVMbrelcy2Rvl7WKpj59wzPQ1hrPZnFnspCFZq/jhC9eDmCEZUYgxbih8ZXZnSloAyoreU27v+H48g2ZUlvFYlly1Cnw32HG3AzlEyIQZglaKoVS0lh5q5Ow3zY2o0g/ZpTxOdjftaefsxMoAOoLMNjvnYEsu038lD5jGiTY3GEhN0LS4em/mXCShlQENGYhfMkfrXLQnDCs6pBvn+/P8gGxrQmPGik31GghHqg/o0fbmkfak98FPV2EjvBGIzOnKjQ3FkO5dKpIzyvpMSooPjWHfsjjCKVyuemNgcoIO5doiZ1tjkolGWFnJ+cpCSRxAWrs8v89dX1Vc0hWfXnYvvRyfwqhflHeZMw== MIME-Version: 1.0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN6PR01MB4240.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: 048bd2c1-fbde-4049-3a3e-08dc06c84f2e X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Dec 2023 10:40:57.1042 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 8OEO1/NMEfJqiuEMVkia44fzHgext1bqT6sM3607rrX7GngDgkcrUOud+M8JhPchjRj3VLrap9y/zOAAvcNN2UPG3TuFMtZX2XDwNxry9HL+h7/XyV2Ay9XX33tyzpkf X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR01MB8384 X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org This patch adds a new tuning option 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA', to consider fully pipelined FMAs in reassociation. Also, set this option by default for Ampere CPUs. Tested on aarch64-unknown-linux-gnu. Is this OK for trunk? Thanks, Di Zhao gcc/ChangeLog: * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION): New tuning option AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA. * config/aarch64/aarch64.cc (aarch64_override_options_internal): Set param_fully_pipelined_fma according to tuning option. * config/aarch64/tuning_models/ampere1.h: Add AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA to tune_flags. * config/aarch64/tuning_models/ampere1a.h: Likewise. * config/aarch64/tuning_models/ampere1b.h: Likewise. --- gcc/config/aarch64/aarch64-tuning-flags.def | 2 ++ gcc/config/aarch64/aarch64.cc | 6 ++++++ gcc/config/aarch64/tuning_models/ampere1.h | 3 ++- gcc/config/aarch64/tuning_models/ampere1a.h | 3 ++- gcc/config/aarch64/tuning_models/ampere1b.h | 3 ++- 5 files changed, 14 insertions(+), 3 deletions(-) diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def index f28a73839a6..256f17bad60 100644 --- a/gcc/config/aarch64/aarch64-tuning-flags.def +++ b/gcc/config/aarch64/aarch64-tuning-flags.def @@ -49,4 +49,6 @@ AARCH64_EXTRA_TUNING_OPTION ("matched_vector_throughput", MATCHED_VECTOR_THROUGH AARCH64_EXTRA_TUNING_OPTION ("avoid_cross_loop_fma", AVOID_CROSS_LOOP_FMA) +AARCH64_EXTRA_TUNING_OPTION ("fully_pipelined_FMA", FULLY_PIPELINED_FMA) + #undef AARCH64_EXTRA_TUNING_OPTION diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f9850320f61..1b3b288cdf9 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -18289,6 +18289,12 @@ aarch64_override_options_internal (struct gcc_options *opts) SET_OPTION_IF_UNSET (opts, &global_options_set, param_avoid_fma_max_bits, 512); + /* Consider fully pipelined FMA in reassociation. */ + if (aarch64_tune_params.extra_tuning_flags + & AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA) + SET_OPTION_IF_UNSET (opts, &global_options_set, param_fully_pipelined_fma, + 1); + aarch64_override_options_after_change_1 (opts); } diff --git a/gcc/config/aarch64/tuning_models/ampere1.h b/gcc/config/aarch64/tuning_models/ampere1.h index a144e8f94b3..d63788528a7 100644 --- a/gcc/config/aarch64/tuning_models/ampere1.h +++ b/gcc/config/aarch64/tuning_models/ampere1.h @@ -104,7 +104,8 @@ static const struct tune_params ampere1_tunings = 2, /* min_div_recip_mul_df. */ 0, /* max_case_values. */ tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA), /* tune_flags. */ + (AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA | + AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA), /* tune_flags. */ &ere1_prefetch_tune, AARCH64_LDP_STP_POLICY_ALIGNED, /* ldp_policy_model. */ AARCH64_LDP_STP_POLICY_ALIGNED /* stp_policy_model. */ diff --git a/gcc/config/aarch64/tuning_models/ampere1a.h b/gcc/config/aarch64/tuning_models/ampere1a.h index f688ed08a79..63506e1d1c6 100644 --- a/gcc/config/aarch64/tuning_models/ampere1a.h +++ b/gcc/config/aarch64/tuning_models/ampere1a.h @@ -56,7 +56,8 @@ static const struct tune_params ampere1a_tunings = 2, /* min_div_recip_mul_df. */ 0, /* max_case_values. */ tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA), /* tune_flags. */ + (AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA | + AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA), /* tune_flags. */ &ere1_prefetch_tune, AARCH64_LDP_STP_POLICY_ALIGNED, /* ldp_policy_model. */ AARCH64_LDP_STP_POLICY_ALIGNED /* stp_policy_model. */ diff --git a/gcc/config/aarch64/tuning_models/ampere1b.h b/gcc/config/aarch64/tuning_models/ampere1b.h index a98b6a980f7..7894e730174 100644 --- a/gcc/config/aarch64/tuning_models/ampere1b.h +++ b/gcc/config/aarch64/tuning_models/ampere1b.h @@ -106,7 +106,8 @@ static const struct tune_params ampere1b_tunings = 0, /* max_case_values. */ tune_params::AUTOPREFETCHER_STRONG, /* autoprefetcher_model. */ (AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND | - AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA), /* tune_flags. */ + AARCH64_EXTRA_TUNE_AVOID_CROSS_LOOP_FMA | + AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA), /* tune_flags. */ &ere1b_prefetch_tune, AARCH64_LDP_STP_POLICY_ALIGNED, /* ldp_policy_model. */ AARCH64_LDP_STP_POLICY_ALIGNED /* stp_policy_model. */