Message ID | VE1PR08MB559988D252EEABF494E24BAD83BC9@VE1PR08MB5599.eurprd08.prod.outlook.com |
---|---|
State | Committed |
Commit | 9c751b88def09f71e1edbf623d41776478ead6fd |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B57563858415 for <patchwork@sourceware.org>; Mon, 18 Oct 2021 16:20:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B57563858415 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1634574000; bh=0dLV+FDPEF2xQhCWCXKLj58AZO/CVSjiu+Jz6BvWwRE=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=FxsrWB/fIaJ5sc9G7SjjcvJVIXMxleepg4VTUc/KBvaeF5w3uewcluibb6kd3e7Q8 Bpau2XK2mxGSWAhrf49H3NhqmKIzF8Gs7jo54ZuvbpedUR1h84AD9zj09g4LnecH2o pzrJuMRwO5WU2oU3cB8lPQ+HOSEAzL2E0Xu8PjlI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-eopbgr130057.outbound.protection.outlook.com [40.107.13.57]) by sourceware.org (Postfix) with ESMTPS id 7D864385740B for <gcc-patches@gcc.gnu.org>; Mon, 18 Oct 2021 16:19:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7D864385740B Received: from AM6PR04CA0064.eurprd04.prod.outlook.com (2603:10a6:20b:f0::41) by PAXPR08MB6896.eurprd08.prod.outlook.com (2603:10a6:102:13c::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.15; Mon, 18 Oct 2021 16:19:14 +0000 Received: from VE1EUR03FT012.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:f0:cafe::94) by AM6PR04CA0064.outlook.office365.com (2603:10a6:20b:f0::41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.18 via Frontend Transport; Mon, 18 Oct 2021 16:19:14 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT012.mail.protection.outlook.com (10.152.18.211) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.15 via Frontend Transport; Mon, 18 Oct 2021 16:19:13 +0000 Received: ("Tessian outbound e27daf245730:v103"); Mon, 18 Oct 2021 16:19:13 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 98c7d380b2df7c5a X-CR-MTA-TID: 64aa7808 Received: from d1fd84c316b5.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 0111EC0B-B214-4F5D-AA4C-78F99F546957.1; Mon, 18 Oct 2021 16:19:02 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d1fd84c316b5.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 18 Oct 2021 16:19:02 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DCK9MIRqeSxqVSz2+KUCZRKQ9vIMRnTN+r7ldhb+FLPq+Cdk4NgUDyQ3NGoSZBT4UnShqXNFMSJqaneBrf1+898fbzpGPiFbWVnCky+wLMUf+WikdoKjfrRh1c97ZFGoRsAUIUUPYmQLzvp/b3NRiX3KtK02bGwFqrwsYHCZ2/b7f4h/ia2OPKFuLl6+JBZ3iYScj+bILWKw3BBE7icYe81T6eW2UKTvUo00llOdkla7V8CZFm/HZkUMGiRjZ6nd3GSJbBV+l9jOFQaxOd+18m5NN4rN7M7puxWg+s+YECrw3AJS5NsYAC64Y+sKTfZy1yS3hB691IpUmP3a3tjXWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0dLV+FDPEF2xQhCWCXKLj58AZO/CVSjiu+Jz6BvWwRE=; b=hdIfwpEX9A2n85eMq8kT6QZW8LHRWrWcR15O2WUPYwo3OKG1GMn9CXjRDstx0WzXPZ9ZS4cEvJo03WMMk89oHhtpPQ7B7EQ5Kk098IQxQxdm/wWTMmgkG9CPyjvV3GX4AfrZcZmMh67mIsMCiSQhkrH6hGeudcRUcCeyrKqFCWLuGTP0eV/Ijq5D6AZS8H/ZCqycC5PXhTqnWvPkpKDnnVE30EIODR6q6UdFFt3qebDlrN5iKRiVbR+l2H5F6uQ7Swwf6i5SLPEpQFgBFzEflq5Cfh+vFi3hCtW1EIgbwlmpY8ORe/Wwr0xo8V/hhA8vbzhiqw+5R2xRAOVmhyaExw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB5599.eurprd08.prod.outlook.com (2603:10a6:800:1a1::12) by VI1PR08MB4029.eurprd08.prod.outlook.com (2603:10a6:803:ec::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.15; Mon, 18 Oct 2021 16:19:00 +0000 Received: from VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::281b:cded:83ff:1856]) by VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::281b:cded:83ff:1856%3]) with mapi id 15.20.4608.018; Mon, 18 Oct 2021 16:19:00 +0000 To: GCC Patches <gcc-patches@gcc.gnu.org> Subject: [PATCH] AArch64: Tune case-values-threshold Thread-Topic: [PATCH] AArch64: Tune case-values-threshold Thread-Index: AQHXxDs9q/niIe8ugke2l6w2sX45+w== Date: Mon, 18 Oct 2021 16:19:00 +0000 Message-ID: <VE1PR08MB559988D252EEABF494E24BAD83BC9@VE1PR08MB5599.eurprd08.prod.outlook.com> Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: suggested_attachment_session_id: 5579ab97-dfde-3960-32d8-0461e9dbd2cc Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: 5cc4bcc8-aa18-4603-72a8-08d992530687 x-ms-traffictypediagnostic: VI1PR08MB4029:|PAXPR08MB6896: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: <PAXPR08MB68963B8D0F760589A6493FDE83BC9@PAXPR08MB6896.eurprd08.prod.outlook.com> x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:1169;OLM:1169; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 4HhMk4VSxyNqAqBxFlrt2pe+Vi45QiQ0/U/4ycFP5db4/jpu8omWE+N1ULgRR9ci3tse/1yugA1Iwy2t0SCHDO4L9zGL4p3Y5uF4K87JwcmcQZNs4v7Tc+pRmFqTiJcjQMVFW9Syc4+p0RT/Nm5pmdvF2XOUlsEMWdLx0brtpFZhV6kEmjdJ6Qg1pXuopDhxF9MDBNrYq4uuVQazQiY5LevvPCXVqWjyNeQPZtlnn0576kcg/PfF/qI2SEAkPI7HoUkGkZCNNA1KCMtcdo+hariGfOOXIpIbEsu3Uuxyu5F7YUXpcIVa6dHTbkC8SMr/Xw0C8kwof9/482GJsCRgetDYeG7DLxnWa0btk/o2Y85mTFv4mPDuVmxajxcm9mCx8ym3lgRKgC2Kogcfqe/PDT0m6TsppKEGnVDP/GutRxqBUTc+dw//gxtHrfhhgvdeY4wzmSsQk+rZMWtOt0C0QM/+/Uk8wO/l+SR8rDzecq5RrdVRtole76sfH9AjT2TtJVGahVJpadUK1WwCxHTcAkaFVTtYCdcQ1kjRh5MxFkdVatYFItwDLmjYLdcnSrbq6sqQjvpj/JPhKJdiVYS1G8ZBywhHLMV1SUkpOYTIbPNtRBup+viMDWO91xbExs1Bq+XOI1CpP2Z59NveWqjY8jalDPjSzjeg9ZioKKJ0Rs6TBkYjeqXfMcw6MG79DRC4ZeVlHTIQc4Ol81mncjAyvw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB5599.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(2906002)(186003)(55016002)(66556008)(8936002)(54906003)(66476007)(52536014)(26005)(33656002)(6916009)(7696005)(9686003)(122000001)(91956017)(5660300002)(4326008)(76116006)(38100700002)(4001150100001)(316002)(71200400001)(86362001)(66446008)(508600001)(6506007)(8676002)(64756008)(38070700005)(66946007); DIR:OUT; SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB4029 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 432506bc-ad2d-41a3-7f07-08d99252fe82 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: JII3cVaK6b8ps0DTYyy9Vab4CC3U1KPjA197jg1Yx5mibY21A1Be02voNfyQX3HJ4gB6U2Iq93ySBIspMrbW4dx1SMqvZMdETlECS9juDUy1erith/OgMNxX7cket856KOJLKFBetAXkZ0LDCVtM29WKW2DMvREwyFoJliFEREMvyIti6oO1ET1VuyPXRXPwBQLgeG03PuoLkSRUtCUuEQTMfNdzOdWC2J7AIlj2JsO9PyvFynSYsXe3/HQmJkdzfLEY7w2rhPQPEphpVCV2yLWGS9NwRUCUlDezUPpvDkRahaKe7ktEeYFGkg6pkawio5VyR0C4NAKp24AerJO5EnWCjpZNPQwJhL4eYHVizZzbFenykD8zL+6M033Nc+5p1yR3fbjp1XOorq3YGJmPvh6p7fU4sLaTqH4tyclU6zTiygXELV3jvft8znQi+az6W2CmXAQ2QP0Qpkwy/2ZzUSmVUCRN8SrOzaPU4HNIyjKZvNMYWnBpsMHSQOmg6/fN2HU/cO2/vU2vW3bgyUP/3XyzthYX1yLCaOzGTZE/Sa4igcy2J6hJPby5tKZIoSr3akXmT1gbNfCX7XZFP9FR6Bd4Vu2gGKY08mOmmktlKgLJOrSaYRl2N5thneI7yjQcuJuou0DnoDff1Z/Fy8LKzZUTR2beXyJCiAwO4NqKhJAfXROP45VKK9RDyLlc622yffZA1xA7hnBA+dMHnJAxwA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(4326008)(8936002)(7696005)(70586007)(70206006)(2906002)(33656002)(6506007)(55016002)(4001150100001)(36860700001)(54906003)(336012)(356005)(82310400003)(86362001)(508600001)(52536014)(186003)(8676002)(47076005)(26005)(5660300002)(316002)(6916009)(81166007)(9686003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Oct 2021 16:19:13.8067 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5cc4bcc8-aa18-4603-72a8-08d992530687 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6896 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Wilco Dijkstra via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Cc: Richard Sandiford <Richard.Sandiford@arm.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
AArch64: Tune case-values-threshold
|
|
Commit Message
Wilco Dijkstra
Oct. 18, 2021, 4:19 p.m. UTC
Tune the case-values-threshold setting for modern cores. A value of 11 improves SPECINT2017 by 0.2% and reduces codesize by 0.04%. With -Os use value 8 which reduces codesize by 0.07%. Passes regress, OK for commit? ChangeLog: 2021-10-18 Wilco Dijkstra <wdijkstr@arm.com> * config/aarch64/aarch64.c (aarch64_case_values_threshold): Change to 8 with -Os, 11 otherwise. ---
Comments
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: > Tune the case-values-threshold setting for modern cores. A value of 11 improves > SPECINT2017 by 0.2% and reduces codesize by 0.04%. With -Os use value 8 which > reduces codesize by 0.07%. > > Passes regress, OK for commit? > > ChangeLog: > > 2021-10-18 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_case_values_threshold): > Change to 8 with -Os, 11 otherwise. > > --- > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index f5b25a7f7041645921e6ad85714efda73b993492..adc5256c5ccc1182710d87cc6a1091083d888663 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -9360,8 +9360,8 @@ aarch64_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) > The expansion for a table switch is quite expensive due to the number > of instructions, the table lookup and hard to predict indirect jump. > When optimizing for speed, and -O3 enabled, use the per-core tuning if > - set, otherwise use tables for > 16 cases as a tradeoff between size and > - performance. When optimizing for size, use the default setting. */ > + set, otherwise use tables for >= 11 cases as a tradeoff between size and > + performance. When optimizing for size, use 8 for smallest codesize. */ I'm just concerned that here we're using the same explanation but with different numbers. Why are the new numbers more right than the old ones (especially when it comes to code size, where the trade-off hasn't really changed)? It would be good to have more discussion of why certain numbers are too small or too high, and why 8 is the right pivot point for -Os. Thanks, Richard > > static unsigned int > aarch64_case_values_threshold (void) > @@ -9372,7 +9372,7 @@ aarch64_case_values_threshold (void) > && selected_cpu->tune->max_case_values != 0) > return selected_cpu->tune->max_case_values; > else > - return optimize_size ? default_case_values_threshold () : 17; > + return optimize_size ? 8 : 11; > } > > /* Return true if register REGNO is a valid index register.
Hi Richard, > I'm just concerned that here we're using the same explanation but with > different numbers. Why are the new numbers more right than the old ones > (especially when it comes to code size, where the trade-off hasn't > really changed)? Like all tuning/costing parameters, these values are never fixed but change over time due to optimizations, micro architectures and workloads. The previous values were out of date so that's why I retuned them by benchmarking different values and choosing the best combinations. > It would be good to have more discussion of why certain numbers are > too small or too high, and why 8 is the right pivot point for -Os. You mean add more discussion in the comment? That comment is already overly large and too specific - it would be better to reduce it. The -Os value was never tuned, and 8 turns out to be faster and smaller than GCC's default. Cheers, Wilco
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: > Hi Richard, > >> I'm just concerned that here we're using the same explanation but with >> different numbers. Why are the new numbers more right than the old ones >> (especially when it comes to code size, where the trade-off hasn't >> really changed)? > > Like all tuning/costing parameters, these values are never fixed but change > over time due to optimizations, micro architectures and workloads. > The previous values were out of date so that's why I retuned them by > benchmarking different values and choosing the best combinations. > >> It would be good to have more discussion of why certain numbers are >> too small or too high, and why 8 is the right pivot point for -Os. > > You mean add more discussion in the comment? That comment is already overly > large and too specific - it would be better to reduce it. The -Os value was never > tuned, and 8 turns out to be faster and smaller than GCC's default. The problem is that you're effectively asking for these values to be taken on faith without providing any analysis and without describing how you arrived at the new numbers. Did you try other values too? If so, how did they compare with the numbers that you finally chose? At least that would give an indication of where the boundaries are. For example, it's easier to believe that 8 is the right value for -Os if you say that you tried 9 and 7 as well, and they were worse than 8 by X% and Y%. This would also help anyone who wants to tweak the numbers again in future. BTW, which CPU did you use to do the experiments? Are the tuning parameters for that CPU already consistent with the new generic values? Thanks, Richard
On 19 October 2021 15:23:58 CEST, Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: >Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: >> Hi Richard, >> >>> I'm just concerned that here we're using the same explanation but with >>> different numbers. Why are the new numbers more right than the old ones >>> (especially when it comes to code size, where the trade-off hasn't >>> really changed)? >> >> Like all tuning/costing parameters, these values are never fixed but change >> over time due to optimizations, micro architectures and workloads. >> The previous values were out of date so that's why I retuned them by >> benchmarking different values and choosing the best combinations. >> >>> It would be good to have more discussion of why certain numbers are >>> too small or too high, and why 8 is the right pivot point for -Os. >> >> You mean add more discussion in the comment? That comment is already overly >> large and too specific - it would be better to reduce it. The -Os value was never >> tuned, and 8 turns out to be faster and smaller than GCC's default. > >The problem is that you're effectively asking for these values to be >taken on faith without providing any analysis and without describing >how you arrived at the new numbers. Did you try other values too? >If so, how did they compare with the numbers that you finally chose? >At least that would give an indication of where the boundaries are. Maybe you can show csibe benchmark numbers to show the effects: http://szeged.github.io/csibe/ thanks, > >For example, it's easier to believe that 8 is the right value for -Os if >you say that you tried 9 and 7 as well, and they were worse than 8 by X% >and Y%. This would also help anyone who wants to tweak the numbers >again in future. > >BTW, which CPU did you use to do the experiments? Are the tuning >parameters for that CPU already consistent with the new generic values? > >Thanks, >Richard
Hi Richard, > The problem is that you're effectively asking for these values to be > taken on faith without providing any analysis and without describing > how you arrived at the new numbers. Did you try other values too? > If so, how did they compare with the numbers that you finally chose? > At least that would give an indication of where the boundaries are. Yes, I obviously tried other values, pretty much all in range 1-20. There is generally a range of 4-5 values that are very similar in size, and then you choose one in the middle which also looks good for performance. > For example, it's easier to believe that 8 is the right value for -Os if > you say that you tried 9 and 7 as well, and they were worse than 8 by X% > and Y%. This would also help anyone who wants to tweak the numbers > again in future. For -Os, the size range for values 6-10 is within 0.01% so they are virtually identical and I picked the median. Whether this will remain best in the future is unclear since it depends on so many things, so at some point it needs to be looked at again, just like most other tunings. > BTW, which CPU did you use to do the experiments? Are the tuning > parameters for that CPU already consistent with the new generic values? This was done on Neoverse N1. Almost no CPUs use per-CPU tuning for this. Cheers, Wilco
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: > Hi Richard, > >> The problem is that you're effectively asking for these values to be >> taken on faith without providing any analysis and without describing >> how you arrived at the new numbers. Did you try other values too? >> If so, how did they compare with the numbers that you finally chose? >> At least that would give an indication of where the boundaries are. > > Yes, I obviously tried other values, pretty much all in range 1-20. There is > generally a range of 4-5 values that are very similar in size, and then you > choose one in the middle which also looks good for performance. > >> For example, it's easier to believe that 8 is the right value for -Os if >> you say that you tried 9 and 7 as well, and they were worse than 8 by X% >> and Y%. This would also help anyone who wants to tweak the numbers >> again in future. > > For -Os, the size range for values 6-10 is within 0.01% so they are virtually > identical and I picked the median. Whether this will remain best in the future > is unclear since it depends on so many things, so at some point it needs > to be looked at again, just like most other tunings. Thanks. These details are useful. For example, if someone finds a compelling reason to bump the new values by +/-2 (to help with a particular test case) then it sounds we should accept that, since it wouldn't conflict with your work. So the patch is OK, thanks. (FWIW, I tried building a linux kernel I had lying around at -Os, which also showed an improvement of ~0.07%.) Richard
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f5b25a7f7041645921e6ad85714efda73b993492..adc5256c5ccc1182710d87cc6a1091083d888663 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9360,8 +9360,8 @@ aarch64_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) The expansion for a table switch is quite expensive due to the number of instructions, the table lookup and hard to predict indirect jump. When optimizing for speed, and -O3 enabled, use the per-core tuning if - set, otherwise use tables for > 16 cases as a tradeoff between size and - performance. When optimizing for size, use the default setting. */ + set, otherwise use tables for >= 11 cases as a tradeoff between size and + performance. When optimizing for size, use 8 for smallest codesize. */ static unsigned int aarch64_case_values_threshold (void) @@ -9372,7 +9372,7 @@ aarch64_case_values_threshold (void) && selected_cpu->tune->max_case_values != 0) return selected_cpu->tune->max_case_values; else - return optimize_size ? default_case_values_threshold () : 17; + return optimize_size ? 8 : 11; } /* Return true if register REGNO is a valid index register.