From patchwork Wed Sep 15 08:52:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 45016 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 569F83857C4F for ; Wed, 15 Sep 2021 08:53:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 569F83857C4F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1631696008; bh=+qr9IVdiZoHgq+moTFvqa0qiJrcMqD/c8+zfaRCFkg8=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=V67bxcpev+qH5cXOvgMwGBYOPWCdWPpJK4uXCIOvERrwi8uvUAQ+X4D3P81n+xHOt +2iFMdTz8WCUve6cQTyww0WwCf3fzfyB6P73xqUz1ZJxESF2B8okzQPi8RGTAVafwm Fr420nJ9DQJcbb4uth6NsAMwajNMyU/eBpg5wxuk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 09DFD3858432 for ; Wed, 15 Sep 2021 08:52:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 09DFD3858432 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.0.43) with SMTP id 18F80LfP003290; Wed, 15 Sep 2021 04:52:57 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3b3cxn12m0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Sep 2021 04:52:56 -0400 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 18F8nC3f003173; Wed, 15 Sep 2021 04:52:56 -0400 Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 3b3cxn12k8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Sep 2021 04:52:56 -0400 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 18F8lErQ027499; Wed, 15 Sep 2021 08:52:54 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04ams.nl.ibm.com with ESMTP id 3b0m3a4qw7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Sep 2021 08:52:53 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 18F8mLxe41681158 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Sep 2021 08:48:21 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AE5F4AE051; Wed, 15 Sep 2021 08:52:51 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6D7CDAE05F; Wed, 15 Sep 2021 08:52:50 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.147.34]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 15 Sep 2021 08:52:50 +0000 (GMT) To: GCC Patches Subject: [PATCH] rs6000: Parameterize some const values for density test Message-ID: Date: Wed, 15 Sep 2021 16:52:49 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: P22NYMs1vjm03k23ZGvT8MSK77AyZlYq X-Proofpoint-ORIG-GUID: ae5MGrTRVIuzpcCXI3K86bLYOOsmrwEo X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.687,Hydra:6.0.235,FMLib:17.0.607.475 definitions=2020-10-13_15,2020-10-13_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxlogscore=999 impostorscore=0 mlxscore=0 priorityscore=1501 lowpriorityscore=0 spamscore=0 clxscore=1015 adultscore=0 phishscore=0 suspectscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109150040 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Cc: Bill Schmidt , David Edelsohn , Segher Boessenkool Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch follows the discussion here[1], where Segher suggested parameterizing those exact magic constants for density heuristics, to make it easier to tweak if need. Since these heuristics are quite internal, I make these parameters as undocumented and be mainly used by developers. The change here should be "No Functional Change". But I verified it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8, the result is neutral as expected. Bootstrapped and regress-tested on powerpc64le-linux-gnu Power9. Is it ok for trunk? [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html BR, Kewen ----- gcc/ChangeLog: * config/rs6000/rs6000.opt (rs6000-density-pct-threshold, rs6000-density-size-threshold, rs6000-density-penalty, rs6000-density-load-pct-threshold, rs6000-density-load-num-threshold): New parameter. * config/rs6000/rs6000.c (rs6000_density_test): Adjust with corresponding parameters. --- gcc/config/rs6000/rs6000.c | 22 +++++++--------------- gcc/config/rs6000/rs6000.opt | 21 +++++++++++++++++++++ 2 files changed, 28 insertions(+), 15 deletions(-) -- 2.25.1 diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 9bc826e3a50..4ab23b0ab33 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -5284,9 +5284,6 @@ struct rs6000_cost_data static void rs6000_density_test (rs6000_cost_data *data) { - const int DENSITY_PCT_THRESHOLD = 85; - const int DENSITY_SIZE_THRESHOLD = 70; - const int DENSITY_PENALTY = 10; struct loop *loop = data->loop_info; basic_block *bbs = get_loop_body (loop); int nbbs = loop->num_nodes; @@ -5322,26 +5319,21 @@ rs6000_density_test (rs6000_cost_data *data) free (bbs); density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost); - if (density_pct > DENSITY_PCT_THRESHOLD - && vec_cost + not_vec_cost > DENSITY_SIZE_THRESHOLD) + if (density_pct > rs6000_density_pct_threshold + && vec_cost + not_vec_cost > rs6000_density_size_threshold) { - data->cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100; + data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "density %d%%, cost %d exceeds threshold, penalizing " - "loop body cost by %d%%\n", density_pct, - vec_cost + not_vec_cost, DENSITY_PENALTY); + "loop body cost by %u%%\n", density_pct, + vec_cost + not_vec_cost, rs6000_density_penalty); } /* Check whether we need to penalize the body cost to account for excess strided or elementwise loads. */ if (data->extra_ctor_cost > 0) { - /* Threshold for load stmts percentage in all vectorized stmts. */ - const int DENSITY_LOAD_PCT_THRESHOLD = 45; - /* Threshold for total number of load stmts. */ - const int DENSITY_LOAD_NUM_THRESHOLD = 20; - gcc_assert (data->nloads <= data->nstmts); unsigned int load_pct = (data->nloads * 100) / data->nstmts; @@ -5355,8 +5347,8 @@ rs6000_density_test (rs6000_cost_data *data) the loads. One typical case is the innermost loop of the hotspot of SPEC2017 503.bwaves_r without loop interchange. */ - if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD - && load_pct > DENSITY_LOAD_PCT_THRESHOLD) + if (data->nloads > (unsigned int) rs6000_density_load_num_threshold + && load_pct > (unsigned int) rs6000_density_load_pct_threshold) { data->cost[vect_body] += data->extra_ctor_cost; if (dump_enabled_p ()) diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index 0538db387dc..563983f3269 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -639,3 +639,24 @@ Enable instructions that guard against return-oriented programming attacks. mprivileged Target Var(rs6000_privileged) Init(0) Generate code that will run in privileged state. + +-param=rs6000-density-pct-threshold= +Target Undocumented Joined UInteger Var(rs6000_density_pct_threshold) Init(85) IntegerRange(0, 99) Param +When costing for loop vectorization, we probably need to penalize the loop body cost if the existing cost model may not adequately reflect delays from unavailable vector resources. We collect the cost for vectorized statements and non-vectorized statements separately, check the proportion of vec_cost to total cost of vec_cost and non vec_cost, and penalize only if the proportion exceeds the threshold specified by this parameter. The default value is 85. + +-param=rs6000-density-size-threshold= +Target Undocumented Joined UInteger Var(rs6000_density_size_threshold) Init(70) IntegerRange(0, 99) Param +Like parameter rs6000-density-pct-threshold, we also check the total sum of vec_cost and non vec_cost, and penalize only if the sum exceeds the threshold specified by this parameter. The default value is 70. + +-param=rs6000-density-penalty= +Target Undocumented Joined UInteger Var(rs6000_density_penalty) Init(10) IntegerRange(0, 1000) Param +When both heuristics with rs6000-density-pct-threshold and rs6000-density-size-threshold are satisfied, we decide to penalize the loop body cost by the value which is specified by this parameter. The default value is 10. + +-param=rs6000-density-load-pct-threshold= +Target Undocumented Joined UInteger Var(rs6000_density_load_pct_threshold) Init(45) IntegerRange(0, 99) Param +When costing for loop vectorization, we probably need to penalize the loop body cost by accounting for excess strided or elementwise loads. We collect the numbers for general statements and load statements according to the information for statement to be vectorized, check the proportion of load statements, and penalize only if the proportion exceeds the threshold specified by this parameter. The default value is 45. + +-param=rs6000-density-load-num-threshold= +Target Undocumented Joined UInteger Var(rs6000_density_load_num_threshold) Init(20) IntegerRange(0, 1000) Param +Like parameter rs6000-density-load-pct-threshold, we also check if the total number of load statements exceeds the threshold specified by this parameter, and penalize only if it's satisfied. The default value is 20. +