From patchwork Mon May 23 01:39:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 54272 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6D3FB383E6A3 for ; Mon, 23 May 2022 01:39:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6D3FB383E6A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1653269985; bh=S2+RQBWdkY/6mf9hkiQo61t7HgR9bQ2i9xvERrhHVt4=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=WMFd0EmSzM4+1bjfwqddFOF9/uxAdVujmkWU9fWxhU0iVr7yVxN5Jq7VdT7WGo+pp GogNXnqCuzINa7gdfzKPVdJNQoUzxEsW6Gbv98YgY9yOA5OALRwgMauK68BlAxCsqW zcp1wSv+EMYatLF4RB/fQYHxMAmlv7obYWHTXZts= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 1274E385742C; Mon, 23 May 2022 01:39:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1274E385742C Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24MNMDFY031267; Mon, 23 May 2022 01:39:13 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3g79e81bgp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 23 May 2022 01:39:13 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 24N1BshR017445; Mon, 23 May 2022 01:39:12 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3g79e81bge-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 23 May 2022 01:39:12 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 24N1D0kL013846; Mon, 23 May 2022 01:39:11 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma03dal.us.ibm.com with ESMTP id 3g6qq9qq4y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 23 May 2022 01:39:11 +0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 24N1dAEJ23593320 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 23 May 2022 01:39:10 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C1E266E052; Mon, 23 May 2022 01:39:10 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7AB606E059; Mon, 23 May 2022 01:39:10 +0000 (GMT) Received: from genoa (unknown [9.40.192.157]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTPS; Mon, 23 May 2022 01:39:10 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH V2]rs6000: Store complicated constant into pool References: <20220510092850.1360120-1-guojiufu@linux.ibm.com> Date: Mon, 23 May 2022 09:39:07 +0800 In-Reply-To: <20220510092850.1360120-1-guojiufu@linux.ibm.com> (Jiufu Guo's message of "Tue, 10 May 2022 17:28:50 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) X-TM-AS-GCONF: 00 X-Proofpoint-GUID: zPmUMy9iGXe6p9-JmKIGJVS67uYK4eas X-Proofpoint-ORIG-GUID: RJKRkgZ_NMP2uh7Ov_oCOEMlcHdJ4_XP X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-22_12,2022-05-20_02,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 suspectscore=0 spamscore=0 phishscore=0 priorityscore=1501 mlxlogscore=999 malwarescore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2205230006 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Cc: dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, And after some discussions in the previous review: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591509.html This patch simply updates the rtx_cost hook to accurate cost on constant by using 'COSTS_N_INSNS' with 'num_insns_constant'. This could avoid CSE eliminate constant loading. Compare with previous patch, this patch update cost for prefixed load('pld') slightly faster. To set a constant to a reg, one way is building the constant through instructions, like lis/ori/sldi... Another way is loading it from the constant pool through instruction 'ld'(or 'pld' for P10). Loading a constant may need 2 instructions (or just 'pld' on P10), and according to testing, if building the constant needs more than 2 instructions (or more than 1 instruction on P10), it is faster to load it from constant pool. This patch reduces the threshold of instruction number for storing constant to pool and update cost for constant and mem accessing. Bootstrap and regtest pass on ppc64le and ppc64. Is this ok for trunk? BR, Jiufu PR target/63281 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_cannot_force_const_mem): Exclude rtx with code 'HIGH'. (rs6000_emit_move): Update threshold of const insn. (rs6000_rtx_costs): Update cost of constant and mem. gcc/testsuite/ChangeLog: * gcc.target/powerpc/medium_offset.c: Update. * gcc.target/powerpc/pr93012.c: Update. * gcc.target/powerpc/pr63281.c: New test. --- gcc/config/rs6000/rs6000.cc | 23 +++++++++++++++---- .../gcc.target/powerpc/medium_offset.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr63281.c | 11 +++++++++ gcc/testsuite/gcc.target/powerpc/pr93012.c | 2 +- 4 files changed, 31 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr63281.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index cd291f93019..90c91a8e1ea 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -9706,8 +9706,9 @@ rs6000_init_stack_protect_guard (void) static bool rs6000_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) { - if (GET_CODE (x) == HIGH - && GET_CODE (XEXP (x, 0)) == UNSPEC) + /* Exclude CONSTANT HIGH part. e.g. + (high:DI (symbol_ref:DI ("var") [flags 0xc0] )). */ + if (GET_CODE (x) == HIGH) return true; /* A TLS symbol in the TOC cannot contain a sum. */ @@ -11139,7 +11140,7 @@ rs6000_emit_move (rtx dest, rtx source, machine_mode mode) && FP_REGNO_P (REGNO (operands[0]))) || !CONST_INT_P (operands[1]) || (num_insns_constant (operands[1], mode) - > (TARGET_CMODEL != CMODEL_SMALL ? 3 : 2))) + > (TARGET_PREFIXED ? 1 : 2))) && !toc_relative_expr_p (operands[1], false, NULL, NULL) && (TARGET_CMODEL == CMODEL_SMALL || can_create_pseudo_p () @@ -22101,6 +22102,14 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, case CONST_DOUBLE: case CONST_WIDE_INT: + /* Set a const to reg, it may needs a few insns. */ + if (outer_code == SET) + { + *total = COSTS_N_INSNS (num_insns_constant (x, mode)); + return true; + } + /* FALLTHRU */ + case CONST: case HIGH: case SYMBOL_REF: @@ -22110,8 +22119,12 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, case MEM: /* When optimizing for size, MEM should be slightly more expensive than generating address, e.g., (plus (reg) (const)). - L1 cache latency is about two instructions. */ - *total = !speed ? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (2); + L1 cache latency is about two instructions. + For prefixed load (pld), we would set it slightly faster than + than two instructions. */ + *total = !speed + ? COSTS_N_INSNS (1) + 1 + : TARGET_PREFIXED ? COSTS_N_INSNS (2) - 1 : COSTS_N_INSNS (2); if (rs6000_slow_unaligned_access (mode, MEM_ALIGN (x))) *total += COSTS_N_INSNS (100); return true; diff --git a/gcc/testsuite/gcc.target/powerpc/medium_offset.c b/gcc/testsuite/gcc.target/powerpc/medium_offset.c index f29eba08c38..4889e8fa8ec 100644 --- a/gcc/testsuite/gcc.target/powerpc/medium_offset.c +++ b/gcc/testsuite/gcc.target/powerpc/medium_offset.c @@ -1,7 +1,7 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ /* { dg-require-effective-target lp64 } */ /* { dg-options "-O" } */ -/* { dg-final { scan-assembler-not "\\+4611686018427387904" } } */ +/* { dg-final { scan-assembler-times {\msldi|pld\M} 1 } } */ static int x; diff --git a/gcc/testsuite/gcc.target/powerpc/pr63281.c b/gcc/testsuite/gcc.target/powerpc/pr63281.c new file mode 100644 index 00000000000..469a8f64400 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr63281.c @@ -0,0 +1,11 @@ +/* PR target/63281 */ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O2 -std=c99" } */ + +void +foo (unsigned long long *a) +{ + *a = 0x020805006106003; +} + +/* { dg-final { scan-assembler-times {\mp?ld\M} 1 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c b/gcc/testsuite/gcc.target/powerpc/pr93012.c index 4f764d0576f..5afb4f79c45 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c @@ -10,4 +10,4 @@ unsigned long long mskh1() { return 0xffff9234ffff9234ULL; } unsigned long long mskl1() { return 0x2bcdffff2bcdffffULL; } unsigned long long mskse() { return 0xffff1234ffff1234ULL; } -/* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */ +/* { dg-final { scan-assembler-times {\mrldimi|ld|pld\M} 7 } } */