From patchwork Tue Jun 14 13:23:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 55072 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E3D34385C33E for ; Tue, 14 Jun 2022 13:24:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E3D34385C33E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1655213072; bh=yfN7xae5W6O72lDpk0/PsOpq7iBocQyoWDHrANClo9Q=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Bd+8YNgqoHmRhhZqe5AaMkRhtAri7MXdq3DUzTGBMeNoTKSgtAXO9t0pTGYNJalke xysvaPP07Tw5Ish50EGnqNYcet6HOd2P8/lL3dugoAXcb1NWD0vmc/K5y7epbcc8Nf XYVxtQMi6L+XCo0VAFKuNtbKOzMxf3nuRvxRPiZY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 106873858D28; Tue, 14 Jun 2022 13:24:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 106873858D28 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25EDBeIv037441; Tue, 14 Jun 2022 13:24:02 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3gpr3f5c05-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Jun 2022 13:24:02 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 25EDKTI8034606; Tue, 14 Jun 2022 13:24:01 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3gpr3f5byn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Jun 2022 13:24:01 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 25ED6XNK008610; Tue, 14 Jun 2022 13:23:59 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma03ams.nl.ibm.com with ESMTP id 3gmjp9cdra-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Jun 2022 13:23:59 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 25EDNvsa17170710 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Jun 2022 13:23:57 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 56C2AA405B; Tue, 14 Jun 2022 13:23:57 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 71FACA4054; Tue, 14 Jun 2022 13:23:56 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 14 Jun 2022 13:23:56 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH V2]rs6000: Store complicated constant into pool Date: Tue, 14 Jun 2022 21:23:55 +0800 Message-Id: <20220614132355.117887-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: UVCPWpd9OxedpDWjLZQnlO5hU6yjBSiR X-Proofpoint-GUID: agX3XFInBlMv17_yO3uy_zkJ6FQWT-Ty X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.517,FMLib:17.11.64.514 definitions=2022-06-14_04,2022-06-13_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 malwarescore=0 adultscore=0 phishscore=0 priorityscore=1501 clxscore=1011 bulkscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2204290000 definitions=main-2206140053 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Cc: amodra@gmail.com, dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch is same with nearly: https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595378.html The concept of this patch is similar with the patches which is attached in PR63281. e.g. https://gcc.gnu.org/bugzilla/attachment.cgi?id=42186 I had a test for perlbench from SPEC2017. As expected, on -O2 for P9, there is ~2% performance improvement with this patch. This patch reduces the threshold of instruction number for storing constant to pool and update cost for constant and mem accessing. And then if building the constant needs more than 2 instructions (or more than 1 instruction on P10), then prefer to load it from constant pool. Bootstrap and regtest pass on ppc64le and ppc64. Is this ok for trunk? Thanks for comments and sugguestions. BR, Jiufu PR target/63281 gcc/ChangeLog: 2022-06-14 Jiufu Guo Alan Modra * config/rs6000/rs6000.cc (rs6000_cannot_force_const_mem): Exclude rtx with code 'HIGH'. (rs6000_emit_move): Update threshold of const insn. (rs6000_rtx_costs): Update cost of constant and mem. gcc/testsuite/ChangeLog: 2022-06-14 Jiufu Guo Alan Modra * gcc.target/powerpc/medium_offset.c: Update. * gcc.target/powerpc/pr93012.c: Update. * gcc.target/powerpc/pr63281.c: New test. --- gcc/config/rs6000/rs6000.cc | 23 +++++++++++++++---- .../gcc.target/powerpc/medium_offset.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr63281.c | 11 +++++++++ gcc/testsuite/gcc.target/powerpc/pr93012.c | 2 +- 4 files changed, 31 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr63281.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index cd291f93019..90c91a8e1ea 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -9706,8 +9706,9 @@ rs6000_init_stack_protect_guard (void) static bool rs6000_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) { - if (GET_CODE (x) == HIGH - && GET_CODE (XEXP (x, 0)) == UNSPEC) + /* Exclude CONSTANT HIGH part. e.g. + (high:DI (symbol_ref:DI ("var") [flags 0xc0] )). */ + if (GET_CODE (x) == HIGH) return true; /* A TLS symbol in the TOC cannot contain a sum. */ @@ -11139,7 +11140,7 @@ rs6000_emit_move (rtx dest, rtx source, machine_mode mode) && FP_REGNO_P (REGNO (operands[0]))) || !CONST_INT_P (operands[1]) || (num_insns_constant (operands[1], mode) - > (TARGET_CMODEL != CMODEL_SMALL ? 3 : 2))) + > (TARGET_PREFIXED ? 1 : 2))) && !toc_relative_expr_p (operands[1], false, NULL, NULL) && (TARGET_CMODEL == CMODEL_SMALL || can_create_pseudo_p () @@ -22101,6 +22102,14 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, case CONST_DOUBLE: case CONST_WIDE_INT: + /* It may needs a few insns for const to SET. -1 for outer SET code. */ + if (outer_code == SET) + { + *total = COSTS_N_INSNS (num_insns_constant (x, mode)) - 1; + return true; + } + /* FALLTHRU */ + case CONST: case HIGH: case SYMBOL_REF: @@ -22110,8 +22119,12 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int outer_code, case MEM: /* When optimizing for size, MEM should be slightly more expensive than generating address, e.g., (plus (reg) (const)). - L1 cache latency is about two instructions. */ - *total = !speed ? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (2); + L1 cache latency is about two instructions. + For prefixed load (pld), we would set it slightly faster than + than two instructions. */ + *total = !speed + ? COSTS_N_INSNS (1) + 1 + : TARGET_PREFIXED ? COSTS_N_INSNS (2) - 1 : COSTS_N_INSNS (2); if (rs6000_slow_unaligned_access (mode, MEM_ALIGN (x))) *total += COSTS_N_INSNS (100); return true; diff --git a/gcc/testsuite/gcc.target/powerpc/medium_offset.c b/gcc/testsuite/gcc.target/powerpc/medium_offset.c index f29eba08c38..4889e8fa8ec 100644 --- a/gcc/testsuite/gcc.target/powerpc/medium_offset.c +++ b/gcc/testsuite/gcc.target/powerpc/medium_offset.c @@ -1,7 +1,7 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ /* { dg-require-effective-target lp64 } */ /* { dg-options "-O" } */ -/* { dg-final { scan-assembler-not "\\+4611686018427387904" } } */ +/* { dg-final { scan-assembler-times {\msldi|pld\M} 1 } } */ static int x; diff --git a/gcc/testsuite/gcc.target/powerpc/pr63281.c b/gcc/testsuite/gcc.target/powerpc/pr63281.c new file mode 100644 index 00000000000..469a8f64400 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr63281.c @@ -0,0 +1,11 @@ +/* PR target/63281 */ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O2 -std=c99" } */ + +void +foo (unsigned long long *a) +{ + *a = 0x020805006106003; +} + +/* { dg-final { scan-assembler-times {\mp?ld\M} 1 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c b/gcc/testsuite/gcc.target/powerpc/pr93012.c index 4f764d0576f..5afb4f79c45 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c @@ -10,4 +10,4 @@ unsigned long long mskh1() { return 0xffff9234ffff9234ULL; } unsigned long long mskl1() { return 0x2bcdffff2bcdffffULL; } unsigned long long mskse() { return 0xffff1234ffff1234ULL; } -/* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */ +/* { dg-final { scan-assembler-times {\mrldimi|ld|pld\M} 7 } } */