From patchwork Fri Feb 25 11:38:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 51393 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 96EF63857C5D for ; Fri, 25 Feb 2022 11:38:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 96EF63857C5D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1645789119; bh=fNffHbq8bHieFNVpXTXxmFTidZcRPi2TWud4DPuTb98=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=LH3XbyWEcVK0b2JOKXwcJYp9cyq/zL3VhcgpTHnPj/UUrEe2AbHhr7LwUo+p58tbJ 0nIGNVpCK1wlggqPths9XJW00wOA6oyif+iO1ojhb1RLzWWGb1Be98kh2J0kuLAu4c MeOGwG3lA9lc6b1UvbIi0cG8f2JjogA8r3VkGqs4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id CF07E3858414 for ; Fri, 25 Feb 2022 11:38:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CF07E3858414 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21PBMkuV004728 for ; Fri, 25 Feb 2022 11:38:08 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 3eex1n0k9e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 25 Feb 2022 11:38:08 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 21PBXFW6018928 for ; Fri, 25 Feb 2022 11:38:06 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma04fra.de.ibm.com with ESMTP id 3ear69wrbe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 25 Feb 2022 11:38:06 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 21PBc3jk52953482 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 11:38:03 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F084B11C052; Fri, 25 Feb 2022 11:38:02 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B01AA11C058; Fri, 25 Feb 2022 11:38:02 +0000 (GMT) Received: from [9.171.59.228] (unknown [9.171.59.228]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Fri, 25 Feb 2022 11:38:02 +0000 (GMT) Message-ID: <3eed88a1-3a38-068e-fdff-87fbebabe049@linux.ibm.com> Date: Fri, 25 Feb 2022 12:38:02 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: GCC Patches , Andreas Krebbel Subject: [PATCH] s390: Change SET rtx_cost handling. X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Xs_FzdbmHbQjzAOmUnD-O6pw-mSjRXvs X-Proofpoint-ORIG-GUID: Xs_FzdbmHbQjzAOmUnD-O6pw-mSjRXvs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-02-25_07,2022-02-25_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 suspectscore=0 malwarescore=0 impostorscore=0 mlxlogscore=999 clxscore=1015 spamscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250062 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Robin Dapp via Gcc-patches From: Robin Dapp Reply-To: Robin Dapp Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, the IF_THEN_ELSE detection currently prevents us from properly costing register-register moves which causes the lower-subreg pass to assume that a VR-VR move is as expensive as two GPR-GPR moves. This patch adds handling for SETs containing REGs as well as MEMs and is inspired by the aarch64 implementation. Bootstrapped and regtested on z900 up to z15. Is it OK? Regards Robin --- gcc/ChangeLog: * config/s390/s390.cc (s390_address_cost): Declare. (s390_hard_regno_nregs): Declare. (s390_rtx_costs): Add handling for REG and MEM in SET. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New test. From 8c4c6f029dbf0c9db12c2877189a5ec0ce0a9c89 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 3 Feb 2022 12:50:04 +0100 Subject: [PATCH] s390: Change SET rtx_cost handling. The IF_THEN_ELSE detection currently prevents us from properly costing register-register moves which causes the lower-subreg pass to assume that a VR-VR move is as expensive as two GPR-GPR moves. This patch adds handling for SETs containing REGs as well as MEMs and is inspired by the aarch64 implementation. gcc/ChangeLog: * config/s390/s390.cc (s390_address_cost): Declare. (s390_hard_regno_nregs): Declare. (s390_rtx_costs): Add handling for REG and MEM in SET. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New test. --- gcc/config/s390/s390.cc | 140 +++++++++++++----- .../vector/vec-sum-across-no-lower-subreg-1.c | 17 +++ 2 files changed, 118 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c +++ b/gcc/testsuitgcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.ce/gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { s390*-*-* } } } */ +/* { dg-options "-O3 -mzarch -mzvector -march=z15 -fdump-rtl-subreg1" } */ + +/* { dg-final { scan-rtl-dump-times "Skipping mode V2DI for copy lowering" 2 "subreg1" } } */ + +#include + +#define STYPE long long +#define VTYPE __attribute__((vector_size(16))) STYPE + +STYPE foo1 (VTYPE a) +{ + /* { dg-final { scan-assembler-not "vst\t.*" } } */ + /* { dg-final { scan-assembler-not "lg\t.*" } } */ + /* { dg-final { scan-assembler-not "lgr\t.*" } } */ + return a[0] + a[1]; +} diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index d2af6d8813d..e647c90ab29 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -429,6 +429,14 @@ struct s390_address bytes on a z10 (or higher) CPU. */ #define PREDICT_DISTANCE (TARGET_Z10 ? 384 : 2048) +static int +s390_address_cost (rtx addr, machine_mode mode ATTRIBUTE_UNUSED, + addr_space_t as ATTRIBUTE_UNUSED, + bool speed ATTRIBUTE_UNUSED); + +static unsigned int +s390_hard_regno_nregs (unsigned int regno, machine_mode mode); + /* Masks per jump target register indicating which thunk need to be generated. */ static GTY(()) int indirect_branch_prez10thunk_mask = 0; @@ -3619,50 +3627,104 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code, case MEM: *total = 0; return true; - case SET: - { - /* Without this a conditional move instruction would be - accounted as 3 * COSTS_N_INSNS (set, if_then_else, - comparison operator). That's a bit pessimistic. */ + { + rtx dest = SET_DEST (x); + rtx src = SET_SRC (x); - if (!TARGET_Z196 || GET_CODE (SET_SRC (x)) != IF_THEN_ELSE) - return false; + switch (GET_CODE (src)) + { + case IF_THEN_ELSE: + { + /* Without this a conditional move instruction would be + accounted as 3 * COSTS_N_INSNS (set, if_then_else, + comparison operator). That's a bit pessimistic. */ + + if (!TARGET_Z196) + return false; + + rtx cond = XEXP (src, 0); + if (!CC_REG_P (XEXP (cond, 0)) || !CONST_INT_P (XEXP (cond, 1))) + return false; + + /* It is going to be a load/store on condition. Make it + slightly more expensive than a normal load. */ + *total = COSTS_N_INSNS (1) + 2; + + rtx dst = SET_DEST (src); + rtx then = XEXP (src, 1); + rtx els = XEXP (src, 2); + + /* It is a real IF-THEN-ELSE. An additional move will be + needed to implement that. */ + if (!TARGET_Z15 + && reload_completed + && !rtx_equal_p (dst, then) + && !rtx_equal_p (dst, els)) + *total += COSTS_N_INSNS (1) / 2; + + /* A minor penalty for constants we cannot directly handle. */ + if ((CONST_INT_P (then) || CONST_INT_P (els)) + && (!TARGET_Z13 || MEM_P (dst) + || (CONST_INT_P (then) && !satisfies_constraint_K (then)) + || (CONST_INT_P (els) && !satisfies_constraint_K (els)))) + *total += COSTS_N_INSNS (1) / 2; + + /* A store on condition can only handle register src operands. */ + if (MEM_P (dst) && (!REG_P (then) || !REG_P (els))) + *total += COSTS_N_INSNS (1) / 2; + + return true; + } + default: + break; + } - rtx cond = XEXP (SET_SRC (x), 0); + switch (GET_CODE (dest)) + { + case SUBREG: + if (!REG_P (SUBREG_REG (dest))) + *total += rtx_cost (SUBREG_REG (src), VOIDmode, SET, 0, speed); + /* fallthrough */ + case REG: + /* If this is a VR -> VR copy, count the number of + registers. */ + if (VECTOR_MODE_P (GET_MODE (dest)) && REG_P (src)) + { + int nregs = s390_hard_regno_nregs (VR0_REGNUM, GET_MODE + (dest)); + *total = COSTS_N_INSNS (nregs); + } + /* Same for GPRs. */ + else if (REG_P (src)) + { + int nregs = s390_hard_regno_nregs (GPR0_REGNUM, GET_MODE + (dest)); + *total = COSTS_N_INSNS (nregs); + } + else + /* Otherwise just cost the src. */ + *total += rtx_cost (src, mode, SET, 1, speed); + return true; + case MEM: + { + rtx address = XEXP (dest, 0); + rtx tmp; + long tmp2; + if (s390_loadrelative_operand_p (address, &tmp, &tmp2)) + *total = COSTS_N_INSNS (1); + else + *total = s390_address_cost (address, mode, 0, speed); + return true; + } + default: + /* Not handled for now, assume default costs. */ + *total = COSTS_N_INSNS (1); + return false; + } - if (!CC_REG_P (XEXP (cond, 0)) || !CONST_INT_P (XEXP (cond, 1))) return false; - - /* It is going to be a load/store on condition. Make it - slightly more expensive than a normal load. */ - *total = COSTS_N_INSNS (1) + 2; - - rtx dst = SET_DEST (x); - rtx then = XEXP (SET_SRC (x), 1); - rtx els = XEXP (SET_SRC (x), 2); - - /* It is a real IF-THEN-ELSE. An additional move will be - needed to implement that. */ - if (!TARGET_Z15 - && reload_completed - && !rtx_equal_p (dst, then) - && !rtx_equal_p (dst, els)) - *total += COSTS_N_INSNS (1) / 2; - - /* A minor penalty for constants we cannot directly handle. */ - if ((CONST_INT_P (then) || CONST_INT_P (els)) - && (!TARGET_Z13 || MEM_P (dst) - || (CONST_INT_P (then) && !satisfies_constraint_K (then)) - || (CONST_INT_P (els) && !satisfies_constraint_K (els)))) - *total += COSTS_N_INSNS (1) / 2; - - /* A store on condition can only handle register src operands. */ - if (MEM_P (dst) && (!REG_P (then) || !REG_P (els))) - *total += COSTS_N_INSNS (1) / 2; - - return true; - } + } case IOR: /* nnrk, nngrk */ diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c new file mode 100644 index 00000000000..17870f397ed --- /dev/null