From patchwork Fri Nov 11 08:25:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 60391 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 024393858D1E for ; Fri, 11 Nov 2022 08:26:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 024393858D1E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668155171; bh=lv74JDmoZ8wBKcLuzIHiN1Qd/yeV3RmVQmtgNYTOWNo=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=mnbKURKLRaLyUuCbhcZOSMR1+cwm8DvDM+NhUWRM4Wc2NcqaxevBF5cDtuNqzSLIv j+9uS1nvoLjBqQ89YUJbIVjsfSnEd/VwQOguGB0DfeNgZ4jZ/8OS6TTqB9C12Ym5Na WuHb1mPDGk11Z5aTYg+poZPaR7B+Vwv7ljgyUkyA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id E68F83858D1E; Fri, 11 Nov 2022 08:25:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E68F83858D1E Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2AB8Gkhb000800; Fri, 11 Nov 2022 08:25:39 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ksjqy8597-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 08:25:39 +0000 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AB8KwNj014780; Fri, 11 Nov 2022 08:25:38 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ksjqy858k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 08:25:38 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AB8L0A2010725; Fri, 11 Nov 2022 08:25:36 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma04fra.de.ibm.com with ESMTP id 3kngmqp3h1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 08:25:36 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AB8PYf41835568 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 11 Nov 2022 08:25:34 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6138B11C04C; Fri, 11 Nov 2022 08:25:34 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 623F011C04A; Fri, 11 Nov 2022 08:25:33 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 11 Nov 2022 08:25:33 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, guojiufu@linux.ibm.com, rguenther@suse.de, jeffreyalaw@gmail.com Subject: [PATCH] Using sub-scalars mode to move struct block Date: Fri, 11 Nov 2022 16:25:32 +0800 Message-Id: <20221111082532.24898-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: _2sxyJSgXBeGsfZCMwn4P5MaqBDfhE62 X-Proofpoint-ORIG-GUID: HcU66SS1dmoQBbeVw9x2DPtvxR-lF9TT X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-11_04,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 priorityscore=1501 clxscore=1015 mlxscore=0 suspectscore=0 phishscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211110053 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, When assigning a struct parameter to another variable, or loading a memory block to a struct var (especially for return value), Now, "block move" would be used during expand the assignment. And the "block move" may use a type/mode different from the mode which is accessing the var. e.g. on ppc64le, V2DI would be used to move the block of 16bytes. And then, this "block move" would prevent optimization passes from leaping/crossing over the assignment. PR65421 reflects this issue. As the example code in PR65421. typedef struct { double a[4]; } A; A foo (const A *a) { return *a; } On ppc64le, the below instructions are used for the "block move": 7: r122:V2DI=[r121:DI] 8: r124:V2DI=[r121:DI+r123:DI] 9: [r112:DI]=r122:V2DI 10: [r112:DI+0x10]=r124:V2DI For this issue, a few comments/suggestions are mentioned via RFC: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html I drafted a patch which is updating the behavior of block_move for struct type. This patch is simple to work with, a few ideas in the comments are not put into this patch. I would submit this patch first. The idea is trying to use sub-modes(scalar) for the "block move". And the sub-modes would align with the access patterns of the struct members and usages on parameter/return value. The major benefits of this change would be raising more opportunities for other optimization passes(cse/dse/xprop). The suitable mode would be target specified and relates to ABI, this patch introduces a target hook. And in this patch, the hook is implemented on rs6000. In this patch, the hook would be just using heuristic modes for all struct block moving. And the hook would not check if the "block move" is about parameters or return value or other uses. For the rs6000 implementation of this hook, it is able to use DF/DI/TD/.. modes for the struct block movement. The sub-modes would be the same as the mode when the struct type is on parameter or return value. Bootstrapped and regtested on ppc64/ppc64le. Is this ok for trunk? BR, Jeff(Jiufu) gcc/ChangeLog: * config/rs6000/rs6000.cc (TARGET_BLOCK_MOVE_FOR_STRUCT): Define. (submode_for_struct_block_move): New function. Called from rs600_block_move_for_struct. (rs600_block_move_for_struct): New function. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_BLOCK_MOVE_FOR_STRUCT): New. * expr.cc (store_expr): Call block_move_for_struct. * target.def (block_move_for_struct): New hook. * targhooks.cc (default_block_move_for_struct): New function. * targhooks.h (default_block_move_for_struct): New Prototype. --- gcc/config/rs6000/rs6000.cc | 44 +++++++++++++++++++++++++++++++++++++ gcc/doc/tm.texi | 6 +++++ gcc/doc/tm.texi.in | 2 ++ gcc/expr.cc | 14 +++++++++--- gcc/target.def | 10 +++++++++ gcc/targhooks.cc | 7 ++++++ gcc/targhooks.h | 1 + 7 files changed, 81 insertions(+), 3 deletions(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index a85d7630b41..e14cecba0ef 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -1758,6 +1758,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_NEED_IPA_FN_TARGET_INFO #define TARGET_NEED_IPA_FN_TARGET_INFO rs6000_need_ipa_fn_target_info +#undef TARGET_BLOCK_MOVE_FOR_STRUCT +#define TARGET_BLOCK_MOVE_FOR_STRUCT rs600_block_move_for_struct + #undef TARGET_UPDATE_IPA_FN_TARGET_INFO #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info @@ -23672,6 +23675,47 @@ rs6000_function_value (const_tree valtype, return gen_rtx_REG (mode, regno); } +/* Subroutine of rs600_block_move_for_struct, to get the internal mode which + would be used to move the struct. */ +static machine_mode +submode_for_struct_block_move (tree type) +{ + gcc_assert (TREE_CODE (type) == RECORD_TYPE); + + /* The sub mode may not be the field's type of the struct. + It would be fine to use the mode as if the type is used as a function + parameter or return value. For example: DF for "{double a[4];}", and + DI for "{doubel a[3]; long l;}". + Here, using the mode as if it is function return type. */ + rtx val = rs6000_function_value (type, NULL, 0); + return (GET_CODE (val) == PARALLEL) ? GET_MODE (XEXP (XVECEXP (val, 0, 0), 0)) + : word_mode; +} + +/* Implement the TARGET_BLOCK_MOVE_FOR_STRUCT hook. */ +static void +rs600_block_move_for_struct (rtx x, rtx y, tree exp, HOST_WIDE_INT method) +{ + machine_mode mode = submode_for_struct_block_move (TREE_TYPE (exp)); + int mode_size = GET_MODE_SIZE (mode); + int size = UINTVAL (expr_size (exp)); + if (size < mode_size || (size % mode_size) != 0 || size > 64) + { + default_block_move_for_struct (x, y, exp, method); + return; + } + + int len = size / mode_size; + for (int i = 0; i < len; i++) + { + rtx temp = gen_reg_rtx (mode); + rtx src = adjust_address (y, mode, mode_size * i); + rtx dest = adjust_address (x, mode, mode_size * i); + emit_move_insn (temp, src); + emit_move_insn (dest, temp); + } +} + /* Define how to find the value returned by a library function assuming the value has mode MODE. */ rtx diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 8572313b308..c8a1c1f30cf 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -1380,6 +1380,12 @@ retain the field's mode. Normally, this is not needed. @end deftypefn +@deftypefn {Target Hook} void TARGET_BLOCK_MOVE_FOR_STRUCT (rtx @var{x}, rtx @var{y}, tree @var{exp}, HOST_WIDE_INT @var{method}) +Move from @var{y} to @var{x}, where @var{y} is as @var{exp} in structure +type. @var{method} is the method if this function invokes block_move. +The default definition invokes block_move. +@end deftypefn + @defmac ROUND_TYPE_ALIGN (@var{type}, @var{computed}, @var{specified}) Define this macro as an expression for the alignment of a type (given by @var{type} as a tree node) if the alignment computed in the usual diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 986e8f0da09..f0f525a2008 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -1226,6 +1226,8 @@ to aligning a bit-field within the structure. @hook TARGET_MEMBER_TYPE_FORCES_BLK +@hook TARGET_BLOCK_MOVE_FOR_STRUCT + @defmac ROUND_TYPE_ALIGN (@var{type}, @var{computed}, @var{specified}) Define this macro as an expression for the alignment of a type (given by @var{type} as a tree node) if the alignment computed in the usual diff --git a/gcc/expr.cc b/gcc/expr.cc index c6917fbf7bd..734dc07a76b 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -6504,9 +6504,17 @@ store_expr (tree exp, rtx target, int call_param_p, emit_group_store (target, temp, TREE_TYPE (exp), int_size_in_bytes (TREE_TYPE (exp))); else if (GET_MODE (temp) == BLKmode) - emit_block_move (target, temp, expr_size (exp), - (call_param_p - ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)); + { + if (TREE_CODE (TREE_TYPE (exp)) == RECORD_TYPE) + targetm.block_move_for_struct (target, temp, exp, + call_param_p ? BLOCK_OP_CALL_PARM + : BLOCK_OP_NORMAL); + else + emit_block_move (target, temp, expr_size (exp), + (call_param_p ? BLOCK_OP_CALL_PARM + : BLOCK_OP_NORMAL)); + } + /* If we emit a nontemporal store, there is nothing else to do. */ else if (nontemporal && emit_storent_insn (target, temp)) ; diff --git a/gcc/target.def b/gcc/target.def index 25f94c19fa7..e141f72a8a3 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5584,6 +5584,16 @@ Normally, this is not needed.", bool, (const_tree field, machine_mode mode), default_member_type_forces_blk) + +/* Move block for structure type. */ +DEFHOOK +(block_move_for_struct, + "Move from @var{y} to @var{x}, where @var{y} is as @var{exp} in structure\n\ +type. @var{method} is the method if this function invokes block_move.\n\ +The default definition invokes block_move.", + void, (rtx x, rtx y, tree exp, HOST_WIDE_INT method), + default_block_move_for_struct) + /* See tree-ssa-math-opts.cc:divmod_candidate_p for conditions that gate the divod transform. */ DEFHOOK diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index 12a58456b39..2b96c348419 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -2330,6 +2330,13 @@ default_member_type_forces_blk (const_tree, machine_mode) return false; } +/* Default version of block_move_for_struct. */ +void +default_block_move_for_struct (rtx x, rtx y, tree exp, HOST_WIDE_INT method) +{ + emit_block_move (x, y, expr_size (exp), (enum block_op_methods)method); +} + /* Default version of canonicalize_comparison. */ void diff --git a/gcc/targhooks.h b/gcc/targhooks.h index a6a423c1abb..c284a35ee28 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -264,6 +264,7 @@ extern void default_asm_output_ident_directive (const char*); extern scalar_int_mode default_cstore_mode (enum insn_code); extern bool default_member_type_forces_blk (const_tree, machine_mode); +extern void default_block_move_for_struct (rtx, rtx, tree, HOST_WIDE_INT); extern void default_atomic_assign_expand_fenv (tree *, tree *, tree *); extern tree build_va_arg_indirect_ref (tree); extern tree std_gimplify_va_arg_expr (tree, tree, gimple_seq *, gimple_seq *);