From patchwork Wed Dec 21 06:27:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 62213 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 76E46385B50F for ; Wed, 21 Dec 2022 06:28:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 76E46385B50F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1671604094; bh=/MCic29GY5rvzvicEkQI+1UrBpkwuy69mDtNepolrQ8=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=xY8eCy83/EiQ0TSZN4MR4f/q1KhVukX+r4Y7BrlYxRVoaMyF+a1Lh4xSX5WrQNWJh 5FweRjloNKJSbbx4V/jtXmtMN44LHF0CiZmZ1Zio3VK9QvE6kI4SWLlWOncRZtYn5Q bKYRXMYoUsBhv60zblirsLHBHd1It9XpG9RyDjvI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 4F5613858281; Wed, 21 Dec 2022 06:27:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4F5613858281 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BL5hBv3004308; Wed, 21 Dec 2022 06:27:43 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mkv87gyre-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 21 Dec 2022 06:27:43 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2BL5iB8G007046; Wed, 21 Dec 2022 06:27:42 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mkv87gyqv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 21 Dec 2022 06:27:42 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 2BKJrHBr020561; Wed, 21 Dec 2022 06:27:41 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma06ams.nl.ibm.com (PPS) with ESMTPS id 3mh6yy52r3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 21 Dec 2022 06:27:40 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2BL6RctA41550188 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 21 Dec 2022 06:27:38 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8591F20043; Wed, 21 Dec 2022 06:27:38 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EDD102004B; Wed, 21 Dec 2022 06:27:36 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 21 Dec 2022 06:27:36 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, guojiufu@linux.ibm.com, rguenther@suse.de, jeffreyalaw@gmail.com Subject: [PATCH] loading float member of parameter stored via int registers Date: Wed, 21 Dec 2022 14:27:36 +0800 Message-Id: <20221221062736.78036-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: OjUN62HGorXwWE8sgaSSwzgZ6o4Tyg1A X-Proofpoint-GUID: NEG8Wt1wlldMKWOkyMVE0hs3z-tYQK8F X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-21_02,2022-12-20_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1011 mlxlogscore=999 impostorscore=0 spamscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 phishscore=0 bulkscore=0 malwarescore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2212210043 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch is fixing an issue about parameter accessing if the parameter is struct type and passed through integer registers, and there is floating member is accessed. Like below code: typedef struct DF {double a[4]; long l; } DF; double foo_df (DF arg){return arg.a[3];} On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is generated. While instruction "mtvsrd 1, 6" would be enough for this case. This patch updates the behavior when loading floating members of a parameter: if that floating member is stored via integer register, then loading it as integer mode first, and converting it to floating mode. I also thought of a method: before storing the register to stack, convert it to float mode first. While there are some cases that may still prefer to keep an integer register store. Bootstrap and regtest passes on ppc64{,le}. I would ask for help to review for comments and if this patch is acceptable for the trunk. BR, Jeff (Jiufu) PR target/108073 gcc/ChangeLog: * config/rs6000/rs6000.cc (TARGET_LOADING_INT_CONVERT_TO_FLOAT): New macro definition. (rs6000_loading_int_convert_to_float): New hook implement. * doc/tm.texi: Regenerated. * doc/tm.texi.in (loading_int_convert_to_float): New hook. * expr.cc (expand_expr_real_1): Updated to use the new hook. * target.def (loading_int_convert_to_float): New hook. gcc/testsuite/ChangeLog: * g++.target/powerpc/pr102024.C: Update. * gcc.target/powerpc/pr108073.c: New test. --- gcc/config/rs6000/rs6000.cc | 70 +++++++++++++++++++++ gcc/doc/tm.texi | 6 ++ gcc/doc/tm.texi.in | 2 + gcc/expr.cc | 15 +++++ gcc/target.def | 11 ++++ gcc/testsuite/g++.target/powerpc/pr102024.C | 2 +- gcc/testsuite/gcc.target/powerpc/pr108073.c | 24 +++++++ 7 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index b3a609f3aa3..af676eea276 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -1559,6 +1559,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN #define TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN invalid_arg_for_unprototyped_fn +#undef TARGET_LOADING_INT_CONVERT_TO_FLOAT +#define TARGET_LOADING_INT_CONVERT_TO_FLOAT rs6000_loading_int_convert_to_float + #undef TARGET_MD_ASM_ADJUST #define TARGET_MD_ASM_ADJUST rs6000_md_asm_adjust @@ -24018,6 +24021,73 @@ invalid_arg_for_unprototyped_fn (const_tree typelist, const_tree funcdecl, const : NULL; } +/* Implement the TARGET_LOADING_INT_CONVERT_TO_FLOAT. */ +static rtx +rs6000_loading_int_convert_to_float (machine_mode mode, rtx source, rtx base) +{ + rtx src_base = XEXP (source, 0); + poly_uint64 offset = MEM_OFFSET (source); + + if (GET_CODE (src_base) == PLUS && CONSTANT_P (XEXP (src_base, 1))) + { + offset += INTVAL (XEXP (src_base, 1)); + src_base = XEXP (src_base, 0); + } + + if (!rtx_equal_p (XEXP (base, 0), src_base)) + return NULL_RTX; + + rtx temp_reg = gen_reg_rtx (word_mode); + rtx temp_mem = copy_rtx (source); + PUT_MODE (temp_mem, word_mode); + + /* DI->DF */ + if (word_mode == DImode && mode == DFmode) + { + if (multiple_p (offset, GET_MODE_SIZE (word_mode))) + { + emit_move_insn (temp_reg, temp_mem); + rtx float_subreg = simplify_gen_subreg (mode, temp_reg, word_mode, 0); + rtx target_reg = gen_reg_rtx (mode); + emit_move_insn (target_reg, float_subreg); + return target_reg; + } + return NULL_RTX; + } + + /* Sub DI#->SF */ + if (word_mode == DImode && mode == SFmode) + { + poly_uint64 byte_off = 0; + if (multiple_p (offset, GET_MODE_SIZE (word_mode))) + byte_off = 0; + else if (multiple_p (offset - GET_MODE_SIZE (mode), + GET_MODE_SIZE (word_mode))) + byte_off = GET_MODE_SIZE (mode); + else + return NULL_RTX; + + temp_mem = adjust_address (temp_mem, word_mode, -byte_off); + emit_move_insn (temp_reg, temp_mem); + + /* little endia only? */ + poly_uint64 high_off = subreg_highpart_offset (SImode, word_mode); + if (known_eq (byte_off, high_off)) + { + temp_reg = expand_shift (RSHIFT_EXPR, word_mode, temp_reg, + GET_MODE_PRECISION (SImode), temp_reg, 0); + } + rtx subreg_si = gen_reg_rtx (SImode); + emit_move_insn (subreg_si, gen_lowpart (SImode, temp_reg)); + rtx float_subreg = simplify_gen_subreg (mode, subreg_si, SImode, 0); + rtx target_reg = gen_reg_rtx (mode); + emit_move_insn (target_reg, float_subreg); + return target_reg; + } + + return NULL_RTX; +} + /* For TARGET_SECURE_PLT 32-bit PIC code we can save PIC register setup by using __stack_chk_fail_local hidden function instead of calling __stack_chk_fail directly. Otherwise it is better to call diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 8fe49c2ba3d..10f94af553d 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11933,6 +11933,12 @@ or when the back end is in a partially-initialized state. outside of any function scope. @end deftypefn +@deftypefn {Target Hook} rtx TARGET_LOADING_INT_CONVERT_TO_FLOAT (machine_mode @var{mode}, rtx @var{source}, rtx @var{base}) +If the target is protifiable to load an integer in word_mode from +@var{source} which is based on @var{base}, then convert to a floating +point value in @var{mode}. +@end deftypefn + @defmac TARGET_OBJECT_SUFFIX Define this macro to be a C string representing the suffix for object files on your target machine. If you do not define this macro, GCC will diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 62c49ac46de..1ca6a671d86 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -7756,6 +7756,8 @@ to by @var{ce_info}. @hook TARGET_SET_CURRENT_FUNCTION +@hook TARGET_LOADING_INT_CONVERT_TO_FLOAT + @defmac TARGET_OBJECT_SUFFIX Define this macro to be a C string representing the suffix for object files on your target machine. If you do not define this macro, GCC will diff --git a/gcc/expr.cc b/gcc/expr.cc index d9407432ea5..466079220e7 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -11812,6 +11812,21 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, && modifier != EXPAND_WRITE) op0 = flip_storage_order (mode1, op0); + /* Accessing float field of struct parameter which passed via integer + registers. */ + if (targetm.loading_int_convert_to_float && mode == mode1 + && GET_MODE_CLASS (mode) == MODE_FLOAT + && TREE_CODE (tem) == PARM_DECL && DECL_INCOMING_RTL (tem) + && REG_P (DECL_INCOMING_RTL (tem)) + && GET_MODE (DECL_INCOMING_RTL (tem)) == BLKmode && MEM_P (op0) + && MEM_OFFSET_KNOWN_P (op0)) + { + rtx res = targetm.loading_int_convert_to_float (mode, op0, + DECL_RTL (tem)); + if (res) + op0 = res; + } + if (mode == mode1 || mode1 == BLKmode || mode1 == tmode || modifier == EXPAND_CONST_ADDRESS || modifier == EXPAND_INITIALIZER) diff --git a/gcc/target.def b/gcc/target.def index 082a7c62f34..837ce902489 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -4491,6 +4491,17 @@ original and the returned modes should be @code{MODE_INT}.", (machine_mode mode), default_preferred_doloop_mode) + +/* Loading an integer value from memory first, then convert (bitcast) to\n\n + floating point value, if target is able to support such behavior. */ +DEFHOOK +(loading_int_convert_to_float, +"If the target is protifiable to load an integer in word_mode from\n\ +@var{source} which is based on @var{base}, then convert to a floating\n\ +point value in @var{mode}.", + rtx, (machine_mode mode, rtx source, rtx base), + NULL) + /* Returns true for a legitimate combined insn. */ DEFHOOK (legitimate_combined_insn, diff --git a/gcc/testsuite/g++.target/powerpc/pr102024.C b/gcc/testsuite/g++.target/powerpc/pr102024.C index 769585052b5..c8995cae707 100644 --- a/gcc/testsuite/g++.target/powerpc/pr102024.C +++ b/gcc/testsuite/g++.target/powerpc/pr102024.C @@ -5,7 +5,7 @@ // Test that a zero-width bit field in an otherwise homogeneous aggregate // generates a psabi warning and passes arguments in GPRs. -// { dg-final { scan-assembler-times {\mstd\M} 4 } } +// { dg-final { scan-assembler-times {\mmtvsrd\M} 4 } } struct a_thing { diff --git a/gcc/testsuite/gcc.target/powerpc/pr108073.c b/gcc/testsuite/gcc.target/powerpc/pr108073.c new file mode 100644 index 00000000000..aa02de56405 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr108073.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -save-temps" } */ + +typedef struct DF {double a[4]; long l; } DF; +typedef struct SF {float a[4];short l; } SF; + +/* Each of below function contains one mtvsrd. */ +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 3 {target { has_arch_ppc64 && has_arch_pwr8 } } } } */ +double __attribute__ ((noipa)) foo_df (DF arg){return arg.a[3];} +float __attribute__ ((noipa)) foo_sf (SF arg){return arg.a[2];} +float __attribute__ ((noipa)) foo_sf1 (SF arg){return arg.a[1];} + +double gd = 4.0; +float gf1 = 3.0f, gf2 = 2.0f; +DF gdf = {{1.0,2.0,3.0,4.0}, 1L}; +SF gsf = {{1.0f,2.0f,3.0f,4.0f}, 1L}; + +int main() +{ + if (!(foo_df (gdf) == gd && foo_sf (gsf) == gf1 && foo_sf1 (gsf) == gf2)) + __builtin_abort (); + return 0; +} +