Message ID | 20220113171140.43735-1-krebbel@linux.ibm.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 734BF385AC26 for <patchwork@sourceware.org>; Thu, 13 Jan 2022 17:12:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 734BF385AC26 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642093934; bh=FmXch0k3ijK40n01NQk0TC+LE7ihlGbEk7h9ut6KDxU=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=bPfR/F8yskUr0HvRHoK6vzxLS0HKwvTB7yO/nKEXRLmVAzShXWE9iHfykt3yuR7Aw F9J48yscBRj2k9hAqmZLNs44OIgzgeASdHRrAUHsB9AldcGDqrsyE2IIgRqwvzWrPy wTU0f/l5cQoMVb1RT8PuBEC0K+7Pa6Nc8GJwq/N8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 373AC385840D for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 373AC385840D Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20DGvUHn031744 for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:44 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 3djr2e0b3n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:44 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 20DH8Oex023195 for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:43 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma06ams.nl.ibm.com with ESMTP id 3df1vjq18k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:42 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 20DHBe3V7274962 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:40 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ACEAEAE053 for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:40 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 91BEAAE056 for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:40 +0000 (GMT) Received: from li-23497a81-5215-11cb-9bae-a81330ecc14b.ibm.com.com (unknown [9.145.14.100]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP for <gcc-patches@gcc.gnu.org>; Thu, 13 Jan 2022 17:11:40 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets Date: Thu, 13 Jan 2022 18:11:40 +0100 Message-Id: <20220113171140.43735-1-krebbel@linux.ibm.com> X-Mailer: git-send-email 2.33.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: q5AU0sC-Qrt2ifjK6noTw2_oZ4wQ01Ia X-Proofpoint-GUID: q5AU0sC-Qrt2ifjK6noTw2_oZ4wQ01Ia Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-01-13_08,2022-01-13_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 lowpriorityscore=0 phishscore=0 mlxlogscore=999 malwarescore=0 priorityscore=1501 bulkscore=0 suspectscore=0 mlxscore=0 spamscore=0 adultscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201130106 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Andreas Krebbel via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Andreas Krebbel <krebbel@linux.ibm.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
cprop_hardreg: Workaround for narrow mode != lowpart targets
|
|
Commit Message
Andreas Krebbel
Jan. 13, 2022, 5:11 p.m. UTC
The cprop_hardreg pass is built around the assumption that accessing a register in a narrower mode is the same as accessing the lowpart of the register. This unfortunately is not true for vector registers on IBM Z. This caused a miscompile of LLVM with GCC 8.5. The problem could not be reproduced with upstream GCC unfortunately but we have to assume that it is latent there. The right fix would require substantial changes to the cprop pass and is certainly something we would want for our platform. But since this would not be acceptable for older GCCs I'll go with what Vladimir proposed in the RedHat BZ and introduce a hopefully temporary and undocumented target hook to disable that specific transformation in regcprop.c. Here the RedHat BZ for reference: https://bugzilla.redhat.com/show_bug.cgi?id=2028609 Bootstrapped and regression-tested on s390x. Ok? gcc/ChangeLog: * target.def (narrow_mode_refers_low_part_p): Add new target hook. * config/s390/s390.c (s390_narrow_mode_refers_low_part_p): Implement new target hook for IBM Z. (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro. * regcprop.c (maybe_mode_change): Disable transformation depending on the new target hook. --- gcc/config/s390/s390.c | 14 ++++++++++++++ gcc/regcprop.c | 3 ++- gcc/target.def | 12 +++++++++++- 3 files changed, 27 insertions(+), 2 deletions(-)
Comments
On 1/13/22 18:11, Andreas Krebbel via Gcc-patches wrote: ... > @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done. As long as the\n\ > floating registers are not in class @code{GENERAL_REGS}, they will not\n\ > be used unless some pattern's constraint asks for one.", > bool, (unsigned int regno, machine_mode mode), > - hook_bool_uint_mode_true) > + hook_bool_uint_true) > > DEFHOOK > (modes_tieable_p, That hunk was a copy and paste bug and does not belong to the patch. Andreas
On Thu, Jan 13, 2022 at 6:12 PM Andreas Krebbel via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > The cprop_hardreg pass is built around the assumption that accessing a > register in a narrower mode is the same as accessing the lowpart of > the register. This unfortunately is not true for vector registers on > IBM Z. This caused a miscompile of LLVM with GCC 8.5. The problem > could not be reproduced with upstream GCC unfortunately but we have to > assume that it is latent there. The right fix would require > substantial changes to the cprop pass and is certainly something we > would want for our platform. But since this would not be acceptable > for older GCCs I'll go with what Vladimir proposed in the RedHat BZ > and introduce a hopefully temporary and undocumented target hook to > disable that specific transformation in regcprop.c. > > Here the RedHat BZ for reference: > https://bugzilla.redhat.com/show_bug.cgi?id=2028609 Can the gist of this bug be put into the GCC bugzilla so the rev can refer to it? Can we have a testcase even? I'm not quite understanding the problem but is it that, say, (subreg:DI (reg:V2DI ..) 0) isn't the same as (lowpart:DI (reg:V2DI ...) 0) ? The regcprop code looks more like asking whether the larger reg is a composition of multiple other hardregs and will return the specific hardreg corresponding to the lowpart - so like if on s390 the vector registers overlap with some other regset. But then doing the actual accesses via the other regset regs doesn't actually work? Isn't the backend then lying to us (aka the mode_change_ok returns the wrong answer)? How does the stage1 fix, aka "rewrite" of cprop, look like? How can we be sure this hack isn't still present in 10 years from now? Thanks, Richard. > Bootstrapped and regression-tested on s390x. > > Ok? > > gcc/ChangeLog: > > * target.def (narrow_mode_refers_low_part_p): Add new target hook. > * config/s390/s390.c (s390_narrow_mode_refers_low_part_p): > Implement new target hook for IBM Z. > (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro. > * regcprop.c (maybe_mode_change): Disable transformation depending > on the new target hook. > --- > gcc/config/s390/s390.c | 14 ++++++++++++++ > gcc/regcprop.c | 3 ++- > gcc/target.def | 12 +++++++++++- > 3 files changed, 27 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > index 056002e4a4a..aafc6d63be6 100644 > --- a/gcc/config/s390/s390.c > +++ b/gcc/config/s390/s390.c > @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode) > return false; > } > > +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P. */ > + > +static bool > +s390_narrow_mode_refers_low_part_p (unsigned int regno) > +{ > + if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno))) > + return false; > + > + return true; > +} > + > + > /* Implement TARGET_MODES_TIEABLE_P. */ > > static bool > @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1, > #undef TARGET_VECTORIZE_VEC_PERM_CONST > #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const > > +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P > +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p > > struct gcc_target targetm = TARGET_INITIALIZER; > > diff --git a/gcc/regcprop.c b/gcc/regcprop.c > index 1a9bcf0a1ad..aaf94ad9b51 100644 > --- a/gcc/regcprop.c > +++ b/gcc/regcprop.c > @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, > > if (orig_mode == new_mode) > return gen_raw_REG (new_mode, regno); > - else if (mode_change_ok (orig_mode, new_mode, regno)) > + else if (mode_change_ok (orig_mode, new_mode, regno) > + && targetm.narrow_mode_refers_low_part_p (regno)) > { > int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); > int use_nregs = hard_regno_nregs (copy_regno, new_mode); > diff --git a/gcc/target.def b/gcc/target.def > index 8fd2533e90a..598eea501ff 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -5446,6 +5446,16 @@ value that the middle-end intended.", > bool, (machine_mode from, machine_mode to, reg_class_t rclass), > hook_bool_mode_mode_reg_class_t_true) > > +/* This hook is used to work around a problem in regcprop. Hardcoded > +assumptions currently prevent it from working correctly for targets > +where the low part of a multi-word register doesn't align to accessing > +the register with a narrower mode. */ > +DEFHOOK_UNDOC > +(narrow_mode_refers_low_part_p, > +"", > +bool, (unsigned int regno), > +hook_bool_unit_true) > + > /* Change pseudo allocno class calculated by IRA. */ > DEFHOOK > (ira_change_pseudo_allocno_class, > @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done. As long as the\n\ > floating registers are not in class @code{GENERAL_REGS}, they will not\n\ > be used unless some pattern's constraint asks for one.", > bool, (unsigned int regno, machine_mode mode), > - hook_bool_uint_mode_true) > + hook_bool_uint_true) > > DEFHOOK > (modes_tieable_p, > -- > 2.33.1 >
On 1/14/22 08:37, Richard Biener wrote: ... > Can the gist of this bug be put into the GCC bugzilla so the rev can > refer to it? https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034 > Can we have a testcase even? The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to include it in my patch. > I'm not quite understanding the problem but is it that, say, > > (subreg:DI (reg:V2DI ..) 0) > > isn't the same as > > (lowpart:DI (reg:V2DI ...) 0) (reg:DI v0) does not match the lower order bits of (reg:TI v0) > ? The regcprop code looks more like asking whether the larger reg > is a composition of multiple other hardregs and will return the specific > hardreg corresponding to the lowpart - so like if on s390 the vector > registers overlap with some other regset. But then doing the actual > accesses via the other regset regs doesn't actually work? Isn't the > backend then lying to us (aka the mode_change_ok returns the > wrong answer)? can_change_mode_class should do the right thing. We return false in case somebody wants to change TI to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok. Before cprop we have: (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69]) (reg:TI 8 %r8)) -1 (nil)) (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ]) (reg:DI 16 %f0)) 1277 {*movdi_64} (nil)) (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69]) (unspec:DI [ (reg:V2DI 16 %f0) (const_int 1 [0x1]) ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di} (expr_list:REG_DEAD (reg:V2DI 16 %f0) (nil))) So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop assuming that (reg:DI f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) with (reg:DI 9 %r9) what would be the DImode lowpart of (reg:TI r8) Insn 155 and 156 are the result of applying the following splitter: ; Split a VR -> GPR TImode move into 2 vector load GR from VR element. ; For the higher order bits we do simply a DImode move while the ; second part is done via vec extract. Both will end up as vlgvg. (define_split [(set (match_operand:TI 0 "register_operand" "") (match_operand:TI 1 "register_operand" ""))] "TARGET_VX && reload_completed && GENERAL_REG_P (operands[0]) && VECTOR_REG_P (operands[1])" [(set (match_dup 2) (match_dup 4)) (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)] UNSPEC_VEC_EXTRACT))] { operands[2] = operand_subword (operands[0], 0, 0, TImode); operands[3] = operand_subword (operands[0], 1, 0, TImode); operands[4] = gen_rtx_REG (DImode, REGNO (operands[1])); operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1])); }) Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle end is not expected to do - because we prevent it in can_change_mode_class. However, I don't see anything wrong with doing that in the splitter. In our backend this is well-defined as being the first element in the vector register - the high part of the TImode vector register value. Unfortunately it confuses cprop :( Andreas > > How does the stage1 fix, aka "rewrite" of cprop, look like? How can we > be sure this hack isn't still present in 10 years from now? > > Thanks, > Richard. > >> Bootstrapped and regression-tested on s390x. >> >> Ok? >> >> gcc/ChangeLog: >> >> * target.def (narrow_mode_refers_low_part_p): Add new target hook. >> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p): >> Implement new target hook for IBM Z. >> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro. >> * regcprop.c (maybe_mode_change): Disable transformation depending >> on the new target hook. >> --- >> gcc/config/s390/s390.c | 14 ++++++++++++++ >> gcc/regcprop.c | 3 ++- >> gcc/target.def | 12 +++++++++++- >> 3 files changed, 27 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c >> index 056002e4a4a..aafc6d63be6 100644 >> --- a/gcc/config/s390/s390.c >> +++ b/gcc/config/s390/s390.c >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode) >> return false; >> } >> >> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P. */ >> + >> +static bool >> +s390_narrow_mode_refers_low_part_p (unsigned int regno) >> +{ >> + if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno))) >> + return false; >> + >> + return true; >> +} >> + >> + >> /* Implement TARGET_MODES_TIEABLE_P. */ >> >> static bool >> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1, >> #undef TARGET_VECTORIZE_VEC_PERM_CONST >> #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const >> >> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P >> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p >> >> struct gcc_target targetm = TARGET_INITIALIZER; >> >> diff --git a/gcc/regcprop.c b/gcc/regcprop.c >> index 1a9bcf0a1ad..aaf94ad9b51 100644 >> --- a/gcc/regcprop.c >> +++ b/gcc/regcprop.c >> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, >> >> if (orig_mode == new_mode) >> return gen_raw_REG (new_mode, regno); >> - else if (mode_change_ok (orig_mode, new_mode, regno)) >> + else if (mode_change_ok (orig_mode, new_mode, regno) >> + && targetm.narrow_mode_refers_low_part_p (regno)) >> { >> int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); >> int use_nregs = hard_regno_nregs (copy_regno, new_mode); >> diff --git a/gcc/target.def b/gcc/target.def >> index 8fd2533e90a..598eea501ff 100644 >> --- a/gcc/target.def >> +++ b/gcc/target.def >> @@ -5446,6 +5446,16 @@ value that the middle-end intended.", >> bool, (machine_mode from, machine_mode to, reg_class_t rclass), >> hook_bool_mode_mode_reg_class_t_true) >> >> +/* This hook is used to work around a problem in regcprop. Hardcoded >> +assumptions currently prevent it from working correctly for targets >> +where the low part of a multi-word register doesn't align to accessing >> +the register with a narrower mode. */ >> +DEFHOOK_UNDOC >> +(narrow_mode_refers_low_part_p, >> +"", >> +bool, (unsigned int regno), >> +hook_bool_unit_true) >> + >> /* Change pseudo allocno class calculated by IRA. */ >> DEFHOOK >> (ira_change_pseudo_allocno_class, >> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done. As long as the\n\ >> floating registers are not in class @code{GENERAL_REGS}, they will not\n\ >> be used unless some pattern's constraint asks for one.", >> bool, (unsigned int regno, machine_mode mode), >> - hook_bool_uint_mode_true) >> + hook_bool_uint_true) >> >> DEFHOOK >> (modes_tieable_p, >> -- >> 2.33.1 >>
On Fri, Jan 14, 2022 at 11:42 AM Andreas Krebbel via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On 1/14/22 08:37, Richard Biener wrote: > ... > > Can the gist of this bug be put into the GCC bugzilla so the rev can > > refer to it? > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034 > > > Can we have a testcase even? > The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to > include it in my patch. So I also noticed there was PR 101260 which had the exact same analysis as this patch with respect to s390 and TI mode splitting and regcprop. So I marked PR 104034 as a duplicate of PR 101260. Thanks, Andrew > > > I'm not quite understanding the problem but is it that, say, > > > > (subreg:DI (reg:V2DI ..) 0) > > > > isn't the same as > > > > (lowpart:DI (reg:V2DI ...) 0) > > (reg:DI v0) does not match the lower order bits of (reg:TI v0) > > > ? The regcprop code looks more like asking whether the larger reg > > is a composition of multiple other hardregs and will return the specific > > hardreg corresponding to the lowpart - so like if on s390 the vector > > registers overlap with some other regset. But then doing the actual > > accesses via the other regset regs doesn't actually work? Isn't the > > backend then lying to us (aka the mode_change_ok returns the > > wrong answer)? > > can_change_mode_class should do the right thing. We return false in case somebody wants to change TI > to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop > only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok. > > Before cprop we have: > > (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69]) > (reg:TI 8 %r8)) -1 > (nil)) > > > (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ]) > (reg:DI 16 %f0)) 1277 {*movdi_64} > (nil)) > > > (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69]) > (unspec:DI [ > (reg:V2DI 16 %f0) > (const_int 1 [0x1]) > ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di} > (expr_list:REG_DEAD (reg:V2DI 16 %f0) > (nil))) > > So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop assuming that (reg:DI > f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) with (reg:DI 9 %r9) > what would be the DImode lowpart of (reg:TI r8) > > Insn 155 and 156 are the result of applying the following splitter: > > ; Split a VR -> GPR TImode move into 2 vector load GR from VR element. > ; For the higher order bits we do simply a DImode move while the > ; second part is done via vec extract. Both will end up as vlgvg. > (define_split > [(set (match_operand:TI 0 "register_operand" "") > (match_operand:TI 1 "register_operand" ""))] > "TARGET_VX && reload_completed > && GENERAL_REG_P (operands[0]) > && VECTOR_REG_P (operands[1])" > [(set (match_dup 2) (match_dup 4)) > (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)] > UNSPEC_VEC_EXTRACT))] > { > operands[2] = operand_subword (operands[0], 0, 0, TImode); > operands[3] = operand_subword (operands[0], 1, 0, TImode); > operands[4] = gen_rtx_REG (DImode, REGNO (operands[1])); > operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1])); > }) > > Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle end is not expected > to do - because we prevent it in can_change_mode_class. However, I don't see anything wrong with > doing that in the splitter. In our backend this is well-defined as being the first element in the > vector register - the high part of the TImode vector register value. > > Unfortunately it confuses cprop :( > > Andreas > > > > > > > How does the stage1 fix, aka "rewrite" of cprop, look like? How can we > > be sure this hack isn't still present in 10 years from now? > > > > Thanks, > > Richard. > > > >> Bootstrapped and regression-tested on s390x. > >> > >> Ok? > >> > >> gcc/ChangeLog: > >> > >> * target.def (narrow_mode_refers_low_part_p): Add new target hook. > >> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p): > >> Implement new target hook for IBM Z. > >> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro. > >> * regcprop.c (maybe_mode_change): Disable transformation depending > >> on the new target hook. > >> --- > >> gcc/config/s390/s390.c | 14 ++++++++++++++ > >> gcc/regcprop.c | 3 ++- > >> gcc/target.def | 12 +++++++++++- > >> 3 files changed, 27 insertions(+), 2 deletions(-) > >> > >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > >> index 056002e4a4a..aafc6d63be6 100644 > >> --- a/gcc/config/s390/s390.c > >> +++ b/gcc/config/s390/s390.c > >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode) > >> return false; > >> } > >> > >> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P. */ > >> + > >> +static bool > >> +s390_narrow_mode_refers_low_part_p (unsigned int regno) > >> +{ > >> + if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno))) > >> + return false; > >> + > >> + return true; > >> +} > >> + > >> + > >> /* Implement TARGET_MODES_TIEABLE_P. */ > >> > >> static bool > >> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1, > >> #undef TARGET_VECTORIZE_VEC_PERM_CONST > >> #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const > >> > >> +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P > >> +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p > >> > >> struct gcc_target targetm = TARGET_INITIALIZER; > >> > >> diff --git a/gcc/regcprop.c b/gcc/regcprop.c > >> index 1a9bcf0a1ad..aaf94ad9b51 100644 > >> --- a/gcc/regcprop.c > >> +++ b/gcc/regcprop.c > >> @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, > >> > >> if (orig_mode == new_mode) > >> return gen_raw_REG (new_mode, regno); > >> - else if (mode_change_ok (orig_mode, new_mode, regno)) > >> + else if (mode_change_ok (orig_mode, new_mode, regno) > >> + && targetm.narrow_mode_refers_low_part_p (regno)) > >> { > >> int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); > >> int use_nregs = hard_regno_nregs (copy_regno, new_mode); > >> diff --git a/gcc/target.def b/gcc/target.def > >> index 8fd2533e90a..598eea501ff 100644 > >> --- a/gcc/target.def > >> +++ b/gcc/target.def > >> @@ -5446,6 +5446,16 @@ value that the middle-end intended.", > >> bool, (machine_mode from, machine_mode to, reg_class_t rclass), > >> hook_bool_mode_mode_reg_class_t_true) > >> > >> +/* This hook is used to work around a problem in regcprop. Hardcoded > >> +assumptions currently prevent it from working correctly for targets > >> +where the low part of a multi-word register doesn't align to accessing > >> +the register with a narrower mode. */ > >> +DEFHOOK_UNDOC > >> +(narrow_mode_refers_low_part_p, > >> +"", > >> +bool, (unsigned int regno), > >> +hook_bool_unit_true) > >> + > >> /* Change pseudo allocno class calculated by IRA. */ > >> DEFHOOK > >> (ira_change_pseudo_allocno_class, > >> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done. As long as the\n\ > >> floating registers are not in class @code{GENERAL_REGS}, they will not\n\ > >> be used unless some pattern's constraint asks for one.", > >> bool, (unsigned int regno, machine_mode mode), > >> - hook_bool_uint_mode_true) > >> + hook_bool_uint_true) > >> > >> DEFHOOK > >> (modes_tieable_p, > >> -- > >> 2.33.1 > >> >
On 1/14/22 20:41, Andreas Krebbel via Gcc-patches wrote: > On 1/14/22 08:37, Richard Biener wrote: > ... >> Can the gist of this bug be put into the GCC bugzilla so the rev can >> refer to it? > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034 > >> Can we have a testcase even? > The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to > include it in my patch. > >> I'm not quite understanding the problem but is it that, say, >> >> (subreg:DI (reg:V2DI ..) 0) >> >> isn't the same as >> >> (lowpart:DI (reg:V2DI ...) 0) > > (reg:DI v0) does not match the lower order bits of (reg:TI v0) > >> ? The regcprop code looks more like asking whether the larger reg >> is a composition of multiple other hardregs and will return the specific >> hardreg corresponding to the lowpart - so like if on s390 the vector >> registers overlap with some other regset. But then doing the actual >> accesses via the other regset regs doesn't actually work? Isn't the >> backend then lying to us (aka the mode_change_ok returns the >> wrong answer)? > > can_change_mode_class should do the right thing. We return false in case somebody wants to change TI > to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop > only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok. After writing this I'm wondering whether this would be a better fix: diff --git a/gcc/regcprop.c b/gcc/regcprop.c index 18132425ab2..b6a3f4e3804 100644 --- a/gcc/regcprop.c +++ b/gcc/regcprop.c @@ -402,7 +402,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, if (orig_mode == new_mode) return gen_raw_REG (new_mode, regno); - else if (mode_change_ok (orig_mode, new_mode, regno)) + else if (mode_change_ok (orig_mode, new_mode, regno) + && mode_change_ok (copy_mode, new_mode, copy_regno)) { int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); int use_nregs = hard_regno_nregs (copy_regno, new_mode); Andreas
Andreas Krebbel via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > On 1/14/22 20:41, Andreas Krebbel via Gcc-patches wrote: >> On 1/14/22 08:37, Richard Biener wrote: >> ... >>> Can the gist of this bug be put into the GCC bugzilla so the rev can >>> refer to it? >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034 >> >>> Can we have a testcase even? >> The testcase from Jakub is in the BZ. However, since it doesn't fail with head I didn't try to >> include it in my patch. >> >>> I'm not quite understanding the problem but is it that, say, >>> >>> (subreg:DI (reg:V2DI ..) 0) >>> >>> isn't the same as >>> >>> (lowpart:DI (reg:V2DI ...) 0) >> >> (reg:DI v0) does not match the lower order bits of (reg:TI v0) >> >>> ? The regcprop code looks more like asking whether the larger reg >>> is a composition of multiple other hardregs and will return the specific >>> hardreg corresponding to the lowpart - so like if on s390 the vector >>> registers overlap with some other regset. But then doing the actual >>> accesses via the other regset regs doesn't actually work? Isn't the >>> backend then lying to us (aka the mode_change_ok returns the >>> wrong answer)? >> >> can_change_mode_class should do the right thing. We return false in case somebody wants to change TI >> to DI for a vector register. However, the hook never gets called like this from regcprop. regcprop >> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's indeed ok. > > After writing this I'm wondering whether this would be a better fix: > > diff --git a/gcc/regcprop.c b/gcc/regcprop.c > index 18132425ab2..b6a3f4e3804 100644 > --- a/gcc/regcprop.c > +++ b/gcc/regcprop.c > @@ -402,7 +402,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, > > if (orig_mode == new_mode) > return gen_raw_REG (new_mode, regno); > - else if (mode_change_ok (orig_mode, new_mode, regno)) > + else if (mode_change_ok (orig_mode, new_mode, regno) > + && mode_change_ok (copy_mode, new_mode, copy_regno)) > { > int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); > int use_nregs = hard_regno_nregs (copy_regno, new_mode); > Yeah, this looks good to me FWIW. Richard
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 056002e4a4a..aafc6d63be6 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return false; } +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P. */ + +static bool +s390_narrow_mode_refers_low_part_p (unsigned int regno) +{ + if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno))) + return false; + + return true; +} + + /* Implement TARGET_MODES_TIEABLE_P. */ static bool @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx op1, #undef TARGET_VECTORIZE_VEC_PERM_CONST #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const +#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P +#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p struct gcc_target targetm = TARGET_INITIALIZER; diff --git a/gcc/regcprop.c b/gcc/regcprop.c index 1a9bcf0a1ad..aaf94ad9b51 100644 --- a/gcc/regcprop.c +++ b/gcc/regcprop.c @@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode copy_mode, if (orig_mode == new_mode) return gen_raw_REG (new_mode, regno); - else if (mode_change_ok (orig_mode, new_mode, regno)) + else if (mode_change_ok (orig_mode, new_mode, regno) + && targetm.narrow_mode_refers_low_part_p (regno)) { int copy_nregs = hard_regno_nregs (copy_regno, copy_mode); int use_nregs = hard_regno_nregs (copy_regno, new_mode); diff --git a/gcc/target.def b/gcc/target.def index 8fd2533e90a..598eea501ff 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5446,6 +5446,16 @@ value that the middle-end intended.", bool, (machine_mode from, machine_mode to, reg_class_t rclass), hook_bool_mode_mode_reg_class_t_true) +/* This hook is used to work around a problem in regcprop. Hardcoded +assumptions currently prevent it from working correctly for targets +where the low part of a multi-word register doesn't align to accessing +the register with a narrower mode. */ +DEFHOOK_UNDOC +(narrow_mode_refers_low_part_p, +"", +bool, (unsigned int regno), +hook_bool_unit_true) + /* Change pseudo allocno class calculated by IRA. */ DEFHOOK (ira_change_pseudo_allocno_class, @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done. As long as the\n\ floating registers are not in class @code{GENERAL_REGS}, they will not\n\ be used unless some pattern's constraint asks for one.", bool, (unsigned int regno, machine_mode mode), - hook_bool_uint_mode_true) + hook_bool_uint_true) DEFHOOK (modes_tieable_p,