combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

Message ID CAFULd4az-XWtTFKOD1JzU3BhCXGjmS35pU_idZtcrt+4=LG2eQ@mail.gmail.com
State New
Headers
Series combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-arm-bootstrap success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-aarch64-bootstrap success Build passed

Commit Message

Uros Bizjak Feb. 12, 2025, 12:14 p.m. UTC
  The combine pass is trying to combine:

Trying 16, 22, 21 -> 23:
   16: r104:QI=flags:CCNO>0
   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC
   21: r119:QI=flags:CCNO<=0
      REG_DEAD flags:CCNO
   23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
      REG_DEAD r120:QI
      REG_DEAD r119:QI
      REG_UNUSED flags:CC

and creates the following two insn sequence:

modifying insn i2    22: r104:QI=flags:CCNO>0
      REG_DEAD flags:CC
deferring rescan insn with uid = 22.
modifying insn i3    23: r110:QI=flags:CCNO<=0
      REG_DEAD flags:CC
deferring rescan insn with uid = 23.

where the REG_DEAD note in i2 is not correct, because the flags
register is still referenced in i3.  In try_combine() megafunction, we
have this part:

--cut here--
    /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
    if (i3notes)
      distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
            elim_i2, elim_i1, elim_i0);
    if (i2notes)
      distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
            elim_i2, elim_i1, elim_i0);
    if (i1notes)
      distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
            elim_i2, local_elim_i1, local_elim_i0);
    if (i0notes)
      distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
            elim_i2, elim_i1, local_elim_i0);
    if (midnotes)
      distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
            elim_i2, elim_i1, elim_i0);
--cut here--

where the compiler distributes REG_UNUSED note from i2:

   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC

via distribute_notes() using the following:

--cut here--
          /* Otherwise, if this register is now referenced in i2
         then the register used to be modified in one of the
         original insns.  If it was i3 (say, in an unused
         parallel), it's now completely gone, so the note can
         be discarded.  But if it was modified in i2, i1 or i0
         and we still reference it in i2, then we're
         referencing the previous value, and since the
         register was modified and REG_UNUSED, we know that
         the previous value is now dead.  So, if we only
         reference the register in i2, we change the note to
         REG_DEAD, to reflect the previous value.  However, if
         we're also setting or clobbering the register as
         scratch, we know (because the register was not
         referenced in i3) that it's unused, just as it was
         unused before, and we place the note in i2.  */
          if (from_insn != i3 && i2 && INSN_P (i2)
          && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
        {
          if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
            PUT_REG_NOTE_KIND (note, REG_DEAD);
          if (! (REG_P (XEXP (note, 0))
             ? find_regno_note (i2, REG_NOTE_KIND (note),
                        REGNO (XEXP (note, 0)))
             : find_reg_note (i2, REG_NOTE_KIND (note),
                      XEXP (note, 0))))
            place = i2;
        }
--cut here--

However, the flags register is not UNUSED (or DEAD), because it is
used in i3.  The proposed solution is to remove the REG_UNUSED note
from i2 when the register is also mentioned in i3.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for master and eventual backports?

Uros.
  

Comments

Uros Bizjak Feb. 12, 2025, 12:21 p.m. UTC | #1
On Wed, Feb 12, 2025 at 1:14 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> The combine pass is trying to combine:
>
> Trying 16, 22, 21 -> 23:
>    16: r104:QI=flags:CCNO>0
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>    21: r119:QI=flags:CCNO<=0
>       REG_DEAD flags:CCNO
>    23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
>       REG_DEAD r120:QI
>       REG_DEAD r119:QI
>       REG_UNUSED flags:CC
>
> and creates the following two insn sequence:
>
> modifying insn i2    22: r104:QI=flags:CCNO>0
>       REG_DEAD flags:CC
> deferring rescan insn with uid = 22.
> modifying insn i3    23: r110:QI=flags:CCNO<=0
>       REG_DEAD flags:CC
> deferring rescan insn with uid = 23.
>
> where the REG_DEAD note in i2 is not correct, because the flags
> register is still referenced in i3.  In try_combine() megafunction, we
> have this part:
>
> --cut here--
>     /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
>     if (i3notes)
>       distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
>     if (i2notes)
>       distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
>     if (i1notes)
>       distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
>             elim_i2, local_elim_i1, local_elim_i0);
>     if (i0notes)
>       distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, local_elim_i0);
>     if (midnotes)
>       distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
> --cut here--
>
> where the compiler distributes REG_UNUSED note from i2:
>
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>
> via distribute_notes() using the following:
>
> --cut here--
>           /* Otherwise, if this register is now referenced in i2
>          then the register used to be modified in one of the
>          original insns.  If it was i3 (say, in an unused
>          parallel), it's now completely gone, so the note can
>          be discarded.  But if it was modified in i2, i1 or i0
>          and we still reference it in i2, then we're
>          referencing the previous value, and since the
>          register was modified and REG_UNUSED, we know that
>          the previous value is now dead.  So, if we only
>          reference the register in i2, we change the note to
>          REG_DEAD, to reflect the previous value.  However, if
>          we're also setting or clobbering the register as
>          scratch, we know (because the register was not
>          referenced in i3) that it's unused, just as it was
>          unused before, and we place the note in i2.  */
>           if (from_insn != i3 && i2 && INSN_P (i2)
>           && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
>         {
>           if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
>             PUT_REG_NOTE_KIND (note, REG_DEAD);
>           if (! (REG_P (XEXP (note, 0))
>              ? find_regno_note (i2, REG_NOTE_KIND (note),
>                         REGNO (XEXP (note, 0)))
>              : find_reg_note (i2, REG_NOTE_KIND (note),
>                       XEXP (note, 0))))
>             place = i2;
>         }
> --cut here--
>
> However, the flags register is not UNUSED (or DEAD), because it is
> used in i3.  The proposed solution is to remove the REG_UNUSED note
> from i2 when the register is also mentioned in i3.

Oops, forgot to include ChangeLog entry:

--cut here--
    PR rtl-optimization/118739

gcc/ChangeLog:

    * combine.cc (distribute_notes) <case REG_UNUSED>: Remove
    REG_UNUSED note from i2 when the register is also mentioned in i3.

gcc/testsuite/ChangeLog:

    * gcc.target/i386/pr118739.c: New test.
--cut here--

>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> OK for master and eventual backports?
>
> Uros.
  
Richard Sandiford Feb. 12, 2025, 3:16 p.m. UTC | #2
Uros Bizjak <ubizjak@gmail.com> writes:
> The combine pass is trying to combine:
>
> Trying 16, 22, 21 -> 23:
>    16: r104:QI=flags:CCNO>0
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>    21: r119:QI=flags:CCNO<=0
>       REG_DEAD flags:CCNO

It looks like something has already gone wrong in this sequence,
in that insn 21 is using the flags after the register has been clobbered.
If the flags result of insn 22 is useful, the insn should be setting the
flags using a parallel of two sets.

Thanks,
Richard

>    23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
>       REG_DEAD r120:QI
>       REG_DEAD r119:QI
>       REG_UNUSED flags:CC
>
> and creates the following two insn sequence:
>
> modifying insn i2    22: r104:QI=flags:CCNO>0
>       REG_DEAD flags:CC
> deferring rescan insn with uid = 22.
> modifying insn i3    23: r110:QI=flags:CCNO<=0
>       REG_DEAD flags:CC
> deferring rescan insn with uid = 23.
>
> where the REG_DEAD note in i2 is not correct, because the flags
> register is still referenced in i3.  In try_combine() megafunction, we
> have this part:
>
> --cut here--
>     /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
>     if (i3notes)
>       distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
>     if (i2notes)
>       distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
>     if (i1notes)
>       distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
>             elim_i2, local_elim_i1, local_elim_i0);
>     if (i0notes)
>       distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, local_elim_i0);
>     if (midnotes)
>       distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
> --cut here--
>
> where the compiler distributes REG_UNUSED note from i2:
>
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>
> via distribute_notes() using the following:
>
> --cut here--
>           /* Otherwise, if this register is now referenced in i2
>          then the register used to be modified in one of the
>          original insns.  If it was i3 (say, in an unused
>          parallel), it's now completely gone, so the note can
>          be discarded.  But if it was modified in i2, i1 or i0
>          and we still reference it in i2, then we're
>          referencing the previous value, and since the
>          register was modified and REG_UNUSED, we know that
>          the previous value is now dead.  So, if we only
>          reference the register in i2, we change the note to
>          REG_DEAD, to reflect the previous value.  However, if
>          we're also setting or clobbering the register as
>          scratch, we know (because the register was not
>          referenced in i3) that it's unused, just as it was
>          unused before, and we place the note in i2.  */
>           if (from_insn != i3 && i2 && INSN_P (i2)
>           && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
>         {
>           if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
>             PUT_REG_NOTE_KIND (note, REG_DEAD);
>           if (! (REG_P (XEXP (note, 0))
>              ? find_regno_note (i2, REG_NOTE_KIND (note),
>                         REGNO (XEXP (note, 0)))
>              : find_reg_note (i2, REG_NOTE_KIND (note),
>                       XEXP (note, 0))))
>             place = i2;
>         }
> --cut here--
>
> However, the flags register is not UNUSED (or DEAD), because it is
> used in i3.  The proposed solution is to remove the REG_UNUSED note
> from i2 when the register is also mentioned in i3.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> OK for master and eventual backports?
>
> Uros.
  
Uros Bizjak Feb. 12, 2025, 3:32 p.m. UTC | #3
On Wed, Feb 12, 2025 at 4:16 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Uros Bizjak <ubizjak@gmail.com> writes:
> > The combine pass is trying to combine:
> >
> > Trying 16, 22, 21 -> 23:
> >    16: r104:QI=flags:CCNO>0
> >    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
> >       REG_UNUSED flags:CC
> >    21: r119:QI=flags:CCNO<=0
> >       REG_DEAD flags:CCNO
>
> It looks like something has already gone wrong in this sequence,
> in that insn 21 is using the flags after the register has been clobbered.
> If the flags result of insn 22 is useful, the insn should be setting the
> flags using a parallel of two sets.

Please note that the insn sequence before combine pass is correct:

   16: r104:QI=flags:CCNO>0
   21: r119:QI=flags:CCNO<=0
      REG_DEAD flags:CCNO
   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
      REG_UNUSED flags:CC
   23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
      REG_DEAD r120:QI
      REG_DEAD r119:QI
      REG_UNUSED flags:CC

then combine tries to make the combined insn with:

Trying 16, 22, 21 -> 23:

where:

Failed to match this instruction:
(parallel [
        (set (reg:QI 110 [ d_lsm_flag.20 ])
            (le:QI (reg:CCNO 17 flags)
                (const_int 0 [0])))
        (clobber (reg:CC 17 flags))
        (set (reg:QI 104 [ _36 ])
            (gt:QI (reg:CCNO 17 flags)
                (const_int 0 [0])))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:QI 110 [ d_lsm_flag.20 ])
            (le:QI (reg:CCNO 17 flags)
                (const_int 0 [0])))
        (set (reg:QI 104 [ _36 ])
            (gt:QI (reg:CCNO 17 flags)
                (const_int 0 [0])))
    ])
Successfully matched this instruction:
(set (reg:QI 104 [ _36 ])
    (gt:QI (reg:CCNO 17 flags)
        (const_int 0 [0])))
Successfully matched this instruction:
(set (reg:QI 110 [ d_lsm_flag.20 ])
    (le:QI (reg:CCNO 17 flags)
        (const_int 0 [0])))
allowing combination of insns 16, 21, 22 and 23
original costs 4 + 4 + 4 + 4 = 16
replacement costs 4 + 4 = 8
deferring deletion of insn with uid = 21.
deferring deletion of insn with uid = 16.
modifying insn i2    22: r104:QI=flags:CCNO>0
      REG_DEAD flags:CC
deferring rescan insn with uid = 22.
modifying insn i3    23: r110:QI=flags:CCNO<=0
      REG_DEAD flags:CC
deferring rescan insn with uid = 23.

Combined instruction is OK, but when insn is split into two, it
propagates REG_UNUSED from insn 22 (and converting them to REG_DEAD on
the fly in the referred code) to insn i2. This is clearly wrong when
flags reg is used in insn i3. I don't see anything wrong besides this.

Uros.
  
Richard Sandiford Feb. 12, 2025, 4:13 p.m. UTC | #4
Uros Bizjak <ubizjak@gmail.com> writes:
> On Wed, Feb 12, 2025 at 4:16 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Uros Bizjak <ubizjak@gmail.com> writes:
>> > The combine pass is trying to combine:
>> >
>> > Trying 16, 22, 21 -> 23:
>> >    16: r104:QI=flags:CCNO>0
>> >    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>> >       REG_UNUSED flags:CC
>> >    21: r119:QI=flags:CCNO<=0
>> >       REG_DEAD flags:CCNO
>>
>> It looks like something has already gone wrong in this sequence,
>> in that insn 21 is using the flags after the register has been clobbered.
>> If the flags result of insn 22 is useful, the insn should be setting the
>> flags using a parallel of two sets.
>
> Please note that the insn sequence before combine pass is correct:
>
>    16: r104:QI=flags:CCNO>0
>    21: r119:QI=flags:CCNO<=0
>       REG_DEAD flags:CCNO
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>    23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
>       REG_DEAD r120:QI
>       REG_DEAD r119:QI
>       REG_UNUSED flags:CC

Ah, ok!  Sorry for the noise.  I hadn't realised that combine sorted
the instructions into program order only after printing them:

  if (dump_file && (dump_flags & TDF_DETAILS))
    {
      if (i0)
	fprintf (dump_file, "\nTrying %d, %d, %d -> %d:\n",
		 INSN_UID (i0), INSN_UID (i1), INSN_UID (i2), INSN_UID (i3));
      else if (i1)
	fprintf (dump_file, "\nTrying %d, %d -> %d:\n",
		 INSN_UID (i1), INSN_UID (i2), INSN_UID (i3));
      else
	fprintf (dump_file, "\nTrying %d -> %d:\n",
		 INSN_UID (i2), INSN_UID (i3));

      if (i0)
	dump_insn_slim (dump_file, i0);
      if (i1)
	dump_insn_slim (dump_file, i1);
      dump_insn_slim (dump_file, i2);
      dump_insn_slim (dump_file, i3);
    }

  /* If multiple insns feed into one of I2 or I3, they can be in any
     order.  To simplify the code below, reorder them in sequence.  */
  if (i0 && DF_INSN_LUID (i0) > DF_INSN_LUID (i2))
    std::swap (i0, i2);
  if (i0 && DF_INSN_LUID (i0) > DF_INSN_LUID (i1))
    std::swap (i0, i1);
  if (i1 && DF_INSN_LUID (i1) > DF_INSN_LUID (i2))
    std::swap (i1, i2);

Feels like it would be more useful to sort them first.

So yeah, please ignore my comment above.

Thanks,
Richard
  
Richard Biener Feb. 24, 2025, 9:46 a.m. UTC | #5
On Wed, Feb 12, 2025 at 1:16 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> The combine pass is trying to combine:
>
> Trying 16, 22, 21 -> 23:
>    16: r104:QI=flags:CCNO>0
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>    21: r119:QI=flags:CCNO<=0
>       REG_DEAD flags:CCNO
>    23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
>       REG_DEAD r120:QI
>       REG_DEAD r119:QI
>       REG_UNUSED flags:CC
>
> and creates the following two insn sequence:
>
> modifying insn i2    22: r104:QI=flags:CCNO>0
>       REG_DEAD flags:CC
> deferring rescan insn with uid = 22.
> modifying insn i3    23: r110:QI=flags:CCNO<=0
>       REG_DEAD flags:CC
> deferring rescan insn with uid = 23.
>
> where the REG_DEAD note in i2 is not correct, because the flags
> register is still referenced in i3.  In try_combine() megafunction, we
> have this part:
>
> --cut here--
>     /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
>     if (i3notes)
>       distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
>     if (i2notes)
>       distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
>     if (i1notes)
>       distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
>             elim_i2, local_elim_i1, local_elim_i0);
>     if (i0notes)
>       distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, local_elim_i0);
>     if (midnotes)
>       distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
>             elim_i2, elim_i1, elim_i0);
> --cut here--
>
> where the compiler distributes REG_UNUSED note from i2:
>
>    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>       REG_UNUSED flags:CC
>
> via distribute_notes() using the following:
>
> --cut here--
>           /* Otherwise, if this register is now referenced in i2
>          then the register used to be modified in one of the
>          original insns.  If it was i3 (say, in an unused
>          parallel), it's now completely gone, so the note can
>          be discarded.  But if it was modified in i2, i1 or i0
>          and we still reference it in i2, then we're
>          referencing the previous value, and since the
>          register was modified and REG_UNUSED, we know that
>          the previous value is now dead.  So, if we only
>          reference the register in i2, we change the note to
>          REG_DEAD, to reflect the previous value.  However, if
>          we're also setting or clobbering the register as
>          scratch, we know (because the register was not
>          referenced in i3) that it's unused, just as it was
>          unused before, and we place the note in i2.  */
>           if (from_insn != i3 && i2 && INSN_P (i2)
>           && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
>         {
>           if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
>             PUT_REG_NOTE_KIND (note, REG_DEAD);
>           if (! (REG_P (XEXP (note, 0))
>              ? find_regno_note (i2, REG_NOTE_KIND (note),
>                         REGNO (XEXP (note, 0)))
>              : find_reg_note (i2, REG_NOTE_KIND (note),
>                       XEXP (note, 0))))
>             place = i2;
>         }
> --cut here--
>
> However, the flags register is not UNUSED (or DEAD), because it is
> used in i3.  The proposed solution is to remove the REG_UNUSED note
> from i2 when the register is also mentioned in i3.

But there is

          /* Otherwise, if this register is used by I3, then this register
             now dies here, so we must put a REG_DEAD note here unless there
             is one already.  */
          else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
                   && ! (REG_P (XEXP (note, 0))
                         ? find_regno_note (i3, REG_DEAD,
                                            REGNO (XEXP (note, 0)))
                         : find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
            {
              PUT_REG_NOTE_KIND (note, REG_DEAD);
              place = i3;
            }

which of course isn't taken as there is a NOTE in i3 already.  Still the
note is supposed to be moved there (just not emitted, it's already there.
So a better fix is to refactor the above to

          /* Otherwise, if this register is used by I3, then this register
             now dies here, so we must put a REG_DEAD note here unless there
             is one already.  */
          else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
             {
                 if (! (REG_P (XEXP (note, 0))
                         ? find_regno_note (i3, REG_DEAD,
                                            REGNO (XEXP (note, 0)))
                         : find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
               {
                 PUT_REG_NOTE_KIND (note, REG_DEAD);
                 place = i3;
               }
             }

?  At least the else { case seems to assume the reg isn't refernced in i3.
The comment wording might also need an update of course.

Richard.

> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> OK for master and eventual backports?
>
> Uros.
  
Uros Bizjak Feb. 26, 2025, 7:01 a.m. UTC | #6
On Mon, Feb 24, 2025 at 10:46 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Feb 12, 2025 at 1:16 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > The combine pass is trying to combine:
> >
> > Trying 16, 22, 21 -> 23:
> >    16: r104:QI=flags:CCNO>0
> >    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
> >       REG_UNUSED flags:CC
> >    21: r119:QI=flags:CCNO<=0
> >       REG_DEAD flags:CCNO
> >    23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
> >       REG_DEAD r120:QI
> >       REG_DEAD r119:QI
> >       REG_UNUSED flags:CC
> >
> > and creates the following two insn sequence:
> >
> > modifying insn i2    22: r104:QI=flags:CCNO>0
> >       REG_DEAD flags:CC
> > deferring rescan insn with uid = 22.
> > modifying insn i3    23: r110:QI=flags:CCNO<=0
> >       REG_DEAD flags:CC
> > deferring rescan insn with uid = 23.
> >
> > where the REG_DEAD note in i2 is not correct, because the flags
> > register is still referenced in i3.  In try_combine() megafunction, we
> > have this part:
> >
> > --cut here--
> >     /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
> >     if (i3notes)
> >       distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
> >             elim_i2, elim_i1, elim_i0);
> >     if (i2notes)
> >       distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
> >             elim_i2, elim_i1, elim_i0);
> >     if (i1notes)
> >       distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
> >             elim_i2, local_elim_i1, local_elim_i0);
> >     if (i0notes)
> >       distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
> >             elim_i2, elim_i1, local_elim_i0);
> >     if (midnotes)
> >       distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
> >             elim_i2, elim_i1, elim_i0);
> > --cut here--
> >
> > where the compiler distributes REG_UNUSED note from i2:
> >
> >    22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
> >       REG_UNUSED flags:CC
> >
> > via distribute_notes() using the following:
> >
> > --cut here--
> >           /* Otherwise, if this register is now referenced in i2
> >          then the register used to be modified in one of the
> >          original insns.  If it was i3 (say, in an unused
> >          parallel), it's now completely gone, so the note can
> >          be discarded.  But if it was modified in i2, i1 or i0
> >          and we still reference it in i2, then we're
> >          referencing the previous value, and since the
> >          register was modified and REG_UNUSED, we know that
> >          the previous value is now dead.  So, if we only
> >          reference the register in i2, we change the note to
> >          REG_DEAD, to reflect the previous value.  However, if
> >          we're also setting or clobbering the register as
> >          scratch, we know (because the register was not
> >          referenced in i3) that it's unused, just as it was
> >          unused before, and we place the note in i2.  */
> >           if (from_insn != i3 && i2 && INSN_P (i2)
> >           && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
> >         {
> >           if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
> >             PUT_REG_NOTE_KIND (note, REG_DEAD);
> >           if (! (REG_P (XEXP (note, 0))
> >              ? find_regno_note (i2, REG_NOTE_KIND (note),
> >                         REGNO (XEXP (note, 0)))
> >              : find_reg_note (i2, REG_NOTE_KIND (note),
> >                       XEXP (note, 0))))
> >             place = i2;
> >         }
> > --cut here--
> >
> > However, the flags register is not UNUSED (or DEAD), because it is
> > used in i3.  The proposed solution is to remove the REG_UNUSED note
> > from i2 when the register is also mentioned in i3.
>
> But there is
>
>           /* Otherwise, if this register is used by I3, then this register
>              now dies here, so we must put a REG_DEAD note here unless there
>              is one already.  */
>           else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
>                    && ! (REG_P (XEXP (note, 0))
>                          ? find_regno_note (i3, REG_DEAD,
>                                             REGNO (XEXP (note, 0)))
>                          : find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
>             {
>               PUT_REG_NOTE_KIND (note, REG_DEAD);
>               place = i3;
>             }
>
> which of course isn't taken as there is a NOTE in i3 already.  Still the
> note is supposed to be moved there (just not emitted, it's already there.
> So a better fix is to refactor the above to
>
>           /* Otherwise, if this register is used by I3, then this register
>              now dies here, so we must put a REG_DEAD note here unless there
>              is one already.  */
>           else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
>              {
>                  if (! (REG_P (XEXP (note, 0))
>                          ? find_regno_note (i3, REG_DEAD,
>                                             REGNO (XEXP (note, 0)))
>                          : find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
>                {
>                  PUT_REG_NOTE_KIND (note, REG_DEAD);
>                  place = i3;
>                }
>              }
>
> ?  At least the else { case seems to assume the reg isn't refernced in i3.
> The comment wording might also need an update of course.

Yes, this revision works as well. I'll prepare a v2 patch.

Thanks,
Uros.
  
Uros Bizjak Feb. 26, 2025, 1:57 p.m. UTC | #7
On Mon, Feb 24, 2025 at 10:46 AM Richard Biener
<richard.guenther@gmail.com> wrote:

>           /* Otherwise, if this register is used by I3, then this register
>              now dies here, so we must put a REG_DEAD note here unless there
>              is one already.  */
>           else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
>              {
>                  if (! (REG_P (XEXP (note, 0))
>                          ? find_regno_note (i3, REG_DEAD,
>                                             REGNO (XEXP (note, 0)))
>                          : find_reg_note (i3, REG_DEAD, XEXP (note, 0))))
>                {
>                  PUT_REG_NOTE_KIND (note, REG_DEAD);
>                  place = i3;
>                }
>              }
>
> ?  At least the else { case seems to assume the reg isn't refernced in i3.
> The comment wording might also need an update of course.

Reading the comment a couple of times, considering that "here" and
"there" means I3 and that "one" means REG_DEAD note for "this
register" in I3, the comment actually makes sense.

Uros.
  

Patch

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 3beeb514b81..0589ddbaca7 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -14557,9 +14557,12 @@  distribute_notes (rtx notes, rtx_insn *from_insn, rtx_insn *i3, rtx_insn *i2,
 		 we're also setting or clobbering the register as
 		 scratch, we know (because the register was not
 		 referenced in i3) that it's unused, just as it was
-		 unused before, and we place the note in i2.  */
+		 unused before, and we place the note in i2.  If this
+		 register is still referenced in i3, the note can be
+		 discarded.  */
 	      if (from_insn != i3 && i2 && INSN_P (i2)
-		  && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
+		  && reg_referenced_p (XEXP (note, 0), PATTERN (i2))
+		  && !reg_referenced_p (XEXP (note, 0), PATTERN (i3)))
 		{
 		  if (!reg_set_p (XEXP (note, 0), PATTERN (i2)))
 		    PUT_REG_NOTE_KIND (note, REG_DEAD);
diff --git a/gcc/testsuite/gcc.target/i386/pr118739.c b/gcc/testsuite/gcc.target/i386/pr118739.c
new file mode 100644
index 00000000000..89bed546363
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr118739.c
@@ -0,0 +1,50 @@ 
+/* PR rtl-optimization/118739 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-forwprop -fno-tree-vrp" } */
+
+volatile int a;
+int b, c, d = 1, e, f, g;
+
+int h (void)
+{
+  int i = 1;
+
+ j:
+  for (b = 1; b; b--)
+    {
+      asm ("#");
+
+      g = 0;
+
+      for (; g <= 1; g++)
+	{
+	  int k = f = 0;
+
+	  for (; f <= 1; f++)
+	    k = (1 == i) >= k || ((d = 0) >= a) + k;
+	}
+    }
+
+  for (; i < 3; i++)
+    {
+      if (!c)
+	return g;
+
+      if (e)
+	goto j;
+
+      asm ("#");
+    }
+
+  return 0;
+}
+
+int main()
+{
+  h();
+
+  if (d != 1)
+    __builtin_abort();
+
+  return 0;
+}