[committed,PR,tree-optimization103226] Clobber the condition code in the bfin doloop patterns

Message ID 5f9d27eb-6202-f087-e6d9-7b2c9e142d05@gmail.com
State Committed
Commit 7950c96ca667ddaab9d6e894da3958ebc2e2dccb
Headers
Series [committed,PR,tree-optimization103226] Clobber the condition code in the bfin doloop patterns |

Commit Message

Jeff Law Nov. 20, 2021, 4:31 p.m. UTC
  Per Aldy's excellent, but tough to follow analysis in PR 103226, this 
patch fixes the bfin-elf regression.

In simplest terms the doloop patterns on this port may clobber the 
condition code register, but they do not expose that until after 
register allocation.  That would be fine, except that other patterns 
have exposed CC earlier.  As a result the dataflow, particularly for CC, 
is incorrect.

This leads the register allocators to assume that a value in CC outside 
the loop is still valid inside the loop when in fact, the value has been 
clobbered.  This is what caused pr80974 to start failing.

With this fix, not only do we fix the pr80974 regression, but we fix ~20 
other execution failures in the port.  It also reduces test time for the 
port from ~90 minutes to ~60 minutes.

Committed to the trunk,
Jeff
commit 7950c96ca667ddaab9d6e894da3958ebc2e2dccb
Author: Jeff Law <jeffreyalaw@gmail.com>
Date:   Sat Nov 20 11:20:07 2021 -0500

    Clobber the condition code in the bfin doloop patterns
    
    Per Aldy's excellent, but tough to follow analysis in PR 103226, this patch
    fixes the bfin-elf regression.
    
    In simplest terms the doloop patterns on this port may clobber the condition
    code register, but they do not expose that until after register allocation.
    That would be fine, except that other patterns have exposed CC earlier.  As
    a result the dataflow, particularly for CC, is incorrect.
    
    This leads the register allocators to assume that a value in CC outside the
    loop is still valid inside the loop when in fact, the value has been
    clobbered.  This is what caused pr80974 to start failing.
    
    With this fix, not only do we fix the pr80974 regression, but we fix ~20
    other execution failures in the port.  It also reduces test time for the
    port from ~90 minutes to ~60 minutes.
    
            PR tree-optimization/103226
    gcc/
            * config/bfin/bfin.md (doloop pattern, splitter and expander): Clobber
            CC.
  

Patch

diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md
index fd65f4d9e63..10a19aac23e 100644
--- a/gcc/config/bfin/bfin.md
+++ b/gcc/config/bfin/bfin.md
@@ -1959,7 +1959,8 @@ 
 		   (plus:SI (match_dup 0)
 			    (const_int -1)))
 	      (unspec [(const_int 0)] UNSPEC_LSETUP_END)
-	      (clobber (match_dup 2))])] ; match_scratch
+	      (clobber (match_dup 2))
+	      (clobber (reg:BI REG_CC))])] ; match_scratch
   ""
 {
   /* The loop optimizer doesn't check the predicates... */
@@ -1979,7 +1980,8 @@ 
 	(plus (match_dup 2)
 	      (const_int -1)))
    (unspec [(const_int 0)] UNSPEC_LSETUP_END)
-   (clobber (match_scratch:SI 3 "=X,&r,&r"))]
+   (clobber (match_scratch:SI 3 "=X,&r,&r"))
+   (clobber (reg:BI REG_CC))]
   ""
   "@
    /* loop end %0 %l1 */
@@ -1997,7 +1999,8 @@ 
 	(plus (match_dup 0)
 	      (const_int -1)))
    (unspec [(const_int 0)] UNSPEC_LSETUP_END)
-   (clobber (match_scratch:SI 2))]
+   (clobber (match_scratch:SI 2))
+   (clobber (reg:BI REG_CC))]
   "memory_operand (operands[0], SImode) || splitting_loops"
   [(set (match_dup 2) (match_dup 0))
    (set (match_dup 2) (plus:SI (match_dup 2) (const_int -1)))