[avr] : Improve bit-extractions as of PR109907.

Message ID 5829e492-0f43-138a-6e50-e3115a4abbf1@gjlay.de
State New
Headers
Series [avr] : Improve bit-extractions as of PR109907. |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed

Commit Message

Georg-Johann Lay June 7, 2023, 8:41 a.m. UTC
  This patch improves bit-extractions on AVR.

Andrew added some patches so that more bit extractions are
recognized in the middle-end and rtl optimizers.

The patch adds pattern for "extzv<mode>" and replaces the
deprecated "extzv".

There are still situations where expensive shifts are passed
down to the backend though , and in one situation the backend
uses better sequences for right-shift with an offset of MSB:

Instead of ROL/CLR/ROL sequence that needs constraint "0" for
operand $1, BST/CLR/BLD just requires "r" for $1 thus less
register pressure.  Moreover, no scratch is required.

Asm out for (inverted) bit-extraction was out-sourced to a
C function which is more convenient.

Ok for master?

Johann

--

target/19907: Overhaul bit extractions.

o Logical right shift that shifts the MSB to position 0 can be performed in
   such a way that the input operand constraint can be relaxed from "0" 
to "r".
   This results in less register pressure.  Moreover, no scratch register is
   required in that case.

o The deprecated "extzv" pattern is replaced by "extzv<mode>" that allows
   inputs of scalar integer modes of different sizes (1 up to 4 bytes).

o Existing patterns are adjusted to the more generic "extzv<mode>" pattern.
   Some patterns are added as the middle-end has been reworked to spot
   more bit-extraction opportunities.

o A C function is used to print the asm for bit extractions, which is more
   convenient for complex output logic.

gcc/
	PR target/109907
	* config/avr/avr.md (adjust_len) [extr, extr_not]: New elements.
	(MSB, SIZE): New mode attributes.
	(any_shift): New code iterator.
	(*lshr<mode>3_split, *lshr<mode>3, lshr<mode>3)
	(*lshr<mode>3_const_split): Add constraint alternative for
	the case of shift-offset = MSB.  Ditch "length" attribute.
	(extzv<mode): New. replaces extzv.  Adjust following patterns.
	Use avr_out_extr, avr_out_extr_not to print asm.
	(*extzv.subreg.<mode>, *extzv.<mode>.subreg, *extzv.xor)
	(*extzv<mode>.ge, *neg.ashiftrt<mode>.msb, *extzv.io.lsr7): New.
	* config/avr/constraints.md (C15, C23, C31, Yil): New
	* config/avr/predicates.md (reg_or_low_io_operand)
	(const7_operand, reg_or_low_io_operand)
	(const15_operand, const_0_to_15_operand)
	(const23_operand, const_0_to_23_operand)
	(const31_operand, const_0_to_31_operand): New.
	* config/avr/avr-protos.h (avr_out_extr, avr_out_extr_not): New.
	* config/avr/avr.cc (avr_out_extr, avr_out_extr_not): New funcs.
	(lshrqi3_out, lshrhi3_out, lshrpsi3_out, lshrsi3_out): Adjust
	MSB case to new insn constraint "r" for operands[1].
	(avr_adjust_insn_length) [ADJUST_LEN_EXTR_NOT, ADJUST_LEN_EXTR]:
	Handle these cases.
	(avr_rtx_costs_1): Adjust cost for a new pattern.
gcc/testsuite/
	* gcc.target/avr/pr109907.c: New test.
	* gcc.target/avr/torture/pr109907-1.c: New test.
	* gcc.target/avr/torture/pr109907-2.c: New test.
  

Comments

Jeff Law June 10, 2023, 5:06 p.m. UTC | #1
On 6/7/23 02:41, Georg-Johann Lay wrote:
> Subject:
> [patch,avr]: Improve bit-extractions as of PR109907.
> From:
> Georg-Johann Lay <avr@gjlay.de>
> Date:
> 6/7/23, 02:41
> 
> To:
> gcc-patches@gcc.gnu.org
> CC:
> Jeff Law <jeffreyalaw@gmail.com>, Denis Chertykov <chertykov@gmail.com>
> 
> 
> This patch improves bit-extractions on AVR.
> 
> Andrew added some patches so that more bit extractions are
> recognized in the middle-end and rtl optimizers.
> 
> The patch adds pattern for "extzv<mode>" and replaces the
> deprecated "extzv".
> 
> There are still situations where expensive shifts are passed
> down to the backend though , and in one situation the backend
> uses better sequences for right-shift with an offset of MSB:
> 
> Instead of ROL/CLR/ROL sequence that needs constraint "0" for
> operand $1, BST/CLR/BLD just requires "r" for $1 thus less
> register pressure.  Moreover, no scratch is required.
> 
> Asm out for (inverted) bit-extraction was out-sourced to a
> C function which is more convenient.
> 
> Ok for master?
> 
> Johann
> 
> -- 
> 
> target/19907: Overhaul bit extractions.
> 
> o Logical right shift that shifts the MSB to position 0 can be performed in
>    such a way that the input operand constraint can be relaxed from "0" 
> to "r".
>    This results in less register pressure.  Moreover, no scratch 
> register is
>    required in that case.
> 
> o The deprecated "extzv" pattern is replaced by "extzv<mode>" that allows
>    inputs of scalar integer modes of different sizes (1 up to 4 bytes).
> 
> o Existing patterns are adjusted to the more generic "extzv<mode>" pattern.
>    Some patterns are added as the middle-end has been reworked to spot
>    more bit-extraction opportunities.
> 
> o A C function is used to print the asm for bit extractions, which is more
>    convenient for complex output logic.
> 
> gcc/
>      PR target/109907
>      * config/avr/avr.md (adjust_len) [extr, extr_not]: New elements.
>      (MSB, SIZE): New mode attributes.
>      (any_shift): New code iterator.
>      (*lshr<mode>3_split, *lshr<mode>3, lshr<mode>3)
>      (*lshr<mode>3_const_split): Add constraint alternative for
>      the case of shift-offset = MSB.  Ditch "length" attribute.
>      (extzv<mode): New. replaces extzv.  Adjust following patterns.
>      Use avr_out_extr, avr_out_extr_not to print asm.
>      (*extzv.subreg.<mode>, *extzv.<mode>.subreg, *extzv.xor)
>      (*extzv<mode>.ge, *neg.ashiftrt<mode>.msb, *extzv.io.lsr7): New.
>      * config/avr/constraints.md (C15, C23, C31, Yil): New
>      * config/avr/predicates.md (reg_or_low_io_operand)
>      (const7_operand, reg_or_low_io_operand)
>      (const15_operand, const_0_to_15_operand)
>      (const23_operand, const_0_to_23_operand)
>      (const31_operand, const_0_to_31_operand): New.
>      * config/avr/avr-protos.h (avr_out_extr, avr_out_extr_not): New.
>      * config/avr/avr.cc (avr_out_extr, avr_out_extr_not): New funcs.
>      (lshrqi3_out, lshrhi3_out, lshrpsi3_out, lshrsi3_out): Adjust
>      MSB case to new insn constraint "r" for operands[1].
>      (avr_adjust_insn_length) [ADJUST_LEN_EXTR_NOT, ADJUST_LEN_EXTR]:
>      Handle these cases.
>      (avr_rtx_costs_1): Adjust cost for a new pattern.
> gcc/testsuite/
>      * gcc.target/avr/pr109907.c: New test.
>      * gcc.target/avr/torture/pr109907-1.c: New test.
>      * gcc.target/avr/torture/pr109907-2.c: New test.
> 
> pr109907-v2.diff
> 
> diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
> index ec96fd45865..229854a19db 100644
> --- a/gcc/config/avr/avr-protos.h
> +++ b/gcc/config/avr/avr-protos.h

> diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
> index a90cade35c7..f69d79bf14e 100644
> --- a/gcc/config/avr/avr.cc
> +++ b/gcc/config/avr/avr.cc
> @@ -7142,9 +7142,9 @@ lshrqi3_out (rtx_insn *insn, rtx operands[], int *len)
>   
>   	case 7:
>   	  *len = 3;
> -	  return ("rol %0" CR_TAB
> -		  "clr %0" CR_TAB
> -		  "rol %0");
> +	  return ("bst %1,7" CR_TAB
> +		  "clr %0"   CR_TAB
> +		  "bld %0,0");
>   	}
[ ... ]
Reminds me a lot of the H8 port.  The basic H8/300 variants can only 
shift a single bit at a time and the H8/S can only shift 2 at a time. 
So we synthesize all kinds of sequences to try and optimize shifts and 
bitfield extractions.


Anyway, much like the other patch, I did a cursory review, but you're 
really in a better position to judge correctness and profitability for 
the AVR bits.  So OK for the trunk.

jeff
  

Patch

diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index ec96fd45865..229854a19db 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -58,6 +58,8 @@  extern const char *ret_cond_branch (rtx x, int len, int reverse);
 extern const char *avr_out_movpsi (rtx_insn *, rtx*, int*);
 extern const char *avr_out_sign_extend (rtx_insn *, rtx*, int*);
 extern const char *avr_out_insert_notbit (rtx_insn *, rtx*, rtx, int*);
+extern const char *avr_out_extr (rtx_insn *, rtx*, int*);
+extern const char *avr_out_extr_not (rtx_insn *, rtx*, int*);
 
 extern const char *ashlqi3_out (rtx_insn *insn, rtx operands[], int *len);
 extern const char *ashlhi3_out (rtx_insn *insn, rtx operands[], int *len);
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index a90cade35c7..f69d79bf14e 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -7142,9 +7142,9 @@  lshrqi3_out (rtx_insn *insn, rtx operands[], int *len)
 
 	case 7:
 	  *len = 3;
-	  return ("rol %0" CR_TAB
-		  "clr %0" CR_TAB
-		  "rol %0");
+	  return ("bst %1,7" CR_TAB
+		  "clr %0"   CR_TAB
+		  "bld %0,0");
 	}
     }
   else if (CONSTANT_P (operands[2]))
@@ -7401,10 +7401,10 @@  lshrhi3_out (rtx_insn *insn, rtx operands[], int *len)
 
 	case 15:
 	  *len = 4;
-	  return ("clr %A0" CR_TAB
-		  "lsl %B0" CR_TAB
-		  "rol %A0" CR_TAB
-		  "clr %B0");
+	  return ("bst %B1,7" CR_TAB
+		  "clr %A0"   CR_TAB
+		  "clr %B0"   CR_TAB
+		  "bld %A0,0");
 	}
       len = t;
     }
@@ -7453,11 +7453,11 @@  avr_out_lshrpsi3 (rtx_insn *insn, rtx *op, int *plen)
           /* fall through */
 
         case 23:
-          return avr_asm_len ("clr %A0"    CR_TAB
-                              "sbrc %C0,7" CR_TAB
-                              "inc %A0"    CR_TAB
-                              "clr %B0"    CR_TAB
-                              "clr %C0", op, plen, 5);
+          return avr_asm_len ("bst %C1,7" CR_TAB
+                              "clr %A0"   CR_TAB
+                              "clr %B0"   CR_TAB
+                              "clr %C0"   CR_TAB
+                              "bld %A0,0", op, plen, 5);
         } /* switch */
     }
 
@@ -7540,13 +7540,19 @@  lshrsi3_out (rtx_insn *insn, rtx operands[], int *len)
 			    "clr %D0");
 
 	case 31:
+	  if (AVR_HAVE_MOVW)
+	    return *len = 5, ("bst %D1,7"    CR_TAB
+			      "clr %A0"      CR_TAB
+			      "clr %B0"      CR_TAB
+			      "movw %C0,%A0" CR_TAB
+			      "bld %A0,0");
 	  *len = 6;
-	  return ("clr %A0"    CR_TAB
-		  "sbrc %D0,7" CR_TAB
-		  "inc %A0"    CR_TAB
-		  "clr %B0"    CR_TAB
-		  "clr %C0"    CR_TAB
-		  "clr %D0");
+	  return ("bst %D1,7" CR_TAB
+		  "clr %A0"   CR_TAB
+		  "clr %B0"   CR_TAB
+		  "clr %C0"   CR_TAB
+		  "clr %D0"   CR_TAB
+		  "bld %A0,0");
 	}
       len = t;
     }
@@ -8485,6 +8491,135 @@  avr_out_insert_notbit (rtx_insn *insn, rtx operands[], rtx xbitno, int *plen)
 }
 
 
+/* Output instructions to extract a bit to 8-bit register XOP[0].
+   The input XOP[1] is a register or an 8-bit MEM in the lower I/O range.
+   XOP[2] is the const_int bit position.  Return "".
+
+   PLEN != 0: Set *PLEN to the code length in words.  Don't output anything.
+   PLEN == 0: Output instructions.  */
+
+const char*
+avr_out_extr (rtx_insn *insn, rtx xop[], int *plen)
+{
+  rtx dest = xop[0];
+  rtx src = xop[1];
+  int bit = INTVAL (xop[2]);
+
+  if (GET_MODE (src) != QImode)
+    {
+      src = xop[1] = simplify_gen_subreg (QImode, src, GET_MODE (src), bit / 8);
+      bit %= 8;
+      xop[2] = GEN_INT (bit);
+    }
+
+  if (MEM_P (src))
+    {
+      xop[1] = XEXP (src, 0); // address
+      gcc_assert (low_io_address_operand (xop[1], Pmode));
+
+      return avr_asm_len ("clr %0"      CR_TAB
+			  "sbic %i1,%2" CR_TAB
+			  "inc %0", xop, plen, -3);
+    }
+
+  gcc_assert (REG_P (src));
+
+  bool ld_dest_p = test_hard_reg_class (LD_REGS, dest);
+  bool ld_src_p = test_hard_reg_class (LD_REGS, src);
+
+  if (ld_dest_p
+      && REGNO (src) == REGNO (dest))
+    {
+      if (bit == 0)
+	return avr_asm_len ("andi %0,1", xop, plen, -1);
+      if (bit == 1)
+	return avr_asm_len ("lsr %0" CR_TAB
+			    "andi %0,1", xop, plen, -2);
+      if (bit == 4)
+	return avr_asm_len ("swap %0" CR_TAB
+			    "andi %0,1", xop, plen, -2);
+    }
+
+  if (bit == 0
+      && REGNO (src) != REGNO (dest))
+  {
+    if (ld_dest_p)
+      return avr_asm_len ("mov %0,%1" CR_TAB
+			  "andi %0,1", xop, plen, -2);
+    if (ld_src_p
+	&& reg_unused_after (insn, src))
+      return avr_asm_len ("andi %1,1" CR_TAB
+			  "mov %0,%1", xop, plen, -2);
+  }
+
+  return avr_asm_len ("bst %1,%2" CR_TAB
+		      "clr %0"    CR_TAB
+		      "bld %0,0", xop, plen, -3);
+}
+
+
+/* Output instructions to extract a negated bit to 8-bit register XOP[0].
+   The input XOP[1] is an 8-bit register or MEM in the lower I/O range.
+   XOP[2] is the const_int bit position.  Return "".
+
+   PLEN != 0: Set *PLEN to the code length in words.  Don't output anything.
+   PLEN == 0: Output instructions.  */
+
+const char*
+avr_out_extr_not (rtx_insn* /* insn */, rtx xop[], int *plen)
+{
+  rtx dest = xop[0];
+  rtx src = xop[1];
+  int bit = INTVAL (xop[2]);
+
+  if (MEM_P (src))
+    {
+      xop[1] = XEXP (src, 0); // address
+      gcc_assert (low_io_address_operand (xop[1], Pmode));
+
+      return avr_asm_len ("clr %0"      CR_TAB
+			  "sbis %i1,%2" CR_TAB
+			  "inc %0", xop, plen, -3);
+    }
+
+  gcc_assert (REG_P (src));
+
+  bool ld_src_p = test_hard_reg_class (LD_REGS, src);
+
+  if (ld_src_p
+      && REGNO (src) == REGNO (dest))
+    {
+      if (bit == 0)
+	return avr_asm_len ("inc %0" CR_TAB
+			    "andi %0,1", xop, plen, -2);
+      if (bit == 1)
+	return avr_asm_len ("lsr %0" CR_TAB
+			    "inc %0" CR_TAB
+			    "andi %0,1", xop, plen, -3);
+      if (bit == 4)
+	return avr_asm_len ("swap %0" CR_TAB
+			    "inc %0"  CR_TAB
+			    "andi %0,1", xop, plen, -3);
+    }
+
+  if (bit == 7
+      && ld_src_p)
+    return avr_asm_len ("cpi %1,0x80" CR_TAB
+			"sbc %0,%0"   CR_TAB
+			"neg %0", xop, plen, -3);
+
+  if (REGNO (src) != REGNO (dest))
+    return avr_asm_len ("clr %0"     CR_TAB
+			"sbrs %1,%2" CR_TAB
+			"inc %0", xop, plen, -3);
+
+  return avr_asm_len ("clr __tmp_reg__" CR_TAB
+		      "sbrs %1,%2"      CR_TAB
+		      "inc __tmp_reg__" CR_TAB
+		      "mov %0,__tmp_reg__", xop, plen, -4);
+}
+
+
 /* Outputs instructions needed for fixed point type conversion.
    This includes converting between any fixed point type, as well
    as converting to any integer type.  Conversion between integer
@@ -9282,6 +9417,8 @@  avr_adjust_insn_length (rtx_insn *insn, int len)
     case ADJUST_LEN_RELOAD_IN32: output_reload_insisf (op, op[2], &len); break;
 
     case ADJUST_LEN_OUT_BITOP: avr_out_bitop (insn, op, &len); break;
+    case ADJUST_LEN_EXTR_NOT: avr_out_extr_not (insn, op, &len); break;
+    case ADJUST_LEN_EXTR: avr_out_extr (insn, op, &len); break;
 
     case ADJUST_LEN_PLUS: avr_out_plus (insn, op, &len); break;
     case ADJUST_LEN_ADDTO_SP: avr_out_addto_sp (op, &len); break;
@@ -10865,6 +11002,16 @@  avr_rtx_costs_1 (rtx x, machine_mode mode, int outer_code,
           *total = COSTS_N_INSNS (2);
           return true;
         }
+      if (AND == code
+          && single_one_operand (XEXP (x, 1), mode)
+          && (ASHIFT == GET_CODE (XEXP (x, 0))
+              || ASHIFTRT == GET_CODE (XEXP (x, 0))
+              || LSHIFTRT == GET_CODE (XEXP (x, 0))))
+        {
+          // "*insv.any_shift.<mode>
+          *total = COSTS_N_INSNS (1 + GET_MODE_SIZE (mode));
+          return true;
+        }
       *total = COSTS_N_INSNS (GET_MODE_SIZE (mode));
       *total += avr_operand_rtx_cost (XEXP (x, 0), mode, code, 0, speed);
       if (!CONST_INT_P (XEXP (x, 1)))
diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index 9f5fabc861f..01ebdf1c322 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -155,7 +155,7 @@  (define_attr "length" ""
 ;; Otherwise do special processing depending on the attribute.
 
 (define_attr "adjust_len"
-  "out_bitop, plus, addto_sp, sext,
+  "out_bitop, plus, addto_sp, sext, extr, extr_not,
    tsthi, tstpsi, tstsi, compare, compare64, call,
    mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32,
    ufract, sfract, round,
@@ -272,12 +272,25 @@  (define_mode_iterator ORDERED234 [HI SI PSI
 (define_mode_iterator SPLIT34 [SI SF PSI
                                SQ USQ SA USA])
 
+;; Where the most significant bit is located.
+(define_mode_attr MSB  [(QI "7") (QQ "7") (UQQ "7")
+                        (HI "15") (HQ "15") (UHQ "15") (HA "15") (UHA "15")
+                        (PSI "23")
+                        (SI "31") (SQ "31") (USQ "31") (SA "31") (USA "31") (SF "31")])
+
+;; Size in bytes of the mode.
+(define_mode_attr SIZE [(QI "1") (QQ "1") (UQQ "1")
+                        (HI "2") (HQ "2") (UHQ "2") (HA "2") (UHA "2")
+                        (PSI "3")
+                        (SI "4") (SQ "4") (USQ "4") (SA "4") (USA "4") (SF "4")])
+
 ;; Define code iterators
 ;; Define two incarnations so that we can build the cross product.
 (define_code_iterator any_extend  [sign_extend zero_extend])
 (define_code_iterator any_extend2 [sign_extend zero_extend])
 (define_code_iterator any_extract [sign_extract zero_extract])
 (define_code_iterator any_shiftrt [lshiftrt ashiftrt])
+(define_code_iterator any_shift   [lshiftrt ashiftrt ashift])
 
 (define_code_iterator piaop [plus ior and])
 (define_code_iterator bitop [xor ior and])
@@ -5696,9 +5709,9 @@  (define_split	; lshrqi3_const6
 ;; "*lshrqq3"
 ;; "*lshruqq3"
 (define_insn_and_split "*lshr<mode>3_split"
-  [(set (match_operand:ALL1 0 "register_operand"                  "=r,r,r,r,!d,r,r")
-        (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand"    "0,0,0,0,0 ,0,0")
-                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,K,n ,n,Qm")))]
+  [(set (match_operand:ALL1 0 "register_operand"                  "=r,r,r,r,r  ,!d,r,r")
+        (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand"    "0,0,0,0,r  ,0 ,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,K,C07,n ,n,Qm")))]
   ""
   "#"
   "&& reload_completed"
@@ -5708,24 +5721,23 @@  (define_insn_and_split "*lshr<mode>3_split"
               (clobber (reg:CC REG_CC))])])
 
 (define_insn "*lshr<mode>3"
-  [(set (match_operand:ALL1 0 "register_operand"                  "=r,r,r,r,!d,r,r")
-        (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand"    "0,0,0,0,0 ,0,0")
-                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,K,n ,n,Qm")))
+  [(set (match_operand:ALL1 0 "register_operand"                  "=r,r,r,r,r  ,!d,r,r")
+        (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand"    "0,0,0,0,r  ,0 ,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,K,C07,n ,n,Qm")))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
   {
     return lshrqi3_out (insn, operands, NULL);
   }
-  [(set_attr "length" "5,0,1,2,4,6,9")
-   (set_attr "adjust_len" "lshrqi")])
+  [(set_attr "adjust_len" "lshrqi")])
 
 ;; "lshrhi3"
 ;; "lshrhq3"  "lshruhq3"
 ;; "lshrha3"  "lshruha3"
 (define_insn_and_split "lshr<mode>3"
-  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r,r,r")
-        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"    "0,0,0,r,0,0,0")
-                       (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r  ,r,r,r")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,0,r,r  ,0,0,0")
+                       (match_operand:QI 2 "nop_general_operand" "r,L,P,O,C15,K,n,Qm")))]
   ""
   "#"
   "&& reload_completed"
@@ -5735,22 +5747,21 @@  (define_insn_and_split "lshr<mode>3"
               (clobber (reg:CC REG_CC))])])
 
 (define_insn "*lshr<mode>3"
-  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r,r,r")
-        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"    "0,0,0,r,0,0,0")
-                       (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r  ,r,r,r")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,0,r,r  ,0,0,0")
+                       (match_operand:QI 2 "nop_general_operand" "r,L,P,O,C15,K,n,Qm")))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
   {
     return lshrhi3_out (insn, operands, NULL);
   }
-  [(set_attr "length" "6,0,2,2,4,10,10")
-   (set_attr "adjust_len" "lshrhi")])
+  [(set_attr "adjust_len" "lshrhi")])
 
 (define_insn_and_split "lshrpsi3"
-  [(set (match_operand:PSI 0 "register_operand"                 "=r,r,r,r,r")
-        (lshiftrt:PSI (match_operand:PSI 1 "register_operand"    "0,0,r,0,0")
-                      (match_operand:QI 2 "nonmemory_operand"    "r,P,O,K,n")))
-   (clobber (match_scratch:QI 3                                 "=X,X,X,X,&d"))]
+  [(set (match_operand:PSI 0 "register_operand"                 "=r,r,r,r  ,r,r")
+        (lshiftrt:PSI (match_operand:PSI 1 "register_operand"    "0,0,r,r  ,0,0")
+                      (match_operand:QI 2 "nonmemory_operand"    "r,P,O,C23,K,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,X  ,X,&d"))]
   ""
   "#"
   "&& reload_completed"
@@ -5761,10 +5772,10 @@  (define_insn_and_split "lshrpsi3"
               (clobber (reg:CC REG_CC))])])
 
 (define_insn "*lshrpsi3"
-  [(set (match_operand:PSI 0 "register_operand"                 "=r,r,r,r,r")
-        (lshiftrt:PSI (match_operand:PSI 1 "register_operand"    "0,0,r,0,0")
-                      (match_operand:QI 2 "nonmemory_operand"    "r,P,O,K,n")))
-   (clobber (match_scratch:QI 3                                 "=X,X,X,X,&d"))
+  [(set (match_operand:PSI 0 "register_operand"                 "=r,r,r,r  ,r,r")
+        (lshiftrt:PSI (match_operand:PSI 1 "register_operand"    "0,0,r,r  ,0,0")
+                      (match_operand:QI 2 "nonmemory_operand"    "r,P,O,C23,K,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,X  ,X,&d"))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
   {
@@ -5776,9 +5787,9 @@  (define_insn "*lshrpsi3"
 ;; "lshrsq3"  "lshrusq3"
 ;; "lshrsa3"  "lshrusa3"
 (define_insn_and_split "lshr<mode>3"
-  [(set (match_operand:ALL4 0 "register_operand"                  "=r,r,r,r,r,r,r")
-        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,0,0")
-                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,n,Qm")))]
+  [(set (match_operand:ALL4 0 "register_operand"                  "=r,r,r,r,r,r  ,r,r")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,r  ,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,C31,n,Qm")))]
   ""
   "#"
   "&& reload_completed"
@@ -5788,16 +5799,15 @@  (define_insn_and_split "lshr<mode>3"
               (clobber (reg:CC REG_CC))])])
 
 (define_insn "*lshr<mode>3"
-  [(set (match_operand:ALL4 0 "register_operand"                  "=r,r,r,r,r,r,r")
-        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,0,0")
-                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,n,Qm")))
+  [(set (match_operand:ALL4 0 "register_operand"                  "=r,r,r,r,r,r  ,r,r")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,r  ,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,C31,n,Qm")))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
   {
     return lshrsi3_out (insn, operands, NULL);
   }
-  [(set_attr "length" "8,0,4,4,8,10,12")
-   (set_attr "adjust_len" "lshrsi")])
+  [(set_attr "adjust_len" "lshrsi")])
 
 ;; Optimize if a scratch register from LD_REGS happens to be available.
 
@@ -5856,7 +5866,7 @@  (define_peephole2 ; lshrqi3_l_const6
     operands[2] = avr_to_int_mode (operands[0]);
   })
 
-(define_peephole2
+(define_peephole2 ; "*lshrhi3_const"
   [(match_scratch:QI 3 "d")
    (parallel [(set (match_operand:ALL2 0 "register_operand" "")
                    (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "")
@@ -5873,10 +5883,10 @@  (define_peephole2
 ;; "*lshrhq3_const"  "*lshruhq3_const"
 ;; "*lshrha3_const"  "*lshruha3_const"
 (define_insn_and_split "*lshr<mode>3_const_split"
-  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r")
-        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,0,0")
-                       (match_operand:QI 2 "const_int_operand"   "L,P,O,K,n")))
-   (clobber (match_scratch:QI 3                                 "=X,X,X,X,&d"))]
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r  ,r,r")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,r  ,0,0")
+                       (match_operand:QI 2 "const_int_operand"   "L,P,O,C15,K,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,X  ,X,&d"))]
   "reload_completed"
   "#"
   "&& reload_completed"
@@ -5887,19 +5897,18 @@  (define_insn_and_split "*lshr<mode>3_const_split"
               (clobber (reg:CC REG_CC))])])
 
 (define_insn "*lshr<mode>3_const"
-  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r")
-        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,0,0")
-                       (match_operand:QI 2 "const_int_operand"   "L,P,O,K,n")))
-   (clobber (match_scratch:QI 3                                 "=X,X,X,X,&d"))
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r  ,r,r")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,r  ,0,0")
+                       (match_operand:QI 2 "const_int_operand"   "L,P,O,C15,K,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,X  ,X,&d"))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
   {
     return lshrhi3_out (insn, operands, NULL);
   }
-  [(set_attr "length" "0,2,2,4,10")
-   (set_attr "adjust_len" "lshrhi")])
+  [(set_attr "adjust_len" "lshrhi")])
 
-(define_peephole2
+(define_peephole2 ; "*lshrsi3_const"
   [(match_scratch:QI 3 "d")
    (parallel [(set (match_operand:ALL4 0 "register_operand" "")
                    (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "")
@@ -5916,10 +5925,10 @@  (define_peephole2
 ;; "*lshrsq3_const"  "*lshrusq3_const"
 ;; "*lshrsa3_const"  "*lshrusa3_const"
 (define_insn_and_split "*lshr<mode>3_const_split"
-  [(set (match_operand:ALL4 0 "register_operand"               "=r,r,r,r")
-        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0")
-                       (match_operand:QI 2 "const_int_operand"  "L,P,O,n")))
-   (clobber (match_scratch:QI 3                                "=X,X,X,&d"))]
+  [(set (match_operand:ALL4 0 "register_operand"               "=r,r,r,r  ,r")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,r  ,0")
+                       (match_operand:QI 2 "const_int_operand"  "L,P,O,C31,n")))
+   (clobber (match_scratch:QI 3                                "=X,X,X,X  ,&d"))]
   "reload_completed"
   "#"
   "&& reload_completed"
@@ -5930,17 +5939,16 @@  (define_insn_and_split "*lshr<mode>3_const_split"
               (clobber (reg:CC REG_CC))])])
 
 (define_insn "*lshr<mode>3_const"
-  [(set (match_operand:ALL4 0 "register_operand"               "=r,r,r,r")
-        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0")
-                       (match_operand:QI 2 "const_int_operand"  "L,P,O,n")))
-   (clobber (match_scratch:QI 3                                "=X,X,X,&d"))
+  [(set (match_operand:ALL4 0 "register_operand"               "=r,r,r,r  ,r")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,r  ,0")
+                       (match_operand:QI 2 "const_int_operand"  "L,P,O,C31,n")))
+   (clobber (match_scratch:QI 3                                "=X,X,X,X  ,&d"))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
   {
     return lshrsi3_out (insn, operands, NULL);
   }
-  [(set_attr "length" "0,4,4,10")
-   (set_attr "adjust_len" "lshrsi")])
+  [(set_attr "adjust_len" "lshrsi")])
 
 ;; abs(x) abs(x) abs(x) abs(x) abs(x) abs(x) abs(x) abs(x) abs(x) abs(x) abs(x)
 ;; abs
@@ -9533,17 +9541,17 @@  (define_peephole2
               (clobber (reg:CC REG_CC))])])
 
 
-(define_expand "extzv"
-  [(set (match_operand:QI 0 "register_operand" "")
-        (zero_extract:QI (match_operand:QI 1 "register_operand"  "")
-                         (match_operand:QI 2 "const1_operand" "")
-                         (match_operand:QI 3 "const_0_to_7_operand" "")))])
+(define_expand "extzv<mode>"
+  [(set (match_operand:QI 0 "register_operand")
+        (zero_extract:QI (match_operand:QISI 1 "register_operand")
+                         (match_operand:QI 2 "const1_operand")
+                         (match_operand:QI 3 "const_0_to_<MSB>_operand")))])
 
-(define_insn_and_split "*extzv_split"
-  [(set (match_operand:QI 0 "register_operand"                   "=*d,*d,*d,*d,r")
-        (zero_extract:QI (match_operand:QI 1 "register_operand"     "0,r,0,0,r")
+(define_insn_and_split "*extzv<mode>_split"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (zero_extract:QI (match_operand:QISI 1 "reg_or_low_io_operand" "r Yil")
                          (const_int 1)
-                         (match_operand:QI 2 "const_0_to_7_operand" "L,L,P,C04,n")))]
+                         (match_operand:QI 2 "const_0_to_<MSB>_operand" "n")))]
   ""
   "#"
   "&& reload_completed"
@@ -9551,22 +9559,28 @@  (define_insn_and_split "*extzv_split"
                    (zero_extract:QI (match_dup 1)
                                     (const_int 1)
                                     (match_dup 2)))
-              (clobber (reg:CC REG_CC))])])
+              (clobber (reg:CC REG_CC))])]
+  {
+    if (! MEM_P (operands[1]))
+      {
+        int bitno = INTVAL (operands[2]);
+        operands[1] = simplify_gen_subreg (QImode, operands[1], <MODE>mode, bitno / 8);
+        operands[2] = GEN_INT (bitno % 8);
+      }
+  })
 
-(define_insn "*extzv"
-  [(set (match_operand:QI 0 "register_operand"                   "=*d,*d,*d,*d,r")
-        (zero_extract:QI (match_operand:QI 1 "register_operand"     "0,r,0,0,r")
+(define_insn "*extzv<mode>"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (zero_extract:QI (match_operand:QISI 1 "reg_or_low_io_operand" "r Yil")
                          (const_int 1)
-                         (match_operand:QI 2 "const_0_to_7_operand" "L,L,P,C04,n")))
+                         (match_operand:QI 2 "const_0_to_<MSB>_operand" "n")))
    (clobber (reg:CC REG_CC))]
   "reload_completed"
-  "@
-	andi %0,1
-	mov %0,%1\;andi %0,1
-	lsr %0\;andi %0,1
-	swap %0\;andi %0,1
-	bst %1,%2\;clr %0\;bld %0,0"
-  [(set_attr "length" "1,2,2,2,3")])
+  {
+    return avr_out_extr (insn, operands, nullptr);
+  }
+  [(set_attr "adjust_len" "extr")])
+
 
 (define_insn_and_split "*extzv.qihi1"
   [(set (match_operand:HI 0 "register_operand"                     "=r")
@@ -9587,16 +9601,197 @@  (define_insn_and_split "*extzv.qihi1"
     operands[4] = simplify_gen_subreg (QImode, operands[0], HImode, 1);
   })
 
-(define_insn_and_split "*extzv.qihi2"
-  [(set (match_operand:HI 0 "register_operand"                      "=r")
-        (zero_extend:HI
-         (zero_extract:QI (match_operand:QI 1 "register_operand"     "r")
+(define_insn_and_split "*extzv.not_split"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (zero_extract:QI (not:QI (match_operand:QI 1 "reg_or_low_io_operand" "r Yil"))
+                         (const_int 1)
+                         (match_operand:QI 2 "const_0_to_7_operand" "n")))]
+  ""
+  "#"
+  "&& reload_completed"
+  [(parallel [(set (match_dup 0)
+                   (zero_extract:QI (not:QI (match_dup 1))
+                                    (const_int 1)
+                                    (match_dup 2)))
+              (clobber (reg:CC REG_CC))])])
+
+(define_insn "*extzv.not"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (zero_extract:QI (not:QI (match_operand:QI 1 "reg_or_low_io_operand" "r Yil"))
+                         (const_int 1)
+                         (match_operand:QI 2 "const_0_to_7_operand" "n")))
+   (clobber (reg:CC REG_CC))]
+  "reload_completed"
+  {
+    return avr_out_extr_not (insn, operands, nullptr);
+  }
+  [(set_attr "adjust_len" "extr_not")])
+
+(define_insn_and_split "*extzv.subreg.<mode>"
+  [(set (match_operand:QI 0 "register_operand"                                "=r")
+        (subreg:QI (zero_extract:HISI (match_operand:HISI 1 "register_operand" "r")
+                                      (const_int 1)
+                                      (match_operand:QI 2 "const_0_to_<MSB>_operand" "n"))
+                   0))]
+   "! reload_completed"
+   { gcc_unreachable(); }
+   "&& 1"
+   [; "*extzv<mode>_split"
+    (set (match_dup 0)
+         (zero_extract:QI (match_dup 1)
                           (const_int 1)
-                          (match_operand:QI 2 "const_0_to_7_operand" "n"))))]
+                          (match_dup 2)))])
+
+;; Possible subreg bytes.
+(define_int_iterator SuReB [0 1 2 3])
+
+(define_insn_and_split "*extzv.<mode>.subreg<SuReB>"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (zero_extract:QI (subreg:QI
+                          (and:HISI (match_operand:HISI 1 "register_operand" "r")
+                                    (match_operand:HISI 2 "single_one_operand" "n"))
+                          SuReB)
+                         (const_int 1)
+                         (match_operand:QI 3 "const_0_to_7_operand" "n")))]
+  "! reload_completed
+   && IN_RANGE (UINTVAL(operands[2]) & GET_MODE_MASK(<MODE>mode),
+                1U << (8 * <SuReB>), 0x80U << (8 * <SuReB>))
+   && exact_log2 (UINTVAL(operands[2]) & GET_MODE_MASK(<MODE>mode))
+      == 8 * <SuReB> + INTVAL (operands[3])"
+  { gcc_unreachable(); }
+  "&& 1"
+  [; "*extzv<mode>_split"
+   (set (match_dup 0)
+        (zero_extract:QI (match_dup 1)
+                         (const_int 1)
+                         (match_dup 4)))]
+  {
+    operands[4] = plus_constant (QImode, operands[3], 8 * <SuReB>);
+  })
+
+
+(define_insn_and_split "*extzv.xor"
+  [(set (match_operand:QI 0 "register_operand")
+        (zero_extract:QI (xor:QI (match_operand:QI 1 "reg_or_low_io_operand")
+                                 (match_operand:QI 2 "single_one_operand"))
+                         (const_int 1)
+                         (match_operand:QI 3 "const_0_to_7_operand")))]
+  "! reload_completed
+   && ((1 << INTVAL (operands[3])) & INTVAL (operands[2])) != 0"
+  { gcc_unreachable(); }
+  "&& 1"
+  [; "*extzv.not_split"
+   (set (match_dup 0)
+        (zero_extract:QI (not:QI (match_dup 1))
+                         (const_int 1)
+                         (match_dup 3)))])
+
+
+(define_insn_and_split "*extzv<mode>.ge"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (ge:QI (match_operand:QISI 1 "reg_or_low_io_operand" "r Yil")
+               (match_operand:QISI 2 "const0_operand" "Y00")))]
   ""
   "#"
+  "reload_completed"
+  [; "*extzv.not"
+   (parallel [(set (match_dup 0)
+                   (zero_extract:QI (not:QI (match_dup 1))
+                                    (const_int 1)
+                                    (const_int 7)))
+              (clobber (reg:CC REG_CC))])]
+  {
+    if (! MEM_P (operands[1]))
+      {
+        int msb = <SIZE> - 1;
+        operands[1] = simplify_gen_subreg (QImode, operands[1], <MODE>mode, msb);
+      }
+  })
+
+(define_insn_and_split "*neg.ashiftrt<mode>.msb"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+        (neg:QI (subreg:QI
+                 (ashiftrt:QISI (match_operand:QISI 1 "register_operand" "r")
+                                (match_operand:QI 2 "const<MSB>_operand" "n"))
+                 0)))]
+  "! reload_completed"
+  { gcc_unreachable(); }
+  "&& 1"
+  [; "*extzv<mode>_split"
+   (set (match_dup 0)
+        (zero_extract:QI (match_dup 1)
+                         (const_int 1)
+                         (match_dup 2)))])
+
+(define_insn_and_split "*extzv.io.lsr7"
+  [(set (match_operand:QI 0 "register_operand")
+        (lshiftrt:QI (match_operand:QI 1 "reg_or_low_io_operand")
+                     (const_int 7)))]
+  "! reload_completed"
+  { gcc_unreachable(); }
+  "&& 1"
+  [; "*extzv_split"
+   (set (match_dup 0)
+        (zero_extract:QI (match_dup 1)
+                         (const_int 1)
+                         (const_int 7)))])
+
+(define_insn_and_split "*insv.any_shift.<mode>_split"
+  [(set (match_operand:QISI 0 "register_operand" "=r")
+        (and:QISI (any_shift:QISI (match_operand:QISI 1 "register_operand" "r")
+                                  (match_operand:QI 2 "const_0_to_<MSB>_operand" "n"))
+                  (match_operand:QISI 3 "single_one_operand" "n")))]
   ""
-  [(set (match_dup 3)
+  "#"
+  "&& reload_completed"
+  [(parallel [(set (match_dup 0)
+                   (and:QISI (any_shift:QISI (match_dup 1)
+                                             (match_dup 2))
+                             (match_dup 3)))
+              (clobber (reg:CC REG_CC))])])
+
+(define_insn "*insv.any_shift.<mode>"
+  [(set (match_operand:QISI 0 "register_operand" "=r")
+        (and:QISI (any_shift:QISI (match_operand:QISI 1 "register_operand" "r")
+                                  (match_operand:QI 2 "const_0_to_<MSB>_operand" "n"))
+                  (match_operand:QISI 3 "single_one_operand" "n")))
+   (clobber (reg:CC REG_CC))]
+  "reload_completed"
+  {
+    int shift = <CODE> == ASHIFT ? INTVAL (operands[2]) : -INTVAL (operands[2]);
+    int mask = GET_MODE_MASK (<MODE>mode) & INTVAL (operands[3]);
+    // Position of the output / input bit, respectively.
+    int obit = exact_log2 (mask);
+    int ibit = obit - shift;
+    gcc_assert (IN_RANGE (obit, 0, <MSB>));
+    gcc_assert (IN_RANGE (ibit, 0, <MSB>));
+    operands[3] = GEN_INT (obit);
+    operands[2] = GEN_INT (ibit);
+
+    if (<SIZE> == 1) return "bst %T1%T2\;clr %0\;"                 "bld %T0%T3";
+    if (<SIZE> == 2) return "bst %T1%T2\;clr %A0\;clr %B0\;"       "bld %T0%T3";
+    if (<SIZE> == 3) return "bst %T1%T2\;clr %A0\;clr %B0\;clr %C0\;bld %T0%T3";
+    return AVR_HAVE_MOVW
+      ? "bst %T1%T2\;clr %A0\;clr %B0\;movw %C0,%A0\;"  "bld %T0%T3"
+      : "bst %T1%T2\;clr %A0\;clr %B0\;clr %C0\;clr %D0\;bld %T0%T3";
+  }
+  [(set (attr "length")
+        (minus (symbol_ref "2 + <SIZE>")
+               ; One less if we can use a MOVW to clear.
+               (symbol_ref "<SIZE> == 4 && AVR_HAVE_MOVW")))])
+
+
+(define_insn_and_split "*extzv.<mode>hi2"
+  [(set (match_operand:HI 0 "register_operand"                      "=r")
+        (zero_extend:HI
+         (zero_extract:QI (match_operand:QISI 1 "register_operand"   "r")
+                          (const_int 1)
+                          (match_operand:QI 2 "const_0_to_<MSB>_operand" "n"))))]
+  "! reload_completed"
+  { gcc_unreachable(); }
+  "&& 1"
+  [; "*extzv<mode>_split"
+   (set (match_dup 3)
         (zero_extract:QI (match_dup 1)
                          (const_int 1)
                          (match_dup 2)))
@@ -9610,24 +9805,20 @@  (define_insn_and_split "*extzv.qihi2"
 ;; ??? do_store_flag emits a hard-coded right shift to extract a bit without
 ;; even considering rtx_costs, extzv, or a bit-test.  See PR 55181 for an example.
 (define_insn_and_split "*extract.subreg.bit"
-  [(set (match_operand:QI 0 "register_operand"                                       "=r")
-        (and:QI (subreg:QI (any_shiftrt:HISI (match_operand:HISI 1 "register_operand" "r")
-                                             (match_operand:QI 2 "const_int_operand"  "n"))
-                           0)
+  [(set (match_operand:QI 0 "register_operand"                                   "=r")
+        (and:QI (subreg:QI
+                 (any_shiftrt:HISI (match_operand:HISI 1 "register_operand"       "r")
+                                   (match_operand:QI 2 "const_0_to_<MSB>_operand" "n"))
+                 0)
                 (const_int 1)))]
-  "INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)"
+  "! reload_completed"
   { gcc_unreachable(); }
-  "&& reload_completed"
-  [;; "*extzv"
+  "&& 1"
+  [;; "*extzv<mode>_split"
    (set (match_dup 0)
-        (zero_extract:QI (match_dup 3)
+        (zero_extract:QI (match_dup 1)
                          (const_int 1)
-                         (match_dup 4)))]
-  {
-    int bitno = INTVAL (operands[2]);
-    operands[3] = simplify_gen_subreg (QImode, operands[1], <MODE>mode, bitno / 8);
-    operands[4] = GEN_INT (bitno % 8);
-  })
+                         (match_dup 2)))])
 
 
 ;; Fixed-point instructions
diff --git a/gcc/config/avr/constraints.md b/gcc/config/avr/constraints.md
index ac43678872d..6ced8090d33 100644
--- a/gcc/config/avr/constraints.md
+++ b/gcc/config/avr/constraints.md
@@ -133,6 +133,21 @@  (define_constraint "C07"
   (and (match_code "const_int")
        (match_test "ival == 7")))
 
+(define_constraint "C15"
+  "Constant integer 15."
+  (and (match_code "const_int")
+       (match_test "ival == 15")))
+
+(define_constraint "C23"
+  "Constant integer 23."
+  (and (match_code "const_int")
+       (match_test "ival == 23")))
+
+(define_constraint "C31"
+  "Constant integer 31."
+  (and (match_code "const_int")
+       (match_test "ival == 31")))
+
 (define_constraint "Ca1"
   "Constant 1-byte integer that allows AND by means of CLT + BLD."
   (and (match_code "const_int")
@@ -257,3 +272,8 @@  (define_constraint "YIJ"
   "Fixed-point constant from @minus{}0x003f to 0x003f."
   (and (match_code "const_fixed")
        (match_test "IN_RANGE (INTVAL (avr_to_int_mode (op)), -63, 63)")))
+
+(define_constraint "Yil"
+  "Memory in the lower half of the I/O space."
+  (and (match_code "mem")
+       (match_test "low_io_address_operand (XEXP (op, 0), Pmode)")))
diff --git a/gcc/config/avr/predicates.md b/gcc/config/avr/predicates.md
index e352326466f..25767353be1 100644
--- a/gcc/config/avr/predicates.md
+++ b/gcc/config/avr/predicates.md
@@ -50,6 +50,16 @@  (define_special_predicate "low_io_address_operand"
        (and (match_code "symbol_ref")
 	    (match_test "SYMBOL_REF_FLAGS (op) & SYMBOL_FLAG_IO_LOW"))))
 
+;; Return true if OP is a register_operand or low_io_operand.
+(define_predicate "reg_or_low_io_operand"
+  (ior (match_operand 0 "register_operand")
+       (and (match_code "mem")
+            ; Deliberately only allow QImode no matter what the mode of
+            ; the operand is.  This effectively disallows and I/O that
+            ; is not QImode for that operand.
+            (match_test "GET_MODE (op) == QImode")
+            (match_test "low_io_address_operand (XEXP (op, 0), Pmode)"))))
+
 ;; Return true if OP is a valid address for high half of I/O space.
 (define_predicate "high_io_address_operand"
   (and (match_code "const_int")
@@ -86,12 +96,47 @@  (define_predicate "const1_operand"
   (and (match_code "const_int")
        (match_test "op == CONST1_RTX (mode)")))
 
+;; Return 1 if OP is the constant integer 7 for MODE.
+(define_predicate "const7_operand"
+  (and (match_code "const_int")
+       (match_test "INTVAL(op) == 7")))
+
+;; Return 1 if OP is the constant integer 15 for MODE.
+(define_predicate "const15_operand"
+  (and (match_code "const_int")
+       (match_test "INTVAL(op) == 15")))
+
+;; Return 1 if OP is the constant integer 23 for MODE.
+(define_predicate "const23_operand"
+  (and (match_code "const_int")
+       (match_test "INTVAL(op) == 23")))
+
+;; Return 1 if OP is the constant integer 31 for MODE.
+(define_predicate "const31_operand"
+  (and (match_code "const_int")
+       (match_test "INTVAL(op) == 31")))
+
 
 ;; Return 1 if OP is constant integer 0..7 for MODE.
 (define_predicate "const_0_to_7_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
 
+;; Return 1 if OP is constant integer 0..15 for MODE.
+(define_predicate "const_0_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 15)")))
+
+;; Return 1 if OP is constant integer 0..23 for MODE.
+(define_predicate "const_0_to_23_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 23)")))
+
+;; Return 1 if OP is constant integer 0..31 for MODE.
+(define_predicate "const_0_to_31_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 31)")))
+
 ;; Return 1 if OP is constant integer 2..7 for MODE.
 (define_predicate "const_2_to_7_operand"
   (and (match_code "const_int")
diff --git a/gcc/testsuite/gcc.target/avr/pr109907.c b/gcc/testsuite/gcc.target/avr/pr109907.c
new file mode 100644
index 00000000000..a7cb47bb131
--- /dev/null
+++ b/gcc/testsuite/gcc.target/avr/pr109907.c
@@ -0,0 +1,156 @@ 
+/* { dg-options { -Os -dp } } */
+/* { dg-final { scan-assembler-not {shr} } } */
+
+typedef __UINT8_TYPE__ uint8_t;
+typedef __UINT16_TYPE__ uint16_t;
+typedef __UINT32_TYPE__ uint32_t;
+
+#define SFR (* (volatile uint8_t*) (0 + __AVR_SFR_OFFSET__))
+
+#define HISFR (* (volatile uint8_t*) (0x20 + __AVR_SFR_OFFSET__))
+
+uint8_t cset_32bit31 (uint32_t x)
+{
+    return (x & (1ul << 31)) ? 1 : 0;
+}
+
+uint8_t cset_24bit23 (__uint24 x)
+{
+    return (x & (1ul << 23)) ? 1 : 0;
+}
+
+uint8_t cset_32bit30 (uint32_t x)
+{
+    return (x & (1ul << 30)) ? 1 : 0;
+}
+
+uint8_t cset_32bit30_not (uint32_t x)
+{
+    return (x & (1ul << 30)) ? 0 : 1;
+}
+
+uint8_t cset_32bit31_not (uint32_t x)
+{
+    return (x & (1ul << 31)) ? 0 : 1;
+}
+
+/*********************************************/
+
+uint8_t cset_sfr_7 (uint32_t x)
+{
+    return (SFR & (1 << 7)) ? 1 : 0;
+}
+
+uint8_t cset_sfr_7not (void)
+{
+    return (SFR & (1 << 7)) ? 0 : 1;
+}
+
+uint8_t cset_sfr_5 (void)
+{
+    return (SFR & (1 << 5)) ? 1 : 0;
+}
+
+uint8_t cset_sfr_5not (void)
+{
+    return (SFR & (1 << 5)) ? 0 : 1;
+}
+
+char zz;
+
+void set0_sfr_5_1 (void)
+{
+    if (SFR & (1 << 5))
+        zz = 0;
+}
+
+void set0_sfr_5_0 (void)
+{
+    if (! (SFR & (1 << 5)))
+        zz = 0;
+}
+
+void set0_sfr_7_1 (void)
+{
+    if (SFR & (1 << 7))
+        zz = 0;
+}
+
+void set0_sfr_7_0 (void)
+{
+    if (! (SFR & (1 << 7)))
+        zz = 0;
+}
+
+void set0_sfr_0_1 (void)
+{
+    if (SFR & (1 << 0))
+        zz = 0;
+}
+
+void set0_sfr_0_0 (void)
+{
+    if (! (SFR & (1 << 0)))
+        zz = 0;
+}
+
+
+/*********************************************/
+
+uint8_t cset_hisfr_7 (uint32_t x)
+{
+    return (HISFR & (1 << 7)) ? 1 : 0;
+}
+
+uint8_t cset_hisfr_7not (void)
+{
+    return (HISFR & (1 << 7)) ? 0 : 1;
+}
+
+uint8_t cset_hisfr_5 (void)
+{
+    return (HISFR & (1 << 5)) ? 1 : 0;
+}
+
+uint8_t cset_hisfr_5not (void)
+{
+    return (HISFR & (1 << 5)) ? 0 : 1;
+}
+
+char zz;
+
+void set0_hisfr_5_1 (void)
+{
+    if (HISFR & (1 << 5))
+        zz = 0;
+}
+
+void set0_hisfr_5_0 (void)
+{
+    if (! (HISFR & (1 << 5)))
+        zz = 0;
+}
+
+void set0_hisfr_7_1 (void)
+{
+    if (HISFR & (1 << 7))
+        zz = 0;
+}
+
+void set0_hisfr_7_0 (void)
+{
+    if (! (HISFR & (1 << 7)))
+        zz = 0;
+}
+
+void set0_hisfr_0_1 (void)
+{
+    if (HISFR & (1 << 0))
+        zz = 0;
+}
+
+void set0_hisfr_0_0 (void)
+{
+    if (! (HISFR & (1 << 0)))
+        zz = 0;
+}
diff --git a/gcc/testsuite/gcc.target/avr/torture/pr109907-1.c b/gcc/testsuite/gcc.target/avr/torture/pr109907-1.c
new file mode 100644
index 00000000000..975afb27700
--- /dev/null
+++ b/gcc/testsuite/gcc.target/avr/torture/pr109907-1.c
@@ -0,0 +1,95 @@ 
+/* { dg-do run } */
+
+#define NI __attribute__((__noinline__,__noclone__))
+#define AI static __inline__ __attribute__((__always_inline__))
+
+typedef __UINT8_TYPE__ uint8_t;
+typedef __UINT16_TYPE__ uint16_t;
+typedef __uint24 uint24_t;
+typedef __UINT32_TYPE__ uint32_t;
+
+AI uint16_t Aswap16 (uint16_t data)
+{
+  uint16_t d = 0;
+  if (data & (1u << 15)) d |= 1u << 6;
+  if (data & (1u << 14)) d |= 1u << 13;
+  if (data & (1u << 13)) d |= 1u << 14;
+  if (data & (1u << 12)) d |= 1u << 15;
+  return d;
+}
+
+AI uint32_t Aswap32 (uint32_t data)
+{
+  uint32_t d = 0;
+  if (data & (1ul << 31)) d |= 1ul << 6;
+  if (data & (1ul << 30)) d |= 1ul << 13;
+  if (data & (1ul << 13)) d |= 1ul << 30;
+  if (data & (1ul << 12)) d |= 1ul << 31;
+  return d;
+}
+
+AI uint24_t Aswap24 (uint24_t data)
+{
+  uint24_t d = 0;
+  if (data & (1ul << 23)) d |= 1ul << 6;
+  if (data & (1ul << 22)) d |= 1ul << 13;
+  if (data & (1ul << 13)) d |= 1ul << 22;
+  if (data & (1ul << 12)) d |= 1ul << 23;
+  return d;
+}
+
+AI uint8_t Aswap8 (uint8_t data)
+{
+  uint8_t d = 0;
+  if (data & (1 << 7)) d |= 1 << 2;
+  if (data & (1 << 6)) d |= 1 << 3;
+  if (data & (1 << 3)) d |= 1 << 6;
+  if (data & (1 << 2)) d |= 1 << 7;
+  return d;
+}
+
+NI uint8_t Nswap8 (uint8_t data) { return Aswap8 (data); }
+NI uint16_t Nswap16 (uint16_t data) { return Aswap16 (data); }
+NI uint24_t Nswap24 (uint24_t data) { return Aswap24 (data); }
+NI uint32_t Nswap32 (uint32_t data) { return Aswap32 (data); }
+
+void test8 (void)
+{
+  if (Nswap8 (0xaa) != Aswap8 (0xaa)) __builtin_abort();
+  if (Nswap8 (0xcc) != Aswap8 (0xcc)) __builtin_abort();
+  if (Nswap8 (0xf0) != Aswap8 (0xf0)) __builtin_abort();
+}
+
+void test16 (void)
+{
+  if (Nswap16 (0xaaaa) != Aswap16 (0xaaaa)) __builtin_abort();
+  if (Nswap16 (0xcccc) != Aswap16 (0xcccc)) __builtin_abort();
+  if (Nswap16 (0xf0f0) != Aswap16 (0xf0f0)) __builtin_abort();
+  if (Nswap16 (0xff00) != Aswap16 (0xff00)) __builtin_abort();
+}
+
+void test24 (void)
+{
+  if (Nswap24 (0xaaaaaa) != Aswap24 (0xaaaaaa)) __builtin_abort();
+  if (Nswap24 (0xcccccc) != Aswap24 (0xcccccc)) __builtin_abort();
+  if (Nswap24 (0xf0f0f0) != Aswap24 (0xf0f0f0)) __builtin_abort();
+  if (Nswap24 (0xfff000) != Aswap24 (0xfff000)) __builtin_abort();
+}
+
+void test32 (void)
+{
+  if (Nswap32 (0xaaaaaaaa) != Aswap32 (0xaaaaaaaa)) __builtin_abort();
+  if (Nswap32 (0xcccccccc) != Aswap32 (0xcccccccc)) __builtin_abort();
+  if (Nswap32 (0xf0f0f0f0) != Aswap32 (0xf0f0f0f0)) __builtin_abort();
+  if (Nswap32 (0xff00ff00) != Aswap32 (0xff00ff00)) __builtin_abort();
+  if (Nswap32 (0xffff0000) != Aswap32 (0xffff0000)) __builtin_abort();
+}
+
+int main (void)
+{
+  test8 ();
+  test16 ();
+  test24 ();
+  test32 ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/avr/torture/pr109907-2.c b/gcc/testsuite/gcc.target/avr/torture/pr109907-2.c
new file mode 100644
index 00000000000..db0cc72e590
--- /dev/null
+++ b/gcc/testsuite/gcc.target/avr/torture/pr109907-2.c
@@ -0,0 +1,294 @@ 
+/* { dg-do run } */
+
+#define NI __attribute__((__noinline__,__noclone__))
+#define AI static __inline__ __attribute__((__always_inline__))
+
+typedef __UINT8_TYPE__ uint8_t;
+typedef __UINT16_TYPE__ uint16_t;
+typedef __uint24 uint24_t;
+typedef __UINT32_TYPE__ uint32_t;
+
+typedef __INT32_TYPE__ int32_t;
+
+#define ADD(W,B,N)                              \
+  do                                            \
+    {                                           \
+      if (!c)                                   \
+        __asm ("sbrc %T1%T2 $ subi %0,%n3"      \
+               : "+d" (b)                       \
+               : "r" (num), "n" (B), "n" (N));  \
+      else                                      \
+        if (num & ((uint##W##_t) 1 << B))       \
+          b += N;                               \
+    } while (0)
+
+NI uint8_t Afun1 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (32, 31, 1);
+  ADD (32, 29, 1);
+  ADD (32, 13, 1);
+  return b;
+}
+NI uint8_t Cfun1 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (32, 31, 1);
+  ADD (32, 29, 1);
+  ADD (32, 13, 1);
+  return b;
+}
+
+NI uint8_t Afun2 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (32, 31, -1);
+  ADD (32, 29, 1);
+  ADD (32, 13, 1);
+  return b;
+}
+NI uint8_t Cfun2 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (32, 31, -1);
+  ADD (32, 29, 1);
+  ADD (32, 13, 1);
+  return b;
+}
+
+NI uint8_t Afun3 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (32, 13, 1);
+  ADD (32, 29, 1);
+  ADD (32, 31, 1);
+  return b;
+}
+NI uint8_t Cfun3 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (32, 13, 1);
+  ADD (32, 29, 1);
+  ADD (32, 31, 1);
+  return b;
+}
+
+NI uint8_t Afun4 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (32, 13, -1);
+  ADD (32, 29, 1);
+  ADD (32, 31, -1);
+  return b;
+}
+NI uint8_t Cfun4 (uint32_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (32, 13, -1);
+  ADD (32, 29, 1);
+  ADD (32, 31, -1);
+  return b;
+}
+
+void test32_0 (uint32_t x)
+{
+  if (Afun1 (x) != Cfun1 (x)) __builtin_abort();
+  if (Afun2 (x) != Cfun2 (x)) __builtin_abort();
+  if (Afun3 (x) != Cfun3 (x)) __builtin_abort();
+  if (Afun4 (x) != Cfun4 (x)) __builtin_abort();
+}
+
+void test32_1 (uint32_t x)
+{
+  test32_0 (x);
+  test32_0 (~x);
+}
+
+void test32 (void)
+{
+  test32_1 (0);
+  test32_1 (0x55555555);
+  test32_1 (0xcccccccc);
+  test32_1 (0x0f0f0f0f);
+  test32_1 (0x00ff00ff);
+  test32_1 (0x0000ffff);
+}
+
+/*********************************************************************/
+
+NI uint8_t Afun5 (uint24_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (24, 23, 1);
+  ADD (24, 21, 1);
+  ADD (24, 13, 1);
+  return b;
+}
+NI uint8_t Cfun5 (uint24_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (24, 23, 1);
+  ADD (24, 21, 1);
+  ADD (24, 13, 1);
+  return b;
+}
+
+NI uint8_t Afun6 (uint24_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (24, 23, -1);
+  ADD (24, 21, -1);
+  ADD (24, 13, 1);
+  return b;
+}
+NI uint8_t Cfun6 (uint24_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (24, 23, -1);
+  ADD (24, 21, -1);
+  ADD (24, 13, 1);
+  return b;
+}
+
+NI uint8_t Afun7 (uint24_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (24, 0, 1);
+  ADD (24, 21, 1);
+  ADD (24, 23, 1);
+  return b;
+}
+NI uint8_t Cfun7 (uint24_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (24, 0, 1);
+  ADD (24, 21, 1);
+  ADD (24, 23, 1);
+  return b;
+}
+
+void test24_0 (uint24_t x)
+{
+  if (Afun5 (x) != Cfun5 (x)) __builtin_abort();
+  if (Afun6 (x) != Cfun6 (x)) __builtin_abort();
+  if (Afun7 (x) != Cfun7 (x)) __builtin_abort();
+}
+
+void test24_1 (uint24_t x)
+{
+  test24_0 (x);
+  test24_0 (~x);
+}
+
+void test24 (void)
+{
+  test24_1 (0);
+  test24_1 (0x555555);
+  test24_1 (0xcccccc);
+  test24_1 (07070707);
+  test24_1 (0x0f0f0f);
+  test24_1 (0x000fff);
+}
+
+/*********************************************************************/
+
+NI uint8_t Afun15 (uint16_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (16, 15, 1);
+  ADD (16,  2, 1);
+  ADD (16, 13, 1);
+  return b;
+}
+NI uint8_t Cfun15 (uint16_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (16, 15, 1);
+  ADD (16,  2, 1);
+  ADD (16, 13, 1);
+  return b;
+}
+
+NI uint8_t Afun16 (uint16_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (16, 15, -1);
+  ADD (16,  2, 1);
+  ADD (16, 13, 1);
+  return b;
+}
+NI uint8_t Cfun16 (uint16_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (16, 15, -1);
+  ADD (16,  2, 1);
+  ADD (16, 13, 1);
+  return b;
+}
+
+NI uint8_t Afun17 (uint16_t num)
+{
+  uint8_t b = 0;
+  int c = 0;
+  ADD (16, 9, 1);
+  ADD (16, 2, 1);
+  ADD (16, 15, 1);
+  return b;
+}
+NI uint8_t Cfun17 (uint16_t num)
+{
+  uint8_t b = 0;
+  int c = 1;
+  ADD (16, 9, 1);
+  ADD (16, 2, 1);
+  ADD (16, 15, 1);
+  return b;
+}
+
+void test16_0 (uint16_t x)
+{
+  if (Afun15 (x) != Cfun15 (x)) __builtin_abort();
+  if (Afun16 (x) != Cfun16 (x)) __builtin_abort();
+  if (Afun17 (x) != Cfun17 (x)) __builtin_abort();
+}
+
+void test16_1 (uint16_t x)
+{
+  test16_0 (x);
+  test16_0 (~x);
+}
+
+void test16 (void)
+{
+  test16_1 (0);
+  test16_1 (0x5555);
+  test16_1 (0xcccc);
+  test16_1 (0x0f0f);
+  test16_1 (0x00ff);
+}
+
+int main (void)
+{
+  test32 ();
+  test24 ();
+  test16 ();
+  return 0;
+}