[v2] rs6000: Remove unspecs for vec_mrghl[bhw]

Message ID 20211013015007.300795-1-luoxhu@linux.ibm.com
State Committed
Commit 0910c516a3d72af048af27308349167f25c406c2
Headers
Series [v2] rs6000: Remove unspecs for vec_mrghl[bhw] |

Commit Message

Xionghu Luo Oct. 13, 2021, 1:50 a.m. UTC
  Resend this patch.  Previous discussion is:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572330.html

vmrghb only accepts permute index {0, 16, 1, 17, 2, 18, 3, 19, 4, 20,
5, 21, 6, 22, 7, 23} no matter for BE or LE in ISA, similarly for vmrglb.
Remove UNSPEC_VMRGH_DIRECT/UNSPEC_VMRGL_DIRECT pattern as vec_select
+ vec_concat as normal RTL.

Tested pass on P8LE, P9LE and P8BE{m32}, ok for trunk?

gcc/ChangeLog:

	* config/rs6000/altivec.md (*altivec_vmrghb_internal): Delete.
	(altivec_vmrghb_direct): New.
	(*altivec_vmrghh_internal): Delete.
	(altivec_vmrghh_direct): New.
	(*altivec_vmrghw_internal): Delete.
	(altivec_vmrghw_direct_<mode>): New.
	(altivec_vmrghw_direct): Delete.
	(*altivec_vmrglb_internal): Delete.
	(altivec_vmrglb_direct): New.
	(*altivec_vmrglh_internal): Delete.
	(altivec_vmrglh_direct): New.
	(*altivec_vmrglw_internal): Delete.
	(altivec_vmrglw_direct_<mode>): New.
	(altivec_vmrglw_direct): Delete.
	* config/rs6000/rs6000-p8swap.c (rtx_is_swappable_p): Adjust.
	* config/rs6000/rs6000.c (altivec_expand_vec_perm_const):
	Adjust.
	* config/rs6000/vsx.md (vsx_xxmrghw_<mode>): Adjust.
	(vsx_xxmrglw_<mode>): Adjust.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/builtins-1.c: Update instruction counts.
---
 gcc/config/rs6000/altivec.md                  | 203 +++++-------------
 gcc/config/rs6000/rs6000-p8swap.c             |   2 -
 gcc/config/rs6000/rs6000.c                    |  75 +++----
 gcc/config/rs6000/vsx.md                      |  26 ++-
 gcc/testsuite/gcc.target/powerpc/builtins-1.c |   8 +-
 5 files changed, 116 insertions(+), 198 deletions(-)
  

Comments

David Edelsohn Oct. 18, 2021, 10:40 p.m. UTC | #1
On Tue, Oct 12, 2021 at 9:50 PM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> Resend this patch.  Previous discussion is:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572330.html
>
> vmrghb only accepts permute index {0, 16, 1, 17, 2, 18, 3, 19, 4, 20,
> 5, 21, 6, 22, 7, 23} no matter for BE or LE in ISA, similarly for vmrglb.
> Remove UNSPEC_VMRGH_DIRECT/UNSPEC_VMRGL_DIRECT pattern as vec_select
> + vec_concat as normal RTL.
>
> Tested pass on P8LE, P9LE and P8BE{m32}, ok for trunk?
>
> gcc/ChangeLog:
>
>         * config/rs6000/altivec.md (*altivec_vmrghb_internal): Delete.
>         (altivec_vmrghb_direct): New.
>         (*altivec_vmrghh_internal): Delete.
>         (altivec_vmrghh_direct): New.
>         (*altivec_vmrghw_internal): Delete.
>         (altivec_vmrghw_direct_<mode>): New.
>         (altivec_vmrghw_direct): Delete.
>         (*altivec_vmrglb_internal): Delete.
>         (altivec_vmrglb_direct): New.
>         (*altivec_vmrglh_internal): Delete.
>         (altivec_vmrglh_direct): New.
>         (*altivec_vmrglw_internal): Delete.
>         (altivec_vmrglw_direct_<mode>): New.
>         (altivec_vmrglw_direct): Delete.
>         * config/rs6000/rs6000-p8swap.c (rtx_is_swappable_p): Adjust.
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_const):
>         Adjust.
>         * config/rs6000/vsx.md (vsx_xxmrghw_<mode>): Adjust.
>         (vsx_xxmrglw_<mode>): Adjust.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/powerpc/builtins-1.c: Update instruction counts.

This patch is okay.

Thanks, David
  

Patch

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 208d6343225..097a127be07 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -143,8 +143,6 @@  (define_c_enum "unspec"
    UNSPEC_VUPKHU_V4SF
    UNSPEC_VUPKLU_V4SF
    UNSPEC_VGBBD
-   UNSPEC_VMRGH_DIRECT
-   UNSPEC_VMRGL_DIRECT
    UNSPEC_VSPLT_DIRECT
    UNSPEC_VMRGEW_DIRECT
    UNSPEC_VMRGOW_DIRECT
@@ -1291,19 +1289,17 @@  (define_expand "altivec_vmrghb"
    (use (match_operand:V16QI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtvec v = gen_rtvec (16, GEN_INT (0), GEN_INT (16), GEN_INT (1), GEN_INT (17),
-		       GEN_INT (2), GEN_INT (18), GEN_INT (3), GEN_INT (19),
-		       GEN_INT (4), GEN_INT (20), GEN_INT (5), GEN_INT (21),
-		       GEN_INT (6), GEN_INT (22), GEN_INT (7), GEN_INT (23));
-  rtx x = gen_rtx_VEC_CONCAT (V32QImode, operands[1], operands[2]);
-  x = gen_rtx_VEC_SELECT (V16QImode, x, gen_rtx_PARALLEL (VOIDmode, v));
-  emit_insn (gen_rtx_SET (operands[0], x));
+  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
+						: gen_altivec_vmrglb_direct;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
   DONE;
 })
 
-(define_insn "*altivec_vmrghb_internal"
+(define_insn "altivec_vmrghb_direct"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
-        (vec_select:V16QI
+	(vec_select:V16QI
 	  (vec_concat:V32QI
 	    (match_operand:V16QI 1 "register_operand" "v")
 	    (match_operand:V16QI 2 "register_operand" "v"))
@@ -1316,20 +1312,6 @@  (define_insn "*altivec_vmrghb_internal"
 		     (const_int 6) (const_int 22)
 		     (const_int 7) (const_int 23)])))]
   "TARGET_ALTIVEC"
-{
-  if (BYTES_BIG_ENDIAN)
-    return "vmrghb %0,%1,%2";
-  else
-    return "vmrglb %0,%2,%1";
-}
-  [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vmrghb_direct"
-  [(set (match_operand:V16QI 0 "register_operand" "=v")
-	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
-		       (match_operand:V16QI 2 "register_operand" "v")]
-		      UNSPEC_VMRGH_DIRECT))]
-  "TARGET_ALTIVEC"
   "vmrghb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -1339,16 +1321,15 @@  (define_expand "altivec_vmrghh"
    (use (match_operand:V8HI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtvec v = gen_rtvec (8, GEN_INT (0), GEN_INT (8), GEN_INT (1), GEN_INT (9),
-		       GEN_INT (2), GEN_INT (10), GEN_INT (3), GEN_INT (11));
-  rtx x = gen_rtx_VEC_CONCAT (V16HImode, operands[1], operands[2]);
-
-  x = gen_rtx_VEC_SELECT (V8HImode, x, gen_rtx_PARALLEL (VOIDmode, v));
-  emit_insn (gen_rtx_SET (operands[0], x));
+  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghh_direct
+						: gen_altivec_vmrglh_direct;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
   DONE;
 })
 
-(define_insn "*altivec_vmrghh_internal"
+(define_insn "altivec_vmrghh_direct"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
         (vec_select:V8HI
 	  (vec_concat:V16HI
@@ -1359,20 +1340,6 @@  (define_insn "*altivec_vmrghh_internal"
 		     (const_int 2) (const_int 10)
 		     (const_int 3) (const_int 11)])))]
   "TARGET_ALTIVEC"
-{
-  if (BYTES_BIG_ENDIAN)
-    return "vmrghh %0,%1,%2";
-  else
-    return "vmrglh %0,%2,%1";
-}
-  [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vmrghh_direct"
-  [(set (match_operand:V8HI 0 "register_operand" "=v")
-        (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
-                      (match_operand:V8HI 2 "register_operand" "v")]
-                     UNSPEC_VMRGH_DIRECT))]
-  "TARGET_ALTIVEC"
   "vmrghh %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -1382,35 +1349,23 @@  (define_expand "altivec_vmrghw"
    (use (match_operand:V4SI 2 "register_operand"))]
   "VECTOR_MEM_ALTIVEC_P (V4SImode)"
 {
-  rtvec v = gen_rtvec (4, GEN_INT (0), GEN_INT (4), GEN_INT (1), GEN_INT (5));
-  rtx x = gen_rtx_VEC_CONCAT (V8SImode, operands[1], operands[2]);
-  x = gen_rtx_VEC_SELECT (V4SImode, x, gen_rtx_PARALLEL (VOIDmode, v));
-  emit_insn (gen_rtx_SET (operands[0], x));
+  rtx (*fun) (rtx, rtx, rtx);
+  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrghw_direct_v4si
+			 : gen_altivec_vmrglw_direct_v4si;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
   DONE;
 })
 
-(define_insn "*altivec_vmrghw_internal"
-  [(set (match_operand:V4SI 0 "register_operand" "=v")
-        (vec_select:V4SI
-	  (vec_concat:V8SI
-	    (match_operand:V4SI 1 "register_operand" "v")
-	    (match_operand:V4SI 2 "register_operand" "v"))
+(define_insn "altivec_vmrghw_direct_<mode>"
+  [(set (match_operand:VSX_W 0 "register_operand" "=wa,v")
+	(vec_select:VSX_W
+	  (vec_concat:<VS_double>
+	    (match_operand:VSX_W 1 "register_operand" "wa,v")
+	    (match_operand:VSX_W 2 "register_operand" "wa,v"))
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 1) (const_int 5)])))]
-  "VECTOR_MEM_ALTIVEC_P (V4SImode)"
-{
-  if (BYTES_BIG_ENDIAN)
-    return "vmrghw %0,%1,%2";
-  else
-    return "vmrglw %0,%2,%1";
-}
-  [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vmrghw_direct"
-  [(set (match_operand:V4SI 0 "register_operand" "=wa,v")
-	(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "wa,v")
-		      (match_operand:V4SI 2 "register_operand" "wa,v")]
-		     UNSPEC_VMRGH_DIRECT))]
   "TARGET_ALTIVEC"
   "@
    xxmrghw %x0,%x1,%x2
@@ -1440,19 +1395,17 @@  (define_expand "altivec_vmrglb"
    (use (match_operand:V16QI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtvec v = gen_rtvec (16, GEN_INT (8), GEN_INT (24), GEN_INT (9), GEN_INT (25),
-		       GEN_INT (10), GEN_INT (26), GEN_INT (11), GEN_INT (27),
-		       GEN_INT (12), GEN_INT (28), GEN_INT (13), GEN_INT (29),
-		       GEN_INT (14), GEN_INT (30), GEN_INT (15), GEN_INT (31));
-  rtx x = gen_rtx_VEC_CONCAT (V32QImode, operands[1], operands[2]);
-  x = gen_rtx_VEC_SELECT (V16QImode, x, gen_rtx_PARALLEL (VOIDmode, v));
-  emit_insn (gen_rtx_SET (operands[0], x));
+  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrglb_direct
+						: gen_altivec_vmrghb_direct;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
   DONE;
 })
 
-(define_insn "*altivec_vmrglb_internal"
+(define_insn "altivec_vmrglb_direct"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
-        (vec_select:V16QI
+	(vec_select:V16QI
 	  (vec_concat:V32QI
 	    (match_operand:V16QI 1 "register_operand" "v")
 	    (match_operand:V16QI 2 "register_operand" "v"))
@@ -1465,20 +1418,6 @@  (define_insn "*altivec_vmrglb_internal"
 		     (const_int 14) (const_int 30)
 		     (const_int 15) (const_int 31)])))]
   "TARGET_ALTIVEC"
-{
-  if (BYTES_BIG_ENDIAN)
-    return "vmrglb %0,%1,%2";
-  else
-    return "vmrghb %0,%2,%1";
-}
-  [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vmrglb_direct"
-  [(set (match_operand:V16QI 0 "register_operand" "=v")
-	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
-		       (match_operand:V16QI 2 "register_operand" "v")]
-		      UNSPEC_VMRGL_DIRECT))]
-  "TARGET_ALTIVEC"
   "vmrglb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -1488,15 +1427,15 @@  (define_expand "altivec_vmrglh"
    (use (match_operand:V8HI 2 "register_operand"))]
   "TARGET_ALTIVEC"
 {
-  rtvec v = gen_rtvec (8, GEN_INT (4), GEN_INT (12), GEN_INT (5), GEN_INT (13),
-		       GEN_INT (6), GEN_INT (14), GEN_INT (7), GEN_INT (15));
-  rtx x = gen_rtx_VEC_CONCAT (V16HImode, operands[1], operands[2]);
-  x = gen_rtx_VEC_SELECT (V8HImode, x, gen_rtx_PARALLEL (VOIDmode, v));
-  emit_insn (gen_rtx_SET (operands[0], x));
+  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrglh_direct
+						: gen_altivec_vmrghh_direct;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
   DONE;
 })
 
-(define_insn "*altivec_vmrglh_internal"
+(define_insn "altivec_vmrglh_direct"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
         (vec_select:V8HI
 	  (vec_concat:V16HI
@@ -1507,20 +1446,6 @@  (define_insn "*altivec_vmrglh_internal"
 		     (const_int 6) (const_int 14)
 		     (const_int 7) (const_int 15)])))]
   "TARGET_ALTIVEC"
-{
-  if (BYTES_BIG_ENDIAN)
-    return "vmrglh %0,%1,%2";
-  else
-    return "vmrghh %0,%2,%1";
-}
-  [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vmrglh_direct"
-  [(set (match_operand:V8HI 0 "register_operand" "=v")
-        (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
-		      (match_operand:V8HI 2 "register_operand" "v")]
-		     UNSPEC_VMRGL_DIRECT))]
-  "TARGET_ALTIVEC"
   "vmrglh %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -1530,35 +1455,23 @@  (define_expand "altivec_vmrglw"
    (use (match_operand:V4SI 2 "register_operand"))]
   "VECTOR_MEM_ALTIVEC_P (V4SImode)"
 {
-  rtvec v = gen_rtvec (4, GEN_INT (2), GEN_INT (6), GEN_INT (3), GEN_INT (7));
-  rtx x = gen_rtx_VEC_CONCAT (V8SImode, operands[1], operands[2]);
-  x = gen_rtx_VEC_SELECT (V4SImode, x, gen_rtx_PARALLEL (VOIDmode, v));
-  emit_insn (gen_rtx_SET (operands[0], x));
+  rtx (*fun) (rtx, rtx, rtx);
+  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrglw_direct_v4si
+			 : gen_altivec_vmrghw_direct_v4si;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
   DONE;
 })
 
-(define_insn "*altivec_vmrglw_internal"
-  [(set (match_operand:V4SI 0 "register_operand" "=v")
-        (vec_select:V4SI
-	  (vec_concat:V8SI
-	    (match_operand:V4SI 1 "register_operand" "v")
-	    (match_operand:V4SI 2 "register_operand" "v"))
+(define_insn "altivec_vmrglw_direct_<mode>"
+  [(set (match_operand:VSX_W 0 "register_operand" "=wa,v")
+	(vec_select:VSX_W
+	  (vec_concat:<VS_double>
+	    (match_operand:VSX_W 1 "register_operand" "wa,v")
+	    (match_operand:VSX_W 2 "register_operand" "wa,v"))
 	  (parallel [(const_int 2) (const_int 6)
 		     (const_int 3) (const_int 7)])))]
-  "VECTOR_MEM_ALTIVEC_P (V4SImode)"
-{
-  if (BYTES_BIG_ENDIAN)
-    return "vmrglw %0,%1,%2";
-  else
-    return "vmrghw %0,%2,%1";
-}
-  [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vmrglw_direct"
-  [(set (match_operand:V4SI 0 "register_operand" "=wa,v")
-	(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "wa,v")
-		      (match_operand:V4SI 2 "register_operand" "wa,v")]
-		     UNSPEC_VMRGL_DIRECT))]
   "TARGET_ALTIVEC"
   "@
    xxmrglw %x0,%x1,%x2
@@ -3929,13 +3842,13 @@  (define_expand "vec_widen_umult_hi_v8hi"
     {
       emit_insn (gen_altivec_vmuleuh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmulouh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrghw_direct (operands[0], ve, vo));
+      emit_insn (gen_altivec_vmrghw_direct_v4si (operands[0], ve, vo));
     }
   else
     {
       emit_insn (gen_altivec_vmulouh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmuleuh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrghw_direct (operands[0], vo, ve));
+      emit_insn (gen_altivec_vmrghw_direct_v4si (operands[0], vo, ve));
     }
   DONE;
 })
@@ -3954,13 +3867,13 @@  (define_expand "vec_widen_umult_lo_v8hi"
     {
       emit_insn (gen_altivec_vmuleuh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmulouh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrglw_direct (operands[0], ve, vo));
+      emit_insn (gen_altivec_vmrglw_direct_v4si (operands[0], ve, vo));
     }
   else
     {
       emit_insn (gen_altivec_vmulouh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmuleuh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrglw_direct (operands[0], vo, ve));
+      emit_insn (gen_altivec_vmrglw_direct_v4si (operands[0], vo, ve));
     }
   DONE;
 })
@@ -3979,13 +3892,13 @@  (define_expand "vec_widen_smult_hi_v8hi"
     {
       emit_insn (gen_altivec_vmulesh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmulosh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrghw_direct (operands[0], ve, vo));
+      emit_insn (gen_altivec_vmrghw_direct_v4si (operands[0], ve, vo));
     }
   else
     {
       emit_insn (gen_altivec_vmulosh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmulesh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrghw_direct (operands[0], vo, ve));
+      emit_insn (gen_altivec_vmrghw_direct_v4si (operands[0], vo, ve));
     }
   DONE;
 })
@@ -4004,13 +3917,13 @@  (define_expand "vec_widen_smult_lo_v8hi"
     {
       emit_insn (gen_altivec_vmulesh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmulosh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrglw_direct (operands[0], ve, vo));
+      emit_insn (gen_altivec_vmrglw_direct_v4si (operands[0], ve, vo));
     }
   else
     {
       emit_insn (gen_altivec_vmulosh (ve, operands[1], operands[2]));
       emit_insn (gen_altivec_vmulesh (vo, operands[1], operands[2]));
-      emit_insn (gen_altivec_vmrglw_direct (operands[0], vo, ve));
+      emit_insn (gen_altivec_vmrglw_direct_v4si (operands[0], vo, ve));
     }
   DONE;
 })
diff --git a/gcc/config/rs6000/rs6000-p8swap.c b/gcc/config/rs6000/rs6000-p8swap.c
index ad2b3023819..ec503ab742f 100644
--- a/gcc/config/rs6000/rs6000-p8swap.c
+++ b/gcc/config/rs6000/rs6000-p8swap.c
@@ -744,8 +744,6 @@  rtx_is_swappable_p (rtx op, unsigned int *special)
 	  default:
 	    break;
 	  case UNSPEC_VBPERMQ:
-	  case UNSPEC_VMRGH_DIRECT:
-	  case UNSPEC_VMRGL_DIRECT:
 	  case UNSPEC_VPACK_SIGN_SIGN_SAT:
 	  case UNSPEC_VPACK_SIGN_UNS_SAT:
 	  case UNSPEC_VPACK_UNS_UNS_MOD:
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5a5202c455b..d0730253bcc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -23009,43 +23009,44 @@  altivec_expand_vec_perm_const (rtx target, rtx op0, rtx op1,
     unsigned char perm[16];
   };
   static const struct altivec_perm_insn patterns[] = {
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuhum_direct,
-      {  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum_direct,
-      {  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
-    { OPTION_MASK_ALTIVEC, 
-      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb_direct
-       : CODE_FOR_altivec_vmrglb_direct),
-      {  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
-    { OPTION_MASK_ALTIVEC,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh_direct
-       : CODE_FOR_altivec_vmrglh_direct),
-      {  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
-    { OPTION_MASK_ALTIVEC,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct
-       : CODE_FOR_altivec_vmrglw_direct),
-      {  0,  1,  2,  3, 16, 17, 18, 19,  4,  5,  6,  7, 20, 21, 22, 23 } },
-    { OPTION_MASK_ALTIVEC,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb_direct
-       : CODE_FOR_altivec_vmrghb_direct),
-      {  8, 24,  9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
-    { OPTION_MASK_ALTIVEC,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh_direct
-       : CODE_FOR_altivec_vmrghh_direct),
-      {  8,  9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
-    { OPTION_MASK_ALTIVEC,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct
-       : CODE_FOR_altivec_vmrghw_direct),
-      {  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
-    { OPTION_MASK_P8_VECTOR,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgew_v4sf_direct
-       : CODE_FOR_p8_vmrgow_v4sf_direct),
-      {  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
-    { OPTION_MASK_P8_VECTOR,
-      (BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgow_v4sf_direct
-       : CODE_FOR_p8_vmrgew_v4sf_direct),
-      {  4,  5,  6,  7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31 } }
-  };
+    {OPTION_MASK_ALTIVEC,
+     CODE_FOR_altivec_vpkuhum_direct,
+     {1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31}},
+    {OPTION_MASK_ALTIVEC,
+     CODE_FOR_altivec_vpkuwum_direct,
+     {2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31}},
+    {OPTION_MASK_ALTIVEC,
+     BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb_direct
+		      : CODE_FOR_altivec_vmrglb_direct,
+     {0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23}},
+    {OPTION_MASK_ALTIVEC,
+     BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh_direct
+		      : CODE_FOR_altivec_vmrglh_direct,
+     {0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23}},
+    {OPTION_MASK_ALTIVEC,
+     BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct_v4si
+		      : CODE_FOR_altivec_vmrglw_direct_v4si,
+     {0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23}},
+    {OPTION_MASK_ALTIVEC,
+     BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb_direct
+		      : CODE_FOR_altivec_vmrghb_direct,
+     {8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31}},
+    {OPTION_MASK_ALTIVEC,
+     BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh_direct
+		      : CODE_FOR_altivec_vmrghh_direct,
+     {8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31}},
+    {OPTION_MASK_ALTIVEC,
+     BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct_v4si
+		      : CODE_FOR_altivec_vmrghw_direct_v4si,
+     {8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31}},
+    {OPTION_MASK_P8_VECTOR,
+     BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgew_v4sf_direct
+		      : CODE_FOR_p8_vmrgow_v4sf_direct,
+     {0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27}},
+    {OPTION_MASK_P8_VECTOR,
+     BYTES_BIG_ENDIAN ? CODE_FOR_p8_vmrgow_v4sf_direct
+		      : CODE_FOR_p8_vmrgew_v4sf_direct,
+     {4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31}}};
 
   unsigned int i, j, elt, which;
   unsigned char perm[16];
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 55d830d0f20..f1097326861 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4568,7 +4568,7 @@  (define_insn "vsx_xxspltd_<mode>"
   [(set_attr "type" "vecperm")])
 
 ;; V4SF/V4SI interleave
-(define_insn "vsx_xxmrghw_<mode>"
+(define_expand "vsx_xxmrghw_<mode>"
   [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
         (vec_select:VSX_W
 	  (vec_concat:<VS_double>
@@ -4578,14 +4578,17 @@  (define_insn "vsx_xxmrghw_<mode>"
 		     (const_int 1) (const_int 5)])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
-  if (BYTES_BIG_ENDIAN)
-    return "xxmrghw %x0,%x1,%x2";
-  else
-    return "xxmrglw %x0,%x2,%x1";
+  rtx (*fun) (rtx, rtx, rtx);
+  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrghw_direct_<mode>
+			 : gen_altivec_vmrglw_direct_<mode>;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
+  DONE;
 }
   [(set_attr "type" "vecperm")])
 
-(define_insn "vsx_xxmrglw_<mode>"
+(define_expand "vsx_xxmrglw_<mode>"
   [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
 	(vec_select:VSX_W
 	  (vec_concat:<VS_double>
@@ -4595,10 +4598,13 @@  (define_insn "vsx_xxmrglw_<mode>"
 		     (const_int 3) (const_int 7)])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
-  if (BYTES_BIG_ENDIAN)
-    return "xxmrglw %x0,%x1,%x2";
-  else
-    return "xxmrghw %x0,%x2,%x1";
+  rtx (*fun) (rtx, rtx, rtx);
+  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrglw_direct_<mode>
+			 : gen_altivec_vmrghw_direct_<mode>;
+  if (!BYTES_BIG_ENDIAN)
+    std::swap (operands[1], operands[2]);
+  emit_insn (fun (operands[0], operands[1], operands[2]));
+  DONE;
 }
   [(set_attr "type" "vecperm")])
 
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.c b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
index 3ec1024a955..63fbd2e3be1 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c
@@ -317,10 +317,10 @@  int main ()
 /* { dg-final { scan-assembler-times "vctuxs" 2 } } */
 
 /* { dg-final { scan-assembler-times "vmrghb" 4 { target be } } } */
-/* { dg-final { scan-assembler-times "vmrghb" 5 { target le } } } */
+/* { dg-final { scan-assembler-times "vmrghb" 6 { target le } } } */
 /* { dg-final { scan-assembler-times "vmrghh" 8 } } */
-/* { dg-final { scan-assembler-times "xxmrghw" 8 } } */
-/* { dg-final { scan-assembler-times "xxmrglw" 8 } } */
+/* { dg-final { scan-assembler-times "xxmrghw" 4 } } */
+/* { dg-final { scan-assembler-times "xxmrglw" 4 } } */
 /* { dg-final { scan-assembler-times "vmrglh" 8 } } */
 /* { dg-final { scan-assembler-times "xxlnor" 6 } } */
 /* { dg-final { scan-assembler-times {\mvpkudus\M} 1 } } */
@@ -347,7 +347,7 @@  int main ()
 /* { dg-final { scan-assembler-times "vspltb" 6 } } */
 /* { dg-final { scan-assembler-times "vspltw" 0 } } */
 /* { dg-final { scan-assembler-times "vmrgow" 8 } } */
-/* { dg-final { scan-assembler-times "vmrglb" 5 { target le } } } */
+/* { dg-final { scan-assembler-times "vmrglb" 4 { target le } } } */
 /* { dg-final { scan-assembler-times "vmrglb" 6 { target be } } } */
 /* { dg-final { scan-assembler-times "vmrgew" 8 } } */
 /* { dg-final { scan-assembler-times "vsplth" 8 } } */