From patchwork Sat Nov 12 05:10:59 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Meissner <meissner@linux.ibm.com>
X-Patchwork-Id: 60478
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 944E03858404
	for <patchwork@sourceware.org>; Sat, 12 Nov 2022 05:11:39 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 944E03858404
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1668229899;
	bh=RnJqHRj9EY9kubWL+m9VwJBfL5nkbktDvoXtMPWbYt8=;
	h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
	 From;
	b=mT5LrClukyM9OzGjmqr7VMmBGruoSJut2QrsjHFFl5AUKELbAu/Kz9+Yuw1T+WUe0
	 nYG+VGy6LX8WHo+n3aIcZ6wLy+uLTaYalLwZg2So8ozglk6yQkO2/qdOognRp70mYI
	 WgZMQy9JxuHZmJs58Y1OnnYOXXNOcx1ZVF0SZxuk=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com
 [148.163.156.1])
 by sourceware.org (Postfix) with ESMTPS id E3A013858404
 for <gcc-patches@gcc.gnu.org>; Sat, 12 Nov 2022 05:11:08 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E3A013858404
Received: from pps.filterd (m0187473.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id
 2AC4VXtX011492;
 Sat, 12 Nov 2022 05:11:08 GMT
Received: from pps.reinject (localhost [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kt3mf9d9v-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Sat, 12 Nov 2022 05:11:07 +0000
Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1])
 by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AC50mCp010473;
 Sat, 12 Nov 2022 05:11:07 GMT
Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com
 [169.63.214.131])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kt3mf9d9m-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Sat, 12 Nov 2022 05:11:07 +0000
Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1])
 by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AC5ATfQ000355;
 Sat, 12 Nov 2022 05:11:06 GMT
Received: from b03cxnp07029.gho.boulder.ibm.com
 (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16])
 by ppma01dal.us.ibm.com with ESMTP id 3kt348rm23-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Sat, 12 Nov 2022 05:11:06 +0000
Received: from smtpav06.dal12v.mail.ibm.com ([9.208.128.130])
 by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 2AC5B3Cs16515816
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
 Sat, 12 Nov 2022 05:11:03 GMT
Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 5FB0E58055;
 Sat, 12 Nov 2022 05:11:04 +0000 (GMT)
Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 359F25805E;
 Sat, 12 Nov 2022 05:11:02 +0000 (GMT)
Received: from toto.the-meissners.org (unknown [9.160.5.6])
 by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTPS;
 Sat, 12 Nov 2022 05:11:01 +0000 (GMT)
Date: Sat, 12 Nov 2022 00:10:59 -0500
To: Michael Meissner <meissner@linux.ibm.com>, gcc-patches@gcc.gnu.org,
 Segher Boessenkool <segher@kernel.crashing.org>,
 "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>,
 Peter Bergner <bergner@linux.ibm.com>,
 Will Schmidt <will_schmidt@vnet.ibm.com>
Subject: [PATCH 8] PowerPC: Support load/store vector with right length.
Message-ID: <Y28q48o93tWF223A@toto.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.ibm.com>,
 gcc-patches@gcc.gnu.org,
 Segher Boessenkool <segher@kernel.crashing.org>,
 "Kewen.Lin" <linkw@linux.ibm.com>,
 David Edelsohn <dje.gcc@gmail.com>,
 Peter Bergner <bergner@linux.ibm.com>,
 Will Schmidt <will_schmidt@vnet.ibm.com>
References: <Y2xlRIxBIy3VG/xL@toto.the-meissners.org>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <Y2xlRIxBIy3VG/xL@toto.the-meissners.org>
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: GUnX9ifshf2orGrHidUyQAZRH8RtXfK8
X-Proofpoint-ORIG-GUID: xVk2qAdqsew04L159XVZ-dRaUi04jEjL
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1
 definitions=2022-11-12_02,2022-11-11_01,2022-06-22_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 malwarescore=0 bulkscore=0
 mlxlogscore=999 suspectscore=0 phishscore=0 impostorscore=0
 priorityscore=1501 adultscore=0 clxscore=1015 mlxscore=0 spamscore=0
 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.12.0-2210170000 definitions=main-2211120035
X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, KAM_SHORT,
 RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Michael Meissner via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Michael Meissner <meissner@linux.ibm.com>
Reply-To: Michael Meissner <meissner@linux.ibm.com>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds support for new instructions that may be added to the PowerPC
architecture in the future to enhance the load and store vector with length
instructions.

The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use
since the count for the number of bytes must be in the top 8 bits of the GPR
register, instead of the bottom 8 bits.  This meant that code generating these
instructions typically had to do a shift left by 56 bits to get the count into
the right position.  In a future version of the PowerPC architecture, new
variants of these instructions might be added that expect the count to be in
the bottom 8 bits of the GPR register.  These patches add this support to GCC
if the user uses the -mcpu=future option.

I tested this patch on a little endian power10 system with long double using
the tradiational IBM double double format.  Assuming the other 6 patches for
-mcpu=future are checked in (or at least the first two patches), can I check
this patch into the master branch for GCC 13.

2022-11-11   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl with
	the shift count automaticaly used in the insn.
	(lxvrl): New insn for -mcpu=future.
	(lxvrll): Likewise.
	(stxvl): If -mcpu=future, generate the stxvl with the shift count
	automaticaly used in the insn.
	(stxvrl): New insn for -mcpu=future.
	(stxvrll): Likewise.

gcc/testsuite/

	* gcc.target/powerpc/lxvrl.c: New test.
---
 gcc/config/rs6000/vsx.md                 | 122 +++++++++++++++++++----
 gcc/testsuite/gcc.target/powerpc/lxvrl.c |  31 ++++++
 2 files changed, 132 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/lxvrl.c

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index fb5cf04147e..e4e73db9bb8 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5582,20 +5582,32 @@ (define_expand "first_mismatch_or_eos_index_<mode>"
   DONE;
 })
 
-;; Load VSX Vector with Length
+;; Load VSX Vector with Length.  If we have lxvrl, we don't have to do an
+;; explicit shift left into a pseudo.
 (define_expand "lxvl"
-  [(set (match_dup 3)
-        (ashift:DI (match_operand:DI 2 "register_operand")
-                   (const_int 56)))
-   (set (match_operand:V16QI 0 "vsx_register_operand")
-	(unspec:V16QI
-	 [(match_operand:DI 1 "gpc_reg_operand")
-          (mem:V16QI (match_dup 1))
-	  (match_dup 3)]
-	 UNSPEC_LXVL))]
+  [(use (match_operand:V16QI 0 "vsx_register_operand"))
+   (use (match_operand:DI 1 "gpc_reg_operand"))
+   (use (match_operand:DI 2 "gpc_reg_operand"))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
 {
-  operands[3] = gen_reg_rtx (DImode);
+  rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56));
+  rtx len;
+
+  if (TARGET_FUTURE)
+    len = shift_len;
+  else
+    {
+      len = gen_reg_rtx (DImode);
+      emit_insn (gen_rtx_SET (len, shift_len));
+    }
+
+  rtx dest = operands[0];
+  rtx addr = operands[1];
+  rtx mem = gen_rtx_MEM (V16QImode, addr);
+  rtvec rv = gen_rtvec (3, addr, mem, len);
+  rtx lxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_LXVL);
+  emit_insn (gen_rtx_SET (dest, lxvl));
+  DONE;
 })
 
 (define_insn "*lxvl"
@@ -5619,6 +5631,34 @@ (define_insn "lxvll"
   "lxvll %x0,%1,%2"
   [(set_attr "type" "vecload")])
 
+;; For lxvrl and lxvrll, use the combiner to eliminate the shift.  The
+;; define_expand for lxvl will already incorporate the shift in generating the
+;; insn.  The lxvll buitl-in function required the user to have already done
+;; the shift.  Defining lxvrll this way, will optimize cases where the user has
+;; done the shift immediately before the built-in.
+(define_insn "*lxvrl"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
+	(unspec:V16QI
+	 [(match_operand:DI 1 "gpc_reg_operand" "b")
+	  (mem:V16QI (match_dup 1))
+	  (ashift:DI (match_operand:DI 2 "register_operand" "r")
+		     (const_int 56))]
+	 UNSPEC_LXVL))]
+  "TARGET_FUTURE && TARGET_64BIT"
+  "lxvrl %x0,%1,%2"
+  [(set_attr "type" "vecload")])
+
+(define_insn "*lxvrll"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
+	(unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")
+                       (mem:V16QI (match_dup 1))
+		       (ashift:DI (match_operand:DI 2 "register_operand" "r")
+				  (const_int 56))]
+		      UNSPEC_LXVLL))]
+  "TARGET_FUTURE"
+  "lxvrll %x0,%1,%2"
+  [(set_attr "type" "vecload")])
+
 ;; Expand for builtin xl_len_r
 (define_expand "xl_len_r"
   [(match_operand:V16QI 0 "vsx_register_operand")
@@ -5650,18 +5690,29 @@ (define_insn "stxvll"
 
 ;; Store VSX Vector with Length
 (define_expand "stxvl"
-  [(set (match_dup 3)
-	(ashift:DI (match_operand:DI 2 "register_operand")
-		   (const_int 56)))
-   (set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand"))
-	(unspec:V16QI
-	 [(match_operand:V16QI 0 "vsx_register_operand")
-	  (mem:V16QI (match_dup 1))
-	  (match_dup 3)]
-	 UNSPEC_STXVL))]
+  [(use (match_operand:V16QI 0 "vsx_register_operand"))
+   (use (match_operand:DI 1 "gpc_reg_operand"))
+   (use (match_operand:DI 2 "gpc_reg_operand"))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
 {
-  operands[3] = gen_reg_rtx (DImode);
+  rtx shift_len = gen_rtx_ASHIFT (DImode, operands[2], GEN_INT (56));
+  rtx len;
+
+  if (TARGET_FUTURE)
+    len = shift_len;
+  else
+    {
+      len = gen_reg_rtx (DImode);
+      emit_insn (gen_rtx_SET (len, shift_len));
+    }
+
+  rtx src = operands[0];
+  rtx addr = operands[1];
+  rtx mem = gen_rtx_MEM (V16QImode, addr);
+  rtvec rv = gen_rtvec (3, src, mem, len);
+  rtx stxvl = gen_rtx_UNSPEC (V16QImode, rv, UNSPEC_STXVL);
+  emit_insn (gen_rtx_SET (mem, stxvl));
+  DONE;
 })
 
 ;; Define optab for vector access with length vectorization exploitation.
@@ -5705,6 +5756,35 @@ (define_insn "*stxvl"
   "stxvl %x0,%1,%2"
   [(set_attr "type" "vecstore")])
 
+;; For stxvrl and stxvrll, use the combiner to eliminate the shift.  The
+;; define_expand for stxvl will already incorporate the shift in generating the
+;; insn.  The stxvll buitl-in function required the user to have already done
+;; the shift.  Defining stxvrll this way, will optimize cases where the user
+;; has done the shift immediately before the built-in.
+
+(define_insn "*stxvrl"
+  [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b"))
+	(unspec:V16QI
+	 [(match_operand:V16QI 0 "vsx_register_operand" "wa")
+	  (mem:V16QI (match_dup 1))
+	  (ashift:DI (match_operand:DI 2 "register_operand" "r")
+		     (const_int 56))]
+	 UNSPEC_STXVL))]
+  "TARGET_FUTURE && TARGET_64BIT"
+  "stxvrl %x0,%1,%2"
+  [(set_attr "type" "vecstore")])
+
+(define_insn "*stxvrll"
+  [(set (mem:V16QI (match_operand:DI 1 "gpc_reg_operand" "b"))
+	(unspec:V16QI [(match_operand:V16QI 0 "vsx_register_operand" "wa")
+		       (mem:V16QI (match_dup 1))
+		       (ashift:DI (match_operand:DI 2 "register_operand" "r")
+				  (const_int 56))]
+	              UNSPEC_STXVLL))]
+  "TARGET_FUTURE"
+  "stxvrll %x0,%1,%2"
+  [(set_attr "type" "vecstore")])
+
 ;; Expand for builtin xst_len_r
 (define_expand "xst_len_r"
   [(match_operand:V16QI 0 "vsx_register_operand" "=wa")
diff --git a/gcc/testsuite/gcc.target/powerpc/lxvrl.c b/gcc/testsuite/gcc.target/powerpc/lxvrl.c
new file mode 100644
index 00000000000..83277dce6e4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/lxvrl.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_future_ok } */
+/* { dg-options "-mdejagnu-cpu=future -O2" } */
+
+/* Test whether the lxvrl and stxvrl instructions are generated for
+   -mcpu=future on memory copy operations.  */
+
+#ifndef VSIZE
+#define VSIZE 2
+#endif
+
+#ifndef LSIZE
+#define LSIZE 5
+#endif
+
+struct foo {
+  vector unsigned char vc[VSIZE];
+  unsigned char leftover[LSIZE];
+};
+
+void memcpy_ptr (struct foo *p, struct foo *q)
+{
+  __builtin_memcpy ((void *) p,		/* lxvrl and stxvrl.  */
+		    (void *) q,
+		    (sizeof (vector unsigned char) * VSIZE) + LSIZE);
+}
+
+/* { dg-final { scan-assembler     {\mlxvrl\M}  } } */
+/* { dg-final { scan-assembler     {\mstxvrl\M} } } */
+/* { dg-final { scan-assembler-not {\mlxvl\M}   } } */
+/* { dg-final { scan-assembler-not {\mstxvl\M}  } } */