From patchwork Fri Aug 12 10:13:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 56699 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4C75A3858C52 for ; Fri, 12 Aug 2022 10:13:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4C75A3858C52 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1660299237; bh=cZR2r+kjYhE9QRLTHn1EQ8E98N3nk5VwuJFEXkrnqP8=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=xX2qtw7OpQjFglfaKCV63kIR7fyxHWfLTgEGHAOT4OdJsU7Cfci5Mq0fIY2YdHV+t qxFPuMqJQLWf3q8xhHBlF5NqEJ9vnrnOkqifNoI8vAKSE9zLQvCkk87ayKCIn+kbuC LBPP0AvlLvnRivxbHqHY16YkTyQ8FNphSftmsa/0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 4BCAE3858D28 for ; Fri, 12 Aug 2022 10:13:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4BCAE3858D28 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 27C9jf62022576 for ; Fri, 12 Aug 2022 10:13:24 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hwmgxrvtb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 12 Aug 2022 10:13:23 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 27CACWJv014544 for ; Fri, 12 Aug 2022 10:13:21 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma02fra.de.ibm.com with ESMTP id 3huww0th5r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 12 Aug 2022 10:13:21 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 27CAAh4n32375058 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Aug 2022 10:10:43 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 90CF1AE045; Fri, 12 Aug 2022 10:13:18 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 65607AE053; Fri, 12 Aug 2022 10:13:18 +0000 (GMT) Received: from [9.171.46.216] (unknown [9.171.46.216]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Fri, 12 Aug 2022 10:13:18 +0000 (GMT) Message-ID: <0791cfa8-15be-022a-6495-2e6ddbe8ca95@linux.ibm.com> Date: Fri, 12 Aug 2022 12:13:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Content-Language: en-US Subject: [PATCH] s390: Use vpdi and verllg in vec_reve. To: GCC Patches X-TM-AS-GCONF: 00 X-Proofpoint-GUID: BQPngBsunhEoH-M5w89zdYygnTCSbWAQ X-Proofpoint-ORIG-GUID: BQPngBsunhEoH-M5w89zdYygnTCSbWAQ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-12_07,2022-08-11_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 mlxscore=0 spamscore=0 priorityscore=1501 adultscore=0 impostorscore=0 phishscore=0 malwarescore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2208120028 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Robin Dapp via Gcc-patches From: Robin Dapp Reply-To: Robin Dapp Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, swapping the two elements of a V2DImode or V2DFmode vector can be done with vpdi instead of using the generic way of loading a permutation mask from the literal pool and vperm. Analogous to the V2DI/V2DF case reversing the elements of a four-element vector can be done by first swapping the elements of the first doubleword as well the ones of the second one and subsequently rotate the doublewords by 32 bits. Bootstrapped and regtested, no regressions. Is it OK? Regards Robin gcc/ChangeLog: PR target/100869 * config/s390/vector.md (@vpdi4_2): New pattern. (rotl3_di): New pattern. * config/s390/vx-builtins.md: Use vpdi and verll for reversing elements. gcc/testsuite/ChangeLog: * gcc.target/s390/zvector/vec-reve-int-long.c: New test. --- gcc/config/s390/vector.md | 28 +++++++++++++ gcc/config/s390/vx-builtins.md | 41 +++++++++++++++++++ .../s390/zvector/vec-reve-int-long.c | 31 ++++++++++++++ 3 files changed, 100 insertions(+) create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-reve-int-long.c diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index 16b162aae0e5..2207f39b80e4 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -791,6 +791,17 @@ (define_insn "@vpdi4" "vpdi\t%v0,%v1,%v2,4" [(set_attr "op_type" "VRR")]) +; Second DW of op1 and first DW of op2 (when interpreted as 2-element vector). +(define_insn "@vpdi4_2" + [(set (match_operand:V_HW_4 0 "register_operand" "=v") + (vec_select:V_HW_4 + (vec_concat: + (match_operand:V_HW_4 1 "register_operand" "v") + (match_operand:V_HW_4 2 "register_operand" "v")) + (parallel [(const_int 2) (const_int 3) (const_int 4) (const_int 5)])))] + "TARGET_VX" + "vpdi\t%v0,%v1,%v2,4" + [(set_attr "op_type" "VRR")]) (define_insn "*vmrhb" [(set (match_operand:V16QI 0 "register_operand" "=v") @@ -1249,6 +1260,23 @@ (define_insn "*3" "\t%v0,%v1,%Y2" [(set_attr "op_type" "VRS")]) +; verllg for V4SI/V4SF. This swaps the first and the second two +; elements of a vector and is only valid in that context. +(define_expand "rotl3_di" + [ + (set (match_dup 2) + (subreg:V2DI (match_operand:V_HW_4 1) 0)) + (set (match_dup 3) + (rotate:V2DI + (match_dup 2) + (const_int 32))) + (set (match_operand:V_HW_4 0) + (subreg:V_HW_4 (match_dup 3) 0))] + "TARGET_VX" + { + operands[2] = gen_reg_rtx (V2DImode); + operands[3] = gen_reg_rtx (V2DImode); + }) ; Shift each element by corresponding vector element diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md index c46d16eae484..99c4c037b49a 100644 --- a/gcc/config/s390/vx-builtins.md +++ b/gcc/config/s390/vx-builtins.md @@ -2184,6 +2184,47 @@ (define_insn "*eltswap" vster\t%v1,%v0" [(set_attr "op_type" "*,VRX,VRX")]) +; Swapping v2df/v2di can be done via vpdi on z13 and z14. +(define_split + [(set (match_operand:V_HW_2 0 "register_operand" "") + (unspec:V_HW_2 [(match_operand:V_HW_2 1 "register_operand" "")] + UNSPEC_VEC_ELTSWAP))] + "TARGET_VX && can_create_pseudo_p ()" + [(set (match_operand:V_HW_2 0 "register_operand" "=v") + (vec_select:V_HW_2 + (vec_concat: + (match_operand:V_HW_2 1 "register_operand" "v") + (match_dup 1)) + (parallel [(const_int 1) (const_int 2)])))] +) + + +; Swapping v4df/v4si can be done via vpdi and rot. +(define_split + [(set (match_operand:V_HW_4 0 "register_operand" "") + (unspec:V_HW_4 [(match_operand:V_HW_4 1 "register_operand" "")] + UNSPEC_VEC_ELTSWAP))] + "TARGET_VX && can_create_pseudo_p ()" + [(set (match_dup 2) + (vec_select:V_HW_4 + (vec_concat: + (match_dup 1) + (match_dup 1)) + (parallel [(const_int 2) (const_int 3) (const_int 4) (const_int 5)]))) + (set (match_dup 3) + (subreg:V2DI (match_dup 2) 0)) + (set (match_dup 4) + (rotate:V2DI + (match_dup 3) + (const_int 32))) + (set (match_operand:V_HW_4 0) + (subreg:V_HW_4 (match_dup 4) 0))] +{ + operands[2] = gen_reg_rtx (mode); + operands[3] = gen_reg_rtx (V2DImode); + operands[4] = gen_reg_rtx (V2DImode); +}) + ; z15 has instructions for doing element reversal from mem to reg ; or the other way around. For reg to reg or on pre z15 machines ; we have to emulate it with vector permute. diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-reve-int-long.c b/gcc/testsuite/gcc.target/s390/zvector/vec-reve-int-long.c new file mode 100644 index 000000000000..dff3a94066c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/zvector/vec-reve-int-long.c @@ -0,0 +1,31 @@ +/* Test that we use vpdi in order to reverse vectors + with two elements instead of creating a literal-pool entry + and permuting with vperm. */ +/* { dg-do compile { target { s390*-*-* } } } */ +/* { dg-options "-O2 -march=z14 -mzarch -mzvector -fno-unroll-loops" } */ + +/* { dg-final { scan-assembler-times "vpdi\t" 4 } } */ +/* { dg-final { scan-assembler-times "verllg\t" 2 } } */ +/* { dg-final { scan-assembler-times "vperm" 0 } } */ + +#include + +vector double reved (vector double a) +{ + return vec_reve (a); +} + +vector long long revel (vector long long a) +{ + return vec_reve (a); +} + +vector float revef (vector float a) +{ + return vec_reve (a); +} + +vector int revei (vector int a) +{ + return vec_reve (a); +}