From patchwork Thu Oct 14 06:17:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 46195 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6C834385800C for ; Thu, 14 Oct 2021 06:18:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6C834385800C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1634192282; bh=rm3jW1K7x0AshZAWdkeM63wfWPsEQp6mylnia1rvqiU=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=BZpsYG9yX8bw2fr3+SGOcovqyOf3HXehfpbvFnuFcZvqfusKL+TdCkhv8nXf8PBg5 ekOmrz0uLbYpVDG1hKcpp6EdERJL9WfVJIGgw96o6xiQHjzFG+Au6IR6QkMEwTDTUu /JWkWUnX1BSdY8V7L3pORSj6hWu6Jkioy3+ma8mc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 4A7C93858D39 for ; Thu, 14 Oct 2021 06:17:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4A7C93858D39 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19E3EmCL024945; Thu, 14 Oct 2021 02:17:28 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bnshgw7v3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Oct 2021 02:17:27 -0400 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19E55s74007558; Thu, 14 Oct 2021 02:17:27 -0400 Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bnshgw7ud-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Oct 2021 02:17:27 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19E681G0032523; Thu, 14 Oct 2021 06:17:25 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06fra.de.ibm.com with ESMTP id 3bk2bjy7b6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Oct 2021 06:17:24 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19E6BjtC43909522 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Oct 2021 06:11:45 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2B24FAE059; Thu, 14 Oct 2021 06:17:22 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AB6BDAE05D; Thu, 14 Oct 2021 06:17:20 +0000 (GMT) Received: from [9.200.54.45] (unknown [9.200.54.45]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 14 Oct 2021 06:17:20 +0000 (GMT) Message-ID: <9c39ac50-aee1-b50f-dfb3-badb6752e921@linux.ibm.com> Date: Thu, 14 Oct 2021 14:17:17 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Content-Language: en-US To: gcc-patches Subject: PATCH, rs6000] Optimization for vec_xl_sext X-TM-AS-GCONF: 00 X-Proofpoint-GUID: GB_OcUcoYnAtzZIk8WeGpQAx01xGg593 X-Proofpoint-ORIG-GUID: FkcXRSCHxUfloboraoba-NbUtzNtFCel X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-14_01,2021-10-14_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxscore=0 phishscore=0 priorityscore=1501 impostorscore=0 spamscore=0 clxscore=1015 malwarescore=0 suspectscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110140035 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: HAO CHEN GUI via Gcc-patches From: HAO CHEN GUI Reply-To: HAO CHEN GUI Cc: Bill Schmidt , Segher Boessenkool , David Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi,   The patch optimizes the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly.   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot.   I refined the patch according to Bill and David's advice. I put the patch.diff and ChangeLog in attachment also in case the indentation doesn't show correctly in email body. ChangeLog 2021-10-11 Haochen Gui gcc/ * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify the expansion for sign extension. All extensions are done within VSX registers. gcc/testsuite/ * gcc.target/powerpc/p10_vec_xl_sext.c: New test. patch.diff 2021-10-11 Haochen Gui gcc/ * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify the expansion for sign extension. All extensions are done within VSX registers. gcc/testsuite/ * gcc.target/powerpc/p10_vec_xl_sext.c: New test. diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index b4e13af4dc6..587e9fa2a2a 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl if (sign_extend) { - rtx discratch = gen_reg_rtx (DImode); + rtx discratch = gen_reg_rtx (V2DImode); rtx tiscratch = gen_reg_rtx (TImode); /* Emit the lxvr*x insn. */ @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl return 0; emit_insn (pat); - /* Emit a sign extension from QI,HI,WI to double (DI). */ - rtx scratch = gen_lowpart (smode, tiscratch); + /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI. */ + rtx temp1, temp2; if (icode == CODE_FOR_vsx_lxvrbx) - emit_insn (gen_extendqidi2 (discratch, scratch)); + { + temp1 = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0); + emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1)); + } else if (icode == CODE_FOR_vsx_lxvrhx) - emit_insn (gen_extendhidi2 (discratch, scratch)); + { + temp1 = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0); + emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1)); + } else if (icode == CODE_FOR_vsx_lxvrwx) - emit_insn (gen_extendsidi2 (discratch, scratch)); - /* Assign discratch directly if scratch is already DI. */ - if (icode == CODE_FOR_vsx_lxvrdx) - discratch = scratch; + { + temp1 = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0); + emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1)); + } + else if (icode == CODE_FOR_vsx_lxvrdx) + discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0); + else + gcc_unreachable (); - /* Emit the sign extension from DI (double) to TI (quad). */ - emit_insn (gen_extendditi2 (target, discratch)); + /* Emit the sign extension from V2DI (double) to TI (quad). */ + temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0); + emit_insn (gen_extendditi2_vector (target, temp2)); return target; } diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c new file mode 100644 index 00000000000..78e72ac5425 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target int128 } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +vector signed __int128 +foo1 (signed long a, signed char *b) +{ + return vec_xl_sext (a, b); +} + +vector signed __int128 +foo2 (signed long a, signed short *b) +{ + return vec_xl_sext (a, b); +} + +vector signed __int128 +foo3 (signed long a, signed int *b) +{ + return vec_xl_sext (a, b); +} + +vector signed __int128 +foo4 (signed long a, signed long *b) +{ + return vec_xl_sext (a, b); +} + +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */ diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index b4e13af4dc6..587e9fa2a2a 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl    if (sign_extend)      { -      rtx discratch = gen_reg_rtx (DImode); +      rtx discratch = gen_reg_rtx (V2DImode);        rtx tiscratch = gen_reg_rtx (TImode);        /* Emit the lxvr*x insn.  */ @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl         return 0;        emit_insn (pat); -      /* Emit a sign extension from QI,HI,WI to double (DI).  */ -      rtx scratch = gen_lowpart (smode, tiscratch); +      /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */ +      rtx temp1, temp2;        if (icode == CODE_FOR_vsx_lxvrbx) -       emit_insn (gen_extendqidi2 (discratch, scratch)); +       { +         temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0); +         emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1)); +       }        else if (icode == CODE_FOR_vsx_lxvrhx) -       emit_insn (gen_extendhidi2 (discratch, scratch)); +       { +         temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0); +         emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1)); +       }        else if (icode == CODE_FOR_vsx_lxvrwx) -       emit_insn (gen_extendsidi2 (discratch, scratch)); -      /*  Assign discratch directly if scratch is already DI.  */ -      if (icode == CODE_FOR_vsx_lxvrdx) -       discratch = scratch; +       { +         temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0); +         emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1)); +       } +      else if (icode == CODE_FOR_vsx_lxvrdx) +       discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0); +      else +       gcc_unreachable (); -      /* Emit the sign extension from DI (double) to TI (quad). */ -      emit_insn (gen_extendditi2 (target, discratch)); +      /* Emit the sign extension from V2DI (double) to TI (quad).  */ +      temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0); +      emit_insn (gen_extendditi2_vector (target, temp2));        return target;      } diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c new file mode 100644 index 00000000000..78e72ac5425 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target int128 } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +vector signed __int128 +foo1 (signed long a, signed char *b) +{ +  return vec_xl_sext (a, b); +} + +vector signed __int128 +foo2 (signed long a, signed short *b) +{ +  return vec_xl_sext (a, b); +} + +vector signed __int128 +foo3 (signed long a, signed int *b) +{ +  return vec_xl_sext (a, b); +} + +vector signed __int128 +foo4 (signed long a, signed long *b) +{ +  return vec_xl_sext (a, b); +} + +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */