Message ID | CAAgBjMktT4K=DE7ZR3URaESfh1rpnFobbXtAd_mEXVhJxM+vKg@mail.gmail.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5DB04385840C for <patchwork@sourceware.org>; Wed, 18 May 2022 07:08:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5DB04385840C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1652857697; bh=rT8S88fiv8wcEJ1CyOSB7PUtK+9+hDRYbYiIqJF0vgA=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=FbB6NQa/kZ61jOXO+FvdjnBD+MiyOmYTdsJBuxLafxAn3GEVa76E1+Wd5iDdxQhvr tH+4WLaamb/Zq+Ue/UcsuripmjvkLaxQ+fYWrqb5YBuy50Y2ptHWLqEZWG55YAiBW8 rONfJs1HBU5lGCpARmN0gi2NezGSpbWj0DKo6ZQ4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by sourceware.org (Postfix) with ESMTPS id 911CE3858C51 for <gcc-patches@gcc.gnu.org>; Wed, 18 May 2022 07:07:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 911CE3858C51 Received: by mail-ed1-x536.google.com with SMTP id eg11so1730921edb.11 for <gcc-patches@gcc.gnu.org>; Wed, 18 May 2022 00:07:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=rT8S88fiv8wcEJ1CyOSB7PUtK+9+hDRYbYiIqJF0vgA=; b=EkTIkXRwKXuRMRHFF9u2txhWKkQbuQrwBlrdCxFjhC1yeR8qAVsK+sRwyR8utxXhPO 5QP+O5fOVWQdHmUdT0dzZfLc2IZqDMQbFwOai1G3yPo6Gn78z2L2S3AJDkUxIpvLlVMC QKwOEOTHdW7m3vE5K7yHSiJEU+z63c/3WyzuxVUqHDH3jRdYFgqxDcnXU67zaFw7vd4H fCF4IG0CIOwZ5LbIEbNS1vQRji+N4cnOUgHsmhdcx22l/CT3pAnZISIT5a5qp1oQ2NKB rQIjHgSw7X974YB/cBG9To8w2P3Sl8PWYzPt2aYz5FvRP1ekewIKWjISvZsJJ0twTX72 ABHA== X-Gm-Message-State: AOAM530Qa7KasAnSGEmbUXHCzoBesr77NYmfbA+rH7QBd1gIjQ0bzyeB 12yEMHxQC5+jqCD7caFn6dw5+vtD5BLY8LnY6Tjz7UtletPdgA== X-Google-Smtp-Source: ABdhPJyBn9x9nZuE4/zPI5Wbtshb36G1zMg9YRMKyS3x3+u5m3tKz6XTsoPp8928iooQikGoQ8/gpm8L7dZpZevNQm4= X-Received: by 2002:aa7:d911:0:b0:42a:af69:e167 with SMTP id a17-20020aa7d911000000b0042aaf69e167mr15728596edr.54.1652857620845; Wed, 18 May 2022 00:07:00 -0700 (PDT) MIME-Version: 1.0 Date: Wed, 18 May 2022 12:36:24 +0530 Message-ID: <CAAgBjMktT4K=DE7ZR3URaESfh1rpnFobbXtAd_mEXVhJxM+vKg@mail.gmail.com> Subject: [0/9] [middle-end] Add param to vec_perm_const hook to specify mode of input operand To: gcc Patches <gcc-patches@gcc.gnu.org>, Richard Sandiford <richard.sandiford@arm.com> Content-Type: multipart/mixed; boundary="0000000000005b03ca05df43e769" X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Prathamesh Kulkarni via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Commit Message
Prathamesh Kulkarni
May 18, 2022, 7:06 a.m. UTC
Hi, The attached patch adds another parameter machine_mode op_mode to vec_perm_const hook to specify mode of input operands. The motivation for doing this is PR96463, where we create vec_perm_expr of the form: lhs = vec_perm_expr<rhs, mask> where lhs and rhs have different vector types but same element type (lhs is SVE and rhs is corresponding advsimd vector). It seems the following targets were affected: aarch64, i386, arm, ia64, mips, rs6000, s390, sparc, gcn. Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. For other targets, I did make all-gcc stage1, which seems to build OK. Thanks, Prathamesh
Comments
Hi Prathamesh, I am just looking at this as it interacts with a change I am trying to make, but I'm not a reviewer so take my comments with a pinch of salt ;) I copied in bits of your patch below to comment. > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > This hook is used to test whether the target can permute up to two > vectors of mode @var{mode} using the permutation vector @code{sel}, and > also to emit such a permutation. In the former case @var{in0}, @var{in1} Would be good to also explain what the new op_mode parameter is used for here. > @@ -6250,7 +6250,9 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > if (single_arg_p) > v1 = v0; > > - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) > + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); > + machine_mode op_mode = GET_MODE (v0); > + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, indices)) > return target; > } I was surprised by the assert check here. The docs seem to insinuate vec_perm needs both input vectors to be of the same mode, so I was expecting this to be checked/enforced elsewhere? But it shouldn't hurt I guess. > @@ -1894,7 +1894,7 @@ try the equivalent byte operation. If that also fails, it will try forcing\n\ > the selector into a register and using the @var{vec_perm@var{mode}}\n\ > instruction pattern. There is no need for the hook to handle these two\n\ > implementation approaches itself.", > - bool, (machine_mode mode, rtx output, rtx in0, rtx in1, > + bool, (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, > const vec_perm_indices &sel), > NULL) Same comment as the first, it could do with an inclusion of op_mode in the comments explaining the function. On 18/05/2022 08:06, Prathamesh Kulkarni via Gcc-patches wrote: > Hi, > The attached patch adds another parameter machine_mode op_mode to vec_perm_const > hook to specify mode of input operands. The motivation for doing this > is PR96463, > where we create vec_perm_expr of the form: > lhs = vec_perm_expr<rhs, mask> > where lhs and rhs have different vector types but same element type > (lhs is SVE and rhs is corresponding advsimd vector). > > It seems the following targets were affected: aarch64, i386, arm, ia64, > mips, rs6000, s390, sparc, gcn. > > Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. > For other targets, I did make all-gcc stage1, which seems to build OK. > > Thanks, > Prathamesh
Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > Hi, > The attached patch adds another parameter machine_mode op_mode to vec_perm_const > hook to specify mode of input operands. The motivation for doing this > is PR96463, > where we create vec_perm_expr of the form: > lhs = vec_perm_expr<rhs, mask> > where lhs and rhs have different vector types but same element type > (lhs is SVE and rhs is corresponding advsimd vector). > > It seems the following targets were affected: aarch64, i386, arm, ia64, > mips, rs6000, s390, sparc, gcn. > > Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. > For other targets, I did make all-gcc stage1, which seems to build OK. > > Thanks, > Prathamesh > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index c5006afc00d..31ff6ef3f92 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -6088,7 +6088,7 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > access using @var{type} is known to be naturally aligned. > @end deftypefn > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > This hook is used to test whether the target can permute up to two > vectors of mode @var{mode} using the permutation vector @code{sel}, and > also to emit such a permutation. In the former case @var{in0}, @var{in1} Like Andre says, the documentation should describe op_mode (and mode). > diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc > index 68dc679cc6a..aef9d4c5d28 100644 > --- a/gcc/optabs-query.cc > +++ b/gcc/optabs-query.cc > @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) > with here. */ > > bool > -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > - bool allow_variable_p) > +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, > + const vec_perm_indices &sel, bool allow_variable_p) > { The function comment should describe the new parameter. > /* If the target doesn't implement a vector mode for the vector type, > then no operations are supported. */ > @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > > if (targetm.vectorize.vec_perm_const != NULL) > { > - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, > + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, > NULL_RTX, sel)) > return true; > > @@ -462,6 +462,13 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > return false; > } > > +bool > +can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > + bool allow_variable_p) > +{ > + return can_vec_perm_const_p (mode, mode, sel, allow_variable_p); > +} > + I can understand why you went for this, but now that we've opened the door to mismatched modes, I think it would be better if all callers specified the input mode explicitly. > diff --git a/gcc/optabs.cc b/gcc/optabs.cc > index 3d8fa3abdfe..55f10c41789 100644 > --- a/gcc/optabs.cc > +++ b/gcc/optabs.cc > @@ -6250,7 +6250,9 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > if (single_arg_p) > v1 = v0; > > - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) > + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); > + machine_mode op_mode = GET_MODE (v0); > + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, indices)) > return target; > } > (FWIW, I agree the assert is worth having.) Thanks, Richard > @@ -6264,7 +6266,7 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > v0_qi = gen_lowpart (qimode, v0); > v1_qi = gen_lowpart (qimode, v1); > if (targetm.vectorize.vec_perm_const != NULL > - && targetm.vectorize.vec_perm_const (qimode, target_qi, v0_qi, > + && targetm.vectorize.vec_perm_const (qimode, qimode, target_qi, v0_qi, > v1_qi, qimode_indices)) > return gen_lowpart (mode, target_qi); > } > diff --git a/gcc/target.def b/gcc/target.def > index d85adf36a39..2713c31dc3f 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -1894,7 +1894,7 @@ try the equivalent byte operation. If that also fails, it will try forcing\n\ > the selector into a register and using the @var{vec_perm@var{mode}}\n\ > instruction pattern. There is no need for the hook to handle these two\n\ > implementation approaches itself.", > - bool, (machine_mode mode, rtx output, rtx in0, rtx in1, > + bool, (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, > const vec_perm_indices &sel), > NULL) >
On Wed, May 18, 2022 at 9:08 AM Prathamesh Kulkarni via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > The attached patch adds another parameter machine_mode op_mode to vec_perm_const > hook to specify mode of input operands. The motivation for doing this > is PR96463, > where we create vec_perm_expr of the form: > lhs = vec_perm_expr<rhs, mask> > where lhs and rhs have different vector types but same element type > (lhs is SVE and rhs is corresponding advsimd vector). > > It seems the following targets were affected: aarch64, i386, arm, ia64, > mips, rs6000, s390, sparc, gcn. > > Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. > For other targets, I did make all-gcc stage1, which seems to build OK. So it would also allow to express extract even/odd and interleave operations with a VEC_PERM. The interleave currently has the issue that we have to artificially widen the inputs with "dont-care" elements. The lo/high variants can already do with a bitfield-ref or vector-of-vector CTOR. Richard. > > Thanks, > Prathamesh
On Wed, 18 May 2022 at 17:27, Richard Sandiford <richard.sandiford@arm.com> wrote: > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > Hi, > > The attached patch adds another parameter machine_mode op_mode to vec_perm_const > > hook to specify mode of input operands. The motivation for doing this > > is PR96463, > > where we create vec_perm_expr of the form: > > lhs = vec_perm_expr<rhs, mask> > > where lhs and rhs have different vector types but same element type > > (lhs is SVE and rhs is corresponding advsimd vector). > > > > It seems the following targets were affected: aarch64, i386, arm, ia64, > > mips, rs6000, s390, sparc, gcn. > > > > Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. > > For other targets, I did make all-gcc stage1, which seems to build OK. > > > > Thanks, > > Prathamesh > > > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > > index c5006afc00d..31ff6ef3f92 100644 > > --- a/gcc/doc/tm.texi > > +++ b/gcc/doc/tm.texi > > @@ -6088,7 +6088,7 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > > access using @var{type} is known to be naturally aligned. > > @end deftypefn > > > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > This hook is used to test whether the target can permute up to two > > vectors of mode @var{mode} using the permutation vector @code{sel}, and > > also to emit such a permutation. In the former case @var{in0}, @var{in1} > > Like Andre says, the documentation should describe op_mode (and mode). > > > diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc > > index 68dc679cc6a..aef9d4c5d28 100644 > > --- a/gcc/optabs-query.cc > > +++ b/gcc/optabs-query.cc > > @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) > > with here. */ > > > > bool > > -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > > - bool allow_variable_p) > > +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, > > + const vec_perm_indices &sel, bool allow_variable_p) > > { > > The function comment should describe the new parameter. > > > /* If the target doesn't implement a vector mode for the vector type, > > then no operations are supported. */ > > @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > > > > if (targetm.vectorize.vec_perm_const != NULL) > > { > > - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, > > + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, > > NULL_RTX, sel)) > > return true; > > > > @@ -462,6 +462,13 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > > return false; > > } > > > > +bool > > +can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > > + bool allow_variable_p) > > +{ > > + return can_vec_perm_const_p (mode, mode, sel, allow_variable_p); > > +} > > + > > I can understand why you went for this, but now that we've opened > the door to mismatched modes, I think it would be better if all callers > specified the input mode explicitly. > > > diff --git a/gcc/optabs.cc b/gcc/optabs.cc > > index 3d8fa3abdfe..55f10c41789 100644 > > --- a/gcc/optabs.cc > > +++ b/gcc/optabs.cc > > @@ -6250,7 +6250,9 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > > if (single_arg_p) > > v1 = v0; > > > > - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) > > + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); > > + machine_mode op_mode = GET_MODE (v0); > > + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, indices)) > > return target; > > } > > > > (FWIW, I agree the assert is worth having.) Hi, I updated the patch with doc and adjusted callers to explicitly pass op_mode. Bootstrapped + tested on x86_64-linux-gnu and aarch64-linux-gnu. Does it look OK to commit ? Thanks, Prathamesh > > Thanks, > Richard > > > @@ -6264,7 +6266,7 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > > v0_qi = gen_lowpart (qimode, v0); > > v1_qi = gen_lowpart (qimode, v1); > > if (targetm.vectorize.vec_perm_const != NULL > > - && targetm.vectorize.vec_perm_const (qimode, target_qi, v0_qi, > > + && targetm.vectorize.vec_perm_const (qimode, qimode, target_qi, v0_qi, > > v1_qi, qimode_indices)) > > return gen_lowpart (mode, target_qi); > > } > > diff --git a/gcc/target.def b/gcc/target.def > > index d85adf36a39..2713c31dc3f 100644 > > --- a/gcc/target.def > > +++ b/gcc/target.def > > @@ -1894,7 +1894,7 @@ try the equivalent byte operation. If that also fails, it will try forcing\n\ > > the selector into a register and using the @var{vec_perm@var{mode}}\n\ > > instruction pattern. There is no need for the hook to handle these two\n\ > > implementation approaches itself.", > > - bool, (machine_mode mode, rtx output, rtx in0, rtx in1, > > + bool, (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, > > const vec_perm_indices &sel), > > NULL) > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c5006afc00d..f53068c5c53 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6088,13 +6088,13 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar access using @var{type} is known to be naturally aligned. @end deftypefn -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) This hook is used to test whether the target can permute up to two -vectors of mode @var{mode} using the permutation vector @code{sel}, and +vectors of mode @var{op_mode} using the permutation vector @code{sel}, and also to emit such a permutation. In the former case @var{in0}, @var{in1} and @var{out} are all null. In the latter case @var{in0} and @var{in1} are -the source vectors and @var{out} is the destination vector; all three are -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if +the source vectors of mode @var{op_mode} and @var{out} is the destination vector +of mode @var{mode}. @var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one vector instead of two. Return true if the operation is possible, emitting instructions for it diff --git a/gcc/match.pd b/gcc/match.pd index f5efa77560c..b06d7013603 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7703,12 +7703,14 @@ and, 2-argument version. */ tree oldop2 = op2; if (sel.ninputs () == 2 - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) + || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), + sel, false)) op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); else { vec_perm_indices sel2 (builder, 2, nelts); - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) + if (can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), + sel2, false)) op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); else /* Not directly supported with either encoding, diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc index 68dc679cc6a..809482b8092 100644 --- a/gcc/optabs-query.cc +++ b/gcc/optabs-query.cc @@ -407,9 +407,9 @@ can_vec_perm_var_p (machine_mode mode) } /* Return true if the target directly supports VEC_PERM_EXPRs on vectors - of mode MODE using the selector SEL. ALLOW_VARIABLE_P is true if it - is acceptable to force the selector into a register and use a variable - permute (if the target supports that). + of mode OP_MODE and result vector of mode MODE using the selector SEL. + ALLOW_VARIABLE_P is true if it is acceptable to force the selector into a + register and use a variable permute (if the target supports that). Note that additional permutations representing whole-vector shifts may also be handled via the vec_shr or vec_shl optab, but only where the @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) with here. */ bool -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, - bool allow_variable_p) +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, + const vec_perm_indices &sel, bool allow_variable_p) { /* If the target doesn't implement a vector mode for the vector type, then no operations are supported. */ @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, if (targetm.vectorize.vec_perm_const != NULL) { - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, NULL_RTX, sel)) return true; @@ -534,7 +534,7 @@ can_mult_highpart_p (machine_mode mode, bool uns_p) + (i & ~1) + ((i & 1) ? nunits : 0)); vec_perm_indices indices (sel, 2, nunits); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return 2; } } @@ -550,7 +550,7 @@ can_mult_highpart_p (machine_mode mode, bool uns_p) for (unsigned int i = 0; i < 3; ++i) sel.quick_push (2 * i + (BYTES_BIG_ENDIAN ? 0 : 1)); vec_perm_indices indices (sel, 2, nunits); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return 3; } } diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h index b9c9fd6f64d..32e52b20109 100644 --- a/gcc/optabs-query.h +++ b/gcc/optabs-query.h @@ -178,7 +178,7 @@ bool can_conditionally_move_p (machine_mode mode); opt_machine_mode qimode_for_vec_perm (machine_mode); bool selector_fits_mode_p (machine_mode, const vec_perm_indices &); bool can_vec_perm_var_p (machine_mode); -bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &, +bool can_vec_perm_const_p (machine_mode, machine_mode, const vec_perm_indices &, bool = true); /* Find a widening optab even if it doesn't widen as much as we want. */ #define find_widening_optab_handler(A, B, C) \ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 3d8fa3abdfe..c0a68471d2d 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -6250,7 +6250,10 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, if (single_arg_p) v1 = v0; - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); + machine_mode op_mode = GET_MODE (v0); + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, + indices)) return target; } @@ -6264,7 +6267,7 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, v0_qi = gen_lowpart (qimode, v0); v1_qi = gen_lowpart (qimode, v1); if (targetm.vectorize.vec_perm_const != NULL - && targetm.vectorize.vec_perm_const (qimode, target_qi, v0_qi, + && targetm.vectorize.vec_perm_const (qimode, qimode, target_qi, v0_qi, v1_qi, qimode_indices)) return gen_lowpart (mode, target_qi); } diff --git a/gcc/target.def b/gcc/target.def index d85adf36a39..e8537b3eaf4 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1878,11 +1878,11 @@ access using @var{type} is known to be naturally aligned.", DEFHOOK (vec_perm_const, "This hook is used to test whether the target can permute up to two\n\ -vectors of mode @var{mode} using the permutation vector @code{sel}, and\n\ +vectors of mode @var{op_mode} using the permutation vector @code{sel}, and\n\ also to emit such a permutation. In the former case @var{in0}, @var{in1}\n\ and @var{out} are all null. In the latter case @var{in0} and @var{in1} are\n\ -the source vectors and @var{out} is the destination vector; all three are\n\ -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if\n\ +the source vectors of mode @var{op_mode} and @var{out} is the destination vector\n\ +of mode @var{mode}. @var{in1} is the same as @var{in0} if\n\ @var{sel} describes a permutation on one vector instead of two.\n\ \n\ Return true if the operation is possible, emitting instructions for it\n\ @@ -1894,7 +1894,7 @@ try the equivalent byte operation. If that also fails, it will try forcing\n\ the selector into a register and using the @var{vec_perm@var{mode}}\n\ instruction pattern. There is no need for the hook to handle these two\n\ implementation approaches itself.", - bool, (machine_mode mode, rtx output, rtx in0, rtx in1, + bool, (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, const vec_perm_indices &sel), NULL) diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index 48cab5844e0..e253e848796 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -3016,7 +3016,8 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) : (elts[0].second == 0 && elts[0].first == 0 ? 0 : refnelts) + i); vec_perm_indices indices (sel, orig[1] ? 2 : 1, refnelts); - if (!can_vec_perm_const_p (TYPE_MODE (perm_type), indices)) + machine_mode vmode = TYPE_MODE (perm_type); + if (!can_vec_perm_const_p (vmode, vmode, indices)) return false; mask_type = build_vector_type (build_nonstandard_integer_type (elem_size, 1), @@ -3065,7 +3066,8 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) sel.quick_push (elts[i].first ? elts[i].second + nelts : i); vec_perm_indices indices (sel, 2, nelts); - if (!can_vec_perm_const_p (TYPE_MODE (type), indices)) + machine_mode vmode = TYPE_MODE (type); + if (!can_vec_perm_const_p (vmode, vmode, indices)) return false; mask_type = build_vector_type (build_nonstandard_integer_type (elem_size, 1), diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 09223baf718..d20a10a1524 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -5347,7 +5347,7 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[3 * i + nelt2] = 0; } indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, @@ -5365,7 +5365,7 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[3 * i + nelt2] = nelt + j2++; } indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, @@ -5390,12 +5390,12 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[i * 2 + 1] = i + nelt; } vec_perm_indices indices (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) { for (i = 0; i < 6; i++) sel[i] += exact_div (nelt, 2); indices.new_vector (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return true; } } @@ -5963,7 +5963,7 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, else sel[i] = 0; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5977,7 +5977,7 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, else sel[i] = nelt + ((nelt + k) % 3) + 3 * (j++); indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6000,12 +6000,12 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, for (i = 0; i < 3; i++) sel[i] = i * 2; vec_perm_indices indices (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) { for (i = 0; i < 3; i++) sel[i] = i * 2 + 1; indices.new_vector (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return true; } } @@ -6327,6 +6327,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, gimple *perm_stmt; tree vectype = STMT_VINFO_VECTYPE (stmt_info); + machine_mode vmode = TYPE_MODE (vectype); unsigned int i; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); @@ -6351,7 +6352,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt / 2; ++i) sel[nelt / 2 + i] = i * 2 + 1; vec_perm_indices indices (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6366,7 +6367,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt / 2; ++i) sel[nelt / 2 + i] = i * 2; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6381,7 +6382,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = nelt / 2 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6397,7 +6398,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = nelt / 2; i < nelt; i++) sel[i] = nelt + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6461,7 +6462,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, k++; } vec_perm_indices indices (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6476,7 +6477,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + (nelt % 3) + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6490,7 +6491,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + 1 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6504,7 +6505,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = (nelt / 3) + (nelt % 3) / 2 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6518,7 +6519,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + (nelt % 3) / 2 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index d99e3207fbe..17de3375127 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1527,7 +1527,8 @@ lower_vec_perm (gimple_stmt_iterator *gsi) && tree_to_vec_perm_builder (&sel_int, mask)) { vec_perm_indices indices (sel_int, 2, elements); - if (can_vec_perm_const_p (TYPE_MODE (vect_type), indices)) + machine_mode vmode = TYPE_MODE (vect_type); + if (can_vec_perm_const_p (vmode, vmode, indices)) { gimple_assign_set_rhs3 (stmt, mask); update_stmt (stmt); diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 1d4337eb261..11dc6cbf576 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -312,7 +312,8 @@ interleave_supported_p (vec_perm_indices *indices, tree vectype, sel.quick_push (base + i + nelts); } indices->new_vector (sel, 2, nelts); - return can_vec_perm_const_p (TYPE_MODE (vectype), *indices); + return can_vec_perm_const_p (TYPE_MODE (vectype), TYPE_MODE (vectype), + *indices); } /* Try to use permutes to define the masks in DEST_RGM using the masks diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index ab7dade1c74..7ac938b3c7b 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -4527,7 +4527,7 @@ have_whole_vector_shift (machine_mode mode) { calc_vec_perm_mask_for_shift (i, nelt, &sel); indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices, false)) + if (!can_vec_perm_const_p (mode, mode, indices, false)) return false; } return true; diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 8c61eb965a6..51803ec97c2 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -2643,7 +2643,8 @@ vect_recog_rotate_pattern (vec_info *vinfo, vec_perm_indices indices (elts, 1, TYPE_VECTOR_SUBPARTS (char_vectype)); - if (can_vec_perm_const_p (TYPE_MODE (char_vectype), indices)) + machine_mode vmode = TYPE_MODE (char_vectype); + if (can_vec_perm_const_p (vmode, vmode, indices)) { /* vectorizable_bswap can handle the __builtin_bswap16 if we undo the argument promotion. */ diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index cdfff1ab9f6..fe9361c338e 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -420,8 +420,9 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count, } vec_perm_indices indices1 (sel1, 2, nelts); vec_perm_indices indices2 (sel2, 2, nelts); - if (can_vec_perm_const_p (TYPE_MODE (vector_type), indices1) - && can_vec_perm_const_p (TYPE_MODE (vector_type), indices2)) + machine_mode vmode = TYPE_MODE (vector_type); + if (can_vec_perm_const_p (vmode, vmode, indices1) + && can_vec_perm_const_p (vmode, vmode, indices2)) { if (nvectors_out) *nvectors_out = nvectors; @@ -6762,7 +6763,7 @@ vect_transform_slp_perm_load (vec_info *vinfo, if (index == count && !noop_p) { indices.new_vector (mask, second_vec_index == -1 ? 1 : 2, nunits); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) { @@ -7122,8 +7123,9 @@ vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi, { indices.new_vector (mask, second_vec.first == -1U ? 1 : 2, nunits); bool identity_p = indices.series_p (0, 1, 0, 1); + machine_mode vmode = TYPE_MODE (vectype); if (!identity_p - && !can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + && !can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) { diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 8327e9d047e..207d1097d5b 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2016,7 +2016,7 @@ perm_mask_for_reverse (tree vectype) sel.quick_push (nunits - 1 - i); vec_perm_indices indices (sel, 1, nunits); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (TYPE_MODE (vectype), TYPE_MODE (vectype), indices)) return NULL_TREE; return vect_gen_perm_mask_checked (vectype, indices); } @@ -3168,7 +3168,8 @@ vectorizable_bswap (vec_info *vinfo, elts.quick_push ((i + 1) * word_bytes - j - 1); vec_perm_indices indices (elts, 1, num_bytes); - if (!can_vec_perm_const_p (TYPE_MODE (char_vectype), indices)) + machine_mode vmode = TYPE_MODE (char_vectype); + if (!can_vec_perm_const_p (vmode, vmode, indices)) return false; if (! vec_stmt) @@ -6712,7 +6713,7 @@ scan_store_can_perm_p (tree vectype, tree init, sel[j] = nunits + k; } vec_perm_indices indices (sel, i == units_log2 ? 1 : 2, nunits); - if (!can_vec_perm_const_p (vec_mode, indices)) + if (!can_vec_perm_const_p (vec_mode, vec_mode, indices)) { if (i == units_log2) return -1; @@ -8582,7 +8583,8 @@ vect_gen_perm_mask_any (tree vectype, const vec_perm_indices &sel) tree vect_gen_perm_mask_checked (tree vectype, const vec_perm_indices &sel) { - gcc_assert (can_vec_perm_const_p (TYPE_MODE (vectype), sel)); + machine_mode vmode = TYPE_MODE (vectype); + gcc_assert (can_vec_perm_const_p (vmode, vmode, sel)); return vect_gen_perm_mask_any (vectype, sel); }
Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > On Wed, 18 May 2022 at 17:27, Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: >> > Hi, >> > The attached patch adds another parameter machine_mode op_mode to vec_perm_const >> > hook to specify mode of input operands. The motivation for doing this >> > is PR96463, >> > where we create vec_perm_expr of the form: >> > lhs = vec_perm_expr<rhs, mask> >> > where lhs and rhs have different vector types but same element type >> > (lhs is SVE and rhs is corresponding advsimd vector). >> > >> > It seems the following targets were affected: aarch64, i386, arm, ia64, >> > mips, rs6000, s390, sparc, gcn. >> > >> > Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. >> > For other targets, I did make all-gcc stage1, which seems to build OK. >> > >> > Thanks, >> > Prathamesh >> > >> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi >> > index c5006afc00d..31ff6ef3f92 100644 >> > --- a/gcc/doc/tm.texi >> > +++ b/gcc/doc/tm.texi >> > @@ -6088,7 +6088,7 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar >> > access using @var{type} is known to be naturally aligned. >> > @end deftypefn >> > >> > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) >> > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) >> > This hook is used to test whether the target can permute up to two >> > vectors of mode @var{mode} using the permutation vector @code{sel}, and >> > also to emit such a permutation. In the former case @var{in0}, @var{in1} >> >> Like Andre says, the documentation should describe op_mode (and mode). >> >> > diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc >> > index 68dc679cc6a..aef9d4c5d28 100644 >> > --- a/gcc/optabs-query.cc >> > +++ b/gcc/optabs-query.cc >> > @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) >> > with here. */ >> > >> > bool >> > -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, >> > - bool allow_variable_p) >> > +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, >> > + const vec_perm_indices &sel, bool allow_variable_p) >> > { >> >> The function comment should describe the new parameter. >> >> > /* If the target doesn't implement a vector mode for the vector type, >> > then no operations are supported. */ >> > @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, >> > >> > if (targetm.vectorize.vec_perm_const != NULL) >> > { >> > - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, >> > + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, >> > NULL_RTX, sel)) >> > return true; >> > >> > @@ -462,6 +462,13 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, >> > return false; >> > } >> > >> > +bool >> > +can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, >> > + bool allow_variable_p) >> > +{ >> > + return can_vec_perm_const_p (mode, mode, sel, allow_variable_p); >> > +} >> > + >> >> I can understand why you went for this, but now that we've opened >> the door to mismatched modes, I think it would be better if all callers >> specified the input mode explicitly. >> >> > diff --git a/gcc/optabs.cc b/gcc/optabs.cc >> > index 3d8fa3abdfe..55f10c41789 100644 >> > --- a/gcc/optabs.cc >> > +++ b/gcc/optabs.cc >> > @@ -6250,7 +6250,9 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, >> > if (single_arg_p) >> > v1 = v0; >> > >> > - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) >> > + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); >> > + machine_mode op_mode = GET_MODE (v0); >> > + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, indices)) >> > return target; >> > } >> > >> >> (FWIW, I agree the assert is worth having.) > Hi, > I updated the patch with doc and adjusted callers to explicitly pass op_mode. > Bootstrapped + tested on x86_64-linux-gnu and aarch64-linux-gnu. > Does it look OK to commit ? > > […] > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index c5006afc00d..f53068c5c53 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -6088,13 +6088,13 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > access using @var{type} is known to be naturally aligned. > @end deftypefn > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > This hook is used to test whether the target can permute up to two > -vectors of mode @var{mode} using the permutation vector @code{sel}, and > +vectors of mode @var{op_mode} using the permutation vector @code{sel}, and > also to emit such a permutation. In the former case @var{in0}, @var{in1} > and @var{out} are all null. In the latter case @var{in0} and @var{in1} are How about: This hook is used to test whether the target can permute up to two vectors of mode @var{op_mode} using the permutation vector @code{sel}, producing a vector of mode @var{mode}. The hook is also used to emit such a permutation. When the hook is being to test whether the target supports a permutation, @var{in0}, @var{in1}, and @var{out} are all null. When the hook is being used to emit a permutation, @var{in0} and @var{in1} are … so that we describe @var{mode} for both the testing and non-testing cases. > -the source vectors and @var{out} is the destination vector; all three are > -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > +the source vectors of mode @var{op_mode} and @var{out} is the destination vector > +of mode @var{mode}. @var{in1} is the same as @var{in0} if > @var{sel} describes a permutation on one vector instead of two. > > Return true if the operation is possible, emitting instructions for it > diff --git a/gcc/match.pd b/gcc/match.pd > index f5efa77560c..b06d7013603 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -7703,12 +7703,14 @@ and, > 2-argument version. */ > tree oldop2 = op2; > if (sel.ninputs () == 2 > - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > + || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), > + sel, false)) > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > else > { > vec_perm_indices sel2 (builder, 2, nelts); > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > + if (can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), > + sel2, false)) > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > else > /* Not directly supported with either encoding, Since this is matching an existing VEC_PERM_EXPR, we need to distinguish between mode and op_mode. Might be worth adding both mode and op_mode to the outermost “with” and changing later tests accordingly. > diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h > index b9c9fd6f64d..32e52b20109 100644 > --- a/gcc/optabs-query.h > +++ b/gcc/optabs-query.h > @@ -178,7 +178,7 @@ bool can_conditionally_move_p (machine_mode mode); > opt_machine_mode qimode_for_vec_perm (machine_mode); > bool selector_fits_mode_p (machine_mode, const vec_perm_indices &); > bool can_vec_perm_var_p (machine_mode); > -bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &, > +bool can_vec_perm_const_p (machine_mode, machine_mode, const vec_perm_indices &, > bool = true); Nit: long line. > /* Find a widening optab even if it doesn't widen as much as we want. */ > #define find_widening_optab_handler(A, B, C) \ > […] > diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc > index d99e3207fbe..17de3375127 100644 > --- a/gcc/tree-vect-generic.cc > +++ b/gcc/tree-vect-generic.cc > @@ -1527,7 +1527,8 @@ lower_vec_perm (gimple_stmt_iterator *gsi) > && tree_to_vec_perm_builder (&sel_int, mask)) > { > vec_perm_indices indices (sel_int, 2, elements); > - if (can_vec_perm_const_p (TYPE_MODE (vect_type), indices)) > + machine_mode vmode = TYPE_MODE (vect_type); > + if (can_vec_perm_const_p (vmode, vmode, indices)) Similarly here: since this is testing an existing VEC_PERM_EXPR, I think we need to read the mode of the lhs and pass both modes. > […] > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 8327e9d047e..207d1097d5b 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -2016,7 +2016,7 @@ perm_mask_for_reverse (tree vectype) > sel.quick_push (nunits - 1 - i); > > vec_perm_indices indices (sel, 1, nunits); > - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) > + if (!can_vec_perm_const_p (TYPE_MODE (vectype), TYPE_MODE (vectype), indices)) Nit: long line. Thanks, Richard
On Mon, 23 May 2022 at 18:14, Richard Sandiford <richard.sandiford@arm.com> wrote: > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > On Wed, 18 May 2022 at 17:27, Richard Sandiford > > <richard.sandiford@arm.com> wrote: > >> > >> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > >> > Hi, > >> > The attached patch adds another parameter machine_mode op_mode to vec_perm_const > >> > hook to specify mode of input operands. The motivation for doing this > >> > is PR96463, > >> > where we create vec_perm_expr of the form: > >> > lhs = vec_perm_expr<rhs, mask> > >> > where lhs and rhs have different vector types but same element type > >> > (lhs is SVE and rhs is corresponding advsimd vector). > >> > > >> > It seems the following targets were affected: aarch64, i386, arm, ia64, > >> > mips, rs6000, s390, sparc, gcn. > >> > > >> > Bootstrapped+tested on x86_64-linux-gnu, aarch64-linux-gnu. > >> > For other targets, I did make all-gcc stage1, which seems to build OK. > >> > > >> > Thanks, > >> > Prathamesh > >> > > >> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > >> > index c5006afc00d..31ff6ef3f92 100644 > >> > --- a/gcc/doc/tm.texi > >> > +++ b/gcc/doc/tm.texi > >> > @@ -6088,7 +6088,7 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > >> > access using @var{type} is known to be naturally aligned. > >> > @end deftypefn > >> > > >> > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > >> > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > >> > This hook is used to test whether the target can permute up to two > >> > vectors of mode @var{mode} using the permutation vector @code{sel}, and > >> > also to emit such a permutation. In the former case @var{in0}, @var{in1} > >> > >> Like Andre says, the documentation should describe op_mode (and mode). > >> > >> > diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc > >> > index 68dc679cc6a..aef9d4c5d28 100644 > >> > --- a/gcc/optabs-query.cc > >> > +++ b/gcc/optabs-query.cc > >> > @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) > >> > with here. */ > >> > > >> > bool > >> > -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > >> > - bool allow_variable_p) > >> > +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, > >> > + const vec_perm_indices &sel, bool allow_variable_p) > >> > { > >> > >> The function comment should describe the new parameter. > >> > >> > /* If the target doesn't implement a vector mode for the vector type, > >> > then no operations are supported. */ > >> > @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > >> > > >> > if (targetm.vectorize.vec_perm_const != NULL) > >> > { > >> > - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, > >> > + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, > >> > NULL_RTX, sel)) > >> > return true; > >> > > >> > @@ -462,6 +462,13 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > >> > return false; > >> > } > >> > > >> > +bool > >> > +can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, > >> > + bool allow_variable_p) > >> > +{ > >> > + return can_vec_perm_const_p (mode, mode, sel, allow_variable_p); > >> > +} > >> > + > >> > >> I can understand why you went for this, but now that we've opened > >> the door to mismatched modes, I think it would be better if all callers > >> specified the input mode explicitly. > >> > >> > diff --git a/gcc/optabs.cc b/gcc/optabs.cc > >> > index 3d8fa3abdfe..55f10c41789 100644 > >> > --- a/gcc/optabs.cc > >> > +++ b/gcc/optabs.cc > >> > @@ -6250,7 +6250,9 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, > >> > if (single_arg_p) > >> > v1 = v0; > >> > > >> > - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) > >> > + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); > >> > + machine_mode op_mode = GET_MODE (v0); > >> > + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, indices)) > >> > return target; > >> > } > >> > > >> > >> (FWIW, I agree the assert is worth having.) > > Hi, > > I updated the patch with doc and adjusted callers to explicitly pass op_mode. > > Bootstrapped + tested on x86_64-linux-gnu and aarch64-linux-gnu. > > Does it look OK to commit ? > > > > […] > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > > index c5006afc00d..f53068c5c53 100644 > > --- a/gcc/doc/tm.texi > > +++ b/gcc/doc/tm.texi > > @@ -6088,13 +6088,13 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > > access using @var{type} is known to be naturally aligned. > > @end deftypefn > > > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > This hook is used to test whether the target can permute up to two > > -vectors of mode @var{mode} using the permutation vector @code{sel}, and > > +vectors of mode @var{op_mode} using the permutation vector @code{sel}, and > > also to emit such a permutation. In the former case @var{in0}, @var{in1} > > and @var{out} are all null. In the latter case @var{in0} and @var{in1} are > > How about: > > This hook is used to test whether the target can permute up to two > vectors of mode @var{op_mode} using the permutation vector @code{sel}, > producing a vector of mode @var{mode}. The hook is also used to emit > such a permutation. > > When the hook is being to test whether the target supports a permutation, > @var{in0}, @var{in1}, and @var{out} are all null. When the hook is > being used to emit a permutation, @var{in0} and @var{in1} are … > > so that we describe @var{mode} for both the testing and non-testing cases. > > > -the source vectors and @var{out} is the destination vector; all three are > > -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > > +the source vectors of mode @var{op_mode} and @var{out} is the destination vector > > +of mode @var{mode}. @var{in1} is the same as @var{in0} if > > @var{sel} describes a permutation on one vector instead of two. > > > > Return true if the operation is possible, emitting instructions for it > > diff --git a/gcc/match.pd b/gcc/match.pd > > index f5efa77560c..b06d7013603 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -7703,12 +7703,14 @@ and, > > 2-argument version. */ > > tree oldop2 = op2; > > if (sel.ninputs () == 2 > > - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > > + || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), > > + sel, false)) > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > > else > > { > > vec_perm_indices sel2 (builder, 2, nelts); > > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > > + if (can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), > > + sel2, false)) > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > > else > > /* Not directly supported with either encoding, > > Since this is matching an existing VEC_PERM_EXPR, we need to distinguish > between mode and op_mode. Might be worth adding both mode and op_mode > to the outermost “with” and changing later tests accordingly. > > > diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h > > index b9c9fd6f64d..32e52b20109 100644 > > --- a/gcc/optabs-query.h > > +++ b/gcc/optabs-query.h > > @@ -178,7 +178,7 @@ bool can_conditionally_move_p (machine_mode mode); > > opt_machine_mode qimode_for_vec_perm (machine_mode); > > bool selector_fits_mode_p (machine_mode, const vec_perm_indices &); > > bool can_vec_perm_var_p (machine_mode); > > -bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &, > > +bool can_vec_perm_const_p (machine_mode, machine_mode, const vec_perm_indices &, > > bool = true); > > Nit: long line. > > > /* Find a widening optab even if it doesn't widen as much as we want. */ > > #define find_widening_optab_handler(A, B, C) \ > > […] > > diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc > > index d99e3207fbe..17de3375127 100644 > > --- a/gcc/tree-vect-generic.cc > > +++ b/gcc/tree-vect-generic.cc > > @@ -1527,7 +1527,8 @@ lower_vec_perm (gimple_stmt_iterator *gsi) > > && tree_to_vec_perm_builder (&sel_int, mask)) > > { > > vec_perm_indices indices (sel_int, 2, elements); > > - if (can_vec_perm_const_p (TYPE_MODE (vect_type), indices)) > > + machine_mode vmode = TYPE_MODE (vect_type); > > + if (can_vec_perm_const_p (vmode, vmode, indices)) > > Similarly here: since this is testing an existing VEC_PERM_EXPR, I think > we need to read the mode of the lhs and pass both modes. > > > […] > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index 8327e9d047e..207d1097d5b 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-stmts.cc > > @@ -2016,7 +2016,7 @@ perm_mask_for_reverse (tree vectype) > > sel.quick_push (nunits - 1 - i); > > > > vec_perm_indices indices (sel, 1, nunits); > > - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) > > + if (!can_vec_perm_const_p (TYPE_MODE (vectype), TYPE_MODE (vectype), indices)) > > Nit: long line. Hi Richard, Thanks for the suggestions. Does the attached patch look OK ? Bootstrapped+tested on x86_64-linux-gnu and aarch64-linux-gnu. Thanks, Prathamesh > > Thanks, > Richard diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c5006afc00d..0a3c733ada9 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar access using @var{type} is known to be naturally aligned. @end deftypefn -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) This hook is used to test whether the target can permute up to two -vectors of mode @var{mode} using the permutation vector @code{sel}, and -also to emit such a permutation. In the former case @var{in0}, @var{in1} -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are -the source vectors and @var{out} is the destination vector; all three are -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if -@var{sel} describes a permutation on one vector instead of two. +vectors of mode @var{op_mode} using the permutation vector @code{sel}, +producing a vector of mode @var{mode}.The hook is also used to emit such +a permutation. + +When the hook is being used to test whether the target supports a permutation, +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one +vector instead of two. Return true if the operation is possible, emitting instructions for it if rtxes are provided. diff --git a/gcc/match.pd b/gcc/match.pd index f5efa77560c..f2a527d9c42 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7596,6 +7596,8 @@ and, (with { tree op0 = @0, op1 = @1, op2 = @2; + machine_mode result_mode = TYPE_MODE (type); + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); /* Build a vector of integers from the tree mask. */ vec_perm_builder builder; @@ -7703,12 +7705,12 @@ and, 2-argument version. */ tree oldop2 = op2; if (sel.ninputs () == 2 - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); else { vec_perm_indices sel2 (builder, 2, nelts); - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); else /* Not directly supported with either encoding, diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc index 68dc679cc6a..809482b8092 100644 --- a/gcc/optabs-query.cc +++ b/gcc/optabs-query.cc @@ -407,9 +407,9 @@ can_vec_perm_var_p (machine_mode mode) } /* Return true if the target directly supports VEC_PERM_EXPRs on vectors - of mode MODE using the selector SEL. ALLOW_VARIABLE_P is true if it - is acceptable to force the selector into a register and use a variable - permute (if the target supports that). + of mode OP_MODE and result vector of mode MODE using the selector SEL. + ALLOW_VARIABLE_P is true if it is acceptable to force the selector into a + register and use a variable permute (if the target supports that). Note that additional permutations representing whole-vector shifts may also be handled via the vec_shr or vec_shl optab, but only where the @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) with here. */ bool -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, - bool allow_variable_p) +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, + const vec_perm_indices &sel, bool allow_variable_p) { /* If the target doesn't implement a vector mode for the vector type, then no operations are supported. */ @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, if (targetm.vectorize.vec_perm_const != NULL) { - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, NULL_RTX, sel)) return true; @@ -534,7 +534,7 @@ can_mult_highpart_p (machine_mode mode, bool uns_p) + (i & ~1) + ((i & 1) ? nunits : 0)); vec_perm_indices indices (sel, 2, nunits); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return 2; } } @@ -550,7 +550,7 @@ can_mult_highpart_p (machine_mode mode, bool uns_p) for (unsigned int i = 0; i < 3; ++i) sel.quick_push (2 * i + (BYTES_BIG_ENDIAN ? 0 : 1)); vec_perm_indices indices (sel, 2, nunits); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return 3; } } diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h index b9c9fd6f64d..945d2a803b6 100644 --- a/gcc/optabs-query.h +++ b/gcc/optabs-query.h @@ -178,8 +178,8 @@ bool can_conditionally_move_p (machine_mode mode); opt_machine_mode qimode_for_vec_perm (machine_mode); bool selector_fits_mode_p (machine_mode, const vec_perm_indices &); bool can_vec_perm_var_p (machine_mode); -bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &, - bool = true); +bool can_vec_perm_const_p (machine_mode, machine_mode, + const vec_perm_indices &, bool = true); /* Find a widening optab even if it doesn't widen as much as we want. */ #define find_widening_optab_handler(A, B, C) \ find_widening_optab_handler_and_mode (A, B, C, NULL) diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 3d8fa3abdfe..c0a68471d2d 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -6250,7 +6250,10 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, if (single_arg_p) v1 = v0; - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); + machine_mode op_mode = GET_MODE (v0); + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, + indices)) return target; } @@ -6264,7 +6267,7 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, v0_qi = gen_lowpart (qimode, v0); v1_qi = gen_lowpart (qimode, v1); if (targetm.vectorize.vec_perm_const != NULL - && targetm.vectorize.vec_perm_const (qimode, target_qi, v0_qi, + && targetm.vectorize.vec_perm_const (qimode, qimode, target_qi, v0_qi, v1_qi, qimode_indices)) return gen_lowpart (mode, target_qi); } diff --git a/gcc/target.def b/gcc/target.def index d85adf36a39..db90552fc90 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1878,12 +1878,16 @@ access using @var{type} is known to be naturally aligned.", DEFHOOK (vec_perm_const, "This hook is used to test whether the target can permute up to two\n\ -vectors of mode @var{mode} using the permutation vector @code{sel}, and\n\ -also to emit such a permutation. In the former case @var{in0}, @var{in1}\n\ -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are\n\ -the source vectors and @var{out} is the destination vector; all three are\n\ -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if\n\ -@var{sel} describes a permutation on one vector instead of two.\n\ +vectors of mode @var{op_mode} using the permutation vector @code{sel},\n\ +producing a vector of mode @var{mode}.The hook is also used to emit such\n\ +a permutation.\n\ +\n\ +When the hook is being used to test whether the target supports a permutation,\n\ +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used\n\ +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode\n\ +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}.\n\ +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one\n\ +vector instead of two.\n\ \n\ Return true if the operation is possible, emitting instructions for it\n\ if rtxes are provided.\n\ @@ -1894,7 +1898,7 @@ try the equivalent byte operation. If that also fails, it will try forcing\n\ the selector into a register and using the @var{vec_perm@var{mode}}\n\ instruction pattern. There is no need for the hook to handle these two\n\ implementation approaches itself.", - bool, (machine_mode mode, rtx output, rtx in0, rtx in1, + bool, (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, const vec_perm_indices &sel), NULL) diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index 48cab5844e0..e253e848796 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -3016,7 +3016,8 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) : (elts[0].second == 0 && elts[0].first == 0 ? 0 : refnelts) + i); vec_perm_indices indices (sel, orig[1] ? 2 : 1, refnelts); - if (!can_vec_perm_const_p (TYPE_MODE (perm_type), indices)) + machine_mode vmode = TYPE_MODE (perm_type); + if (!can_vec_perm_const_p (vmode, vmode, indices)) return false; mask_type = build_vector_type (build_nonstandard_integer_type (elem_size, 1), @@ -3065,7 +3066,8 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) sel.quick_push (elts[i].first ? elts[i].second + nelts : i); vec_perm_indices indices (sel, 2, nelts); - if (!can_vec_perm_const_p (TYPE_MODE (type), indices)) + machine_mode vmode = TYPE_MODE (type); + if (!can_vec_perm_const_p (vmode, vmode, indices)) return false; mask_type = build_vector_type (build_nonstandard_integer_type (elem_size, 1), diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 09223baf718..d20a10a1524 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -5347,7 +5347,7 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[3 * i + nelt2] = 0; } indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, @@ -5365,7 +5365,7 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[3 * i + nelt2] = nelt + j2++; } indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf (MSG_MISSED_OPTIMIZATION, @@ -5390,12 +5390,12 @@ vect_grouped_store_supported (tree vectype, unsigned HOST_WIDE_INT count) sel[i * 2 + 1] = i + nelt; } vec_perm_indices indices (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) { for (i = 0; i < 6; i++) sel[i] += exact_div (nelt, 2); indices.new_vector (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return true; } } @@ -5963,7 +5963,7 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, else sel[i] = 0; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -5977,7 +5977,7 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, else sel[i] = nelt + ((nelt + k) % 3) + 3 * (j++); indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6000,12 +6000,12 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, for (i = 0; i < 3; i++) sel[i] = i * 2; vec_perm_indices indices (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) { for (i = 0; i < 3; i++) sel[i] = i * 2 + 1; indices.new_vector (sel, 2, nelt); - if (can_vec_perm_const_p (mode, indices)) + if (can_vec_perm_const_p (mode, mode, indices)) return true; } } @@ -6327,6 +6327,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, gimple *perm_stmt; tree vectype = STMT_VINFO_VECTYPE (stmt_info); + machine_mode vmode = TYPE_MODE (vectype); unsigned int i; loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo); @@ -6351,7 +6352,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt / 2; ++i) sel[nelt / 2 + i] = i * 2 + 1; vec_perm_indices indices (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6366,7 +6367,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt / 2; ++i) sel[nelt / 2 + i] = i * 2; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6381,7 +6382,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = nelt / 2 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6397,7 +6398,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = nelt / 2; i < nelt; i++) sel[i] = nelt + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6461,7 +6462,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, k++; } vec_perm_indices indices (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6476,7 +6477,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + (nelt % 3) + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6490,7 +6491,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + 1 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6504,7 +6505,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = (nelt / 3) + (nelt % 3) / 2 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -6518,7 +6519,7 @@ vect_shift_permute_load_chain (vec_info *vinfo, vec<tree> dr_chain, for (i = 0; i < nelt; i++) sel[i] = 2 * (nelt / 3) + (nelt % 3) / 2 + i; indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index d99e3207fbe..533ea68db7d 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1527,7 +1527,10 @@ lower_vec_perm (gimple_stmt_iterator *gsi) && tree_to_vec_perm_builder (&sel_int, mask)) { vec_perm_indices indices (sel_int, 2, elements); - if (can_vec_perm_const_p (TYPE_MODE (vect_type), indices)) + machine_mode vmode = TYPE_MODE (vect_type); + tree lhs_type = TREE_TYPE (gimple_assign_lhs (stmt)); + machine_mode lhs_mode = TYPE_MODE (lhs_type); + if (can_vec_perm_const_p (lhs_mode, vmode, indices)) { gimple_assign_set_rhs3 (stmt, mask); update_stmt (stmt); diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 1d4337eb261..11dc6cbf576 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -312,7 +312,8 @@ interleave_supported_p (vec_perm_indices *indices, tree vectype, sel.quick_push (base + i + nelts); } indices->new_vector (sel, 2, nelts); - return can_vec_perm_const_p (TYPE_MODE (vectype), *indices); + return can_vec_perm_const_p (TYPE_MODE (vectype), TYPE_MODE (vectype), + *indices); } /* Try to use permutes to define the masks in DEST_RGM using the masks diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index ab7dade1c74..7ac938b3c7b 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -4527,7 +4527,7 @@ have_whole_vector_shift (machine_mode mode) { calc_vec_perm_mask_for_shift (i, nelt, &sel); indices.new_vector (sel, 2, nelt); - if (!can_vec_perm_const_p (mode, indices, false)) + if (!can_vec_perm_const_p (mode, mode, indices, false)) return false; } return true; diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 8c61eb965a6..51803ec97c2 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -2643,7 +2643,8 @@ vect_recog_rotate_pattern (vec_info *vinfo, vec_perm_indices indices (elts, 1, TYPE_VECTOR_SUBPARTS (char_vectype)); - if (can_vec_perm_const_p (TYPE_MODE (char_vectype), indices)) + machine_mode vmode = TYPE_MODE (char_vectype); + if (can_vec_perm_const_p (vmode, vmode, indices)) { /* vectorizable_bswap can handle the __builtin_bswap16 if we undo the argument promotion. */ diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index cdfff1ab9f6..fe9361c338e 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -420,8 +420,9 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count, } vec_perm_indices indices1 (sel1, 2, nelts); vec_perm_indices indices2 (sel2, 2, nelts); - if (can_vec_perm_const_p (TYPE_MODE (vector_type), indices1) - && can_vec_perm_const_p (TYPE_MODE (vector_type), indices2)) + machine_mode vmode = TYPE_MODE (vector_type); + if (can_vec_perm_const_p (vmode, vmode, indices1) + && can_vec_perm_const_p (vmode, vmode, indices2)) { if (nvectors_out) *nvectors_out = nvectors; @@ -6762,7 +6763,7 @@ vect_transform_slp_perm_load (vec_info *vinfo, if (index == count && !noop_p) { indices.new_vector (mask, second_vec_index == -1 ? 1 : 2, nunits); - if (!can_vec_perm_const_p (mode, indices)) + if (!can_vec_perm_const_p (mode, mode, indices)) { if (dump_enabled_p ()) { @@ -7122,8 +7123,9 @@ vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi, { indices.new_vector (mask, second_vec.first == -1U ? 1 : 2, nunits); bool identity_p = indices.series_p (0, 1, 0, 1); + machine_mode vmode = TYPE_MODE (vectype); if (!identity_p - && !can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + && !can_vec_perm_const_p (vmode, vmode, indices)) { if (dump_enabled_p ()) { diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 8327e9d047e..346d8ce2804 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2016,7 +2016,8 @@ perm_mask_for_reverse (tree vectype) sel.quick_push (nunits - 1 - i); vec_perm_indices indices (sel, 1, nunits); - if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices)) + if (!can_vec_perm_const_p (TYPE_MODE (vectype), TYPE_MODE (vectype), + indices)) return NULL_TREE; return vect_gen_perm_mask_checked (vectype, indices); } @@ -3168,7 +3169,8 @@ vectorizable_bswap (vec_info *vinfo, elts.quick_push ((i + 1) * word_bytes - j - 1); vec_perm_indices indices (elts, 1, num_bytes); - if (!can_vec_perm_const_p (TYPE_MODE (char_vectype), indices)) + machine_mode vmode = TYPE_MODE (char_vectype); + if (!can_vec_perm_const_p (vmode, vmode, indices)) return false; if (! vec_stmt) @@ -6712,7 +6714,7 @@ scan_store_can_perm_p (tree vectype, tree init, sel[j] = nunits + k; } vec_perm_indices indices (sel, i == units_log2 ? 1 : 2, nunits); - if (!can_vec_perm_const_p (vec_mode, indices)) + if (!can_vec_perm_const_p (vec_mode, vec_mode, indices)) { if (i == units_log2) return -1; @@ -8582,7 +8584,8 @@ vect_gen_perm_mask_any (tree vectype, const vec_perm_indices &sel) tree vect_gen_perm_mask_checked (tree vectype, const vec_perm_indices &sel) { - gcc_assert (can_vec_perm_const_p (TYPE_MODE (vectype), sel)); + machine_mode vmode = TYPE_MODE (vectype); + gcc_assert (can_vec_perm_const_p (vmode, vmode, sel)); return vect_gen_perm_mask_any (vectype, sel); }
Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index c5006afc00d..0a3c733ada9 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > access using @var{type} is known to be naturally aligned. > @end deftypefn > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > This hook is used to test whether the target can permute up to two > -vectors of mode @var{mode} using the permutation vector @code{sel}, and > -also to emit such a permutation. In the former case @var{in0}, @var{in1} > -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are > -the source vectors and @var{out} is the destination vector; all three are > -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > -@var{sel} describes a permutation on one vector instead of two. > +vectors of mode @var{op_mode} using the permutation vector @code{sel}, > +producing a vector of mode @var{mode}.The hook is also used to emit such Should be two spaces between “@var{mode}.” and “The”. > +a permutation. > + > +When the hook is being used to test whether the target supports a permutation, > +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used Same here: missing spaces before “When”. > +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode > +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. > +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one > +vector instead of two. > > Return true if the operation is possible, emitting instructions for it > if rtxes are provided. > diff --git a/gcc/match.pd b/gcc/match.pd > index f5efa77560c..f2a527d9c42 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -7596,6 +7596,8 @@ and, > (with > { > tree op0 = @0, op1 = @1, op2 = @2; > + machine_mode result_mode = TYPE_MODE (type); > + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); > > /* Build a vector of integers from the tree mask. */ > vec_perm_builder builder; > @@ -7703,12 +7705,12 @@ and, > 2-argument version. */ > tree oldop2 = op2; > if (sel.ninputs () == 2 > - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > else > { > vec_perm_indices sel2 (builder, 2, nelts); > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > else > /* Not directly supported with either encoding, Please replace the use of TYPE_MODE here: /* See if the permutation is performing a single element insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR in that case. But only if the vector mode is supported, otherwise this is invalid GIMPLE. */ if (TYPE_MODE (type) != BLKmode as well. OK with those changes, thanks. Richard
On Tue, 24 May 2022 at 14:50, Richard Sandiford <richard.sandiford@arm.com> wrote: > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > > index c5006afc00d..0a3c733ada9 100644 > > --- a/gcc/doc/tm.texi > > +++ b/gcc/doc/tm.texi > > @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > > access using @var{type} is known to be naturally aligned. > > @end deftypefn > > > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > This hook is used to test whether the target can permute up to two > > -vectors of mode @var{mode} using the permutation vector @code{sel}, and > > -also to emit such a permutation. In the former case @var{in0}, @var{in1} > > -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are > > -the source vectors and @var{out} is the destination vector; all three are > > -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > > -@var{sel} describes a permutation on one vector instead of two. > > +vectors of mode @var{op_mode} using the permutation vector @code{sel}, > > +producing a vector of mode @var{mode}.The hook is also used to emit such > > Should be two spaces between “@var{mode}.” and “The”. > > > +a permutation. > > + > > +When the hook is being used to test whether the target supports a permutation, > > +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used > > Same here: missing spaces before “When”. > > > +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode > > +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. > > +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one > > +vector instead of two. > > > > Return true if the operation is possible, emitting instructions for it > > if rtxes are provided. > > diff --git a/gcc/match.pd b/gcc/match.pd > > index f5efa77560c..f2a527d9c42 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -7596,6 +7596,8 @@ and, > > (with > > { > > tree op0 = @0, op1 = @1, op2 = @2; > > + machine_mode result_mode = TYPE_MODE (type); > > + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); > > > > /* Build a vector of integers from the tree mask. */ > > vec_perm_builder builder; > > @@ -7703,12 +7705,12 @@ and, > > 2-argument version. */ > > tree oldop2 = op2; > > if (sel.ninputs () == 2 > > - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > > + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > > else > > { > > vec_perm_indices sel2 (builder, 2, nelts); > > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > > + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > > else > > /* Not directly supported with either encoding, > > Please replace the use of TYPE_MODE here: > > /* See if the permutation is performing a single element > insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR > in that case. But only if the vector mode is supported, > otherwise this is invalid GIMPLE. */ > if (TYPE_MODE (type) != BLKmode > > as well. > > OK with those changes, thanks. Thanks, committed the patch in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e. Thanks, Prathamesh > > Richard
On Tue, May 24, 2022 at 9:22 PM Prathamesh Kulkarni via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Tue, 24 May 2022 at 14:50, Richard Sandiford > <richard.sandiford@arm.com> wrote: > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > > > index c5006afc00d..0a3c733ada9 100644 > > > --- a/gcc/doc/tm.texi > > > +++ b/gcc/doc/tm.texi > > > @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > > > access using @var{type} is known to be naturally aligned. > > > @end deftypefn > > > > > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > > This hook is used to test whether the target can permute up to two > > > -vectors of mode @var{mode} using the permutation vector @code{sel}, and > > > -also to emit such a permutation. In the former case @var{in0}, @var{in1} > > > -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are > > > -the source vectors and @var{out} is the destination vector; all three are > > > -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > > > -@var{sel} describes a permutation on one vector instead of two. > > > +vectors of mode @var{op_mode} using the permutation vector @code{sel}, > > > +producing a vector of mode @var{mode}.The hook is also used to emit such > > > > Should be two spaces between “@var{mode}.” and “The”. > > > > > +a permutation. > > > + > > > +When the hook is being used to test whether the target supports a permutation, > > > +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used > > > > Same here: missing spaces before “When”. > > > > > +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode > > > +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. > > > +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one > > > +vector instead of two. > > > > > > Return true if the operation is possible, emitting instructions for it > > > if rtxes are provided. > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > index f5efa77560c..f2a527d9c42 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -7596,6 +7596,8 @@ and, > > > (with > > > { > > > tree op0 = @0, op1 = @1, op2 = @2; > > > + machine_mode result_mode = TYPE_MODE (type); > > > + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); > > > > > > /* Build a vector of integers from the tree mask. */ > > > vec_perm_builder builder; > > > @@ -7703,12 +7705,12 @@ and, > > > 2-argument version. */ > > > tree oldop2 = op2; > > > if (sel.ninputs () == 2 > > > - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > > > + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) > > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > > > else > > > { > > > vec_perm_indices sel2 (builder, 2, nelts); > > > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > > > + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) > > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > > > else > > > /* Not directly supported with either encoding, > > > > Please replace the use of TYPE_MODE here: > > > > /* See if the permutation is performing a single element > > insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR > > in that case. But only if the vector mode is supported, > > otherwise this is invalid GIMPLE. */ > > if (TYPE_MODE (type) != BLKmode > > > > as well. > > > > OK with those changes, thanks. > Thanks, committed the patch in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e. So the present state allows to ask can_vec_perm_const_p but the implementation asks for 431 if (direct_optab_handler (vec_perm_optab, mode) != CODE_FOR_nothing) 432 return true; which then breaks. Also the VEC_PERMs are not yet valid. Any reason this was committed half-way through the review process of the series? At least I have a user in the vectorizer ready - allowing more permutes from existing vectors (of different sizes now) to be SLP vectorized. Thanks, Richard. > Thanks, > Prathamesh > > > > Richard
On Wed, 25 May 2022 at 18:27, Richard Biener <richard.guenther@gmail.com> wrote: > > On Tue, May 24, 2022 at 9:22 PM Prathamesh Kulkarni via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Tue, 24 May 2022 at 14:50, Richard Sandiford > > <richard.sandiford@arm.com> wrote: > > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > > > > index c5006afc00d..0a3c733ada9 100644 > > > > --- a/gcc/doc/tm.texi > > > > +++ b/gcc/doc/tm.texi > > > > @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > > > > access using @var{type} is known to be naturally aligned. > > > > @end deftypefn > > > > > > > > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > > > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > > > > This hook is used to test whether the target can permute up to two > > > > -vectors of mode @var{mode} using the permutation vector @code{sel}, and > > > > -also to emit such a permutation. In the former case @var{in0}, @var{in1} > > > > -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are > > > > -the source vectors and @var{out} is the destination vector; all three are > > > > -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > > > > -@var{sel} describes a permutation on one vector instead of two. > > > > +vectors of mode @var{op_mode} using the permutation vector @code{sel}, > > > > +producing a vector of mode @var{mode}.The hook is also used to emit such > > > > > > Should be two spaces between “@var{mode}.” and “The”. > > > > > > > +a permutation. > > > > + > > > > +When the hook is being used to test whether the target supports a permutation, > > > > +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used > > > > > > Same here: missing spaces before “When”. > > > > > > > +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode > > > > +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. > > > > +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one > > > > +vector instead of two. > > > > > > > > Return true if the operation is possible, emitting instructions for it > > > > if rtxes are provided. > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > index f5efa77560c..f2a527d9c42 100644 > > > > --- a/gcc/match.pd > > > > +++ b/gcc/match.pd > > > > @@ -7596,6 +7596,8 @@ and, > > > > (with > > > > { > > > > tree op0 = @0, op1 = @1, op2 = @2; > > > > + machine_mode result_mode = TYPE_MODE (type); > > > > + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); > > > > > > > > /* Build a vector of integers from the tree mask. */ > > > > vec_perm_builder builder; > > > > @@ -7703,12 +7705,12 @@ and, > > > > 2-argument version. */ > > > > tree oldop2 = op2; > > > > if (sel.ninputs () == 2 > > > > - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > > > > + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) > > > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > > > > else > > > > { > > > > vec_perm_indices sel2 (builder, 2, nelts); > > > > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > > > > + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) > > > > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > > > > else > > > > /* Not directly supported with either encoding, > > > > > > Please replace the use of TYPE_MODE here: > > > > > > /* See if the permutation is performing a single element > > > insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR > > > in that case. But only if the vector mode is supported, > > > otherwise this is invalid GIMPLE. */ > > > if (TYPE_MODE (type) != BLKmode > > > > > > as well. > > > > > > OK with those changes, thanks. > > Thanks, committed the patch in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e. > > So the present state allows to ask can_vec_perm_const_p but the > implementation asks for > > 431 if (direct_optab_handler (vec_perm_optab, mode) != > CODE_FOR_nothing) > 432 return true; > > which then breaks. Also the VEC_PERMs are not yet valid. Any reason this > was committed half-way through the review process of the series? Hi Richard, I am very sorry about that. I thought the patch was approved, and committed it after testing on x86_64 and aarch64. Should I revert it ? Um sorry to ask a silly question -- I am not sure why does the patch break the above condition ? The patch still passes mode to direct_optab_handler as before, and IIUC does not affect modes for vec_perm_optab, so it shouldn't affect the call to direct_optab_handler above ? Thanks, Prathamesh > > At least I have a user in the vectorizer ready - allowing more permutes > from existing vectors (of different sizes now) to be SLP vectorized. > > Thanks, > Richard. > > > Thanks, > > Prathamesh > > > > > > Richard
> Am 25.05.2022 um 21:03 schrieb Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>: > > On Wed, 25 May 2022 at 18:27, Richard Biener <richard.guenther@gmail.com> wrote: >> >>> On Tue, May 24, 2022 at 9:22 PM Prathamesh Kulkarni via Gcc-patches >>> <gcc-patches@gcc.gnu.org> wrote: >>> >>> On Tue, 24 May 2022 at 14:50, Richard Sandiford >>> <richard.sandiford@arm.com> wrote: >>>> >>>> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: >>>>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi >>>>> index c5006afc00d..0a3c733ada9 100644 >>>>> --- a/gcc/doc/tm.texi >>>>> +++ b/gcc/doc/tm.texi >>>>> @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar >>>>> access using @var{type} is known to be naturally aligned. >>>>> @end deftypefn >>>>> >>>>> -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) >>>>> +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) >>>>> This hook is used to test whether the target can permute up to two >>>>> -vectors of mode @var{mode} using the permutation vector @code{sel}, and >>>>> -also to emit such a permutation. In the former case @var{in0}, @var{in1} >>>>> -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are >>>>> -the source vectors and @var{out} is the destination vector; all three are >>>>> -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if >>>>> -@var{sel} describes a permutation on one vector instead of two. >>>>> +vectors of mode @var{op_mode} using the permutation vector @code{sel}, >>>>> +producing a vector of mode @var{mode}.The hook is also used to emit such >>>> >>>> Should be two spaces between “@var{mode}.” and “The”. >>>> >>>>> +a permutation. >>>>> + >>>>> +When the hook is being used to test whether the target supports a permutation, >>>>> +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used >>>> >>>> Same here: missing spaces before “When”. >>>> >>>>> +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode >>>>> +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. >>>>> +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one >>>>> +vector instead of two. >>>>> >>>>> Return true if the operation is possible, emitting instructions for it >>>>> if rtxes are provided. >>>>> diff --git a/gcc/match.pd b/gcc/match.pd >>>>> index f5efa77560c..f2a527d9c42 100644 >>>>> --- a/gcc/match.pd >>>>> +++ b/gcc/match.pd >>>>> @@ -7596,6 +7596,8 @@ and, >>>>> (with >>>>> { >>>>> tree op0 = @0, op1 = @1, op2 = @2; >>>>> + machine_mode result_mode = TYPE_MODE (type); >>>>> + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); >>>>> >>>>> /* Build a vector of integers from the tree mask. */ >>>>> vec_perm_builder builder; >>>>> @@ -7703,12 +7705,12 @@ and, >>>>> 2-argument version. */ >>>>> tree oldop2 = op2; >>>>> if (sel.ninputs () == 2 >>>>> - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) >>>>> + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) >>>>> op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); >>>>> else >>>>> { >>>>> vec_perm_indices sel2 (builder, 2, nelts); >>>>> - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) >>>>> + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) >>>>> op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); >>>>> else >>>>> /* Not directly supported with either encoding, >>>> >>>> Please replace the use of TYPE_MODE here: >>>> >>>> /* See if the permutation is performing a single element >>>> insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR >>>> in that case. But only if the vector mode is supported, >>>> otherwise this is invalid GIMPLE. */ >>>> if (TYPE_MODE (type) != BLKmode >>>> >>>> as well. >>>> >>>> OK with those changes, thanks. >>> Thanks, committed the patch in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e. >> >> So the present state allows to ask can_vec_perm_const_p but the >> implementation asks for >> >> 431 if (direct_optab_handler (vec_perm_optab, mode) != >> CODE_FOR_nothing) >> 432 return true; >> >> which then breaks. Also the VEC_PERMs are not yet valid. Any reason this >> was committed half-way through the review process of the series? > Hi Richard, > I am very sorry about that. I thought the patch was approved, and > committed it after testing on x86_64 and aarch64. > Should I revert it ? No need. > Um sorry to ask a silly question -- I am not sure why does the patch > break the above condition ? > The patch still passes mode to direct_optab_handler as before, and > IIUC does not affect modes > for vec_perm_optab, so it shouldn't affect the call to > direct_optab_handler above ? x86 now accepts V4SI V8SI permutes because we don’t ask it correctly and thus my naive attempt to use the new function API breaks . Not to mention the VEC_PERM IL is still rejected. I will wait for the rest of the series to be approved and pushed. Richard. > Thanks, > Prathamesh >> >> At least I have a user in the vectorizer ready - allowing more permutes >> from existing vectors (of different sizes now) to be SLP vectorized. >> >> Thanks, >> Richard. >> >>> Thanks, >>> Prathamesh >>>> >>>> Richard
On Thu, 26 May 2022 at 00:37, Richard Biener <richard.guenther@gmail.com> wrote: > > > > > Am 25.05.2022 um 21:03 schrieb Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>: > > > > On Wed, 25 May 2022 at 18:27, Richard Biener <richard.guenther@gmail.com> wrote: > >> > >>> On Tue, May 24, 2022 at 9:22 PM Prathamesh Kulkarni via Gcc-patches > >>> <gcc-patches@gcc.gnu.org> wrote: > >>> > >>> On Tue, 24 May 2022 at 14:50, Richard Sandiford > >>> <richard.sandiford@arm.com> wrote: > >>>> > >>>> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > >>>>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > >>>>> index c5006afc00d..0a3c733ada9 100644 > >>>>> --- a/gcc/doc/tm.texi > >>>>> +++ b/gcc/doc/tm.texi > >>>>> @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar > >>>>> access using @var{type} is known to be naturally aligned. > >>>>> @end deftypefn > >>>>> > >>>>> -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > >>>>> +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) > >>>>> This hook is used to test whether the target can permute up to two > >>>>> -vectors of mode @var{mode} using the permutation vector @code{sel}, and > >>>>> -also to emit such a permutation. In the former case @var{in0}, @var{in1} > >>>>> -and @var{out} are all null. In the latter case @var{in0} and @var{in1} are > >>>>> -the source vectors and @var{out} is the destination vector; all three are > >>>>> -operands of mode @var{mode}. @var{in1} is the same as @var{in0} if > >>>>> -@var{sel} describes a permutation on one vector instead of two. > >>>>> +vectors of mode @var{op_mode} using the permutation vector @code{sel}, > >>>>> +producing a vector of mode @var{mode}.The hook is also used to emit such > >>>> > >>>> Should be two spaces between “@var{mode}.” and “The”. > >>>> > >>>>> +a permutation. > >>>>> + > >>>>> +When the hook is being used to test whether the target supports a permutation, > >>>>> +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used > >>>> > >>>> Same here: missing spaces before “When”. > >>>> > >>>>> +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode > >>>>> +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}. > >>>>> +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on one > >>>>> +vector instead of two. > >>>>> > >>>>> Return true if the operation is possible, emitting instructions for it > >>>>> if rtxes are provided. > >>>>> diff --git a/gcc/match.pd b/gcc/match.pd > >>>>> index f5efa77560c..f2a527d9c42 100644 > >>>>> --- a/gcc/match.pd > >>>>> +++ b/gcc/match.pd > >>>>> @@ -7596,6 +7596,8 @@ and, > >>>>> (with > >>>>> { > >>>>> tree op0 = @0, op1 = @1, op2 = @2; > >>>>> + machine_mode result_mode = TYPE_MODE (type); > >>>>> + machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); > >>>>> > >>>>> /* Build a vector of integers from the tree mask. */ > >>>>> vec_perm_builder builder; > >>>>> @@ -7703,12 +7705,12 @@ and, > >>>>> 2-argument version. */ > >>>>> tree oldop2 = op2; > >>>>> if (sel.ninputs () == 2 > >>>>> - || can_vec_perm_const_p (TYPE_MODE (type), sel, false)) > >>>>> + || can_vec_perm_const_p (result_mode, op_mode, sel, false)) > >>>>> op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel); > >>>>> else > >>>>> { > >>>>> vec_perm_indices sel2 (builder, 2, nelts); > >>>>> - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false)) > >>>>> + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)) > >>>>> op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2); > >>>>> else > >>>>> /* Not directly supported with either encoding, > >>>> > >>>> Please replace the use of TYPE_MODE here: > >>>> > >>>> /* See if the permutation is performing a single element > >>>> insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR > >>>> in that case. But only if the vector mode is supported, > >>>> otherwise this is invalid GIMPLE. */ > >>>> if (TYPE_MODE (type) != BLKmode > >>>> > >>>> as well. > >>>> > >>>> OK with those changes, thanks. > >>> Thanks, committed the patch in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e. > >> > >> So the present state allows to ask can_vec_perm_const_p but the > >> implementation asks for > >> > >> 431 if (direct_optab_handler (vec_perm_optab, mode) != > >> CODE_FOR_nothing) > >> 432 return true; > >> > >> which then breaks. Also the VEC_PERMs are not yet valid. Any reason this > >> was committed half-way through the review process of the series? > > Hi Richard, > > I am very sorry about that. I thought the patch was approved, and > > committed it after testing on x86_64 and aarch64. > > Should I revert it ? > > No need. > > > Um sorry to ask a silly question -- I am not sure why does the patch > > break the above condition ? > > The patch still passes mode to direct_optab_handler as before, and > > IIUC does not affect modes > > for vec_perm_optab, so it shouldn't affect the call to > > direct_optab_handler above ? > > x86 now accepts V4SI V8SI permutes because we don’t ask it correctly and thus my naive attempt to use the new function API breaks . Not to mention the VEC_PERM IL is still rejected. I will wait for the rest of the series to be approved and pushed. Hi, I pushed the entire series in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e after it was approved by Richard S. Thanks, Prathamesh > > Richard. > > > Thanks, > > Prathamesh > >> > >> At least I have a user in the vectorizer ready - allowing more permutes > >> from existing vectors (of different sizes now) to be SLP vectorized. > >> > >> Thanks, > >> Richard. > >> > >>> Thanks, > >>> Prathamesh > >>>> > >>>> Richard
On Wed, May 25, 2022 at 10:24 PM Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote: > > On Thu, 26 May 2022 at 00:37, Richard Biener <richard.guenther@gmail.com> wrote: [...] > > x86 now accepts V4SI V8SI permutes because we don’t ask it correctly and thus my naive attempt to use the new function API breaks . Not to mention the VEC_PERM IL is still rejected. I will wait for the rest of the series to be approved and pushed. > Hi, > I pushed the entire series in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e > after it was approved by Richard S. Thanks. Maybe I'm doing it wrong but I now see indices.new_vector (mask, second_vec.first == -1U ? 1 : 2, nunits); bool identity_p = indices.series_p (0, 1, 0, 1); where nunits is 4 and mask {4, 5, 6, 7}, the number of vectors is 1, and now indices.series_p (0, 1, 0, 1) returns true despite my input vector having 8 elements and 'indices' should select the upper half. That's because the function calls clamp() on the encoding but clamp() knows nothing about the different nunits of the input vectors. I suppose vec_perm_indices needs updating to allow for different nunits of the input vectors as well? Where else does this change need adjustments to other APIs? PR101668 has a naiive user of the new capability. The included testcase works OK but trying to expand test coverage quickly runs into wrong-code, like for typedef int v8si __attribute__((vector_size (32))); typedef long long v4di __attribute__((vector_size (32))); void bar_s32_s64 (v4di * dst, v8si src) { long long tem[8]; tem[0] = src[4]; tem[1] = src[5]; tem[2] = src[6]; tem[3] = src[7]; dst[0] = *(v4di *) tem; } which I expected to be rejected with -mavx2. Thanks, Richard. > Thanks, > Prathamesh > > > > Richard. > > > > > Thanks, > > > Prathamesh > > >> > > >> At least I have a user in the vectorizer ready - allowing more permutes > > >> from existing vectors (of different sizes now) to be SLP vectorized. > > >> > > >> Thanks, > > >> Richard. > > >> > > >>> Thanks, > > >>> Prathamesh > > >>>> > > >>>> Richard
(Sorry for the slow reply, was off on Friday) Richard Biener <richard.guenther@gmail.com> writes: > On Wed, May 25, 2022 at 10:24 PM Prathamesh Kulkarni > <prathamesh.kulkarni@linaro.org> wrote: >> >> On Thu, 26 May 2022 at 00:37, Richard Biener <richard.guenther@gmail.com> wrote: > [...] >> > x86 now accepts V4SI V8SI permutes because we don’t ask it correctly and thus my naive attempt to use the new function API breaks . Not to mention the VEC_PERM IL is still rejected. I will wait for the rest of the series to be approved and pushed. >> Hi, >> I pushed the entire series in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e >> after it was approved by Richard S. > > Thanks. > > Maybe I'm doing it wrong but I now see > > indices.new_vector (mask, second_vec.first == -1U ? 1 : 2, nunits); > bool identity_p = indices.series_p (0, 1, 0, 1); > > where nunits is 4 and mask {4, 5, 6, 7}, the number of vectors is 1, > and now indices.series_p (0, 1, 0, 1) returns true despite my input > vector having 8 elements and 'indices' should select the upper half. > That's because the function calls clamp() on the encoding but > clamp() knows nothing about the different nunits of the input vectors. > > I suppose vec_perm_indices needs updating to allow for different > nunits of the input vectors as well? The final argument to new_vector is supposed to be the number of elements per input vector, so it sounds like it should be 8 rather than 4 in this situation. The number of elements per output vector is taken from the mask argument. Thanks, Richard > > Where else does this change need adjustments to other APIs? > > PR101668 has a naiive user of the new capability. The included > testcase works OK but trying to expand test coverage quickly > runs into wrong-code, like for > > typedef int v8si __attribute__((vector_size (32))); > typedef long long v4di __attribute__((vector_size (32))); > > void > bar_s32_s64 (v4di * dst, v8si src) > { > long long tem[8]; > tem[0] = src[4]; > tem[1] = src[5]; > tem[2] = src[6]; > tem[3] = src[7]; > dst[0] = *(v4di *) tem; > } > > which I expected to be rejected with -mavx2. > > Thanks, > Richard. > >> Thanks, >> Prathamesh >> > >> > Richard. >> > >> > > Thanks, >> > > Prathamesh >> > >> >> > >> At least I have a user in the vectorizer ready - allowing more permutes >> > >> from existing vectors (of different sizes now) to be SLP vectorized. >> > >> >> > >> Thanks, >> > >> Richard. >> > >> >> > >>> Thanks, >> > >>> Prathamesh >> > >>>> >> > >>>> Richard
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c5006afc00d..31ff6ef3f92 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6088,7 +6088,7 @@ for the given scalar type @var{type}. @var{is_packed} is false if the scalar access using @var{type} is known to be naturally aligned. @end deftypefn -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const vec_perm_indices @var{&sel}) This hook is used to test whether the target can permute up to two vectors of mode @var{mode} using the permutation vector @code{sel}, and also to emit such a permutation. In the former case @var{in0}, @var{in1} diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc index 68dc679cc6a..aef9d4c5d28 100644 --- a/gcc/optabs-query.cc +++ b/gcc/optabs-query.cc @@ -417,8 +417,8 @@ can_vec_perm_var_p (machine_mode mode) with here. */ bool -can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, - bool allow_variable_p) +can_vec_perm_const_p (machine_mode mode, machine_mode op_mode, + const vec_perm_indices &sel, bool allow_variable_p) { /* If the target doesn't implement a vector mode for the vector type, then no operations are supported. */ @@ -448,7 +448,7 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, if (targetm.vectorize.vec_perm_const != NULL) { - if (targetm.vectorize.vec_perm_const (mode, NULL_RTX, NULL_RTX, + if (targetm.vectorize.vec_perm_const (mode, op_mode, NULL_RTX, NULL_RTX, NULL_RTX, sel)) return true; @@ -462,6 +462,13 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, return false; } +bool +can_vec_perm_const_p (machine_mode mode, const vec_perm_indices &sel, + bool allow_variable_p) +{ + return can_vec_perm_const_p (mode, mode, sel, allow_variable_p); +} + /* Find a widening optab even if it doesn't widen as much as we want. E.g. if from_mode is HImode, and to_mode is DImode, and there is no direct HI->SI insn, then return SI->DI, if that exists. */ diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h index b9c9fd6f64d..fa22c2210d8 100644 --- a/gcc/optabs-query.h +++ b/gcc/optabs-query.h @@ -178,6 +178,8 @@ bool can_conditionally_move_p (machine_mode mode); opt_machine_mode qimode_for_vec_perm (machine_mode); bool selector_fits_mode_p (machine_mode, const vec_perm_indices &); bool can_vec_perm_var_p (machine_mode); +bool can_vec_perm_const_p (machine_mode, machine_mode, const vec_perm_indices &, + bool = true); bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &, bool = true); /* Find a widening optab even if it doesn't widen as much as we want. */ diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 3d8fa3abdfe..55f10c41789 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -6250,7 +6250,9 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, if (single_arg_p) v1 = v0; - if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices)) + gcc_checking_assert (GET_MODE (v0) == GET_MODE (v1)); + machine_mode op_mode = GET_MODE (v0); + if (targetm.vectorize.vec_perm_const (mode, op_mode, target, v0, v1, indices)) return target; } @@ -6264,7 +6266,7 @@ expand_vec_perm_const (machine_mode mode, rtx v0, rtx v1, v0_qi = gen_lowpart (qimode, v0); v1_qi = gen_lowpart (qimode, v1); if (targetm.vectorize.vec_perm_const != NULL - && targetm.vectorize.vec_perm_const (qimode, target_qi, v0_qi, + && targetm.vectorize.vec_perm_const (qimode, qimode, target_qi, v0_qi, v1_qi, qimode_indices)) return gen_lowpart (mode, target_qi); } diff --git a/gcc/target.def b/gcc/target.def index d85adf36a39..2713c31dc3f 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1894,7 +1894,7 @@ try the equivalent byte operation. If that also fails, it will try forcing\n\ the selector into a register and using the @var{vec_perm@var{mode}}\n\ instruction pattern. There is no need for the hook to handle these two\n\ implementation approaches itself.", - bool, (machine_mode mode, rtx output, rtx in0, rtx in1, + bool, (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, const vec_perm_indices &sel), NULL)