From patchwork Mon Sep 26 06:56:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liwei Xu X-Patchwork-Id: 58016 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8ADE83857023 for ; Mon, 26 Sep 2022 06:58:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8ADE83857023 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1664175531; bh=WoRprUvywVRLZto6j3mUi0dPGL3t4xqsZwdZFtGJ2nw=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=ZGY4escBHuXYWfTTMhTEuRuzaheVBHOAP2q5FuFuZ/3EGbNnSW3dX4yW+DJQ7Mpsg HTZ30q0qEhYagnkEQ1zCO5NMK+XVsMYL/Kw4toVlVUEmqw1H4BsMysgybXbh9ol8Bz lBv0/gFA/kcRuHzd406jKT7QMrij2oQWZbh0Z5GM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 58CBA3858CDA for ; Mon, 26 Sep 2022 06:58:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 58CBA3858CDA X-IronPort-AV: E=McAfee;i="6500,9779,10481"; a="301854893" X-IronPort-AV: E=Sophos;i="5.93,345,1654585200"; d="scan'208";a="301854893" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2022 23:58:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10481"; a="623226285" X-IronPort-AV: E=Sophos;i="5.93,345,1654585200"; d="scan'208";a="623226285" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga007.fm.intel.com with ESMTP; 25 Sep 2022 23:58:05 -0700 Received: from shliclel314.sh.intel.com (shliclel314.sh.intel.com [10.239.240.214]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8C7951005687; Mon, 26 Sep 2022 14:58:04 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Optimize nested permutation to single VEC_PERM_EXPR [PR54346] Date: Mon, 26 Sep 2022 14:56:04 +0800 Message-Id: <20220926065604.783193-1-liwei.xu@intel.com> X-Mailer: git-send-email 2.18.2 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Liwei Xu via Gcc-patches From: Liwei Xu Reply-To: Liwei Xu Cc: wilson@tuliptree.org, admin@levyhsu.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch implemented the optimization in PR 54346, which Merges c = VEC_PERM_EXPR ; d = VEC_PERM_EXPR ; to d = VEC_PERM_EXPR ; Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} tree-ssa/forwprop-19.c fail to pass but I'm not sure whether it is ok to removed it. gcc/ChangeLog: PR target/54346 * match.pd: Merge the index of VCST then generates the new vec_perm. gcc/testsuite/ChangeLog: PR target/54346 * gcc.dg/pr54346.c: New test. Co-authored-by: liuhongt --- gcc/match.pd | 41 ++++++++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/pr54346.c | 13 +++++++++++ 2 files changed, 54 insertions(+) create mode 100755 gcc/testsuite/gcc.dg/pr54346.c diff --git a/gcc/match.pd b/gcc/match.pd index 345bcb701a5..9219b0a10e1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -8086,6 +8086,47 @@ and, (minus (mult (vec_perm @1 @1 @3) @2) @4))) +/* (PR54346) Merge + c = VEC_PERM_EXPR ; + d = VEC_PERM_EXPR ; + to + d = VEC_PERM_EXPR ; */ + +(simplify + (vec_perm (vec_perm@0 @1 @2 VECTOR_CST@3) @0 VECTOR_CST@4) + (with + { + if(!TYPE_VECTOR_SUBPARTS (type).is_constant()) + return NULL_TREE; + + tree op0; + machine_mode result_mode = TYPE_MODE (type); + machine_mode op_mode = TYPE_MODE (TREE_TYPE (@1)); + int nelts = TYPE_VECTOR_SUBPARTS (type).to_constant(); + vec_perm_builder builder0; + vec_perm_builder builder1; + vec_perm_builder builder2 (nelts, nelts, 1); + + if (!tree_to_vec_perm_builder (&builder0, @3) + || !tree_to_vec_perm_builder (&builder1, @4)) + return NULL_TREE; + + vec_perm_indices sel0 (builder0, 2, nelts); + vec_perm_indices sel1 (builder1, 1, nelts); + + for (int i = 0; i < nelts; i++) + builder2.quick_push (sel0[sel1[i].to_constant()]); + + vec_perm_indices sel2 (builder2, 2, nelts); + + if (!can_vec_perm_const_p (result_mode, op_mode, sel2, false)) + return NULL_TREE; + + op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2); + } + (vec_perm @1 @2 { op0; }))) + + /* Match count trailing zeroes for simplify_count_trailing_zeroes in fwprop. The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic constant which when multiplied by a power of 2 contains a unique value diff --git a/gcc/testsuite/gcc.dg/pr54346.c b/gcc/testsuite/gcc.dg/pr54346.c new file mode 100755 index 00000000000..d87dc3a79a5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr54346.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-dse1" } */ + +typedef int veci __attribute__ ((vector_size (4 * sizeof (int)))); + +void fun (veci a, veci b, veci *i) +{ + veci c = __builtin_shuffle (a, b, __extension__ (veci) {1, 4, 2, 7}); + *i = __builtin_shuffle (c, __extension__ (veci) { 7, 2, 1, 5 }); +} + +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 3, 6, 0, 0 }" "dse1" } } */ +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "dse1" } } */ \ No newline at end of file