From patchwork Fri Jun 2 01:00:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 70485 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8554D3857016 for ; Fri, 2 Jun 2023 01:00:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8554D3857016 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685667649; bh=TKF5a6QROOya40DaGdji6svOGfttv8giEkLlhYtw1VM=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=N6kmWQf3UI+C6vaWdFTKluHYsv1tIrijs6Is0UxGjbDE4GAltykdkVPHZ0z5JhBPY nxeMa8JhLLe/DLTfRutz2Dmpv61oz41hERB1ikhUeLDcFPIm4OANESkp+q1ZJAOdn2 b+Sn/2Yu8CUDTajF3ART9vUcgkP5ME98zb87oeXs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 668AB3858C50 for ; Fri, 2 Jun 2023 01:00:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 668AB3858C50 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="358157275" X-IronPort-AV: E=Sophos;i="6.00,211,1681196400"; d="scan'208";a="358157275" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2023 18:00:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10728"; a="954245454" X-IronPort-AV: E=Sophos;i="6.00,211,1681196400"; d="scan'208";a="954245454" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga006.fm.intel.com with ESMTP; 01 Jun 2023 18:00:16 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 73CED1005688; Fri, 2 Jun 2023 09:00:15 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed. Date: Fri, 2 Jun 2023 09:00:15 +0800 Message-Id: <20230602010015.2571612-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" We have already use intermidate type in case WIDEN, but not for NONE, this patch extended that. I didn't do that in pattern recog since we need to know whether the stmt belongs to any slp_node to decide the vectype, the related optabs are checked according to vectype_in and vectype_out. For non-slp case, vec_pack/unpack are always used when lhs has different size from rhs, for slp case, sometimes vec_pack/unpack is used, somethings direct conversion is used. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/110018 * tree-vect-stmts.cc (vectorizable_conversion): Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110018-1.c: New test. --- gcc/testsuite/gcc.target/i386/pr110018-1.c | 94 ++++++++++++++++++++++ gcc/tree-vect-stmts.cc | 56 ++++++++++++- 2 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr110018-1.c diff --git a/gcc/testsuite/gcc.target/i386/pr110018-1.c b/gcc/testsuite/gcc.target/i386/pr110018-1.c new file mode 100644 index 00000000000..b1baffd7af1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110018-1.c @@ -0,0 +1,94 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq" } */ +/* { dg-final { scan-assembler-times {(?n)vcvttp[dsh]2[dqw]} 5 } } */ +/* { dg-final { scan-assembler-times {(?n)vcvt[dqw]*2p[dsh]} 5 } } */ + +void +foo (double* __restrict a, char* b) +{ + a[0] = b[0]; + a[1] = b[1]; +} + +void +foo1 (float* __restrict a, char* b) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; +} + +void +foo2 (_Float16* __restrict a, char* b) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; + a[4] = b[4]; + a[5] = b[5]; + a[6] = b[6]; + a[7] = b[7]; +} + +void +foo3 (double* __restrict a, short* b) +{ + a[0] = b[0]; + a[1] = b[1]; +} + +void +foo4 (float* __restrict a, char* b) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; +} + +void +foo5 (double* __restrict b, char* a) +{ + a[0] = b[0]; + a[1] = b[1]; +} + +void +foo6 (float* __restrict b, char* a) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; +} + +void +foo7 (_Float16* __restrict b, char* a) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; + a[4] = b[4]; + a[5] = b[5]; + a[6] = b[6]; + a[7] = b[7]; +} + +void +foo8 (double* __restrict b, short* a) +{ + a[0] = b[0]; + a[1] = b[1]; +} + +void +foo9 (float* __restrict b, char* a) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index bd3b07a3aa1..1118c89686d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5162,6 +5162,49 @@ vectorizable_conversion (vec_info *vinfo, return false; if (supportable_convert_operation (code, vectype_out, vectype_in, &code1)) break; + if ((code == FLOAT_EXPR + && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode)) + || (code == FIX_TRUNC_EXPR + && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode))) + { + bool float_expr_p = code == FLOAT_EXPR; + scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode; + fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode); + code1 = float_expr_p ? code : NOP_EXPR; + codecvt1 = float_expr_p ? NOP_EXPR : code; + FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode) + { + imode = rhs_mode_iter.require (); + if (GET_MODE_SIZE (imode) > fltsz) + break; + + cvt_type + = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode), + 0); + cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type, + slp_node); + /* This should only happened for SLP as long as loop vectorizer + only supports same-sized vector. */ + if (cvt_type == NULL_TREE + || maybe_ne (TYPE_VECTOR_SUBPARTS (cvt_type), nunits_in) + || !supportable_convert_operation (code1, vectype_out, + cvt_type, &code1) + || !supportable_convert_operation (codecvt1, cvt_type, + vectype_in, &codecvt1)) + continue; + + found_mode = true; + break; + } + + if (found_mode) + { + multi_step_cvt++; + interm_types.safe_push (cvt_type); + cvt_type = NULL_TREE; + break; + } + } /* FALLTHRU */ unsupported: if (dump_enabled_p ()) @@ -5381,7 +5424,18 @@ vectorizable_conversion (vec_info *vinfo, { /* Arguments are ready, create the new vector stmt. */ gcc_assert (TREE_CODE_LENGTH (code1) == unary_op); - gassign *new_stmt = gimple_build_assign (vec_dest, code1, vop0); + gassign* new_stmt; + if (multi_step_cvt) + { + gcc_assert (multi_step_cvt == 1); + new_stmt = gimple_build_assign (vec_dest, codecvt1, vop0); + new_temp = make_ssa_name (vec_dest, new_stmt); + gimple_assign_set_lhs (new_stmt, new_temp); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + vop0 = new_temp; + vec_dest = vec_dsts[0]; + } + new_stmt = gimple_build_assign (vec_dest, code1, vop0); new_temp = make_ssa_name (vec_dest, new_stmt); gimple_assign_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);