From patchwork Tue Nov 28 07:52:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 80870 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2AB70385829F for ; Tue, 28 Nov 2023 07:54:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by sourceware.org (Postfix) with ESMTPS id D9FB13858C39 for ; Tue, 28 Nov 2023 07:54:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D9FB13858C39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D9FB13858C39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.55.52.93 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701158059; cv=none; b=eQdYKNW8s6cX6I60vRl+gyerKjHGGGOaYoNeD7trVgRD1KUt2LRawDqqyCZbAiZI/HQx72ZNPlKAxb/xZSielWjdQR2HXNiBcmuq6C/RqQMTA1xCPHmKMwd2jquUUCqXP5MyxRsaP6cSRZJnf/oitXWRro4jIdkv+xDPwJx/s4k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701158059; c=relaxed/simple; bh=IlzmcGTQq0k0cLvH830X2D5ATubwyVZ7Pqoycqc995g=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=L52qhnQPpqesSM0gea786hgJDheLWuqOo2rs+Mma3yw/mffiSoFiy4tWEh5cE8/XU1PrdcG8qtHUheCn50K/DX4usT31CLv6Iv1c3ucxd7aS5zmtOJr5ISiclcugwbX50LNKMsdeqSZo4ALOY8MUe16spjlsy0RIHY5ibv9QgGM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701158057; x=1732694057; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=IlzmcGTQq0k0cLvH830X2D5ATubwyVZ7Pqoycqc995g=; b=LD4oIWpnXvnOOqqrfk8ujuMjKhphiAtze3wu3asJw2GpuC+s8qm1NBfX vpjuLcjUbjBIhsUpe0Mv98q8mnOz9sV+XmOgk2bldXhKGBmRsMsxZVll9 c03SXCoNz1LryhGW/GkoyPsXBEgF9UTCH3rVho+aeLNWwetJceaMZhV13 Na0n2jAZ/K69Xp2EVfP5wSXxdBibILtd8HK2mo+pSimvEv1uRmUC/rlHN kE7kxpH+8cNIsDjZW5CgVGQdPppA0qga5LPYE4rnJjO/ClR8E97jj2gip /H6IA+4MhE96E/X78pxRMhRnNyQiJYRuvSAoCNwOwF08mkIXCnhCqajz4 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10907"; a="390028735" X-IronPort-AV: E=Sophos;i="6.04,233,1695711600"; d="scan'208";a="390028735" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Nov 2023 23:54:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10907"; a="761859673" X-IronPort-AV: E=Sophos;i="6.04,233,1695711600"; d="scan'208";a="761859673" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga007.jf.intel.com with ESMTP; 27 Nov 2023 23:54:13 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9D6DB1005682; Tue, 28 Nov 2023 15:54:12 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH] Take register pressure into account for vec_construct when the components are not loaded from memory. Date: Tue, 28 Nov 2023 15:52:12 +0800 Message-Id: <20231128075212.3526692-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org For vec_contruct, the components must be live at the same time if they're not loaded from memory, when the number of those components exceeds available registers, spill happens. Try to account that with a rough estimation. ??? Ideally, we should have an overall estimation of register pressure if we know the live range of all variables. The patch can avoid regressions due to .i.e. vec_contruct with 32 char. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Take register pressure into account for vec_construct when the components are not loaded from memory. --- gcc/config/i386/i386.cc | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 683ac643bc8..f8417555930 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -24706,6 +24706,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); unsigned i; tree op; + unsigned reg_needed = 0; FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) if (TREE_CODE (op) == SSA_NAME) TREE_VISITED (op) = 0; @@ -24737,11 +24738,30 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, && (gimple_assign_rhs_code (def) != BIT_FIELD_REF || !VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (gimple_assign_rhs1 (def), 0)))))) - stmt_cost += ix86_cost->sse_to_integer; + { + stmt_cost += ix86_cost->sse_to_integer; + reg_needed++; + } } FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op) if (TREE_CODE (op) == SSA_NAME) TREE_VISITED (op) = 0; + + /* For vec_contruct, the components must be live at the same time if + they're not loaded from memory, when the number of those components + exceeds available registers, spill happens. Try to account that with a + rough estimation. Currently only handle integral modes since scalar fp + shares sse_regs with vectors. + ??? Ideally, we should have an overall estimation of register pressure + if we know the live range of all variables. */ + if (!fp && kind == vec_construct + && reg_needed > target_avail_regs) + { + unsigned spill_cost = ix86_builtin_vectorization_cost (scalar_store, + vectype, + misalign); + stmt_cost += spill_cost * (reg_needed - target_avail_regs); + } } if (stmt_cost == -1) stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);