From patchwork Sat Feb 24 03:18:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 86317 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 538653858298 for ; Sat, 24 Feb 2024 03:19:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 78D233858407 for ; Sat, 24 Feb 2024 03:19:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78D233858407 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 78D233858407 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708744746; cv=none; b=EL6TCovk5dsmpfzeWmWmzwB/VbxU8BAqXO9oq/Ix2K7QmFeiJBnNHbg8uGHpTS6w+iK9GKYjSyBakM7IlHWlCw3QkTwNSo9e4jliBtiCfXai1mNTNjxpuKXWV8GRohg9owfhSBRkvMVpXSySGOund5b8rcCxdxciU8/HSybtAZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708744746; c=relaxed/simple; bh=ImYZ7jVRcyWJsIrMDMiIlYNWLATfiLh5sGDuHMgr2CM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=AWtG+3+gZasGs58pNnYgaKFjRUnQJ6qnLNJRqRYjE/MuwBS8QgcgOBUYL6oO8tl2qQNiCzsvg6CTMheQjeqMBZmTqDnBBCvCP06vKM7VJH//TR4LVUSbqHrj0I/d7vm08wdI/hopUO4EQYfeRETSfYKBg8WPmsqGZAz+EogaxAc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 41O3GSeO030888 for ; Sat, 24 Feb 2024 03:19:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s= qcppdkim1; bh=AKcXqXLxuLQHrTWtSLt9ObnxaC8XF2zxLwyB8FOxCSc=; b=Bu dFkPR7sJXZIeDqJqhsV2yhWznW+UL/rKC9CiQgVwb0/J2sXfDVM57u4SfPSNMBy0 Y5AXthvHbrVCXcfEVltceDeRVSTpkjkaXv8hAR2san+yRMJmSgoMLkInI4P4gcMR IEOh942fJFsqrintCN4C9V16HUJUW9Cil86MQNxKXaZFErKFNiwn+1aPQzAZCXqF 7NcQtZ+syfEGccGe0rkZyVP5D0CCEGLv3l52n1AldBlqN6vJNFiB4utTjtp0DsjP l+qRZuVdwjoLkltkcNBZUXVrA2PShoYMdSOeaZawOepAiSjFs2f1yk4tXQQzdHvA X9lmpm8XTL+mVT8XN2Rg== Received: from nasanppmta01.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3wesgg1x68-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 24 Feb 2024 03:19:00 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA01.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 41O3Ixds026517 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 24 Feb 2024 03:18:59 GMT Received: from hu-apinski-lv.qualcomm.com (10.49.16.6) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 19:18:59 -0800 From: Andrew Pinski To: CC: Andrew Pinski Subject: [PATCH 2/2] aarch64: Support `{1.0f, 1.0f, 0.0, 0.0}` CST forming with fmov with a smaller vector type. Date: Fri, 23 Feb 2024 19:18:48 -0800 Message-ID: <20240224031848.3866630-2-quic_apinski@quicinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240224031848.3866630-1-quic_apinski@quicinc.com> References: <20240224031848.3866630-1-quic_apinski@quicinc.com> MIME-Version: 1.0 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01b.na.qualcomm.com (10.47.209.197) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: s546313aMXxneW8Lf9vKtJc1xfnZ0yHe X-Proofpoint-ORIG-GUID: s546313aMXxneW8Lf9vKtJc1xfnZ0yHe X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-23_08,2024-02-23_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 bulkscore=0 suspectscore=0 adultscore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 mlxscore=0 malwarescore=0 clxscore=1015 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2402120000 definitions=main-2402240025 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org This enables construction of V4SF CST like `{1.0f, 1.0f, 0.0f, 0.0f}` (and other fp enabled CSTs) by using `fmov v0.2s, 1.0` as the instruction is designed to zero out the other bits. This is a small extension on top of the code that creates fmov for the case where the all but the first element is non-zero. Built and tested for aarch64-linux-gnu with no regressions. PR target/113856 gcc/ChangeLog: * config/aarch64/aarch64.cc (simd_immediate_info): Add bool to the float mode constructor. Document modifier field for FMOV_SDH. (aarch64_simd_valid_immediate): Recognize where the first half of the const float vect is the same. (aarch64_output_simd_mov_immediate): Handle the case where insn is FMOV_SDH and modifier is MSL. gcc/testsuite/ChangeLog: * gcc.target/aarch64/fmov-zero-cst-3.c: New test. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64.cc | 34 ++++++++++++++++--- .../gcc.target/aarch64/fmov-zero-cst-3.c | 28 +++++++++++++++ 2 files changed, 57 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index c4386591a9b..89bd0c5e5a6 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -130,7 +130,7 @@ struct simd_immediate_info enum modifier_type { LSL, MSL }; simd_immediate_info () {} - simd_immediate_info (scalar_float_mode, rtx, insn_type = MOV); + simd_immediate_info (scalar_float_mode, rtx, insn_type = MOV, bool = false); simd_immediate_info (scalar_int_mode, unsigned HOST_WIDE_INT, insn_type = MOV, modifier_type = LSL, unsigned int = 0); @@ -153,6 +153,8 @@ struct simd_immediate_info /* The kind of shift modifier to use, and the number of bits to shift. This is (LSL, 0) if no shift is needed. */ + /* For FMOV_SDH, LSL says it is a single while MSL + says if it is either .4h/.2s fmov. */ modifier_type modifier; unsigned int shift; } mov; @@ -173,12 +175,12 @@ struct simd_immediate_info /* Construct a floating-point immediate in which each element has mode ELT_MODE_IN and value VALUE_IN. */ inline simd_immediate_info -::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in, insn_type insn_in) +::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in, insn_type insn_in, bool firsthalfsame) : elt_mode (elt_mode_in), insn (insn_in) { gcc_assert (insn_in == MOV || insn_in == FMOV_SDH); u.mov.value = value_in; - u.mov.modifier = LSL; + u.mov.modifier = firsthalfsame ? MSL : LSL; u.mov.shift = 0; } @@ -22944,10 +22946,23 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info, || aarch64_float_const_representable_p (elt)) { bool valid = true; + bool firsthalfsame = false; for (unsigned int i = 1; i < n_elts; i++) { rtx elt1 = CONST_VECTOR_ENCODED_ELT (op, i); if (!aarch64_float_const_zero_rtx_p (elt1)) + { + if (i == 1) + firsthalfsame = true; + if (!firsthalfsame + || i >= n_elts/2 + || !rtx_equal_p (elt, elt1)) + { + valid = false; + break; + } + } + else if (firsthalfsame && i < n_elts/2) { valid = false; break; @@ -22957,7 +22972,8 @@ aarch64_simd_valid_immediate (rtx op, simd_immediate_info *info, { if (info) *info = simd_immediate_info (elt_float_mode, elt, - simd_immediate_info::FMOV_SDH); + simd_immediate_info::FMOV_SDH, + firsthalfsame); return true; } } @@ -25165,8 +25181,16 @@ aarch64_output_simd_mov_immediate (rtx const_vector, unsigned width, real_to_decimal_for_mode (float_buf, CONST_DOUBLE_REAL_VALUE (info.u.mov.value), buf_size, buf_size, 1, info.elt_mode); - if (info.insn == simd_immediate_info::FMOV_SDH) + if (info.insn == simd_immediate_info::FMOV_SDH + && info.u.mov.modifier == simd_immediate_info::LSL) snprintf (templ, sizeof (templ), "fmov\t%%%c0, %s", element_char, float_buf); + else if (info.insn == simd_immediate_info::FMOV_SDH + && info.u.mov.modifier == simd_immediate_info::MSL) + { + gcc_assert (element_char != 'd'); + gcc_assert (lane_count > 2); + snprintf (templ, sizeof (templ), "fmov\t%%0.%d%c, %s", lane_count/2, element_char, float_buf); + } else if (lane_count == 1) snprintf (templ, sizeof (templ), "fmov\t%%d0, %s", float_buf); else diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c new file mode 100644 index 00000000000..7a78b6d3caf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmov-zero-cst-3.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcmodel=tiny" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +/* PR target/113856 */ + +#define vect64 __attribute__((vector_size(8) )) +#define vect128 __attribute__((vector_size(16) )) + +/* +** f2: +** fmov v0.2s, 1.0e\+0 +** ret +*/ +vect128 float f2() +{ + return (vect128 float){1.0f, 1.0f, 0, 0}; +} + +/* +** f3: +** ldr q0, \.LC[0-9]+ +** ret +*/ +vect128 float f3() +{ + return (vect128 float){1.0f, 1.0f, 1.0f, 0.0}; +} +