From patchwork Thu Mar 21 14:22:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 87451 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E5EC73858425 for ; Thu, 21 Mar 2024 14:23:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id AB1C93858D28 for ; Thu, 21 Mar 2024 14:22:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AB1C93858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=baylibre.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AB1C93858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::330 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711030957; cv=none; b=sA7qN9YHYxuOaTP9TS7Y3Ct+yBY0810Jqjbq0ypVOLf0oY4ACUrnJKM6nF4UAjYxI73y0fmA14DGsPdeK43VMVbeV2bpC2cVI+2/RvGzF9IkSVcVoShZgowkZUMqKoEd/fh8wfQkOx92J+3wyZRn/HmxsU099MmJLgAP5kOk438= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711030957; c=relaxed/simple; bh=g7NP7FbB6jM81wJ+1LPGIcijR3ZHHGb5Iq0soqbkLDs=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=hKWMoOpdYGOqowQmG+QCQgWUXt7Ktm9HR1ZVzww1E1IMyj75H3/D+7yQf0aAAZFJ4l3ymEQJG0q8YpZj/ijJPcqPSSLalEGkyrtrCRzUEVDf1D7sEyHCnlbzQ5IaKqSE+aX2zW4/Imgs5Xd7C/Ozhvsljol1sGoGhxisZuTyTro= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-414782fac28so530305e9.1 for ; Thu, 21 Mar 2024 07:22:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20230601.gappssmtp.com; s=20230601; t=1711030952; x=1711635752; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=5bq/+iyvkR05QgcADscV0FcBl00KyuhO2xjazX7du78=; b=NMa+h2lCaXC7Q+mWkbVt6PNFa/VQqpjjXegTvSTwlZNDhp1x0yOpdZZtK58rkFB1C+ 3r2xyfRaKh9awTZ4bTfye6NXUf88woBJnuxiql0neV0R2XnB1XHb4IJl8ohyt74WpCP9 LdszRnIgMAFADTOiDFbAfDinbeJdeq8ip8UlO/qZInXJpIbf66JwqisTxLoF7+nKDTTm +0UYkADSjNJv6zPZMw9IcG+e1Q8j4cRW8hTbzMQwdoujsrUMfslshDCcUe7d2m09o68n i21Cw473x8mFXF4AgFPHxff/kqWPuSR1QLIGJj48pTu6vnyTSaL8k1/6yBbZ56BUoOEf 5XZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711030952; x=1711635752; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5bq/+iyvkR05QgcADscV0FcBl00KyuhO2xjazX7du78=; b=SnqCQQs1NE9CSw+rhwho+fnisNn9pRpNw0SCps2bS4dB7ixbzYPDavj7LeLhQiotwc /cJN/sW9JIHAJRrVOvl32TLOq35uqCZ1Nxd4BfISz+Gomc1VtA0g1iXzKbAR5fw+bxRb Xh7f5X6n8CnRXwV23mMrzYkrLxhTHQP51HFJ7S/ESRbzaE6nVB9792R+GVyilfLOdbBG XPtNMm+rpLHatxcWx8zR5iKJaCpq7nmHmCw9yAn8fy7t5L0Zo7B2jiS635T1rQitRGlN eg1NFlJ8Gs6WvzES2Y/22ae4DjzT/cYYrGXHM4yHsCfGMBzu2o5N3oCy1Bn1XzHJdBRX s4NQ== X-Gm-Message-State: AOJu0YzolNdNgCKZhrjMMOfo7oPQf+FNAN8tvqTpXy1xXnIckklWfG+z FgHBwRanJMEG2g3mBUvIjaABKsTA4jCxUMW+1BvHJDfiVOxi5TvcjLetUMOh/0tO3UT/oMUhxmO eegLSog== X-Google-Smtp-Source: AGHT+IGqsrUwdZX1DYDS0PtvlnS6EYDklIwrgOmgLIqtZdhXPKKplMotDJ0rtx/aDv1uGXg0xi2l4g== X-Received: by 2002:a5d:5302:0:b0:33e:1629:6523 with SMTP id e2-20020a5d5302000000b0033e16296523mr15274349wrv.42.1711030952268; Thu, 21 Mar 2024 07:22:32 -0700 (PDT) Received: from localhost.localdomain (hawk-18-b2-v4wan-167765-cust1304.vm26.cable.virginm.net. [82.41.69.25]) by smtp.googlemail.com with ESMTPSA id ay19-20020a5d6f13000000b0033e2291fbc0sm17671570wrb.68.2024.03.21.07.22.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Mar 2024 07:22:31 -0700 (PDT) From: Andrew Stubbs To: gcc-patches@gcc.gnu.org Subject: [PATCH] vect: more oversized bitmask fixups Date: Thu, 21 Mar 2024 14:22:25 +0000 Message-ID: <20240321142225.52854-1-ams@baylibre.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org My previous patch to fix this problem with xor was rejected because we want to fix these issues only at the point of use. That patch produced slightly better code, in this example, but this works too.... These patches fix up a failure in testcase vect/tsvc/vect-tsvc-s278.c when configured to use V32 instead of V64 (I plan to do this for RDNA devices). The problem was that a "not" operation on the mask inadvertently enabled inactive lanes 31-63 and corrupted the output. The fix is to adjust the mask when calling internal functions (in this case COND_MINUS), when doing masked loads and stores, and when doing conditional jumps. OK for mainline? Andrew gcc/ChangeLog: * dojump.cc (do_compare_rtx_and_jump): Clear excess bits in vector bitmaps. * internal-fn.cc (expand_fn_using_insn): Likewise. (add_mask_and_len_args): Likewise. --- gcc/dojump.cc | 16 ++++++++++++++++ gcc/internal-fn.cc | 26 ++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/gcc/dojump.cc b/gcc/dojump.cc index 88600cb42d3..8df86957e83 100644 --- a/gcc/dojump.cc +++ b/gcc/dojump.cc @@ -1235,6 +1235,22 @@ do_compare_rtx_and_jump (rtx op0, rtx op1, enum rtx_code code, int unsignedp, } } + if (val + && VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (val)) + && SCALAR_INT_MODE_P (mode)) + { + auto nunits = TYPE_VECTOR_SUBPARTS (TREE_TYPE (val)).to_constant (); + if (maybe_ne (GET_MODE_PRECISION (mode), nunits)) + { + op0 = expand_binop (mode, and_optab, op0, + GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1), + NULL_RTX, true, OPTAB_WIDEN); + op1 = expand_binop (mode, and_optab, op1, + GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1), + NULL_RTX, true, OPTAB_WIDEN); + } + } + emit_cmp_and_jump_insns (op0, op1, code, size, mode, unsignedp, val, if_true_label, prob); } diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index fcf47c7fa12..5269f0ac528 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -245,6 +245,18 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, unsigned int noutputs, && SSA_NAME_IS_DEFAULT_DEF (rhs) && VAR_P (SSA_NAME_VAR (rhs))) create_undefined_input_operand (&ops[opno], TYPE_MODE (rhs_type)); + else if (VECTOR_BOOLEAN_TYPE_P (rhs_type) + && SCALAR_INT_MODE_P (TYPE_MODE (rhs_type)) + && maybe_ne (GET_MODE_PRECISION (TYPE_MODE (rhs_type)), + TYPE_VECTOR_SUBPARTS (rhs_type).to_constant ())) + { + /* Ensure that the vector bitmasks do not have excess bits. */ + int nunits = TYPE_VECTOR_SUBPARTS (rhs_type).to_constant (); + rtx tmp = expand_binop (TYPE_MODE (rhs_type), and_optab, rhs_rtx, + GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1), + NULL_RTX, true, OPTAB_WIDEN); + create_input_operand (&ops[opno], tmp, TYPE_MODE (rhs_type)); + } else create_input_operand (&ops[opno], rhs_rtx, TYPE_MODE (rhs_type)); opno += 1; @@ -312,6 +324,20 @@ add_mask_and_len_args (expand_operand *ops, unsigned int opno, gcall *stmt) { tree mask = gimple_call_arg (stmt, mask_index); rtx mask_rtx = expand_normal (mask); + + tree mask_type = TREE_TYPE (mask); + if (VECTOR_BOOLEAN_TYPE_P (mask_type) + && SCALAR_INT_MODE_P (TYPE_MODE (mask_type)) + && maybe_ne (GET_MODE_PRECISION (TYPE_MODE (mask_type)), + TYPE_VECTOR_SUBPARTS (mask_type).to_constant ())) + { + /* Ensure that the vector bitmasks do not have excess bits. */ + int nunits = TYPE_VECTOR_SUBPARTS (mask_type).to_constant (); + mask_rtx = expand_binop (TYPE_MODE (mask_type), and_optab, mask_rtx, + GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1), + NULL_RTX, true, OPTAB_WIDEN); + } + create_input_operand (&ops[opno++], mask_rtx, TYPE_MODE (TREE_TYPE (mask))); }