From patchwork Sat Nov 2 12:58:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100106 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DAC4A3857B91 for ; Sat, 2 Nov 2024 12:59:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by sourceware.org (Postfix) with ESMTPS id 15C5A3858D33 for ; Sat, 2 Nov 2024 12:58:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 15C5A3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 15C5A3858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::633 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552318; cv=none; b=I8u+vwTzdl9ytr87063u1bB7FRPS9Hw91eZ/+xE59FZMc9enFdavKvAJxt2ADWiuBIhjlt0rkqyBNN1mFlZTo3TiIo8wi/Ed/4V3vjirQQnHIPGxhb/4SqGeHDneZ4Bq6TrqgUaIxfyI9egaV0XtSvtUZxV/Vv2C46JDvJijXp4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552318; c=relaxed/simple; bh=CEjSq9hz6qIpMJi0YQ9jOLbuS6KMOfjrtSMGzS4Ls7s=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=EdP1dvjwR+ll/BVM+zWvBgJ8FMtKa5YQBJADwfcJWeu2OWSh46qSL0xElrgA3+cajS5+4634z9t+NwSyFWJc3uRJwqROQB3rdrDqqIrHuCFgKPk6W43/dMYPpg+FZCH27sy81LkUTNnV+CFp/CIzouEFWpSB+iz2M3C+On1hXwo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-a99eb8b607aso332606166b.2 for ; Sat, 02 Nov 2024 05:58:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552312; x=1731157112; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iNgqKRQpoxeDtLzwgvFJqdjhQHpn91VbBPjIK9zvfP0=; b=YAIwWUBizT9mjoVgLrcdnXnXXew3zt3aitiLmvnaMUrH3DGuaZtkEKcai7WAtgWkeM 5000Pfp/mNbQlzAoWN3UngFqkAV62PQu4eb05/odz/ARep9leoYHzGYcjskdsZmmuIFf 7OjDVuVKvlFhi+xtobyqisBrgvEdI1dB+cwmslIjCU0IUD2vl5egqbBDNSN5CzHrFWmG QwzIvWZTfXjZgZ9c2YTx6O5nmKfwIulwglY1V5p1qp+44Bnq1dMYB6B7jP36sR4mINbd bUpBt75t036//JtJDWsmHG0GERuxxECrgoQ9kaWxqj2sv4CfkEicSI560yfGQwsHRuzj HoSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552312; x=1731157112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iNgqKRQpoxeDtLzwgvFJqdjhQHpn91VbBPjIK9zvfP0=; b=DHcuVv8/6yxmQdFOmgBE787LPBxEhEhDQmcm4CbfsUuecX9624u1s9uwpvREE5OG7w KD3BmdmLvU3zI+rR8GjTz2kPr+RTpPcwB2cK+e1t21/CjYtMxXkNoH3XG+I6SiNS+hAr 6eXPpn4Mq7DFOWaDtMzitw8mG+Pu3YfvObVD+c7i0zkD9D45SNPaUQQlWOm36oK90li4 VoTidRUg5mFPFrDH29Ietv8SGUtJcseO8bSeRA9GbD1BGKiZdFy05beYDjZ4+jp7qOMR l1UP3tTjc4uYCaKGDnBdO+AmJKqxhKPCrv0oWM3eErX7npB4xD+6NtF+hudDZ5Fd6Wmc vBDA== X-Gm-Message-State: AOJu0YzlszEfKI7vCh/3AITfw1PfI21a97WRTZmgKSWz4apNJoDn7iMF bchfsKp7vh5g45DwIC8LzIbjrzOnX1mUA+2GamMMMJz6qNDwiEdYYPeQfA== X-Google-Smtp-Source: AGHT+IHSZxz3zeSQKNnySrg+D6PbEQXRBFwEdnu+lO68Xz8KqIAh50uBJzrem4Eljk2EbaaT9wHp5A== X-Received: by 2002:a17:907:7e9e:b0:a9a:46:83ee with SMTP id a640c23a62f3a-a9de61d1a69mr2102283966b.48.1730552311486; Sat, 02 Nov 2024 05:58:31 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:31 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 1/8] docs: Document maskload else operand and behavior. Date: Sat, 2 Nov 2024 13:58:21 +0100 Message-ID: <20241102125828.29183-2-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 63 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 41 insertions(+), 22 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 6d9c8643739..38d839ac4c9 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5014,8 +5014,10 @@ This pattern is not allowed to @code{FAIL}. @item @samp{vec_mask_load_lanes@var{m}@var{n}} Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional mask operand (operand 2) that specifies which elements of the destination -vectors should be loaded. Other elements of the destination -vectors are set to zero. The operation is equivalent to: +vectors should be loaded. Other elements of the destination vectors are +taken from operand 3, which is an else operand similar to the one in +@code{maskload}. +The operation is equivalent to: @smallexample int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n}); @@ -5025,7 +5027,7 @@ for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++) operand0[i][j] = operand1[j * c + i]; else for (i = 0; i < c; i++) - operand0[i][j] = 0; + operand0[i][j] = operand3[j]; @end smallexample This pattern is not allowed to @code{FAIL}. @@ -5033,16 +5035,20 @@ This pattern is not allowed to @code{FAIL}. @cindex @code{vec_mask_len_load_lanes@var{m}@var{n}} instruction pattern @item @samp{vec_mask_len_load_lanes@var{m}@var{n}} Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional -mask operand (operand 2), length operand (operand 3) as well as bias operand (operand 4) -that specifies which elements of the destination vectors should be loaded. -Other elements of the destination vectors are undefined. The operation is equivalent to: +mask operand (operand 2), length operand (operand 4) as well as bias operand +(operand 5) that specifies which elements of the destination vectors should be +loaded. Other elements of the destination vectors are taken from operand 3, +which is an else operand similar to the one in @code{maskload}. +The operation is equivalent to: @smallexample int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n}); -for (j = 0; j < operand3 + operand4; j++) - if (operand2[j]) - for (i = 0; i < c; i++) +for (j = 0; j < operand4 + operand5; j++) + for (i = 0; i < c; i++) + if (operand2[j]) operand0[i][j] = operand1[j * c + i]; + else + operand0[i][j] = operand3[j]; @end smallexample This pattern is not allowed to @code{FAIL}. @@ -5122,18 +5128,25 @@ address width. @cindex @code{mask_gather_load@var{m}@var{n}} instruction pattern @item @samp{mask_gather_load@var{m}@var{n}} Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand as -operand 5. Bit @var{i} of the mask is set if element @var{i} +operand 5. +Other elements of the destination vectors are taken from operand 6, +which is an else operand similar to the one in @code{maskload}. +Bit @var{i} of the mask is set if element @var{i} of the result should be loaded from memory and clear if element @var{i} -of the result should be set to zero. +of the result should be set to operand 6. @cindex @code{mask_len_gather_load@var{m}@var{n}} instruction pattern @item @samp{mask_len_gather_load@var{m}@var{n}} -Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand (operand 5), -a len operand (operand 6) as well as a bias operand (operand 7). Similar to mask_len_load, -the instruction loads at most (operand 6 + operand 7) elements from memory. +Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand +(operand 5) and an else operand (operand 6) as well as a len operand +(operand 7) and a bias operand (operand 8). + +Similar to mask_len_load the instruction loads at +most (operand 7 + operand 8) elements from memory. Bit @var{i} of the mask is set if element @var{i} of the result should -be loaded from memory and clear if element @var{i} of the result should be undefined. -Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored. +be loaded from memory and clear if element @var{i} of the result should +be set to element @var{i} of operand 6. +Mask elements @var{i} with @var{i} > (operand 7 + operand 8) are ignored. @cindex @code{scatter_store@var{m}@var{n}} instruction pattern @item @samp{scatter_store@var{m}@var{n}} @@ -5365,8 +5378,13 @@ Operands 4 and 5 have a target-dependent scalar integer mode. @cindex @code{maskload@var{m}@var{n}} instruction pattern @item @samp{maskload@var{m}@var{n}} Perform a masked load of vector from memory operand 1 of mode @var{m} -into register operand 0. Mask is provided in register operand 2 of -mode @var{n}. +into register operand 0. The mask is provided in register operand 2 of +mode @var{n}. Operand 3 (the ``else value'') is of mode @var{m} and +specifies which value is loaded when the mask is unset. +The predicate of operand 3 must only accept the else values that the target +actually supports. Currently three values are attempted, zero, -1, and +undefined. GCC handles an else value of zero more efficiently than -1 or +undefined. This pattern is not allowed to @code{FAIL}. @@ -5432,15 +5450,16 @@ Operands 0 and 1 have mode @var{m}, which must be a vector mode. Operand 3 has whichever integer mode the target prefers. A mask is specified in operand 2 which must be of type @var{n}. The mask has lower precedence than the length and is itself subject to length masking, -i.e. only mask indices < (operand 3 + operand 4) are used. +i.e. only mask indices < (operand 4 + operand 5) are used. +Operand 3 is an else operand similar to the one in @code{maskload}. Operand 4 conceptually has mode @code{QI}. -Operand 2 can be a variable or a constant amount. Operand 4 specifies a +Operand 4 can be a variable or a constant amount. Operand 5 specifies a constant bias: it is either a constant 0 or a constant -1. The predicate on -operand 4 must only accept the bias values that the target actually supports. +operand 5 must only accept the bias values that the target actually supports. GCC handles a bias of 0 more efficiently than a bias of -1. -If (operand 2 + operand 4) exceeds the number of elements in mode +If (operand 4 + operand 5) exceeds the number of elements in mode @var{m}, the behavior is undefined. If the target prefers the length to be measured in bytes From patchwork Sat Nov 2 12:58:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100105 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C80613857C7A for ; Sat, 2 Nov 2024 12:59:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by sourceware.org (Postfix) with ESMTPS id C29153858CDA for ; Sat, 2 Nov 2024 12:58:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C29153858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C29153858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::631 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552318; cv=none; b=oz2IdxpGQpKnDQJng8xFRkxPve4bdPSxXqrCzpJmyTugQGAhIosH5zfUqnjdl9T8P7RlvZ+8v4g2XlsKPqVmW6YC1ix4+KVOvqMfdUlzsUOrK3HtpLEge7XceR08escIGGZLh2C3qNAZCMqE0v3CKmLSqQY4F3ReyoFNufyidCw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552318; c=relaxed/simple; bh=kRL9qpjKk3y74/ftH2v+t9Z+rzlmBhs2JUKfYaTIdNc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=XKTlIlbvPFldwvDktnyj1ewHyjKM4gEYn0bjiGrC2JEiIJ3hxskg3a4ytc/KCNwFNtepNPwz8K5d5gnHGT8t73e5nWrBUre9zmTFLRDmfnFtP8aOdLEd7QNqiyFAYBZYyoFrJYq20HH8BpCI55m4WhTCQQE5V0O1+2Z0tsUw7Cg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-a99f646ff1bso343267466b.2 for ; Sat, 02 Nov 2024 05:58:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552313; x=1731157113; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eGDcc1c9gH9gm4mse3rqRIq0Cq5JbCktDs0C5wO2UGE=; b=dwyjs0DjzHKJdeSOjfAKJ4cO4VXUfKFkZTcYeOE5IY/oYg3a+JAY7x6R6kDzyeqz3n SLRVQu5v+EhsHqAsx/yPQ3qRCerxmberMHUCd/fkABQdZk6rcNlyw12PHhozuZfOn7TF vPzxgDA/NsMf4/RhkCzU0T/1WxgE2pzTQsi2/8VjTIqPXEHWF9vCjUiVFrV+SOMaM4Qa hnVr75QuHJVzMVbfCg1W9nOcUNE+zDmfcETXvciy/hisH9Y+OQHNoIc3nxVzGTZAHlk0 dU3PnkQIP54mXWrSWYbf8O17TSXSCjh+4KhwMgXiGvq+BhD7LvCGA9RBOaClGLRcPS6f mT1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552313; x=1731157113; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eGDcc1c9gH9gm4mse3rqRIq0Cq5JbCktDs0C5wO2UGE=; b=I2AW3qkv3p8DkQE2PomHZ04BRSpXPT2ujC9ZZk+r7l9LJN3jnWHuKusC1IH5DoIVAn 97ygrsWIbRZbhI2Gs+/FgME+LNkjX5RxNhn0w8IJCJM2clt3LMnUfKjVwnFGekkJQ0YY VLdB/lJYJGzlVX36kvljG7Z5F+pVf6RaHIrUiNHvH0ztdkZ3HePhDomDkKXGWZ47+2Uk 7N+aTf6cr7qyqZctN88WEFZE31lnUZqkkB6MFONQJqm1oWrJR9jjaneenseSMmDCGipk zS4kddPP5w5pMtBmnqekYh2CCZxPziAnLM1YuxMxTAsScbNv3GLFIA4vYxDfp3lbHC01 7Udw== X-Gm-Message-State: AOJu0Yzxv+nAHzUxMgxJ+7L0WIOnxA3wGcvOexcbZQevplaqS7y40FK6 HpbeekRF+ZBq2wVFcCHnKojHsQJ99eeQDo7F3ItwpdGhH7L6wCwDttYinw== X-Google-Smtp-Source: AGHT+IH4NEaPeQdtpsX0nhAjNKRpoIFOnY6aeTM/ZSjw/RPumSSXH2Rqau1Ot9VEKtv4PP22Q1TS+g== X-Received: by 2002:a17:907:9494:b0:a99:f605:7f1b with SMTP id a640c23a62f3a-a9e3a7f436amr1502695966b.60.1730552312460; Sat, 02 Nov 2024 05:58:32 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:32 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 2/8] ifn: Add else-operand handling. Date: Sat, 2 Nov 2024 13:58:22 +0100 Message-ID: <20241102125828.29183-3-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function. (expand_partial_store_optab_fn): Ditto. (expand_scatter_store_optab_fn): Ditto. (expand_gather_load_optab_fn): Ditto. (internal_fn_len_index): Add else handling. (internal_fn_else_index): Ditto. (internal_fn_mask_index): Ditto. (get_supported_else_vals): New function. (supported_else_val_p): New function. (internal_gather_scatter_fn_supported_p): Add else operand. * internal-fn.h (internal_gather_scatter_fn_supported_p): Define else constants. (MASK_LOAD_ELSE_ZERO): Ditto. (MASK_LOAD_ELSE_M1): Ditto. (MASK_LOAD_ELSE_UNDEFINED): Ditto. (get_supported_else_vals): Declare. (supported_else_val_p): Ditto. --- gcc/internal-fn.cc | 148 ++++++++++++++++++++++++++++++++++++++------- gcc/internal-fn.h | 13 +++- 2 files changed, 139 insertions(+), 22 deletions(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 1b3fe7be047..25d3c0c3c7e 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -333,17 +333,18 @@ get_multi_vector_move (tree array_type, convert_optab optab) return convert_optab_handler (optab, imode, vmode); } -/* Add mask and len arguments according to the STMT. */ +/* Add mask, else, and len arguments according to the STMT. */ static unsigned int -add_mask_and_len_args (expand_operand *ops, unsigned int opno, gcall *stmt) +add_mask_else_and_len_args (expand_operand *ops, unsigned int opno, gcall *stmt) { internal_fn ifn = gimple_call_internal_fn (stmt); int len_index = internal_fn_len_index (ifn); /* BIAS is always consecutive next of LEN. */ int bias_index = len_index + 1; int mask_index = internal_fn_mask_index (ifn); - /* The order of arguments are always {len,bias,mask}. */ + + /* The order of arguments is always {mask, else, len, bias}. */ if (mask_index >= 0) { tree mask = gimple_call_arg (stmt, mask_index); @@ -365,6 +366,22 @@ add_mask_and_len_args (expand_operand *ops, unsigned int opno, gcall *stmt) create_input_operand (&ops[opno++], mask_rtx, TYPE_MODE (TREE_TYPE (mask))); } + + int els_index = internal_fn_else_index (ifn); + if (els_index >= 0) + { + tree els = gimple_call_arg (stmt, els_index); + tree els_type = TREE_TYPE (els); + if (TREE_CODE (els) == SSA_NAME + && SSA_NAME_IS_DEFAULT_DEF (els) + && VAR_P (SSA_NAME_VAR (els))) + create_undefined_input_operand (&ops[opno++], TYPE_MODE (els_type)); + else + { + rtx els_rtx = expand_normal (els); + create_input_operand (&ops[opno++], els_rtx, TYPE_MODE (els_type)); + } + } if (len_index >= 0) { tree len = gimple_call_arg (stmt, len_index); @@ -3016,7 +3033,7 @@ static void expand_partial_load_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab) { int i = 0; - class expand_operand ops[5]; + class expand_operand ops[6]; tree type, lhs, rhs, maskt; rtx mem, target; insn_code icode; @@ -3046,7 +3063,7 @@ expand_partial_load_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab) target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); create_call_lhs_operand (&ops[i++], target, TYPE_MODE (type)); create_fixed_operand (&ops[i++], mem); - i = add_mask_and_len_args (ops, i, stmt); + i = add_mask_else_and_len_args (ops, i, stmt); expand_insn (icode, i, ops); assign_call_lhs (lhs, target, &ops[0]); @@ -3092,7 +3109,7 @@ expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab reg = expand_normal (rhs); create_fixed_operand (&ops[i++], mem); create_input_operand (&ops[i++], reg, TYPE_MODE (type)); - i = add_mask_and_len_args (ops, i, stmt); + i = add_mask_else_and_len_args (ops, i, stmt); expand_insn (icode, i, ops); } @@ -3678,7 +3695,7 @@ expand_scatter_store_optab_fn (internal_fn, gcall *stmt, direct_optab optab) create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset))); create_integer_operand (&ops[i++], scale_int); create_input_operand (&ops[i++], rhs_rtx, TYPE_MODE (TREE_TYPE (rhs))); - i = add_mask_and_len_args (ops, i, stmt); + i = add_mask_else_and_len_args (ops, i, stmt); insn_code icode = convert_optab_handler (optab, TYPE_MODE (TREE_TYPE (rhs)), TYPE_MODE (TREE_TYPE (offset))); @@ -3701,13 +3718,13 @@ expand_gather_load_optab_fn (internal_fn, gcall *stmt, direct_optab optab) HOST_WIDE_INT scale_int = tree_to_shwi (scale); int i = 0; - class expand_operand ops[8]; + class expand_operand ops[9]; create_call_lhs_operand (&ops[i++], lhs_rtx, TYPE_MODE (TREE_TYPE (lhs))); create_address_operand (&ops[i++], base_rtx); create_input_operand (&ops[i++], offset_rtx, TYPE_MODE (TREE_TYPE (offset))); create_integer_operand (&ops[i++], TYPE_UNSIGNED (TREE_TYPE (offset))); create_integer_operand (&ops[i++], scale_int); - i = add_mask_and_len_args (ops, i, stmt); + i = add_mask_else_and_len_args (ops, i, stmt); insn_code icode = convert_optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs)), TYPE_MODE (TREE_TYPE (offset))); expand_insn (icode, i, ops); @@ -3729,14 +3746,14 @@ expand_strided_load_optab_fn (ATTRIBUTE_UNUSED internal_fn, gcall *stmt, rtx stride_rtx = expand_normal (stride); unsigned i = 0; - class expand_operand ops[6]; + class expand_operand ops[7]; machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); create_output_operand (&ops[i++], lhs_rtx, mode); create_address_operand (&ops[i++], base_rtx); create_address_operand (&ops[i++], stride_rtx); - i = add_mask_and_len_args (ops, i, stmt); + i = add_mask_else_and_len_args (ops, i, stmt); expand_insn (direct_optab_handler (optab, mode), i, ops); if (!rtx_equal_p (lhs_rtx, ops[0].value)) @@ -3768,7 +3785,7 @@ expand_strided_store_optab_fn (ATTRIBUTE_UNUSED internal_fn, gcall *stmt, create_address_operand (&ops[i++], stride_rtx); create_input_operand (&ops[i++], rhs_rtx, mode); - i = add_mask_and_len_args (ops, i, stmt); + i = add_mask_else_and_len_args (ops, i, stmt); expand_insn (direct_optab_handler (optab, mode), i, ops); } @@ -4662,6 +4679,18 @@ get_len_internal_fn (internal_fn fn) case IFN_COND_##NAME: \ return IFN_COND_LEN_##NAME; #include "internal-fn.def" + default: + break; + } + + switch (fn) + { + case IFN_MASK_LOAD: + return IFN_MASK_LEN_LOAD; + case IFN_MASK_LOAD_LANES: + return IFN_MASK_LEN_LOAD_LANES; + case IFN_MASK_GATHER_LOAD: + return IFN_MASK_LEN_GATHER_LOAD; default: return IFN_LAST; } @@ -4847,8 +4876,13 @@ internal_fn_len_index (internal_fn fn) case IFN_LEN_STORE: return 2; - case IFN_MASK_LEN_GATHER_LOAD: case IFN_MASK_LEN_SCATTER_STORE: + case IFN_MASK_LEN_STRIDED_LOAD: + return 5; + + case IFN_MASK_LEN_GATHER_LOAD: + return 6; + case IFN_COND_LEN_FMA: case IFN_COND_LEN_FMS: case IFN_COND_LEN_FNMA: @@ -4870,18 +4904,19 @@ internal_fn_len_index (internal_fn fn) case IFN_COND_LEN_XOR: case IFN_COND_LEN_SHL: case IFN_COND_LEN_SHR: - case IFN_MASK_LEN_STRIDED_LOAD: case IFN_MASK_LEN_STRIDED_STORE: return 4; case IFN_COND_LEN_NEG: - case IFN_MASK_LEN_LOAD: case IFN_MASK_LEN_STORE: - case IFN_MASK_LEN_LOAD_LANES: case IFN_MASK_LEN_STORE_LANES: case IFN_VCOND_MASK_LEN: return 3; + case IFN_MASK_LEN_LOAD: + case IFN_MASK_LEN_LOAD_LANES: + return 4; + default: return -1; } @@ -4931,6 +4966,12 @@ internal_fn_else_index (internal_fn fn) case IFN_COND_LEN_SHR: return 3; + case IFN_MASK_LOAD: + case IFN_MASK_LEN_LOAD: + case IFN_MASK_LOAD_LANES: + case IFN_MASK_LEN_LOAD_LANES: + return 3; + case IFN_COND_FMA: case IFN_COND_FMS: case IFN_COND_FNMA: @@ -4939,8 +4980,13 @@ internal_fn_else_index (internal_fn fn) case IFN_COND_LEN_FMS: case IFN_COND_LEN_FNMA: case IFN_COND_LEN_FNMS: + case IFN_MASK_LEN_STRIDED_LOAD: return 4; + case IFN_MASK_GATHER_LOAD: + case IFN_MASK_LEN_GATHER_LOAD: + return 5; + default: return -1; } @@ -4976,6 +5022,7 @@ internal_fn_mask_index (internal_fn fn) case IFN_MASK_LEN_SCATTER_STORE: return 4; + case IFN_VCOND_MASK: case IFN_VCOND_MASK_LEN: return 0; @@ -5015,6 +5062,52 @@ internal_fn_stored_value_index (internal_fn fn) } } + +/* Push all supported else values for the optab referred to by ICODE + into ELSE_VALS. The index of the else operand must be specified in + ELSE_INDEX. */ + +void +get_supported_else_vals (enum insn_code icode, unsigned else_index, + vec &else_vals) +{ + const struct insn_data_d *data = &insn_data[icode]; + if ((char)else_index >= data->n_operands) + return; + + machine_mode else_mode = data->operand[else_index].mode; + + else_vals.truncate (0); + + /* For now we only support else values of 0, -1, and "undefined". */ + if (insn_operand_matches (icode, else_index, CONST0_RTX (else_mode))) + else_vals.safe_push (MASK_LOAD_ELSE_ZERO); + + if (insn_operand_matches (icode, else_index, gen_rtx_SCRATCH (else_mode))) + else_vals.safe_push (MASK_LOAD_ELSE_UNDEFINED); + + if (GET_MODE_CLASS (else_mode) == MODE_VECTOR_INT + && insn_operand_matches (icode, else_index, CONSTM1_RTX (else_mode))) + else_vals.safe_push (MASK_LOAD_ELSE_M1); +} + +/* Return true if the else value ELSE_VAL (one of MASK_LOAD_ELSE_ZERO, + MASK_LOAD_ELSE_M1, and MASK_LOAD_ELSE_UNDEFINED) is valid fo the optab + referred to by ICODE. The index of the else operand must be specified + in ELSE_INDEX. */ + +bool +supported_else_val_p (enum insn_code icode, unsigned else_index, int else_val) +{ + if (else_val != MASK_LOAD_ELSE_ZERO && else_val != MASK_LOAD_ELSE_M1 + && else_val != MASK_LOAD_ELSE_UNDEFINED) + gcc_unreachable (); + + auto_vec else_vals; + get_supported_else_vals (icode, else_index, else_vals); + return else_vals.contains (else_val); +} + /* Return true if the target supports gather load or scatter store function IFN. For loads, VECTOR_TYPE is the vector type of the load result, while for stores it is the vector type of the stored data argument. @@ -5022,12 +5115,15 @@ internal_fn_stored_value_index (internal_fn fn) or stored. OFFSET_VECTOR_TYPE is the vector type that holds the offset from the shared base address of each loaded or stored element. SCALE is the amount by which these offsets should be multiplied - *after* they have been extended to address width. */ + *after* they have been extended to address width. + If the target supports the gather load the supported else values + will be added to the vector ELSVAL points to if it is nonzero. */ bool internal_gather_scatter_fn_supported_p (internal_fn ifn, tree vector_type, tree memory_element_type, - tree offset_vector_type, int scale) + tree offset_vector_type, int scale, + vec *elsvals) { if (!tree_int_cst_equal (TYPE_SIZE (TREE_TYPE (vector_type)), TYPE_SIZE (memory_element_type))) @@ -5040,9 +5136,19 @@ internal_gather_scatter_fn_supported_p (internal_fn ifn, tree vector_type, TYPE_MODE (offset_vector_type)); int output_ops = internal_load_fn_p (ifn) ? 1 : 0; bool unsigned_p = TYPE_UNSIGNED (TREE_TYPE (offset_vector_type)); - return (icode != CODE_FOR_nothing - && insn_operand_matches (icode, 2 + output_ops, GEN_INT (unsigned_p)) - && insn_operand_matches (icode, 3 + output_ops, GEN_INT (scale))); + bool ok = icode != CODE_FOR_nothing + && insn_operand_matches (icode, 2 + output_ops, GEN_INT (unsigned_p)) + && insn_operand_matches (icode, 3 + output_ops, GEN_INT (scale)); + + /* For gather the optab's operand indices do not match the IFN's because + the latter does not have the extension operand (operand 3). It is + implicitly added during expansion so we use the IFN's else index + 1. + */ + if (ok && elsvals) + get_supported_else_vals + (icode, internal_fn_else_index (IFN_MASK_GATHER_LOAD) + 1, *elsvals); + + return ok; } /* Return true if the target supports IFN_CHECK_{RAW,WAR}_PTRS function IFN diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 2785a5a95a2..37fbc60f6dd 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -240,9 +240,20 @@ extern int internal_fn_len_index (internal_fn); extern int internal_fn_else_index (internal_fn); extern int internal_fn_stored_value_index (internal_fn); extern bool internal_gather_scatter_fn_supported_p (internal_fn, tree, - tree, tree, int); + tree, tree, int, + vec * = nullptr); extern bool internal_check_ptrs_fn_supported_p (internal_fn, tree, poly_uint64, unsigned int); + +/* Integer constants representing which else value is supported for masked load + functions. */ +#define MASK_LOAD_ELSE_ZERO -1 +#define MASK_LOAD_ELSE_M1 -2 +#define MASK_LOAD_ELSE_UNDEFINED -3 + +extern void get_supported_else_vals (enum insn_code, unsigned, vec &); +extern bool supported_else_val_p (enum insn_code, unsigned, int); + #define VECT_PARTIAL_BIAS_UNSUPPORTED 127 extern signed char internal_len_load_store_bias (internal_fn ifn, From patchwork Sat Nov 2 12:58:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100110 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AAE573857BBF for ; Sat, 2 Nov 2024 13:01:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by sourceware.org (Postfix) with ESMTPS id 92AEE3858415 for ; Sat, 2 Nov 2024 12:58:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 92AEE3858415 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 92AEE3858415 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::533 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552326; cv=none; b=gUZRHUuNEQmGH+VPkNl1bhbhz7ClfOi45RjNRAeEHEpizJ/Zaa4deTw2vaDDcuPHf+BzLN3NQu+a2lFqjyXkO55xYK6tQV+AhUvCDxbP9ekkdo0XjW9MznbgQwZxBIGUlGL+OromoQ2xw2WPQqBJ/UL2AJGMUlmVljp6LhvOAnY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552326; c=relaxed/simple; bh=Ii6/cFxi1wsDWWyMbfNQHggpAx/E96vgu4oStjf9vvU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=xIemG+oNvRk6507MSqeUlzvBs3nOpBzEN1zaAMOjGmyY6Dd20PbaYgg9GebdvRXthNVSSVt0w64CL+MSFbCrggLIFO2LjQgeanuLBc6tIlijLkmOM/cH9B491xND+RDBKbfiFaHTulnBGj2A2RMP4K61P5VorAUQw4ZRrAiYeEw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x533.google.com with SMTP id 4fb4d7f45d1cf-5c957d8bce2so1486026a12.2 for ; Sat, 02 Nov 2024 05:58:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552314; x=1731157114; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kafQWBmVLjw13uSmA6o17JQTSYZRcbVscJ98/+qNA+M=; b=c13ceNDIvzkSZoff3HqzL+iJPkbDUzMC8NkC/bTjGxP+ssO3ruN4chpGxc0AHS9PZ9 lIvWVdWFd0dkPROrAhTl8Oroi1rLUYGzo9eaqNHKEMGD724i4CrYH63+OA2VW19sWrTH UqKHoXTVeWaQiInZZNpbh4/ClN8Iyl/fcEvq62xWEcg9gyhTqKET9+sy9WsIuR01j07y 4Ah4urJlPI+wnCjoJlMWiSeQTvtn9uBOWbyZBbwJGpaZw81M53Z10W7iJLUmY68EI7uv SiK7pS/KsENXx8AI8BFRBZ79bFynDqEGjU0Xc8Crxskf1ZUZPFX50QPJCQSeBKQwZJtP +c5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552314; x=1731157114; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kafQWBmVLjw13uSmA6o17JQTSYZRcbVscJ98/+qNA+M=; b=mSfvU7ej1wyZcupOUZSOXl9d1gmNZKeLHz7Q8HP6IyE/MnjW6xGjezQwwlSv0gPHyA OCntWf1Ie8N/QoXOtWRWcz/V2Va1RY91WpEgE3saMom+mmhS1Wk92JgzbvLhDH5H7Q/w /nKMRLofR2tSwA9apfJqvC8VPcOya853dij3DYvd4M8hKF8QQxsW8en0/n4icp5L00i8 SYZ6zNOv3tXEHosJqJd17kQTi0Z1xS7Axiyy5ZnA8JtK+bBowvP9utYD3dAJHjAXxrkC 9nntHfTampMnf8XxRqodkEKxL/nfr+R8Y3uWkPyg5DYEYCx3xbQ2r4AWVRgl9vOvLfrI iKCQ== X-Gm-Message-State: AOJu0YxE0psQ3+9sMEkLAmW0+0yGNp9+o45qNEsZYD9lhMmGQA1yKobN hC5ouLaB6kPntDzdbkIZAOJZ+QWrLmo6ZirMWIKG+RabN5q62/xxQQ4/Ew== X-Google-Smtp-Source: AGHT+IEZIK7RMQcMoVfntOuDjdX+k4LqVuGS55ebSmPSEVyrJSb9JAjxPref4XAzcSrcyzFZceSgUA== X-Received: by 2002:a05:6402:2102:b0:5ce:cff3:1fe5 with SMTP id 4fb4d7f45d1cf-5cecff32129mr1108634a12.25.1730552313459; Sat, 02 Nov 2024 05:58:33 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:33 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 3/8] tree-ifcvt: Add zero maskload else value. Date: Sat, 2 Nov 2024 13:58:23 +0100 Message-ID: <20241102125828.29183-4-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. A former version of this patch still had this handling in ifcvt but the latest version defers it to the vectorizer. gcc/ChangeLog: * tree-if-conv.cc (predicate_load_or_store): Add zero else operand and comment. --- gcc/tree-if-conv.cc | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index eb981642bae..f1a1f8fd0d3 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -2555,9 +2555,17 @@ predicate_load_or_store (gimple_stmt_iterator *gsi, gassign *stmt, tree mask) ref); if (TREE_CODE (lhs) == SSA_NAME) { + /* Get a zero else value. This might not be what a target actually uses + but we cannot be sure about which vector mode the vectorizer will + choose. Therefore, leave the decision whether we need to force the + inactive elements to zero to the vectorizer. */ + tree els = vect_get_mask_load_else (MASK_LOAD_ELSE_ZERO, + TREE_TYPE (lhs)); + new_stmt - = gimple_build_call_internal (IFN_MASK_LOAD, 3, addr, - ptr, mask); + = gimple_build_call_internal (IFN_MASK_LOAD, 4, addr, + ptr, mask, els); + gimple_call_set_lhs (new_stmt, lhs); gimple_set_vuse (new_stmt, gimple_vuse (stmt)); } From patchwork Sat Nov 2 12:58:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100107 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 588303857B91 for ; Sat, 2 Nov 2024 12:59:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 3EA6B385842A for ; Sat, 2 Nov 2024 12:58:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3EA6B385842A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3EA6B385842A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::630 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552329; cv=none; b=qX67jFK4FTi75CkQ6iK4gOGTUobQJyQSVesG4IRMkTREMQV7Obcq9YSXgjTFJk5BdGmiLtxQriSXGPxnb72697U2DtUfta764Xp2fQ5KosnrJlIWnteNVC3RaiRODCU0BT+kSnSA94s/rmrLMqHH+doNJNPGBoPydMKUzW/14cc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552329; c=relaxed/simple; bh=GhtBIlAJA1V0haTiml/LMzRZBX3xkjauON5SCPr5qoE=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=RFFilc4SKaxnuqI3NxVfvHX8l3c3Wwd59CdY7m6p4p9Vyh/DIEvwsFxD1kgfZPwtMlU1dPNWU4jAeHCxCTyP6ZEhact1+eAnwvdExChudduvv7AeeNbS8CLnHWxV3ftk+8GLnPWKMTWCYFuiCTuhSjjgZTOUQLAA2j1KYGY0fEg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-a9a16b310f5so425949766b.0 for ; Sat, 02 Nov 2024 05:58:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552316; x=1731157116; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iNBKfQOUPE3H6zCHenRFhPTQTOx8NUE6g9kHZVldSOE=; b=RogD3ob+464qQ5Uv8scejs4WjhSSuWTFuXRrHhYpeWLvgu0rKDqzJ8veFao+r1oFfH fSwqJkKxrQs1CTnxdqEkzyNRn7kUCnnArjbCkUrKPpbKGAkVHZoD2MSgtncjgJrmjf5d jD50jtrE21vxj88pmRFL4yjry5y/DeDs1sLdzA8re0lskA+glBMDn84xDwsucaBwTXMC vE5Haa1JIZ54+J65ue7FoCbSDiXmWuv57H7r/fETwa64KSZ7uqz3yLV/mq3H1wX3g6gD fGJI6mLjSZKnlvsSCPPIoSyzKzm0pvMDR+S+QwN7SbYHn86tfMB8PlmyzscWqt/57wDA JfJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552316; x=1731157116; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iNBKfQOUPE3H6zCHenRFhPTQTOx8NUE6g9kHZVldSOE=; b=POuBMAqMjyByihUbDsqfBwGatwL54T7vVgb7U6m0xVlVfDg5fImz8OYRGMXxpWKf6M 5d4ONI47PC9x4ksGowLxJmImfpJyXVqwiJCBO2KcK/kfqfT/CmFkh86DBWHtuR4zbf53 H1CiFWQWKJ5rnIQZ9oiafXiK/Ef9KfsaPHBDN0XLryUhTCaXhdp0wkDWBPt8dgEEvfRc 0jd/k/qlggRvbmyLQBEjcVNzKGPbX8NZVypFNiCzysjIo7hve5PtDZvEHUERxUK8jcPh PpMs47A1qqFOUK+8Uu8R+17DSjOQLOFliTuW9FJgvk3i5mlsjl1JpH0I3Fp1LW3X8fHY a7Tg== X-Gm-Message-State: AOJu0Ywo8lPWNWwjmzBJvtMxJOfgeXQbSi2v1J1UU/NWaM6+Cl8NJhz4 rDyT2ukjRfD8xGBDgJF6Ixll8xsZ4q46bAHj8OQ5Yx0FcMkcoiE0wZbfgA== X-Google-Smtp-Source: AGHT+IFKjCQZA+D+04V3VVEZPCMgfmKU8mQlbxTpP2Va37K+ITRctwWUdgr6V7MeUFWQt3+kpCxo5Q== X-Received: by 2002:a17:907:3f1a:b0:a9a:2a55:d130 with SMTP id a640c23a62f3a-a9e50cad5ddmr838552266b.55.1730552314936; Sat, 02 Nov 2024 05:58:34 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:34 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 4/8] vect: Add maskload else value support. Date: Sat, 2 Nov 2024 13:58:24 +0100 Message-ID: <20241102125828.29183-5-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. We query the target for its supported else operand and uses that for the maskload call. If necessary, i.e. if the mode has padding bits and if the else operand is nonzero, a VEC_COND enforcing a zero else value is emitted. gcc/ChangeLog: * optabs-query.cc (supports_vec_convert_optab_p): Return icode. (get_supported_else_val): Return supported else value for optab's operand at index. (supports_vec_gather_load_p): Add else argument. (supports_vec_scatter_store_p): Ditto. * optabs-query.h (supports_vec_gather_load_p): Ditto. (get_supported_else_val): Ditto. * optabs-tree.cc (target_supports_mask_load_store_p): Ditto. (can_vec_mask_load_store_p): Ditto. (target_supports_len_load_store_p): Ditto. (get_len_load_store_mode): Ditto. * optabs-tree.h (target_supports_mask_load_store_p): Ditto. (can_vec_mask_load_store_p): Ditto. * tree-vect-data-refs.cc (vect_lanes_optab_supported_p): Ditto. (vect_gather_scatter_fn_p): Ditto. (vect_check_gather_scatter): Ditto. (vect_load_lanes_supported): Ditto. * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Ditto. * tree-vect-slp.cc (vect_get_operand_map): Adjust indices for else operand. (vect_slp_analyze_node_operations): Skip undefined else operand. * tree-vect-stmts.cc (exist_non_indexing_operands_for_use_p): Add else operand handling. (vect_get_vec_defs_for_operand): Handle undefined else operand. (check_load_store_for_partial_vectors): Add else argument. (vect_truncate_gather_scatter_offset): Ditto. (vect_use_strided_gather_scatters_p): Ditto. (get_group_load_store_type): Ditto. (get_load_store_type): Ditto. (vect_get_mask_load_else): Ditto. (vect_get_else_val_from_tree): Ditto. (vect_build_one_gather_load_call): Add zero else operand. (vectorizable_load): Use else operand. * tree-vectorizer.h (vect_gather_scatter_fn_p): Add else argument. (vect_load_lanes_supported): Ditto. (vect_get_mask_load_else): Ditto. (vect_get_else_val_from_tree): Ditto. --- gcc/optabs-query.cc | 70 +++++--- gcc/optabs-query.h | 3 +- gcc/optabs-tree.cc | 66 ++++++-- gcc/optabs-tree.h | 8 +- gcc/tree-vect-data-refs.cc | 74 ++++++--- gcc/tree-vect-patterns.cc | 12 +- gcc/tree-vect-slp.cc | 25 ++- gcc/tree-vect-stmts.cc | 324 ++++++++++++++++++++++++++++++------- gcc/tree-vectorizer.h | 10 +- 9 files changed, 463 insertions(+), 129 deletions(-) diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc index cc52bc0f5ea..c1f3558af92 100644 --- a/gcc/optabs-query.cc +++ b/gcc/optabs-query.cc @@ -29,6 +29,9 @@ along with GCC; see the file COPYING3. If not see #include "rtl.h" #include "recog.h" #include "vec-perm-indices.h" +#include "internal-fn.h" +#include "memmodel.h" +#include "optabs.h" struct target_optabs default_target_optabs; struct target_optabs *this_fn_optabs = &default_target_optabs; @@ -672,34 +675,57 @@ lshift_cheap_p (bool speed_p) that mode, given that the second mode is always an integer vector. If MODE is VOIDmode, return true if OP supports any vector mode. */ -static bool -supports_vec_convert_optab_p (optab op, machine_mode mode) +static enum insn_code +supported_vec_convert_optab (optab op, machine_mode mode) { int start = mode == VOIDmode ? 0 : mode; int end = mode == VOIDmode ? MAX_MACHINE_MODE - 1 : mode; + enum insn_code icode = CODE_FOR_nothing; for (int i = start; i <= end; ++i) if (VECTOR_MODE_P ((machine_mode) i)) for (int j = MIN_MODE_VECTOR_INT; j < MAX_MODE_VECTOR_INT; ++j) - if (convert_optab_handler (op, (machine_mode) i, - (machine_mode) j) != CODE_FOR_nothing) - return true; + { + if ((icode + = convert_optab_handler (op, (machine_mode) i, + (machine_mode) j)) != CODE_FOR_nothing) + return icode; + } - return false; + return icode; } /* If MODE is not VOIDmode, return true if vec_gather_load is available for that mode. If MODE is VOIDmode, return true if gather_load is available - for at least one vector mode. */ + for at least one vector mode. + In that case, and if ELSVALS is nonzero, store the supported else values + into the vector it points to. */ bool -supports_vec_gather_load_p (machine_mode mode) +supports_vec_gather_load_p (machine_mode mode, vec *elsvals) { - if (!this_fn_optabs->supports_vec_gather_load[mode]) - this_fn_optabs->supports_vec_gather_load[mode] - = (supports_vec_convert_optab_p (gather_load_optab, mode) - || supports_vec_convert_optab_p (mask_gather_load_optab, mode) - || supports_vec_convert_optab_p (mask_len_gather_load_optab, mode) - ? 1 : -1); + enum insn_code icode = CODE_FOR_nothing; + if (!this_fn_optabs->supports_vec_gather_load[mode] || elsvals) + { + /* Try the masked variants first. In case we later decide that we + need a mask after all (thus requiring an else operand) we need + to query it below and we cannot do that when using the + non-masked optab. */ + icode = supported_vec_convert_optab (mask_gather_load_optab, mode); + if (icode == CODE_FOR_nothing) + icode = supported_vec_convert_optab (mask_len_gather_load_optab, mode); + if (icode == CODE_FOR_nothing) + icode = supported_vec_convert_optab (gather_load_optab, mode); + this_fn_optabs->supports_vec_gather_load[mode] + = (icode != CODE_FOR_nothing) ? 1 : -1; + } + + /* For gather the optab's operand indices do not match the IFN's because + the latter does not have the extension operand (operand 3). It is + implicitly added during expansion so we use the IFN's else index + 1. + */ + if (elsvals && icode != CODE_FOR_nothing) + get_supported_else_vals + (icode, internal_fn_else_index (IFN_MASK_GATHER_LOAD) + 1, *elsvals); return this_fn_optabs->supports_vec_gather_load[mode] > 0; } @@ -711,12 +737,18 @@ supports_vec_gather_load_p (machine_mode mode) bool supports_vec_scatter_store_p (machine_mode mode) { + enum insn_code icode; if (!this_fn_optabs->supports_vec_scatter_store[mode]) - this_fn_optabs->supports_vec_scatter_store[mode] - = (supports_vec_convert_optab_p (scatter_store_optab, mode) - || supports_vec_convert_optab_p (mask_scatter_store_optab, mode) - || supports_vec_convert_optab_p (mask_len_scatter_store_optab, mode) - ? 1 : -1); + { + icode = supported_vec_convert_optab (scatter_store_optab, mode); + if (icode == CODE_FOR_nothing) + icode = supported_vec_convert_optab (mask_scatter_store_optab, mode); + if (icode == CODE_FOR_nothing) + icode = supported_vec_convert_optab (mask_len_scatter_store_optab, + mode); + this_fn_optabs->supports_vec_scatter_store[mode] + = (icode != CODE_FOR_nothing) ? 1 : -1; + } return this_fn_optabs->supports_vec_scatter_store[mode] > 0; } diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h index 0cb2c21ba85..f38b1e5d5bb 100644 --- a/gcc/optabs-query.h +++ b/gcc/optabs-query.h @@ -191,7 +191,8 @@ bool can_compare_and_swap_p (machine_mode, bool); bool can_atomic_exchange_p (machine_mode, bool); bool can_atomic_load_p (machine_mode); bool lshift_cheap_p (bool); -bool supports_vec_gather_load_p (machine_mode = E_VOIDmode); +bool supports_vec_gather_load_p (machine_mode = E_VOIDmode, + vec * = nullptr); bool supports_vec_scatter_store_p (machine_mode = E_VOIDmode); bool can_vec_extract (machine_mode, machine_mode); diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index b69a5bc3676..3d2d782ea32 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. If not see #include "optabs.h" #include "optabs-tree.h" #include "stor-layout.h" +#include "internal-fn.h" /* Return the optab used for computing the operation given by the tree code, CODE and the tree EXP. This function is not always usable (for example, it @@ -552,24 +553,38 @@ target_supports_op_p (tree type, enum tree_code code, or mask_len_{load,store}. This helper function checks whether target supports masked load/store and return corresponding IFN in the last argument - (IFN_MASK_{LOAD,STORE} or IFN_MASK_LEN_{LOAD,STORE}). */ + (IFN_MASK_{LOAD,STORE} or IFN_MASK_LEN_{LOAD,STORE}). + If there is support and ELSVALS is nonzero store the possible else values + in the vector it points to. */ -static bool +bool target_supports_mask_load_store_p (machine_mode mode, machine_mode mask_mode, - bool is_load, internal_fn *ifn) + bool is_load, internal_fn *ifn, + vec *elsvals) { optab op = is_load ? maskload_optab : maskstore_optab; optab len_op = is_load ? mask_len_load_optab : mask_len_store_optab; - if (convert_optab_handler (op, mode, mask_mode) != CODE_FOR_nothing) + enum insn_code icode; + if ((icode = convert_optab_handler (op, mode, mask_mode)) + != CODE_FOR_nothing) { if (ifn) *ifn = is_load ? IFN_MASK_LOAD : IFN_MASK_STORE; + if (elsvals && is_load) + get_supported_else_vals (icode, + internal_fn_else_index (IFN_MASK_LOAD), + *elsvals); return true; } - else if (convert_optab_handler (len_op, mode, mask_mode) != CODE_FOR_nothing) + else if ((icode = convert_optab_handler (len_op, mode, mask_mode)) + != CODE_FOR_nothing) { if (ifn) *ifn = is_load ? IFN_MASK_LEN_LOAD : IFN_MASK_LEN_STORE; + if (elsvals && is_load) + get_supported_else_vals (icode, + internal_fn_else_index (IFN_MASK_LEN_LOAD), + *elsvals); return true; } return false; @@ -578,19 +593,23 @@ target_supports_mask_load_store_p (machine_mode mode, machine_mode mask_mode, /* Return true if target supports vector masked load/store for mode. An additional output in the last argument which is the IFN pointer. We set IFN as MASK_{LOAD,STORE} or MASK_LEN_{LOAD,STORE} according - which optab is supported in the target. */ + which optab is supported in the target. + If there is support and ELSVALS is nonzero store the possible else values + in the vector it points to. */ bool can_vec_mask_load_store_p (machine_mode mode, machine_mode mask_mode, bool is_load, - internal_fn *ifn) + internal_fn *ifn, + vec *elsvals) { machine_mode vmode; /* If mode is vector mode, check it directly. */ if (VECTOR_MODE_P (mode)) - return target_supports_mask_load_store_p (mode, mask_mode, is_load, ifn); + return target_supports_mask_load_store_p (mode, mask_mode, is_load, ifn, + elsvals); /* Otherwise, return true if there is some vector mode with the mask load/store supported. */ @@ -604,7 +623,8 @@ can_vec_mask_load_store_p (machine_mode mode, vmode = targetm.vectorize.preferred_simd_mode (smode); if (VECTOR_MODE_P (vmode) && targetm.vectorize.get_mask_mode (vmode).exists (&mask_mode) - && target_supports_mask_load_store_p (vmode, mask_mode, is_load, ifn)) + && target_supports_mask_load_store_p (vmode, mask_mode, is_load, ifn, + elsvals)) return true; auto_vector_modes vector_modes; @@ -612,7 +632,8 @@ can_vec_mask_load_store_p (machine_mode mode, for (machine_mode base_mode : vector_modes) if (related_vector_mode (base_mode, smode).exists (&vmode) && targetm.vectorize.get_mask_mode (vmode).exists (&mask_mode) - && target_supports_mask_load_store_p (vmode, mask_mode, is_load, ifn)) + && target_supports_mask_load_store_p (vmode, mask_mode, is_load, ifn, + elsvals)) return true; return false; } @@ -622,11 +643,13 @@ can_vec_mask_load_store_p (machine_mode mode, or mask_len_{load,store}. This helper function checks whether target supports len load/store and return corresponding IFN in the last argument - (IFN_LEN_{LOAD,STORE} or IFN_MASK_LEN_{LOAD,STORE}). */ + (IFN_LEN_{LOAD,STORE} or IFN_MASK_LEN_{LOAD,STORE}). + If there is support and ELSVALS is nonzero store thepossible + else values in the vector it points to. */ static bool target_supports_len_load_store_p (machine_mode mode, bool is_load, - internal_fn *ifn) + internal_fn *ifn, vec *elsvals) { optab op = is_load ? len_load_optab : len_store_optab; optab masked_op = is_load ? mask_len_load_optab : mask_len_store_optab; @@ -638,11 +661,17 @@ target_supports_len_load_store_p (machine_mode mode, bool is_load, return true; } machine_mode mask_mode; + enum insn_code icode; if (targetm.vectorize.get_mask_mode (mode).exists (&mask_mode) - && convert_optab_handler (masked_op, mode, mask_mode) != CODE_FOR_nothing) + && ((icode = convert_optab_handler (masked_op, mode, mask_mode)) + != CODE_FOR_nothing)) { if (ifn) *ifn = is_load ? IFN_MASK_LEN_LOAD : IFN_MASK_LEN_STORE; + if (elsvals && is_load) + get_supported_else_vals (icode, + internal_fn_else_index (IFN_MASK_LEN_LOAD), + *elsvals); return true; } return false; @@ -656,22 +685,25 @@ target_supports_len_load_store_p (machine_mode mode, bool is_load, VnQI to wrap the other supportable same size vector modes. An additional output in the last argument which is the IFN pointer. We set IFN as LEN_{LOAD,STORE} or MASK_LEN_{LOAD,STORE} according - which optab is supported in the target. */ + which optab is supported in the target. + If there is support and ELSVALS is nonzero store the possible else values + in the vector it points to. */ opt_machine_mode -get_len_load_store_mode (machine_mode mode, bool is_load, internal_fn *ifn) +get_len_load_store_mode (machine_mode mode, bool is_load, internal_fn *ifn, + vec *elsvals) { gcc_assert (VECTOR_MODE_P (mode)); /* Check if length in lanes supported for this mode directly. */ - if (target_supports_len_load_store_p (mode, is_load, ifn)) + if (target_supports_len_load_store_p (mode, is_load, ifn, elsvals)) return mode; /* Check if length in bytes supported for same vector size VnQI. */ machine_mode vmode; poly_uint64 nunits = GET_MODE_SIZE (mode); if (related_vector_mode (mode, QImode, nunits).exists (&vmode) - && target_supports_len_load_store_p (vmode, is_load, ifn)) + && target_supports_len_load_store_p (vmode, is_load, ifn, elsvals)) return vmode; return opt_machine_mode (); diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h index 85805fd8296..37102c94f0c 100644 --- a/gcc/optabs-tree.h +++ b/gcc/optabs-tree.h @@ -47,9 +47,13 @@ bool expand_vec_cond_expr_p (tree, tree, enum tree_code = ERROR_MARK); void init_tree_optimization_optabs (tree); bool target_supports_op_p (tree, enum tree_code, enum optab_subtype = optab_default); +bool target_supports_mask_load_store_p (machine_mode, machine_mode, + bool, internal_fn *, vec *); bool can_vec_mask_load_store_p (machine_mode, machine_mode, bool, - internal_fn * = nullptr); + internal_fn * = nullptr, + vec * = nullptr); opt_machine_mode get_len_load_store_mode (machine_mode, bool, - internal_fn * = nullptr); + internal_fn * = nullptr, + vec * = nullptr); #endif diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 54ad5c8f3dc..6fa293e607c 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -55,13 +55,18 @@ along with GCC; see the file COPYING3. If not see #include "vec-perm-indices.h" #include "internal-fn.h" #include "gimple-fold.h" +#include "optabs-query.h" /* Return true if load- or store-lanes optab OPTAB is implemented for - COUNT vectors of type VECTYPE. NAME is the name of OPTAB. */ + COUNT vectors of type VECTYPE. NAME is the name of OPTAB. + + If it is implemented and ELSVALS is nonzero add the possible else values + to the vector it points to. */ static bool vect_lanes_optab_supported_p (const char *name, convert_optab optab, - tree vectype, unsigned HOST_WIDE_INT count) + tree vectype, unsigned HOST_WIDE_INT count, + vec *elsvals = nullptr) { machine_mode mode, array_mode; bool limit_p; @@ -81,7 +86,9 @@ vect_lanes_optab_supported_p (const char *name, convert_optab optab, } } - if (convert_optab_handler (optab, array_mode, mode) == CODE_FOR_nothing) + enum insn_code icode; + if ((icode = convert_optab_handler (optab, array_mode, mode)) + == CODE_FOR_nothing) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, @@ -92,8 +99,13 @@ vect_lanes_optab_supported_p (const char *name, convert_optab optab, if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, - "can use %s<%s><%s>\n", name, GET_MODE_NAME (array_mode), - GET_MODE_NAME (mode)); + "can use %s<%s><%s>\n", name, GET_MODE_NAME (array_mode), + GET_MODE_NAME (mode)); + + if (elsvals) + get_supported_else_vals (icode, + internal_fn_else_index (IFN_MASK_LEN_LOAD_LANES), + *elsvals); return true; } @@ -4184,13 +4196,15 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo) be multiplied *after* it has been converted to address width. Return true if the function is supported, storing the function id in - *IFN_OUT and the vector type for the offset in *OFFSET_VECTYPE_OUT. */ + *IFN_OUT and the vector type for the offset in *OFFSET_VECTYPE_OUT. + + If we can use gather and add the possible else values to ELSVALS. */ bool vect_gather_scatter_fn_p (vec_info *vinfo, bool read_p, bool masked_p, tree vectype, tree memory_type, tree offset_type, int scale, internal_fn *ifn_out, - tree *offset_vectype_out) + tree *offset_vectype_out, vec *elsvals) { unsigned int memory_bits = tree_to_uhwi (TYPE_SIZE (memory_type)); unsigned int element_bits = vector_element_bits (vectype); @@ -4228,7 +4242,8 @@ vect_gather_scatter_fn_p (vec_info *vinfo, bool read_p, bool masked_p, /* Test whether the target supports this combination. */ if (internal_gather_scatter_fn_supported_p (ifn, vectype, memory_type, - offset_vectype, scale)) + offset_vectype, scale, + elsvals)) { *ifn_out = ifn; *offset_vectype_out = offset_vectype; @@ -4238,7 +4253,7 @@ vect_gather_scatter_fn_p (vec_info *vinfo, bool read_p, bool masked_p, && internal_gather_scatter_fn_supported_p (alt_ifn, vectype, memory_type, offset_vectype, - scale)) + scale, elsvals)) { *ifn_out = alt_ifn; *offset_vectype_out = offset_vectype; @@ -4246,7 +4261,8 @@ vect_gather_scatter_fn_p (vec_info *vinfo, bool read_p, bool masked_p, } else if (internal_gather_scatter_fn_supported_p (alt_ifn2, vectype, memory_type, - offset_vectype, scale)) + offset_vectype, scale, + elsvals)) { *ifn_out = alt_ifn2; *offset_vectype_out = offset_vectype; @@ -4285,11 +4301,13 @@ vect_describe_gather_scatter_call (stmt_vec_info stmt_info, } /* Return true if a non-affine read or write in STMT_INFO is suitable for a - gather load or scatter store. Describe the operation in *INFO if so. */ + gather load or scatter store. Describe the operation in *INFO if so. + If it is suitable and ELSVALS is nonzero add the supported else values + to the vector it points to. */ bool vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, - gather_scatter_info *info) + gather_scatter_info *info, vec *elsvals) { HOST_WIDE_INT scale = 1; poly_int64 pbitpos, pbitsize; @@ -4314,6 +4332,13 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, if (internal_gather_scatter_fn_p (ifn)) { vect_describe_gather_scatter_call (stmt_info, info); + + /* In pattern recog we simply used a ZERO else value that + we need to correct here. To that end just re-use the + (already succesful) check if we support a gather IFN + and have it populate the else values. */ + if (DR_IS_READ (dr) && internal_fn_mask_index (ifn) >= 0 && elsvals) + supports_vec_gather_load_p (TYPE_MODE (vectype), elsvals); return true; } masked_p = (ifn == IFN_MASK_LOAD || ifn == IFN_MASK_STORE); @@ -4322,7 +4347,8 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, /* True if we should aim to use internal functions rather than built-in functions. */ bool use_ifn_p = (DR_IS_READ (dr) - ? supports_vec_gather_load_p (TYPE_MODE (vectype)) + ? supports_vec_gather_load_p (TYPE_MODE (vectype), + elsvals) : supports_vec_scatter_store_p (TYPE_MODE (vectype))); base = DR_REF (dr); @@ -4479,12 +4505,14 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, masked_p, vectype, memory_type, signed_char_type_node, new_scale, &ifn, - &offset_vectype) + &offset_vectype, + elsvals) && !vect_gather_scatter_fn_p (loop_vinfo, DR_IS_READ (dr), masked_p, vectype, memory_type, unsigned_char_type_node, new_scale, &ifn, - &offset_vectype)) + &offset_vectype, + elsvals)) break; scale = new_scale; off = op0; @@ -4507,7 +4535,7 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, && vect_gather_scatter_fn_p (loop_vinfo, DR_IS_READ (dr), masked_p, vectype, memory_type, TREE_TYPE (off), scale, &ifn, - &offset_vectype)) + &offset_vectype, elsvals)) break; if (TYPE_PRECISION (TREE_TYPE (op0)) @@ -4561,7 +4589,7 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, { if (!vect_gather_scatter_fn_p (loop_vinfo, DR_IS_READ (dr), masked_p, vectype, memory_type, offtype, scale, - &ifn, &offset_vectype)) + &ifn, &offset_vectype, elsvals)) ifn = IFN_LAST; decl = NULL_TREE; } @@ -6398,27 +6426,29 @@ vect_grouped_load_supported (tree vectype, bool single_element_p, } /* Return FN if vec_{masked_,mask_len_}load_lanes is available for COUNT vectors - of type VECTYPE. MASKED_P says whether the masked form is needed. */ + of type VECTYPE. MASKED_P says whether the masked form is needed. + If it is available and ELSVALS is nonzero add the possible else values + to the vector it points to. */ internal_fn vect_load_lanes_supported (tree vectype, unsigned HOST_WIDE_INT count, - bool masked_p) + bool masked_p, vec *elsvals) { if (vect_lanes_optab_supported_p ("vec_mask_len_load_lanes", vec_mask_len_load_lanes_optab, vectype, - count)) + count, elsvals)) return IFN_MASK_LEN_LOAD_LANES; else if (masked_p) { if (vect_lanes_optab_supported_p ("vec_mask_load_lanes", vec_mask_load_lanes_optab, vectype, - count)) + count, elsvals)) return IFN_MASK_LOAD_LANES; } else { if (vect_lanes_optab_supported_p ("vec_load_lanes", vec_load_lanes_optab, - vectype, count)) + vectype, count, elsvals)) return IFN_LOAD_LANES; } return IFN_LAST; diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index a708234304f..eb0e5808f7f 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -6021,12 +6021,20 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo, /* Build the new pattern statement. */ tree scale = size_int (gs_info.scale); gcall *pattern_stmt; + if (DR_IS_READ (dr)) { tree zero = build_zero_cst (gs_info.element_type); if (mask != NULL) - pattern_stmt = gimple_build_call_internal (gs_info.ifn, 5, base, - offset, scale, zero, mask); + { + int elsval = MASK_LOAD_ELSE_ZERO; + + tree vec_els + = vect_get_mask_load_else (elsval, TREE_TYPE (gs_vectype)); + pattern_stmt = gimple_build_call_internal (gs_info.ifn, 6, base, + offset, scale, zero, mask, + vec_els); + } else pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4, base, offset, scale, zero); diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 97c362d24f8..2986cc3fc4c 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -511,13 +511,13 @@ static const int cond_expr_maps[3][5] = { static const int no_arg_map[] = { 0 }; static const int arg0_map[] = { 1, 0 }; static const int arg1_map[] = { 1, 1 }; -static const int arg2_map[] = { 1, 2 }; -static const int arg1_arg4_map[] = { 2, 1, 4 }; +static const int arg2_arg3_map[] = { 2, 2, 3 }; +static const int arg1_arg4_arg5_map[] = { 3, 1, 4, 5 }; static const int arg3_arg2_map[] = { 2, 3, 2 }; static const int op1_op0_map[] = { 2, 1, 0 }; static const int off_map[] = { 1, -3 }; static const int off_op0_map[] = { 2, -3, 0 }; -static const int off_arg2_map[] = { 2, -3, 2 }; +static const int off_arg2_arg3_map[] = { 3, -3, 2, 3 }; static const int off_arg3_arg2_map[] = { 3, -3, 3, 2 }; static const int mask_call_maps[6][7] = { { 1, 1, }, @@ -564,14 +564,14 @@ vect_get_operand_map (const gimple *stmt, bool gather_scatter_p = false, switch (gimple_call_internal_fn (call)) { case IFN_MASK_LOAD: - return gather_scatter_p ? off_arg2_map : arg2_map; + return gather_scatter_p ? off_arg2_arg3_map : arg2_arg3_map; case IFN_GATHER_LOAD: return arg1_map; case IFN_MASK_GATHER_LOAD: case IFN_MASK_LEN_GATHER_LOAD: - return arg1_arg4_map; + return arg1_arg4_arg5_map; case IFN_MASK_STORE: return gather_scatter_p ? off_arg3_arg2_map : arg3_arg2_map; @@ -2675,7 +2675,8 @@ out: tree op0; tree uniform_val = op0 = oprnd_info->ops[0]; for (j = 1; j < oprnd_info->ops.length (); ++j) - if (!operand_equal_p (uniform_val, oprnd_info->ops[j])) + if (!oprnd_info->ops[j] + || !operand_equal_p (uniform_val, oprnd_info->ops[j])) { uniform_val = NULL_TREE; break; @@ -7928,6 +7929,18 @@ vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node, tree vector_type = SLP_TREE_VECTYPE (child); if (!vector_type) { + /* Masked loads can have an undefined (default SSA definition) + else operand. We do not need to cost it. */ + vec ops = SLP_TREE_SCALAR_OPS (child); + if ((STMT_VINFO_TYPE (SLP_TREE_REPRESENTATIVE (node)) + == load_vec_info_type) + && ((ops.length () + && TREE_CODE (ops[0]) == SSA_NAME + && SSA_NAME_IS_DEFAULT_DEF (ops[0]) + && VAR_P (SSA_NAME_VAR (ops[0]))) + || SLP_TREE_DEF_TYPE (child) == vect_constant_def)) + continue; + /* For shifts with a scalar argument we don't need to cost or code-generate anything. ??? Represent this more explicitely. */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 9a2c2ea753e..02b16ebf533 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see #include "regs.h" #include "attribs.h" #include "optabs-libfuncs.h" +#include "tree-dfa.h" /* For lang_hooks.types.type_for_mode. */ #include "langhooks.h" @@ -157,28 +158,45 @@ create_vector_array (tree elem_type, unsigned HOST_WIDE_INT nelems) /* ARRAY is an array of vectors created by create_vector_array. Return an SSA_NAME for the vector in index N. The reference is part of the vectorization of STMT_INFO and the vector is associated - with scalar destination SCALAR_DEST. */ + with scalar destination SCALAR_DEST. + If we need to ensure that inactive elements are set to zero, + NEED_ZEROING is true, MASK contains the loop mask to be used. */ static tree read_vector_array (vec_info *vinfo, stmt_vec_info stmt_info, gimple_stmt_iterator *gsi, - tree scalar_dest, tree array, unsigned HOST_WIDE_INT n) + tree scalar_dest, tree array, unsigned HOST_WIDE_INT n, + bool need_zeroing, tree mask) { - tree vect_type, vect, vect_name, array_ref; + tree vect_type, vect, vect_name, tmp, tmp_name, array_ref; gimple *new_stmt; gcc_assert (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE); vect_type = TREE_TYPE (TREE_TYPE (array)); + tmp = vect_create_destination_var (scalar_dest, vect_type); vect = vect_create_destination_var (scalar_dest, vect_type); array_ref = build4 (ARRAY_REF, vect_type, array, build_int_cst (size_type_node, n), NULL_TREE, NULL_TREE); - new_stmt = gimple_build_assign (vect, array_ref); - vect_name = make_ssa_name (vect, new_stmt); - gimple_assign_set_lhs (new_stmt, vect_name); + new_stmt = gimple_build_assign (tmp, array_ref); + tmp_name = make_ssa_name (vect, new_stmt); + gimple_assign_set_lhs (new_stmt, tmp_name); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + if (need_zeroing) + { + tree vec_els = vect_get_mask_load_else (MASK_LOAD_ELSE_ZERO, + vect_type); + vect_name = make_ssa_name (vect, new_stmt); + new_stmt + = gimple_build_assign (vect_name, VEC_COND_EXPR, + mask, tmp_name, vec_els); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + } + else + vect_name = tmp_name; + return vect_name; } @@ -469,6 +487,10 @@ exist_non_indexing_operands_for_use_p (tree use, stmt_vec_info stmt_info) if (mask_index >= 0 && use == gimple_call_arg (call, mask_index)) return true; + int els_index = internal_fn_else_index (ifn); + if (els_index >= 0 + && use == gimple_call_arg (call, els_index)) + return true; int stored_value_index = internal_fn_stored_value_index (ifn); if (stored_value_index >= 0 && use == gimple_call_arg (call, stored_value_index)) @@ -1280,7 +1302,17 @@ vect_get_vec_defs_for_operand (vec_info *vinfo, stmt_vec_info stmt_vinfo, vector_type = get_vectype_for_scalar_type (loop_vinfo, TREE_TYPE (op)); gcc_assert (vector_type); - tree vop = vect_init_vector (vinfo, stmt_vinfo, op, vector_type, NULL); + /* A masked load can have a default SSA definition as else operand. + We should "vectorize" this instead of creating a duplicate from the + scalar default. */ + tree vop; + if (TREE_CODE (op) == SSA_NAME + && SSA_NAME_IS_DEFAULT_DEF (op) + && VAR_P (SSA_NAME_VAR (op))) + vop = get_or_create_ssa_default_def (cfun, + create_tmp_var (vector_type)); + else + vop = vect_init_vector (vinfo, stmt_vinfo, op, vector_type, NULL); while (ncopies--) vec_oprnds->quick_push (vop); } @@ -1492,7 +1524,10 @@ static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info, Clear LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P if a loop using partial vectors is not supported, otherwise record the required rgroup control - types. */ + types. + + If partial vectors can be used and ELSVALS is nonzero the supported + else values will be added to the vector ELSVALS points to. */ static void check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, @@ -1502,7 +1537,8 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, vect_memory_access_type memory_access_type, gather_scatter_info *gs_info, - tree scalar_mask) + tree scalar_mask, + vec *elsvals = nullptr) { /* Invariant loads need no special support. */ if (memory_access_type == VMAT_INVARIANT) @@ -1518,7 +1554,8 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, if (slp_node) nvectors /= group_size; internal_fn ifn - = (is_load ? vect_load_lanes_supported (vectype, group_size, true) + = (is_load ? vect_load_lanes_supported (vectype, group_size, true, + elsvals) : vect_store_lanes_supported (vectype, group_size, true)); if (ifn == IFN_MASK_LEN_LOAD_LANES || ifn == IFN_MASK_LEN_STORE_LANES) vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1); @@ -1548,12 +1585,14 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, if (internal_gather_scatter_fn_supported_p (len_ifn, vectype, gs_info->memory_type, gs_info->offset_vectype, - gs_info->scale)) + gs_info->scale, + elsvals)) vect_record_loop_len (loop_vinfo, lens, nvectors, vectype, 1); else if (internal_gather_scatter_fn_supported_p (ifn, vectype, gs_info->memory_type, gs_info->offset_vectype, - gs_info->scale)) + gs_info->scale, + elsvals)) vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype, scalar_mask); else @@ -1607,7 +1646,8 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, machine_mode mask_mode; machine_mode vmode; bool using_partial_vectors_p = false; - if (get_len_load_store_mode (vecmode, is_load).exists (&vmode)) + if (get_len_load_store_mode + (vecmode, is_load, nullptr, elsvals).exists (&vmode)) { nvectors = group_memory_nvectors (group_size * vf, nunits); unsigned factor = (vecmode == vmode) ? 1 : GET_MODE_UNIT_SIZE (vecmode); @@ -1615,7 +1655,8 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, using_partial_vectors_p = true; } else if (targetm.vectorize.get_mask_mode (vecmode).exists (&mask_mode) - && can_vec_mask_load_store_p (vecmode, mask_mode, is_load)) + && can_vec_mask_load_store_p (vecmode, mask_mode, is_load, NULL, + elsvals)) { nvectors = group_memory_nvectors (group_size * vf, nunits); vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype, scalar_mask); @@ -1672,12 +1713,16 @@ prepare_vec_mask (loop_vec_info loop_vinfo, tree mask_type, tree loop_mask, without loss of precision, where X is STMT_INFO's DR_STEP. Return true if this is possible, describing the gather load or scatter - store in GS_INFO. MASKED_P is true if the load or store is conditional. */ + store in GS_INFO. MASKED_P is true if the load or store is conditional. + + If we can use gather/scatter and ELSVALS is nonzero the supported + else values will be stored in the vector ELSVALS points to. */ static bool vect_truncate_gather_scatter_offset (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, bool masked_p, - gather_scatter_info *gs_info) + gather_scatter_info *gs_info, + vec *elsvals) { dr_vec_info *dr_info = STMT_VINFO_DR_INFO (stmt_info); data_reference *dr = dr_info->dr; @@ -1734,7 +1779,8 @@ vect_truncate_gather_scatter_offset (stmt_vec_info stmt_info, tree memory_type = TREE_TYPE (DR_REF (dr)); if (!vect_gather_scatter_fn_p (loop_vinfo, DR_IS_READ (dr), masked_p, vectype, memory_type, offset_type, scale, - &gs_info->ifn, &gs_info->offset_vectype) + &gs_info->ifn, &gs_info->offset_vectype, + elsvals) || gs_info->ifn == IFN_LAST) continue; @@ -1762,17 +1808,21 @@ vect_truncate_gather_scatter_offset (stmt_vec_info stmt_info, vectorize STMT_INFO, which is a grouped or strided load or store. MASKED_P is true if load or store is conditional. When returning true, fill in GS_INFO with the information required to perform the - operation. */ + operation. + + If we can use gather/scatter and ELSVALS is nonzero the supported + else values will be stored in the vector ELSVALS points to. */ static bool vect_use_strided_gather_scatters_p (stmt_vec_info stmt_info, loop_vec_info loop_vinfo, bool masked_p, - gather_scatter_info *gs_info) + gather_scatter_info *gs_info, + vec *elsvals) { - if (!vect_check_gather_scatter (stmt_info, loop_vinfo, gs_info) + if (!vect_check_gather_scatter (stmt_info, loop_vinfo, gs_info, elsvals) || gs_info->ifn == IFN_LAST) return vect_truncate_gather_scatter_offset (stmt_info, loop_vinfo, - masked_p, gs_info); + masked_p, gs_info, elsvals); tree old_offset_type = TREE_TYPE (gs_info->offset); tree new_offset_type = TREE_TYPE (gs_info->offset_vectype); @@ -1974,7 +2024,11 @@ vector_vector_composition_type (tree vtype, poly_uint64 nelts, tree *ptype) For stores, the statements in the group are all consecutive and there is no gap at the end. For loads, the statements in the group might not be consecutive; there can be gaps between statements - as well as at the end. */ + as well as at the end. + + If we can use gather/scatter and ELSVALS is nonzero the supported + else values will be stored in the vector ELSVALS points to. +*/ static bool get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, @@ -1985,7 +2039,8 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, dr_alignment_support *alignment_support_scheme, int *misalignment, gather_scatter_info *gs_info, - internal_fn *lanes_ifn) + internal_fn *lanes_ifn, + vec *elsvals) { loop_vec_info loop_vinfo = dyn_cast (vinfo); class loop *loop = loop_vinfo ? LOOP_VINFO_LOOP (loop_vinfo) : NULL; @@ -2074,7 +2129,8 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, else if (slp_node->ldst_lanes && (*lanes_ifn = (vls_type == VLS_LOAD - ? vect_load_lanes_supported (vectype, group_size, masked_p) + ? vect_load_lanes_supported (vectype, group_size, + masked_p, elsvals) : vect_store_lanes_supported (vectype, group_size, masked_p))) != IFN_LAST) *memory_access_type = VMAT_LOAD_STORE_LANES; @@ -2244,7 +2300,8 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, /* Otherwise try using LOAD/STORE_LANES. */ *lanes_ifn = vls_type == VLS_LOAD - ? vect_load_lanes_supported (vectype, group_size, masked_p) + ? vect_load_lanes_supported (vectype, group_size, masked_p, + elsvals) : vect_store_lanes_supported (vectype, group_size, masked_p); if (*lanes_ifn != IFN_LAST) @@ -2278,7 +2335,7 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, && single_element_p && loop_vinfo && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo, - masked_p, gs_info)) + masked_p, gs_info, elsvals)) *memory_access_type = VMAT_GATHER_SCATTER; if (*memory_access_type == VMAT_GATHER_SCATTER @@ -2340,7 +2397,10 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, SLP says whether we're performing SLP rather than loop vectorization. MASKED_P is true if the statement is conditional on a vectorized mask. VECTYPE is the vector type that the vectorized statements will use. - NCOPIES is the number of vector statements that will be needed. */ + NCOPIES is the number of vector statements that will be needed. + + If ELSVALS is nonzero the supported else values will be stored in the + vector ELSVALS points to. */ static bool get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, @@ -2352,7 +2412,8 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, dr_alignment_support *alignment_support_scheme, int *misalignment, gather_scatter_info *gs_info, - internal_fn *lanes_ifn) + internal_fn *lanes_ifn, + vec *elsvals = nullptr) { loop_vec_info loop_vinfo = dyn_cast (vinfo); poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype); @@ -2361,7 +2422,8 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)) { *memory_access_type = VMAT_GATHER_SCATTER; - if (!vect_check_gather_scatter (stmt_info, loop_vinfo, gs_info)) + if (!vect_check_gather_scatter (stmt_info, loop_vinfo, gs_info, + elsvals)) gcc_unreachable (); /* When using internal functions, we rely on pattern recognition to convert the type of the offset to the type that the target @@ -2415,7 +2477,8 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, masked_p, vls_type, memory_access_type, poffset, alignment_support_scheme, - misalignment, gs_info, lanes_ifn)) + misalignment, gs_info, lanes_ifn, + elsvals)) return false; } else if (STMT_VINFO_STRIDED_P (stmt_info)) @@ -2423,7 +2486,7 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, gcc_assert (!slp_node); if (loop_vinfo && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo, - masked_p, gs_info)) + masked_p, gs_info, elsvals)) *memory_access_type = VMAT_GATHER_SCATTER; else *memory_access_type = VMAT_ELEMENTWISE; @@ -2692,6 +2755,30 @@ vect_build_zero_merge_argument (vec_info *vinfo, return vect_init_vector (vinfo, stmt_info, merge, vectype, NULL); } +/* Return the corresponding else value for an else value constant + ELSVAL with type TYPE. */ + +tree +vect_get_mask_load_else (int elsval, tree type) +{ + tree els; + if (elsval == MASK_LOAD_ELSE_UNDEFINED) + { + tree tmp = create_tmp_var (type); + /* No need to warn about anything. */ + TREE_NO_WARNING (tmp) = 1; + els = get_or_create_ssa_default_def (cfun, tmp); + } + else if (elsval == MASK_LOAD_ELSE_M1) + els = build_minus_one_cst (type); + else if (elsval == MASK_LOAD_ELSE_ZERO) + els = build_zero_cst (type); + else + gcc_unreachable (); + + return els; +} + /* Build a gather load call while vectorizing STMT_INFO. Insert new instructions before GSI and add them to VEC_STMT. GS_INFO describes the gather load operation. If the load is conditional, MASK is the @@ -2773,8 +2860,14 @@ vect_build_one_gather_load_call (vec_info *vinfo, stmt_vec_info stmt_info, } tree scale = build_int_cst (scaletype, gs_info->scale); - gimple *new_stmt = gimple_build_call (gs_info->decl, 5, src_op, ptr, op, - mask_op, scale); + gimple *new_stmt; + + if (!mask) + new_stmt = gimple_build_call (gs_info->decl, 5, src_op, ptr, op, + mask_op, scale); + else + new_stmt = gimple_build_call (gs_info->decl, 5, src_op, ptr, op, + mask_op, scale); if (!useless_type_conversion_p (vectype, rettype)) { @@ -9989,6 +10082,7 @@ vectorizable_load (vec_info *vinfo, gather_scatter_info gs_info; tree ref_type; enum vect_def_type mask_dt = vect_unknown_def_type; + enum vect_def_type els_dt = vect_unknown_def_type; if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo) return false; @@ -10001,8 +10095,12 @@ vectorizable_load (vec_info *vinfo, return false; tree mask = NULL_TREE, mask_vectype = NULL_TREE; + tree els = NULL_TREE; tree els_vectype = NULL_TREE; + int mask_index = -1; + int els_index = -1; slp_tree slp_op = NULL; + slp_tree els_op = NULL; if (gassign *assign = dyn_cast (stmt_info->stmt)) { scalar_dest = gimple_assign_lhs (assign); @@ -10042,6 +10140,15 @@ vectorizable_load (vec_info *vinfo, && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index, &mask, &slp_op, &mask_dt, &mask_vectype)) return false; + + els_index = internal_fn_else_index (ifn); + if (els_index >= 0 && slp_node) + els_index = vect_slp_child_index_for_operand + (call, els_index, STMT_VINFO_GATHER_SCATTER_P (stmt_info)); + if (els_index >= 0 + && !vect_is_simple_use (vinfo, stmt_info, slp_node, els_index, + &els, &els_op, &els_dt, &els_vectype)) + return false; } tree vectype = STMT_VINFO_VECTYPE (stmt_info); @@ -10144,12 +10251,21 @@ vectorizable_load (vec_info *vinfo, int misalignment; poly_int64 poffset; internal_fn lanes_ifn; + auto_vec elsvals; if (!get_load_store_type (vinfo, stmt_info, vectype, slp_node, mask, VLS_LOAD, ncopies, &memory_access_type, &poffset, &alignment_support_scheme, &misalignment, &gs_info, - &lanes_ifn)) + &lanes_ifn, &elsvals)) return false; + + /* We might need to explicitly zero inactive elements if there are + padding bits in the type that might leak otherwise. + Refer to PR115336. */ + tree scalar_type = TREE_TYPE (scalar_dest); + bool type_mode_padding_p + = TYPE_PRECISION (scalar_type) < GET_MODE_PRECISION (GET_MODE_INNER (mode)); + /* ??? The following checks should really be part of get_group_load_store_type. */ if (slp @@ -10213,7 +10329,8 @@ vectorizable_load (vec_info *vinfo, machine_mode vec_mode = TYPE_MODE (vectype); if (!VECTOR_MODE_P (vec_mode) || !can_vec_mask_load_store_p (vec_mode, - TYPE_MODE (mask_vectype), true)) + TYPE_MODE (mask_vectype), + true, NULL, &elsvals)) return false; } else if (memory_access_type != VMAT_LOAD_STORE_LANES @@ -10282,6 +10399,16 @@ vectorizable_load (vec_info *vinfo, STMT_VINFO_TYPE (stmt_info) = load_vec_info_type; } + else + { + /* Here just get the else values. */ + if (loop_vinfo + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) + check_load_store_for_partial_vectors (loop_vinfo, vectype, slp_node, + VLS_LOAD, group_size, + memory_access_type, &gs_info, + mask, &elsvals); + } if (!slp) gcc_assert (memory_access_type @@ -10952,6 +11079,8 @@ vectorizable_load (vec_info *vinfo, } tree vec_mask = NULL_TREE; + tree vec_els = NULL_TREE; + bool need_zeroing = false; if (memory_access_type == VMAT_LOAD_STORE_LANES) { gcc_assert (alignment_support_scheme == dr_aligned @@ -11042,6 +11171,16 @@ vectorizable_load (vec_info *vinfo, } } + if (final_mask) + { + vec_els = vect_get_mask_load_else + (elsvals.contains (MASK_LOAD_ELSE_ZERO) + ? MASK_LOAD_ELSE_ZERO : *elsvals.begin (), vectype); + if (type_mode_padding_p + && !elsvals.contains (MASK_LOAD_ELSE_ZERO)) + need_zeroing = true; + } + gcall *call; if (final_len && final_mask) { @@ -11050,9 +11189,10 @@ vectorizable_load (vec_info *vinfo, VEC_MASK, LEN, BIAS). */ unsigned int align = TYPE_ALIGN (TREE_TYPE (vectype)); tree alias_ptr = build_int_cst (ref_type, align); - call = gimple_build_call_internal (IFN_MASK_LEN_LOAD_LANES, 5, + call = gimple_build_call_internal (IFN_MASK_LEN_LOAD_LANES, 6, dataref_ptr, alias_ptr, - final_mask, final_len, bias); + final_mask, vec_els, + final_len, bias); } else if (final_mask) { @@ -11061,9 +11201,9 @@ vectorizable_load (vec_info *vinfo, VEC_MASK). */ unsigned int align = TYPE_ALIGN (TREE_TYPE (vectype)); tree alias_ptr = build_int_cst (ref_type, align); - call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 3, + call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 4, dataref_ptr, alias_ptr, - final_mask); + final_mask, vec_els); } else { @@ -11082,7 +11222,8 @@ vectorizable_load (vec_info *vinfo, for (unsigned i = 0; i < group_size; i++) { new_temp = read_vector_array (vinfo, stmt_info, gsi, scalar_dest, - vec_array, i); + vec_array, i, need_zeroing, + final_mask); if (slp) slp_node->push_vec_def (new_temp); else @@ -11212,25 +11353,37 @@ vectorizable_load (vec_info *vinfo, } } + if (final_mask) + { + vec_els = vect_get_mask_load_else + (elsvals.contains (MASK_LOAD_ELSE_ZERO) + ? MASK_LOAD_ELSE_ZERO : *elsvals.begin (), vectype); + if (type_mode_padding_p + && !elsvals.contains (MASK_LOAD_ELSE_ZERO)) + need_zeroing = true; + } + gcall *call; if (final_len && final_mask) { if (VECTOR_TYPE_P (TREE_TYPE (vec_offset))) call = gimple_build_call_internal ( - IFN_MASK_LEN_GATHER_LOAD, 7, dataref_ptr, vec_offset, - scale, zero, final_mask, final_len, bias); + IFN_MASK_LEN_GATHER_LOAD, 8, dataref_ptr, vec_offset, + scale, zero, final_mask, vec_els, final_len, bias); else /* Non-vector offset indicates that prefer to take MASK_LEN_STRIDED_LOAD instead of the MASK_LEN_GATHER_LOAD with direct stride arg. */ call = gimple_build_call_internal ( - IFN_MASK_LEN_STRIDED_LOAD, 6, dataref_ptr, vec_offset, - zero, final_mask, final_len, bias); + IFN_MASK_LEN_STRIDED_LOAD, 7, dataref_ptr, vec_offset, + zero, final_mask, vec_els, final_len, bias); } else if (final_mask) - call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, 5, - dataref_ptr, vec_offset, - scale, zero, final_mask); + call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, + 6, dataref_ptr, + vec_offset, scale, + zero, final_mask, + vec_els); else call = gimple_build_call_internal (IFN_GATHER_LOAD, 4, dataref_ptr, vec_offset, @@ -11441,10 +11594,28 @@ vectorizable_load (vec_info *vinfo, vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); new_stmt = gimple_build_assign (vec_dest, data_ref); } - new_temp = make_ssa_name (vec_dest, new_stmt); + new_temp = need_zeroing + ? make_ssa_name (vectype) + : make_ssa_name (vec_dest, new_stmt); gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + /* If we need to explicitly zero inactive elements emit a + VEC_COND_EXPR that does so. */ + if (need_zeroing) + { + vec_els = vect_get_mask_load_else (MASK_LOAD_ELSE_ZERO, + vectype); + + tree new_temp2 = make_ssa_name (vec_dest, new_stmt); + new_stmt + = gimple_build_assign (new_temp2, VEC_COND_EXPR, + final_mask, new_temp, vec_els); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, + gsi); + new_temp = new_temp2; + } + /* Store vector loads in the corresponding SLP_NODE. */ if (slp) slp_node->push_vec_def (new_stmt); @@ -11539,11 +11710,13 @@ vectorizable_load (vec_info *vinfo, dr_chain.create (vec_num); gimple *new_stmt = NULL; + bool need_zeroing = false; for (i = 0; i < vec_num; i++) { tree final_mask = NULL_TREE; tree final_len = NULL_TREE; tree bias = NULL_TREE; + if (!costing_p) { if (mask) @@ -11596,7 +11769,8 @@ vectorizable_load (vec_info *vinfo, if (loop_lens) { opt_machine_mode new_ovmode - = get_len_load_store_mode (vmode, true, &partial_ifn); + = get_len_load_store_mode (vmode, true, &partial_ifn, + &elsvals); new_vmode = new_ovmode.require (); unsigned factor = (new_ovmode == vmode) ? 1 : GET_MODE_UNIT_SIZE (vmode); @@ -11608,7 +11782,7 @@ vectorizable_load (vec_info *vinfo, { if (!can_vec_mask_load_store_p ( vmode, TYPE_MODE (TREE_TYPE (final_mask)), true, - &partial_ifn)) + &partial_ifn, &elsvals)) gcc_unreachable (); } @@ -11636,15 +11810,25 @@ vectorizable_load (vec_info *vinfo, bias = build_int_cst (intQI_type_node, biasval); } + tree vec_els; + if (final_len) { tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); gcall *call; if (partial_ifn == IFN_MASK_LEN_LOAD) - call = gimple_build_call_internal (IFN_MASK_LEN_LOAD, 5, - dataref_ptr, ptr, - final_mask, final_len, - bias); + { + vec_els = vect_get_mask_load_else + (elsvals.contains (MASK_LOAD_ELSE_ZERO) + ? MASK_LOAD_ELSE_ZERO : *elsvals.begin (), vectype); + if (type_mode_padding_p + && !elsvals.contains (MASK_LOAD_ELSE_ZERO)) + need_zeroing = true; + call = gimple_build_call_internal (IFN_MASK_LEN_LOAD, + 6, dataref_ptr, ptr, + final_mask, vec_els, + final_len, bias); + } else call = gimple_build_call_internal (IFN_LEN_LOAD, 4, dataref_ptr, ptr, @@ -11671,9 +11855,16 @@ vectorizable_load (vec_info *vinfo, else if (final_mask) { tree ptr = build_int_cst (ref_type, align * BITS_PER_UNIT); - gcall *call = gimple_build_call_internal (IFN_MASK_LOAD, 3, + vec_els = vect_get_mask_load_else + (elsvals.contains (MASK_LOAD_ELSE_ZERO) + ? MASK_LOAD_ELSE_ZERO : *elsvals.begin (), vectype); + if (type_mode_padding_p + && !elsvals.contains (MASK_LOAD_ELSE_ZERO)) + need_zeroing = true; + gcall *call = gimple_build_call_internal (IFN_MASK_LOAD, 4, dataref_ptr, ptr, - final_mask); + final_mask, + vec_els); gimple_call_set_nothrow (call, true); new_stmt = call; data_ref = NULL_TREE; @@ -11954,9 +12145,28 @@ vectorizable_load (vec_info *vinfo, vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr)); new_stmt = gimple_build_assign (vec_dest, data_ref); } - new_temp = make_ssa_name (vec_dest, new_stmt); + + new_temp = need_zeroing + ? make_ssa_name (vectype) + : make_ssa_name (vec_dest, new_stmt); gimple_set_lhs (new_stmt, new_temp); vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + + /* If we need to explicitly zero inactive elements emit a + VEC_COND_EXPR that does so. */ + if (need_zeroing) + { + vec_els = vect_get_mask_load_else (MASK_LOAD_ELSE_ZERO, + vectype); + + tree new_temp2 = make_ssa_name (vec_dest, new_stmt); + new_stmt + = gimple_build_assign (new_temp2, VEC_COND_EXPR, + final_mask, new_temp, vec_els); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, + gsi); + new_temp = new_temp2; + } } /* 3. Handle explicit realignment if necessary/supported. diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 24227a69d4a..0bd759a92ea 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2418,9 +2418,11 @@ extern bool vect_slp_analyze_instance_alignment (vec_info *, slp_instance); extern opt_result vect_analyze_data_ref_accesses (vec_info *, vec *); extern opt_result vect_prune_runtime_alias_test_list (loop_vec_info); extern bool vect_gather_scatter_fn_p (vec_info *, bool, bool, tree, tree, - tree, int, internal_fn *, tree *); + tree, int, internal_fn *, tree *, + vec * = nullptr); extern bool vect_check_gather_scatter (stmt_vec_info, loop_vec_info, - gather_scatter_info *); + gather_scatter_info *, + vec * = nullptr); extern opt_result vect_find_stmt_data_reference (loop_p, gimple *, vec *, vec *, int); @@ -2438,7 +2440,8 @@ extern tree vect_create_destination_var (tree, tree); extern bool vect_grouped_store_supported (tree, unsigned HOST_WIDE_INT); extern internal_fn vect_store_lanes_supported (tree, unsigned HOST_WIDE_INT, bool); extern bool vect_grouped_load_supported (tree, bool, unsigned HOST_WIDE_INT); -extern internal_fn vect_load_lanes_supported (tree, unsigned HOST_WIDE_INT, bool); +extern internal_fn vect_load_lanes_supported (tree, unsigned HOST_WIDE_INT, + bool, vec * = nullptr); extern void vect_permute_store_chain (vec_info *, vec &, unsigned int, stmt_vec_info, gimple_stmt_iterator *, vec *); @@ -2584,6 +2587,7 @@ extern int vect_slp_child_index_for_operand (const gimple *, int op, bool); extern tree prepare_vec_mask (loop_vec_info, tree, tree, tree, gimple_stmt_iterator *); +extern tree vect_get_mask_load_else (int, tree); /* In tree-vect-patterns.cc. */ extern void From patchwork Sat Nov 2 12:58:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100112 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5D6843857BA9 for ; Sat, 2 Nov 2024 13:02:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 8D846385842D for ; Sat, 2 Nov 2024 12:58:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8D846385842D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8D846385842D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552329; cv=none; b=Nw9phwTmpV1AAqjith+fZXYeVjV9wFb8uefyS0Vu9nTAn3vgouQugz83Z5+PsARDrrnh87WiUHBFPYYy0pte0HldYSm8+8d/HrDelItjIH3HLLUSrUP7AGoHrUjOz2Ky8A/5xW6lCEf5LX8oUZHOJ009jFqOvU6NiY78XP0stgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552329; c=relaxed/simple; bh=AablEesrfHgseL//46+kNW7lt2o1nIufcAe4SjKUEYc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=BgUcyBOrlX/v75uBmJATsD2U2U/zhJfanR0MuoEhmnHoWuMBoE5ILM2cqV/28ZxCPkSWXehY9GvWbvj2KP6uwnuRcoNKesCT+y6sq3iy3aQu3Am5drQ/bwIh1tFFYTbqnSXBEHErWNouhPeG2Y4Bm4MTeEt9rrQnYvAWAupBdO8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-a9e8522445dso70753466b.1 for ; Sat, 02 Nov 2024 05:58:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552317; x=1731157117; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nosboQ17/Dhnv9TXxnKdt/EPXpFKLG/Jg/6HOkyEQNs=; b=bvqL5ecNbpY1/hmd5BhSQUecnMjnLQX9dhLfbP1Y89KXRaX4Frac21o42x2HOrP7EX VM/CFz/Fk1Jdv1Er+8GOKELEH4H6qtuQFe2YgV6e+cKa4fgih2rND/zC9X35LNulne42 wVCbK8tz1OnyOCh8bh/LsWRPjZf4Fq4nOXaL4Dr2DlXtUDyzKV1aae0nwCAVKl6Lb8F0 x2fD0q/paFSUIjeSJvbG4aikWAmlLllOJey+qW+DmiZTjNYG4PIqIxyTjXsOZokDv7xE XiTDVPGI/s/cNzDb2yyEJ8aYi1UWVTPsoMaXbCdHb3usW92dbr/TJtGHeFUMqIrG17x3 pojQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552317; x=1731157117; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nosboQ17/Dhnv9TXxnKdt/EPXpFKLG/Jg/6HOkyEQNs=; b=Ubtney77ddgl9ov7QmsGd15n6LKQhWcjGQEgghgXDL/LEyB+mS31TfiiHnZC6UT+9g p0JeWhfUJBpeaWqfrw0G1yc78mIvkOBibr7gjxyWNeX3URF0ck6XrPyyw2fPkhsyCo+U bnYR6318FB+zDGZBr/Ej0byXnnJy297KC9IyztLnU+AnuaSma8/f+bVPoaJmyX6MgELs 68Tq/Yc/wSk7wwOu7pkDyE6V0kQcdzY9S+YQlBb78zhANq85A4dXc4OVrwL8HD6yFJxO BPcoDU6xe5LRHYBx442Zz7zCk/t6aNx8hDfOv7oCpJKUZGF1AVr7qajEok04dob8aOtR EPHw== X-Gm-Message-State: AOJu0Yxq5kVKLBg41RYx/rdbjIOEzzqDr4c0En5JuUY0szvJzLQVGtSP ZhqlO+WplcECGW9fOIMmEskqTfcws4FluFJkIeFT/pAXr6mDsXLysDljyg== X-Google-Smtp-Source: AGHT+IGUV8kWbk+lGPURTThRnlNxt0ry7/bHlUGOUZRrWS3EM/1aXiqRU0M52f+UJ3p5nK8CtFQbQg== X-Received: by 2002:a17:907:608a:b0:a9a:ea4:2834 with SMTP id a640c23a62f3a-a9e655ab355mr603548666b.33.1730552315971; Sat, 02 Nov 2024 05:58:35 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:35 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 5/8] aarch64: Add masked-load else operands. Date: Sat, 2 Nov 2024 13:58:25 +0100 Message-ID: <20241102125828.29183-6-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc: Add else handling. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Ditto. * config/aarch64/aarch64-sve-builtins.h: Add else operand to contiguous load. * config/aarch64/aarch64-sve.md (@aarch64_load _): Split and add else operand. (@aarch64_load_): Ditto. (*aarch64_load__mov): Ditto. * config/aarch64/aarch64-sve2.md: Ditto. * config/aarch64/iterators.md: Remove unused iterators. * config/aarch64/predicates.md (aarch64_maskload_else_operand): Add zero else operand. --- .../aarch64/aarch64-sve-builtins-base.cc | 24 +++++---- gcc/config/aarch64/aarch64-sve-builtins.cc | 12 ++++- gcc/config/aarch64/aarch64-sve-builtins.h | 2 +- gcc/config/aarch64/aarch64-sve.md | 52 ++++++++++++++++--- gcc/config/aarch64/aarch64-sve2.md | 3 +- gcc/config/aarch64/iterators.md | 4 -- gcc/config/aarch64/predicates.md | 4 ++ 7 files changed, 77 insertions(+), 24 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index fe16d93adcd..d840f590202 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -1523,11 +1523,12 @@ public: gimple_seq stmts = NULL; tree pred = f.convert_pred (stmts, vectype, 0); tree base = f.fold_contiguous_base (stmts, vectype); + tree els = build_zero_cst (vectype); gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT); tree cookie = f.load_store_cookie (TREE_TYPE (vectype)); - gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD, 3, - base, cookie, pred); + gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD, 4, + base, cookie, pred, els); gimple_call_set_lhs (new_call, f.lhs); return new_call; } @@ -1541,7 +1542,7 @@ public: e.vector_mode (0), e.gp_mode (0)); else icode = code_for_aarch64 (UNSPEC_LD1_COUNT, e.tuple_mode (0)); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; @@ -1554,10 +1555,10 @@ public: rtx expand (function_expander &e) const override { - insn_code icode = code_for_aarch64_load (UNSPEC_LD1_SVE, extend_rtx_code (), + insn_code icode = code_for_aarch64_load (extend_rtx_code (), e.vector_mode (0), e.memory_vector_mode ()); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; @@ -1576,6 +1577,8 @@ public: e.prepare_gather_address_operands (1); /* Put the predicate last, as required by mask_gather_load_optab. */ e.rotate_inputs_left (0, 5); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); machine_mode mem_mode = e.memory_vector_mode (); machine_mode int_mode = aarch64_sve_int_mode (mem_mode); insn_code icode = convert_optab_handler (mask_gather_load_optab, @@ -1599,6 +1602,8 @@ public: e.rotate_inputs_left (0, 5); /* Add a constant predicate for the extension rtx. */ e.args.quick_push (CONSTM1_RTX (VNx16BImode)); + /* Add the else operand. */ + e.args.quick_push (CONST0_RTX (e.vector_mode (1))); insn_code icode = code_for_aarch64_gather_load (extend_rtx_code (), e.vector_mode (0), e.memory_vector_mode ()); @@ -1741,6 +1746,7 @@ public: /* Get the predicate and base pointer. */ gimple_seq stmts = NULL; tree pred = f.convert_pred (stmts, vectype, 0); + tree els = build_zero_cst (vectype); tree base = f.fold_contiguous_base (stmts, vectype); gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT); @@ -1759,8 +1765,8 @@ public: /* Emit the load itself. */ tree cookie = f.load_store_cookie (TREE_TYPE (vectype)); - gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 3, - base, cookie, pred); + gcall *new_call = gimple_build_call_internal (IFN_MASK_LOAD_LANES, 4, + base, cookie, pred, els); gimple_call_set_lhs (new_call, lhs_array); gsi_insert_after (f.gsi, new_call, GSI_SAME_STMT); @@ -1773,7 +1779,7 @@ public: machine_mode tuple_mode = e.result_mode (); insn_code icode = convert_optab_handler (vec_mask_load_lanes_optab, tuple_mode, e.vector_mode (0)); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; @@ -1844,7 +1850,7 @@ public: ? code_for_aarch64_ldnt1 (e.vector_mode (0)) : code_for_aarch64 (UNSPEC_LDNT1_COUNT, e.tuple_mode (0))); - return e.use_contiguous_load_insn (icode); + return e.use_contiguous_load_insn (icode, true); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index ef14f8cd39d..0db9a7e9dbe 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -4227,9 +4227,12 @@ function_expander::use_vcond_mask_insn (insn_code icode, /* Implement the call using instruction ICODE, which loads memory operand 1 into register operand 0 under the control of predicate operand 2. Extending loads have a further predicate (operand 3) that nominally - controls the extension. */ + controls the extension. + HAS_ELSE is true if the pattern has an additional operand that specifies + the values of inactive lanes. This exists to match the general maskload + interface and is always zero for AArch64. */ rtx -function_expander::use_contiguous_load_insn (insn_code icode) +function_expander::use_contiguous_load_insn (insn_code icode, bool has_else) { machine_mode mem_mode = memory_vector_mode (); @@ -4238,6 +4241,11 @@ function_expander::use_contiguous_load_insn (insn_code icode) add_input_operand (icode, args[0]); if (GET_MODE_UNIT_BITSIZE (mem_mode) < type_suffix (0).element_bits) add_input_operand (icode, CONSTM1_RTX (VNx16BImode)); + + /* If we have an else operand, add it. */ + if (has_else) + add_input_operand (icode, CONST0_RTX (mem_mode)); + return generate_insn (icode); } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 4cdc0541bdc..1aa9caf84ba 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -695,7 +695,7 @@ public: rtx use_pred_x_insn (insn_code); rtx use_cond_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO); rtx use_vcond_mask_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO); - rtx use_contiguous_load_insn (insn_code); + rtx use_contiguous_load_insn (insn_code, bool = false); rtx use_contiguous_prefetch_insn (insn_code); rtx use_contiguous_store_insn (insn_code); diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 06bd3e4bb2c..17cca97555c 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1291,7 +1291,8 @@ (define_insn "maskload" [(set (match_operand:SVE_ALL 0 "register_operand" "=w") (unspec:SVE_ALL [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_ALL 1 "memory_operand" "m")] + (match_operand:SVE_ALL 1 "memory_operand" "m") + (match_operand:SVE_ALL 3 "aarch64_maskload_else_operand")] UNSPEC_LD1_SVE))] "TARGET_SVE" "ld1\t%0., %2/z, %1" @@ -1302,11 +1303,13 @@ (define_expand "vec_load_lanes" [(set (match_operand:SVE_STRUCT 0 "register_operand") (unspec:SVE_STRUCT [(match_dup 2) - (match_operand:SVE_STRUCT 1 "memory_operand")] + (match_operand:SVE_STRUCT 1 "memory_operand") + (match_dup 3)] UNSPEC_LDN))] "TARGET_SVE" { operands[2] = aarch64_ptrue_reg (mode); + operands[3] = CONST0_RTX (mode); } ) @@ -1315,7 +1318,8 @@ (define_insn "vec_mask_load_lanes" [(set (match_operand:SVE_STRUCT 0 "register_operand" "=w") (unspec:SVE_STRUCT [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_STRUCT 1 "memory_operand" "m")] + (match_operand:SVE_STRUCT 1 "memory_operand" "m") + (match_operand 3 "aarch64_maskload_else_operand")] UNSPEC_LDN))] "TARGET_SVE" "ld\t%0, %2/z, %1" @@ -1334,15 +1338,16 @@ (define_insn "vec_mask_load_lanes" ;; ------------------------------------------------------------------------- ;; Predicated load and extend, with 8 elements per 128-bit block. -(define_insn_and_rewrite "@aarch64_load_" +(define_insn_and_rewrite "@aarch64_load_" [(set (match_operand:SVE_HSDI 0 "register_operand" "=w") (unspec:SVE_HSDI [(match_operand: 3 "general_operand" "UplDnm") (ANY_EXTEND:SVE_HSDI (unspec:SVE_PARTIAL_I [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m")] - SVE_PRED_LOAD))] + (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m") + (match_operand:SVE_PARTIAL_I 4 "aarch64_maskload_else_operand")] + UNSPEC_LD1_SVE))] UNSPEC_PRED_X))] "TARGET_SVE && (~ & ) == 0" "ld1\t%0., %2/z, %1" @@ -1352,6 +1357,26 @@ (define_insn_and_rewrite "@aarch64_load__mov" + [(set (match_operand:SVE_HSDI 0 "register_operand" "=w") + (unspec:SVE_HSDI + [(match_operand: 3 "general_operand" "UplDnm") + (ANY_EXTEND:SVE_HSDI + (unspec:SVE_PARTIAL_I + [(match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_PARTIAL_I 1 "memory_operand" "m")] + UNSPEC_PRED_X))] + UNSPEC_PRED_X))] + "TARGET_SVE && (~ & ) == 0" + "ld1\t%0., %2/z, %1" + "&& !CONSTANT_P (operands[3])" + { + operands[3] = CONSTM1_RTX (mode); + } +) + ;; ------------------------------------------------------------------------- ;; ---- First-faulting contiguous loads ;; ------------------------------------------------------------------------- @@ -1433,7 +1458,8 @@ (define_insn "@aarch64_ldnt1" [(set (match_operand:SVE_FULL 0 "register_operand" "=w") (unspec:SVE_FULL [(match_operand: 2 "register_operand" "Upl") - (match_operand:SVE_FULL 1 "memory_operand" "m")] + (match_operand:SVE_FULL 1 "memory_operand" "m") + (match_operand:SVE_FULL 3 "aarch64_maskload_else_operand")] UNSPEC_LDNT1_SVE))] "TARGET_SVE" "ldnt1\t%0., %2/z, %1" @@ -1456,11 +1482,13 @@ (define_expand "gather_load" (match_operand: 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_dup 6) (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); + operands[6] = CONST0_RTX (mode); } ) @@ -1474,6 +1502,7 @@ (define_insn "mask_gather_load" (match_operand:VNx4SI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_4 6 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1503,6 +1532,7 @@ (define_insn "mask_gather_load" (match_operand:VNx2DI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 6 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1531,6 +1561,7 @@ (define_insn_and_rewrite "*mask_gather_load_xtw_unpac UNSPEC_PRED_X) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1561,6 +1592,7 @@ (define_insn_and_rewrite "*mask_gather_load_sxtw" UNSPEC_PRED_X) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1588,6 +1620,7 @@ (define_insn "*mask_gather_load_uxtw" (match_operand:VNx2DI 6 "aarch64_sve_uxtw_immediate")) (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_2 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] "TARGET_SVE && TARGET_NON_STREAMING" @@ -1624,6 +1657,7 @@ (define_insn_and_rewrite "@aarch64_gather_load_ (match_operand:VNx4SI 2 "register_operand") (match_operand:DI 3 "const_int_operand") (match_operand:DI 4 "aarch64_gather_scale_operand_") + (match_operand:SVE_4BHI 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1663,6 +1697,7 @@ (define_insn_and_rewrite "@aarch64_gather_load_") + (match_operand:SVE_2BHSI 7 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1701,6 +1736,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1738,6 +1774,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] @@ -1772,6 +1809,7 @@ (define_insn_and_rewrite "*aarch64_gather_load_") + (match_operand:SVE_2BHSI 8 "aarch64_maskload_else_operand") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 5f2697c3179..22e8632af80 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -138,7 +138,8 @@ (define_insn "@aarch64_" [(set (match_operand:SVE_FULLx24 0 "aligned_register_operand" "=Uw") (unspec:SVE_FULLx24 [(match_operand:VNx16BI 2 "register_operand" "Uph") - (match_operand:SVE_FULLx24 1 "memory_operand" "m")] + (match_operand:SVE_FULLx24 1 "memory_operand" "m") + (match_operand:SVE_FULLx24 3 "aarch64_maskload_else_operand")] LD1_COUNT))] "TARGET_STREAMING_SME2" "\t%0, %K2/z, %1" diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0bc98315bb6..6592b3df3b2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3224,10 +3224,6 @@ (define_int_iterator SVE_SHIFT_WIDE [UNSPEC_ASHIFT_WIDE (define_int_iterator SVE_LDFF1_LDNF1 [UNSPEC_LDFF1 UNSPEC_LDNF1]) -(define_int_iterator SVE_PRED_LOAD [UNSPEC_PRED_X UNSPEC_LD1_SVE]) - -(define_int_attr pred_load [(UNSPEC_PRED_X "_x") (UNSPEC_LD1_SVE "")]) - (define_int_iterator LD1_COUNT [UNSPEC_LD1_COUNT UNSPEC_LDNT1_COUNT]) (define_int_iterator ST1_COUNT [UNSPEC_ST1_COUNT UNSPEC_STNT1_COUNT]) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 6ad9a4bd8b9..26cfaed2402 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -1067,3 +1067,7 @@ (define_predicate "aarch64_granule16_simm9" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), -4096, 4080) && !(INTVAL (op) & 0xf)"))) + +(define_predicate "aarch64_maskload_else_operand" + (and (match_code "const_int,const_vector") + (match_test "op == CONST0_RTX (GET_MODE (op))"))) From patchwork Sat Nov 2 12:58:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100109 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BEA923857B9F for ; Sat, 2 Nov 2024 13:00:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 158833858432 for ; Sat, 2 Nov 2024 12:58:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 158833858432 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 158833858432 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::634 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552325; cv=none; b=P8tT/zh9bk5PmiwHSExJqRqpvxJD9TT8fp60L4iUULfj2uSa/Y5UhJYTFSl4kCwhxH91cKoNpVaZnJUI9vqmErx0K0iKqLJPgWtMSI/j1xJnQRcKeuQ2Z/78tuXFdMDlwtyMdm6zyNUg1l0iD5XornJ+x5dn54mJj0r2GIcWYtw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552325; c=relaxed/simple; bh=Xi0IrpvP+aaHJb354wpIeruE7sUbn9ZMGvcUcaQGDeU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=qtSXcj+jkc9NPJ44i6ivPLHzVHoQsJBNY6j71STeg6GNGdvFsZq/CUCOB7ue4IB669zarPcqMzDJrt6TwAzbJv2A2gHUhm6LhzBp8NYWXE1thRCPBO/KEqK3G9LzmL1Ydp+RZ5d5tuTXA3Y4kKbH0EUpzk4bXoFU46hyQlDPAeA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-a9e8522c10bso68083366b.1 for ; Sat, 02 Nov 2024 05:58:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552318; x=1731157118; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wpqFQUDXydeStYhrhylYdymYIgbsrnrX2OA3F1vCgpo=; b=haPXwOLS0BWTf2poQq2JK8rqLRsGst6onR2gBtL+KrCk+YfET17l18S/Rzeask44v+ L0l+Kb2SIK+ElMImsrpTu28kdLhoQ2JLTaNO9/ZFOpsWOgj/u0Rr8E6AunRKfSdPJJKS RJDjtag1kD4MVMKUONoPeu7yoaIN1xolzjbtvFPgJLm+ONhZVIlrPYQyhpIoU040uhUj fwusxMyduv9+dCHsJbib+dAewxWHKm1yO7G9wxQNw92hiye9B6MQKRW9Vqty7qAcSKZb qUPaQjnPLDiwH0YjdHfSWdcfZ0NM45EZ6bOorZaEK89ktM4ELBNhoQsaH8qrjaEAz0WL 4j0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552318; x=1731157118; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wpqFQUDXydeStYhrhylYdymYIgbsrnrX2OA3F1vCgpo=; b=WiAswxVCpbmHBYUvTxC/aj8QwyU5kfAGTzGIUE9a8oBhJ2Fs2aUs0bJegInR8uImXq 3SE+0/sSWeDQa4KeX0yfDD6S5Ez2jgbVVkFP7IqNGx/PVB6WGBhl2+kyXUNX7kgwUO4g maRvzP2XVkcDGpNzPhodJzjc2C2xzyCdnRd8YXps4Pk7Vmwq5MP+xUXm+ztgguUeDSBi suHSuAh1Ho9bdeh7GwwmWwxv23u4+GIgAw1OKXcIK0CZ1qU3bueRwm95lgQ0nr3NmyCB pVoAmQ0YjusHmkkFn2MULP0fdr/IMHYaA9W9mQumXWVv5rnkdM6Ifiw5YhMWW36q1dEy Mthw== X-Gm-Message-State: AOJu0YwpoMB4QgXHF1nL+7ggnOeffNS1vnxh8HF/2p2PH0HJt5kQRcAn EEV0nnGpdbiyLEjYvnEeSjTga98qstxqg7Hq+XeVjw/wokZZ7e/scDBVrA== X-Google-Smtp-Source: AGHT+IHQt37YmeEqW1V2eqzF1k0OvY2Wd0HZQmQ/TtBVIlSlbMlRnG0Ie+DZ7VN9gnFa+iUL+SIM3g== X-Received: by 2002:a17:907:72d0:b0:a9a:1af0:ee1a with SMTP id a640c23a62f3a-a9e559dea82mr849277566b.6.1730552317807; Sat, 02 Nov 2024 05:58:37 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:36 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 6/8] gcn: Add else operand to masked loads. Date: Sat, 2 Nov 2024 13:58:26 +0100 Message-ID: <20241102125828.29183-7-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gcn-valu.md | 14 +++++--------- gcc/config/gcn/predicates.md | 2 ++ 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index cb2f4a78035..0e65521cf37 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -3989,7 +3989,8 @@ (define_expand "while_ultsidi" (define_expand "maskloaddi" [(match_operand:V_MOV 0 "register_operand") (match_operand:V_MOV 1 "memory_operand") - (match_operand 2 "")] + (match_operand 2 "") + (match_operand:V_MOV 3 "maskload_else_operand")] "" { rtx exec = force_reg (DImode, operands[2]); @@ -3998,11 +3999,8 @@ (define_expand "maskloaddi" rtx as = gen_rtx_CONST_INT (VOIDmode, MEM_ADDR_SPACE (operands[1])); rtx v = gen_rtx_CONST_INT (VOIDmode, MEM_VOLATILE_P (operands[1])); - /* Masked lanes are required to hold zero. */ - emit_move_insn (operands[0], gcn_vec_constant (mode, 0)); - emit_insn (gen_gather_expr_exec (operands[0], addr, as, v, - operands[0], exec)); + gcn_gen_undef(mode), exec)); DONE; }) @@ -4027,7 +4025,8 @@ (define_expand "mask_gather_load" (match_operand: 2 "register_operand") (match_operand 3 "immediate_operand") (match_operand:SI 4 "gcn_alu_operand") - (match_operand:DI 5 "")] + (match_operand:DI 5 "") + (match_operand:V_MOV 6 "maskload_else_operand")] "" { rtx exec = force_reg (DImode, operands[5]); @@ -4036,9 +4035,6 @@ (define_expand "mask_gather_load" operands[2], operands[4], INTVAL (operands[3]), exec); - /* Masked lanes are required to hold zero. */ - emit_move_insn (operands[0], gcn_vec_constant (mode, 0)); - if (GET_MODE (addr) == mode) emit_insn (gen_gather_insn_1offset_exec (operands[0], addr, const0_rtx, const0_rtx, diff --git a/gcc/config/gcn/predicates.md b/gcc/config/gcn/predicates.md index 3f59396a649..21beeb586a4 100644 --- a/gcc/config/gcn/predicates.md +++ b/gcc/config/gcn/predicates.md @@ -228,3 +228,5 @@ (define_predicate "ascending_zero_int_parallel" return gcn_stepped_zero_int_parallel_p (op, 1); }) +(define_predicate "maskload_else_operand" + (match_operand 0 "scratch_operand")) From patchwork Sat Nov 2 12:58:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100108 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 06DBD3857BBB for ; Sat, 2 Nov 2024 13:00:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id E717D3857C6A for ; Sat, 2 Nov 2024 12:58:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E717D3857C6A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E717D3857C6A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::630 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552324; cv=none; b=WqKsraDzAlsfudI6j2z6eJYvAejLTbK/c9A55HfPuuOCMM5is9hvylIKm01cU65nMYOR7vNEaLaVa+1tqn0XUoUu15EKIiVJupmzjfKiLoU9wm0CYlghz33P4FXLMOuObidbIL63D/EmBpVnhMjc4OMQpgbGGQQqHx1huLrkGqE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552324; c=relaxed/simple; bh=UGS4E2eYcqWiHcRSVnxr/RVXeP5brK38/WBGkcGJPug=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=uMc7hfEIJS5GitRWUw13fuUz0Lc3pJFK7+y5jGk559znwSy4dJKxyvPp+cHEMdTVtq2D6P91Dq9VzhvDN6J8aVlkS74vuG20VlIHQIJcmnMCGQUCMDMEV3UIdZva7DYYLvAaVDhwHhRtIcAWfWfseMvVsaiUKAKAPxbiK/VBebg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-a9a0ec0a94fso420216666b.1 for ; Sat, 02 Nov 2024 05:58:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552319; x=1731157119; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xiv8z41niT89bvWwkeX8ox/0njsvDRJccMJ8wAX1o6c=; b=lAjYQna5MV/2aKIcheVecTppUXqScVD3ZRg1wKU3jm3UyS9PvGV8EtiKeAy6KyjCPc o6cqwp9vo5L1aa8FeyBWSGidopri6d0EjT7GbNDjRSFlgGmFdu+FU7DlIkUOMloGAtUT 9XtMdsyP2ZJC2g+Y7TstbZCjIafCZ58MrXt+mvFVTNx3kDBPLIGLmEvanvmIXfNFaOVV nq/dZG1NvhQcb9TTiBGl3d1ykXcbMkUsvRe01H2C/lQX/b93I11GKm4Xr5TT+kSvERi4 P336+ipVtM91kqHbEJdFul2pGIzkk9wLsI+p2Xa/WB7iUVuUE+auMmxUp9FM+hrYPMMH tE1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552319; x=1731157119; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xiv8z41niT89bvWwkeX8ox/0njsvDRJccMJ8wAX1o6c=; b=kLp32vNtT8Wot/+rmDh4r9vMWfqT7uForYD9Ndmsvn9+bRP2C40KhZyzWm2zOCwlF2 IdgXCd3leG8zJR3fcRcfLHpqdjeRYBCypvIl1yPgDNkDzrQY/ff1kMTkPy7NtLvPOE5W RQ6YRrsseXKw6DPjHOhmS8bA43Y1eIK49jyncKMHHOHc5A+Q9Qn58fxJx5CCsYEQxC+S h8TuK8r/NDSnZ9yPs7ANPGyg4EYYfzK5uRrZJ3k8SOWwdhwepSsQf4u7oz1rksIVqRQB CQhhm0dgVedp33Fv0vTkD0wMy7QUu9ZAUprlbbfWNChHmUwVgfp7mgKIzgt7J9U0kBAq j2Dg== X-Gm-Message-State: AOJu0YyK2ij265lfKD7D/U6FVH9YFx1IC9jyNHnjMum4NNp5ReTVB5d2 YzNMjRKjw5pYQ2eNL0Z0O7DP2AoHedfAaXRf/77XG00ws2lBNNWXbUlpbw== X-Google-Smtp-Source: AGHT+IG5upjql8+ft4DqhVVqGvjJdQ83MYosx2dfU8XkhaaXnPorzTf5hodvOFe6hBTY9jii+3jWWg== X-Received: by 2002:a17:907:971d:b0:a9a:13f8:60b9 with SMTP id a640c23a62f3a-a9de5fc7847mr2623874666b.36.1730552318768; Sat, 02 Nov 2024 05:58:38 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:38 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 7/8] i386: Add else operand to masked loads. Date: Sat, 2 Nov 2024 13:58:27 +0100 Message-ID: <20241102125828.29183-8-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This patch adds a zero else operand to masked loads, in particular the masked gather load builtins that are used for gather vectorization. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Add else-operand handling. (ix86_expand_builtin): Ditto. * config/i386/predicates.md (vcvtne2ps2bf_parallel): New predicate. (maskload_else_operand): Ditto. * config/i386/sse.md: Use predicate. --- gcc/config/i386/i386-expand.cc | 26 ++++++-- gcc/config/i386/predicates.md | 4 ++ gcc/config/i386/sse.md | 112 +++++++++++++++++++++------------ 3 files changed, 97 insertions(+), 45 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 0de0e842731..6c61f9f87c2 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -12995,10 +12995,11 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, { tree arg; rtx pat, op; - unsigned int i, nargs, arg_adjust, memory; + unsigned int i, nargs, arg_adjust, memory = -1; unsigned int constant = 100; bool aligned_mem = false; - rtx xops[4]; + rtx xops[4] = {}; + bool add_els = false; enum insn_code icode = d->icode; const struct insn_data_d *insn_p = &insn_data[icode]; machine_mode tmode = insn_p->operand[0].mode; @@ -13125,6 +13126,9 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, case V4DI_FTYPE_PCV4DI_V4DI: case V4SI_FTYPE_PCV4SI_V4SI: case V2DI_FTYPE_PCV2DI_V2DI: + /* Two actual args but an additional else operand. */ + add_els = true; + /* Fallthru. */ case VOID_FTYPE_INT_INT64: nargs = 2; klass = load; @@ -13397,6 +13401,12 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, xops[i]= op; } + if (add_els) + { + xops[i] = CONST0_RTX (GET_MODE (xops[0])); + nargs++; + } + switch (nargs) { case 0: @@ -13653,7 +13663,7 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget, enum insn_code icode, icode2; tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); tree arg0, arg1, arg2, arg3, arg4; - rtx op0, op1, op2, op3, op4, pat, pat2, insn; + rtx op0, op1, op2, op3, op4, opels, pat, pat2, insn; machine_mode mode0, mode1, mode2, mode3, mode4; unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl); HOST_WIDE_INT bisa, bisa2; @@ -15560,12 +15570,15 @@ rdseed_step: op3 = copy_to_reg (op3); op3 = lowpart_subreg (mode3, op3, GET_MODE (op3)); } + if (!insn_data[icode].operand[5].predicate (op4, mode4)) { - error ("the last argument must be scale 1, 2, 4, 8"); - return const0_rtx; + error ("the last argument must be scale 1, 2, 4, 8"); + return const0_rtx; } + opels = CONST0_RTX (GET_MODE (subtarget)); + /* Optimize. If mask is known to have all high bits set, replace op0 with pc_rtx to signal that the instruction overwrites the whole destination and doesn't use its @@ -15634,7 +15647,8 @@ rdseed_step: } } - pat = GEN_FCN (icode) (subtarget, op0, op1, op2, op3, op4); + pat = GEN_FCN (icode) (subtarget, op0, op1, op2, op3, op4, opels); + if (! pat) return const0_rtx; emit_insn (pat); diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 053312bbe27..7c7d8f61f11 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -2346,3 +2346,7 @@ (define_predicate "apx_evex_add_memory_operand" return true; }) + +(define_predicate "maskload_else_operand" + (and (match_code "const_int,const_vector") + (match_test "op == CONST0_RTX (GET_MODE (op))"))) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 36f8567b66f..41c1badbc00 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -28632,7 +28632,7 @@ (define_insn "_maskstore" (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) -(define_expand "maskload" +(define_expand "maskload_1" [(set (match_operand:V48_128_256 0 "register_operand") (unspec:V48_128_256 [(match_operand: 2 "register_operand") @@ -28640,13 +28640,28 @@ (define_expand "maskload" UNSPEC_MASKMOV))] "TARGET_AVX") +(define_expand "maskload" + [(set (match_operand:V48_128_256 0 "register_operand") + (unspec:V48_128_256 + [(match_operand: 2 "register_operand") + (match_operand:V48_128_256 1 "memory_operand") + (match_operand:V48_128_256 3 "const0_operand")] + UNSPEC_MASKMOV))] + "TARGET_AVX" +{ + emit_insn (gen_maskload_1 (operands[0], + operands[1], + operands[2])); + DONE; +}) + (define_expand "maskload" [(set (match_operand:V48_AVX512VL 0 "register_operand") (vec_merge:V48_AVX512VL (unspec:V48_AVX512VL [(match_operand:V48_AVX512VL 1 "memory_operand")] UNSPEC_MASKLOAD) - (match_dup 0) + (match_operand:V48_AVX512VL 3 "const0_operand") (match_operand: 2 "register_operand")))] "TARGET_AVX512F") @@ -28656,8 +28671,9 @@ (define_expand "maskload" (unspec:VI12HFBF_AVX512VL [(match_operand:VI12HFBF_AVX512VL 1 "memory_operand")] UNSPEC_MASKLOAD) - (match_dup 0) - (match_operand: 2 "register_operand")))] + (match_operand:VI12HFBF_AVX512VL 3 "const0_operand") + (match_operand: 2 "register_operand"))) + ] "TARGET_AVX512BW") (define_expand "maskstore" @@ -29223,20 +29239,22 @@ (define_expand "avx2_gathersi" (unspec:VEC_GATHER_MODE [(match_operand:VEC_GATHER_MODE 1 "register_operand") (mem: - (match_par_dup 6 + (match_par_dup 7 [(match_operand 2 "vsib_address_operand") (match_operand: 3 "register_operand") - (match_operand:SI 5 "const1248_operand ")])) + (match_operand:SI 5 "const1248_operand ") + (match_operand:VEC_GATHER_MODE 6 "maskload_else_operand")])) (mem:BLK (scratch)) (match_operand:VEC_GATHER_MODE 4 "register_operand")] UNSPEC_GATHER)) - (clobber (match_scratch:VEC_GATHER_MODE 7))])] + (clobber (match_scratch:VEC_GATHER_MODE 8))])] "TARGET_AVX2" { - operands[6] - = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, operands[2], operands[3], - operands[5]), UNSPEC_VSIBADDR); + operands[7] + = gen_rtx_UNSPEC (Pmode, gen_rtvec (4, operands[2], operands[3], + operands[5], operands[6]), + UNSPEC_VSIBADDR); }) (define_insn "*avx2_gathersi" @@ -29247,7 +29265,8 @@ (define_insn "*avx2_gathersi" [(unspec:P [(match_operand:P 3 "vsib_address_operand" "jb") (match_operand: 4 "register_operand" "x") - (match_operand:SI 6 "const1248_operand")] + (match_operand:SI 6 "const1248_operand") + (match_operand:VEC_GATHER_MODE 8 "maskload_else_operand")] UNSPEC_VSIBADDR)]) (mem:BLK (scratch)) (match_operand:VEC_GATHER_MODE 5 "register_operand" "1")] @@ -29268,7 +29287,8 @@ (define_insn "*avx2_gathersi_2" [(unspec:P [(match_operand:P 2 "vsib_address_operand" "jb") (match_operand: 3 "register_operand" "x") - (match_operand:SI 5 "const1248_operand")] + (match_operand:SI 5 "const1248_operand") + (match_operand:VEC_GATHER_MODE 7 "maskload_else_operand")] UNSPEC_VSIBADDR)]) (mem:BLK (scratch)) (match_operand:VEC_GATHER_MODE 4 "register_operand" "1")] @@ -29286,20 +29306,22 @@ (define_expand "avx2_gatherdi" (unspec:VEC_GATHER_MODE [(match_operand: 1 "register_operand") (mem: - (match_par_dup 6 + (match_par_dup 7 [(match_operand 2 "vsib_address_operand") (match_operand: 3 "register_operand") - (match_operand:SI 5 "const1248_operand ")])) + (match_operand:SI 5 "const1248_operand ") + (match_operand:VEC_GATHER_MODE 6 "maskload_else_operand")])) (mem:BLK (scratch)) (match_operand: 4 "register_operand")] UNSPEC_GATHER)) - (clobber (match_scratch:VEC_GATHER_MODE 7))])] + (clobber (match_scratch:VEC_GATHER_MODE 8))])] "TARGET_AVX2" { - operands[6] - = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, operands[2], operands[3], - operands[5]), UNSPEC_VSIBADDR); + operands[7] + = gen_rtx_UNSPEC (Pmode, gen_rtvec (4, operands[2], operands[3], + operands[5], operands[6]), + UNSPEC_VSIBADDR); }) (define_insn "*avx2_gatherdi" @@ -29310,7 +29332,8 @@ (define_insn "*avx2_gatherdi" [(unspec:P [(match_operand:P 3 "vsib_address_operand" "jb") (match_operand: 4 "register_operand" "x") - (match_operand:SI 6 "const1248_operand")] + (match_operand:SI 6 "const1248_operand") + (match_operand:VEC_GATHER_MODE 8 "maskload_else_operand")] UNSPEC_VSIBADDR)]) (mem:BLK (scratch)) (match_operand: 5 "register_operand" "1")] @@ -29331,7 +29354,8 @@ (define_insn "*avx2_gatherdi_2" [(unspec:P [(match_operand:P 2 "vsib_address_operand" "jb") (match_operand: 3 "register_operand" "x") - (match_operand:SI 5 "const1248_operand")] + (match_operand:SI 5 "const1248_operand") + (match_operand:VEC_GATHER_MODE 7 "maskload_else_operand")] UNSPEC_VSIBADDR)]) (mem:BLK (scratch)) (match_operand: 4 "register_operand" "1")] @@ -29357,7 +29381,8 @@ (define_insn "*avx2_gatherdi_3" [(unspec:P [(match_operand:P 3 "vsib_address_operand" "jb") (match_operand: 4 "register_operand" "x") - (match_operand:SI 6 "const1248_operand")] + (match_operand:SI 6 "const1248_operand") + (match_operand:VI4F_256 8 "maskload_else_operand")] UNSPEC_VSIBADDR)]) (mem:BLK (scratch)) (match_operand: 5 "register_operand" "1")] @@ -29381,7 +29406,8 @@ (define_insn "*avx2_gatherdi_4" [(unspec:P [(match_operand:P 2 "vsib_address_operand" "jb") (match_operand: 3 "register_operand" "x") - (match_operand:SI 5 "const1248_operand")] + (match_operand:SI 5 "const1248_operand") + (match_operand:VI4F_256 7 "maskload_else_operand")] UNSPEC_VSIBADDR)]) (mem:BLK (scratch)) (match_operand: 4 "register_operand" "1")] @@ -29402,17 +29428,19 @@ (define_expand "_gathersi" [(match_operand:VI48F 1 "register_operand") (match_operand: 4 "register_operand") (mem: - (match_par_dup 6 + (match_par_dup 7 [(match_operand 2 "vsib_address_operand") (match_operand: 3 "register_operand") - (match_operand:SI 5 "const1248_operand")]))] + (match_operand:SI 5 "const1248_operand") + (match_operand:VI48F 6 "maskload_else_operand")]))] UNSPEC_GATHER)) - (clobber (match_scratch: 7))])] + (clobber (match_scratch: 8))])] "TARGET_AVX512F" { - operands[6] - = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, operands[2], operands[3], - operands[5]), UNSPEC_VSIBADDR); + operands[7] + = gen_rtx_UNSPEC (Pmode, gen_rtvec (4, operands[2], operands[3], + operands[5], operands[6]), + UNSPEC_VSIBADDR); }) (define_insn "*avx512f_gathersi" @@ -29424,7 +29452,8 @@ (define_insn "*avx512f_gathersi" [(unspec:P [(match_operand:P 4 "vsib_address_operand" "Tv") (match_operand: 3 "register_operand" "v") - (match_operand:SI 5 "const1248_operand")] + (match_operand:SI 5 "const1248_operand") + (match_operand:VI48F 8 "maskload_else_operand")] UNSPEC_VSIBADDR)])] UNSPEC_GATHER)) (clobber (match_scratch: 2 "=&Yk"))] @@ -29445,7 +29474,8 @@ (define_insn "*avx512f_gathersi_2" [(unspec:P [(match_operand:P 3 "vsib_address_operand" "Tv") (match_operand: 2 "register_operand" "v") - (match_operand:SI 4 "const1248_operand")] + (match_operand:SI 4 "const1248_operand") + (match_operand:VI48F 7 "maskload_else_operand")] UNSPEC_VSIBADDR)])] UNSPEC_GATHER)) (clobber (match_scratch: 1 "=&Yk"))] @@ -29464,17 +29494,19 @@ (define_expand "_gatherdi" [(match_operand: 1 "register_operand") (match_operand:QI 4 "register_operand") (mem: - (match_par_dup 6 + (match_par_dup 7 [(match_operand 2 "vsib_address_operand") (match_operand: 3 "register_operand") - (match_operand:SI 5 "const1248_operand")]))] + (match_operand:SI 5 "const1248_operand") + (match_operand:VI48F 6 "maskload_else_operand")]))] UNSPEC_GATHER)) - (clobber (match_scratch:QI 7))])] + (clobber (match_scratch:QI 8))])] "TARGET_AVX512F" { - operands[6] - = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, operands[2], operands[3], - operands[5]), UNSPEC_VSIBADDR); + operands[7] + = gen_rtx_UNSPEC (Pmode, gen_rtvec (4, operands[2], operands[3], + operands[5], operands[6]), + UNSPEC_VSIBADDR); }) (define_insn "*avx512f_gatherdi" @@ -29486,7 +29518,8 @@ (define_insn "*avx512f_gatherdi" [(unspec:P [(match_operand:P 4 "vsib_address_operand" "Tv") (match_operand: 3 "register_operand" "v") - (match_operand:SI 5 "const1248_operand")] + (match_operand:SI 5 "const1248_operand") + (match_operand:VI48F 8 "maskload_else_operand")] UNSPEC_VSIBADDR)])] UNSPEC_GATHER)) (clobber (match_scratch:QI 2 "=&Yk"))] @@ -29507,7 +29540,8 @@ (define_insn "*avx512f_gatherdi_2" [(unspec:P [(match_operand:P 3 "vsib_address_operand" "Tv") (match_operand: 2 "register_operand" "v") - (match_operand:SI 4 "const1248_operand")] + (match_operand:SI 4 "const1248_operand") + (match_operand:VI48F 7 "maskload_else_operand")] UNSPEC_VSIBADDR)])] UNSPEC_GATHER)) (clobber (match_scratch:QI 1 "=&Yk"))] @@ -29544,7 +29578,7 @@ (define_expand "_scattersi" operands[5] = gen_rtx_UNSPEC (Pmode, gen_rtvec (4, operands[0], operands[2], operands[4], operands[1]), - UNSPEC_VSIBADDR); + UNSPEC_VSIBADDR); }) (define_insn "*avx512f_scattersi" From patchwork Sat Nov 2 12:58:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 100111 X-Patchwork-Delegate: jlaw@ventanamicro.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7D3673857BA7 for ; Sat, 2 Nov 2024 13:01:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id EDF893858283 for ; Sat, 2 Nov 2024 12:58:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EDF893858283 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EDF893858283 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::634 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552327; cv=none; b=wPe6re2jmhtxn2YXv2pHIn9dZ7FvF9OUbeuGoMVYWqBqwVUf/ysH3qbECbx6Y6haZTvxdt17xE3nv96Ys8jIvL+NgtWCZ16yk8ZLmpZSIUNaLZGv6PfdYTCbh0xcvjRLOCzhkZVoZNY+OOKcGFyFILlWMnn75+SCWEUAbk/DvAA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730552327; c=relaxed/simple; bh=LBW7r5itlfJm1rH/hq+Nf3NprnB28OzLgmLEg4Ws4qA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=hIFnfmsygtj4gE34U+AF4SjanFIr2SSIHr8QSjs0xKBQDPpctjyKZ1LDdNxf4XLejnou7wXSGHlzenIWtJvHGF8VVbUfvoZxnlbZpgQAArRCy6w7itm7b79bCUn6+ugdbvx8JtBdPSG/rZjtvc2sZCoi0e64/i2xnfIgT+TVErA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-a9abe139088so405992466b.1 for ; Sat, 02 Nov 2024 05:58:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730552320; x=1731157120; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qQNfualmnqAO1yuhoUMAY7ii5c8+deilw+7zcmPnUUo=; b=Ha9jxfbh8vMUHs/mpVfXCDXnvQTnBasRB/ZK93b2PKF4Z60Ibm53pWSR1yLl0GhmFg su3cK9qSA+PVrU4LaZrS67mbX1Gk+mw41GsEoZGQ3A8Zp6+Nt9mfU1iDvYMozlgBJL0n +5eSuFqfWkjYOB73qRlVoYyU4cKjoYBsfbb2YSiC0EoB7pD2jPcS2LpJ+by9RfXCZOcz EQy8l37zeBBYFlERwwGcCbgI9UpsQY/Q5p4Ip91x3Q8R92IyN2++CDsojxjMJE9GTFoo JIDlL0kyq0eFK948Mjfd9S7LzUZ6gidEIYlzMLfnyS0aoQUaPsm/6U3Geh3seWihyBmG pr1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730552320; x=1731157120; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qQNfualmnqAO1yuhoUMAY7ii5c8+deilw+7zcmPnUUo=; b=DfSKmpVmgAXBoV5yggUBaxOccLfJ6p2EN3JyHWJDIWemRKWg874xTH0j+vB4NQXQG0 Oi9Kdjyy0PSPrYVN3Zrh7A+tmYWOMxATbzW/KfiilMFx64iE1Zulh1F8vnjw7OI/DuwZ qqG3v+FC9ovi/aF9TGD6C69yDPOBh85yticfa1mNWSV5cwunamnkcoCERElfxc7MjIEq 4XnRa76gspuhQ+4SLfF7onbv7h0TK25S8WLoefq/vpTxEqZ/97wHlULtuemo7TA55cEE +33xJbp7ZlKtTFaTet/BJTci8FtDjngiPiHGg2qBznm5wbJhpP/dJIuIhJba77c2z+MX PpSQ== X-Gm-Message-State: AOJu0Yx0nMmD1N+tqaeXIpQmZEx6akf00E1UL9jHG7VWo3m0Tl4Qu0Tf 30Sqnifpo24KxtBQ/LJdsWXURg/E6E/LzXFtM26RjP1g5vtp3EYvxxT6iw== X-Google-Smtp-Source: AGHT+IF5f/k88EAbOrdi3VEaMyBug78tyidZgl/4qyKz6yzKkzKynBd4hOVIq35M4N6Y+YQEdDlyOg== X-Received: by 2002:a17:906:794f:b0:a99:f779:adf9 with SMTP id a640c23a62f3a-a9de61a42e2mr2469925566b.63.1730552319719; Sat, 02 Nov 2024 05:58:39 -0700 (PDT) Received: from x1c10.fritz.box (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e564942a4sm306930766b.24.2024.11.02.05.58.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Nov 2024 05:58:39 -0700 (PDT) From: Robin Dapp To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, rguenther@suse.de, richard.sandiford@arm.com, jeffreyalaw@gmail.com, ams@baylibre.com, crazylht@gmail.com Subject: [PATCH v3 8/8] RISC-V: Add else operand to masked loads [PR115336]. Date: Sat, 2 Nov 2024 13:58:28 +0100 Message-ID: <20241102125828.29183-9-rdapp.gcc@gmail.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241102125828.29183-1-rdapp.gcc@gmail.com> References: <20241102125828.29183-1-rdapp.gcc@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Robin Dapp This patch adds else operands to masked loads. Currently the default else operand predicate just accepts "undefined" (i.e. SCRATCH) values. PR middle-end/115336 PR middle-end/116059 gcc/ChangeLog: * config/riscv/autovec.md: Add else operand. * config/riscv/predicates.md (maskload_else_operand): New predicate. * config/riscv/riscv-v.cc (get_else_operand): Remove static. (expand_load_store): Use get_else_operand and adjust index. (expand_gather_scatter): Ditto. (expand_lanes_load_store): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr115336.c: New test. * gcc.target/riscv/rvv/autovec/pr116059.c: New test. --- gcc/config/riscv/autovec.md | 50 +++++++++++-------- gcc/config/riscv/predicates.md | 3 ++ gcc/config/riscv/riscv-v.cc | 30 +++++++---- .../gcc.target/riscv/rvv/autovec/pr115336.c | 20 ++++++++ .../gcc.target/riscv/rvv/autovec/pr116059.c | 15 ++++++ 5 files changed, 88 insertions(+), 30 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116059.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 1f1849d5237..26489e537c6 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -26,8 +26,9 @@ (define_expand "mask_len_load" [(match_operand:V 0 "register_operand") (match_operand:V 1 "memory_operand") (match_operand: 2 "vector_mask_operand") - (match_operand 3 "autovec_length_operand") - (match_operand 4 "const_0_operand")] + (match_operand:V 3 "maskload_else_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] "TARGET_VECTOR" { riscv_vector::expand_load_store (operands, true); @@ -57,8 +58,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR && riscv_vector::gather_scatter_valid_offset_p (mode)" { riscv_vector::expand_gather_scatter (operands, true); @@ -72,8 +74,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR && riscv_vector::gather_scatter_valid_offset_p (mode)" { riscv_vector::expand_gather_scatter (operands, true); @@ -87,8 +90,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR && riscv_vector::gather_scatter_valid_offset_p (mode)" { riscv_vector::expand_gather_scatter (operands, true); @@ -102,8 +106,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR && riscv_vector::gather_scatter_valid_offset_p (mode)" { riscv_vector::expand_gather_scatter (operands, true); @@ -117,8 +122,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR && riscv_vector::gather_scatter_valid_offset_p (mode)" { riscv_vector::expand_gather_scatter (operands, true); @@ -132,8 +138,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR && riscv_vector::gather_scatter_valid_offset_p (mode)" { riscv_vector::expand_gather_scatter (operands, true); @@ -151,8 +158,9 @@ (define_expand "mask_len_gather_load" (match_operand 3 "") (match_operand 4 "") (match_operand: 5 "vector_mask_operand") - (match_operand 6 "autovec_length_operand") - (match_operand 7 "const_0_operand")] + (match_operand 6 "maskload_else_operand") + (match_operand 7 "autovec_length_operand") + (match_operand 8 "const_0_operand")] "TARGET_VECTOR" { riscv_vector::expand_gather_scatter (operands, true); @@ -280,8 +288,9 @@ (define_expand "vec_mask_len_load_lanes" [(match_operand:VT 0 "register_operand") (match_operand:VT 1 "memory_operand") (match_operand: 2 "vector_mask_operand") - (match_operand 3 "autovec_length_operand") - (match_operand 4 "const_0_operand")] + (match_operand 3 "maskload_else_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] "TARGET_VECTOR_AUTOVEC_SEGMENT" { riscv_vector::expand_lanes_load_store (operands, true); @@ -2898,8 +2907,9 @@ (define_expand "mask_len_strided_load_" (match_operand 1 "pmode_reg_or_0_operand") (match_operand 2 "pmode_reg_or_0_operand") (match_operand: 3 "vector_mask_operand") - (match_operand 4 "autovec_length_operand") - (match_operand 5 "const_0_operand")] + (match_operand 4 "maskload_else_operand") + (match_operand 5 "autovec_length_operand") + (match_operand 6 "const_0_operand")] "TARGET_VECTOR" { riscv_vector::expand_strided_load (mode, operands); diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 55bcfa4fa4f..ffa90948dd7 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -528,6 +528,9 @@ (define_predicate "autovec_else_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "scratch_operand"))) +(define_predicate "maskload_else_operand" + (match_operand 0 "scratch_operand")) + (define_predicate "vector_arith_operand" (ior (match_operand 0 "register_operand") (and (match_code "const_vector") diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 209b7ee88f1..4cec3e64aa9 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3793,12 +3793,23 @@ expand_select_vl (rtx *ops) emit_insn (gen_no_side_effects_vsetvl_rtx (rvv_mode, ops[0], ops[1])); } +/* Return RVV_VUNDEF if the ELSE value is scratch rtx. */ +static rtx +get_else_operand (rtx op) +{ + return GET_CODE (op) == SCRATCH ? RVV_VUNDEF (GET_MODE (op)) : op; +} + /* Expand MASK_LEN_{LOAD,STORE}. */ void expand_load_store (rtx *ops, bool is_load) { - rtx mask = ops[2]; - rtx len = ops[3]; + int idx = 2; + rtx mask = ops[idx++]; + /* A masked load has a merge/else operand. */ + if (is_load) + get_else_operand (ops[idx++]); + rtx len = ops[idx]; machine_mode mode = GET_MODE (ops[0]); if (is_vlmax_len_p (mode, len)) @@ -3841,7 +3852,9 @@ expand_strided_load (machine_mode mode, rtx *ops) rtx base = ops[1]; rtx stride = ops[2]; rtx mask = ops[3]; - rtx len = ops[4]; + int idx = 4; + get_else_operand (ops[idx++]); + rtx len = ops[idx]; poly_int64 len_val; insn_code icode = code_for_pred_strided_load (mode); @@ -3943,13 +3956,6 @@ expand_cond_len_op (unsigned icode, insn_flags op_type, rtx *ops, rtx len) emit_nonvlmax_insn (icode, insn_flags, ops, len); } -/* Return RVV_VUNDEF if the ELSE value is scratch rtx. */ -static rtx -get_else_operand (rtx op) -{ - return GET_CODE (op) == SCRATCH ? RVV_VUNDEF (GET_MODE (op)) : op; -} - /* Expand unary ops COND_LEN_*. */ void expand_cond_len_unop (unsigned icode, rtx *ops) @@ -4070,6 +4076,8 @@ expand_gather_scatter (rtx *ops, bool is_load) int shift; rtx mask = ops[5]; rtx len = ops[6]; + if (is_load) + len = ops[7]; if (is_load) { vec_reg = ops[0]; @@ -4292,6 +4300,8 @@ expand_lanes_load_store (rtx *ops, bool is_load) { rtx mask = ops[2]; rtx len = ops[3]; + if (is_load) + len = ops[4]; rtx addr = is_load ? XEXP (ops[1], 0) : XEXP (ops[0], 0); rtx reg = is_load ? ops[0] : ops[1]; machine_mode mode = GET_MODE (ops[0]); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c new file mode 100644 index 00000000000..aa2d02309be --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr115336.c @@ -0,0 +1,20 @@ +/* { dg-do run } */ +/* { dg-options { -O3 -march=rv64gcv_zvl256b -mabi=lp64d } } */ +/* { dg-require-effective-target rvv_zvl256b_ok } */ + +short d[19]; +_Bool e[100][19][19]; +_Bool f[10000]; + +int main() +{ + for (long g = 0; g < 19; ++g) + d[g] = 3; + _Bool(*h)[19][19] = e; + for (short g = 0; g < 9; g++) + for (int i = 4; i < 16; i += 3) + f[i * 9 + g] = d[i] ? d[i] : h[g][i][2]; + for (long i = 120; i < 122; ++i) + if (f[i] != 1) + __builtin_abort (); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116059.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116059.c new file mode 100644 index 00000000000..93700acd485 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr116059.c @@ -0,0 +1,15 @@ +/* { dg-do run } */ +/* { dg-require-effective-target riscv_v_ok } */ +/* { dg-add-options riscv_v } */ +/* { dg-additional-options "-O2 -std=gnu99" } */ + +char a; +_Bool b[11] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}; +int main() { + _Bool *c = b; + for (signed d = 0; d < 11; d += 1) + a = d % 2 == 0 ? c[d] / c[d] + : c[d]; + if (a != 1) + __builtin_abort (); +}