[v4,07/12] arm: Fix vcond_mask expander for MVE (PR target/100757)

From: Christophe Lyon <christophe.lyon.oss@gmail.com>

  From: Christophe Lyon <christophe.lyon.oss@gmail.com>

The problem in this PR is that we call VPSEL with a mask of vector
type instead of HImode. This happens because operand 3 in vcond_mask
is the pre-computed vector comparison and has vector type.

This patch fixes it by implementing TARGET_VECTORIZE_GET_MASK_MODE,
returning the appropriate VxBI mode when targeting MVE.  In turn, this
implies implementing vec_cmp<mode><MVE_vpred>,
vec_cmpu<mode><MVE_vpred> and vcond_mask_<mode><MVE_vpred>, and we can
move vec_cmp<mode><v_cmp_result>, vec_cmpu<mode><mode> and
vcond_mask_<mode><v_cmp_result> back to neon.md since they are not
used by MVE anymore.  The new *<MVE_vpred> patterns listed above are
implemented in mve.md since they are only valid for MVE. However this
may make maintenance/comparison more painful than having all of them
in vec-common.md.

In the process, we can get rid of the recently added vcond_mve
parameter of arm_expand_vector_compare.

Compared to neon.md's vcond_mask_<mode><v_cmp_result> before my "arm:
Auto-vectorization for MVE: vcmp" patch (r12-834), it keeps the VDQWH
iterator added in r12-835 (to have V4HF/V8HF support), as well as the
(!<Is_float_mode> || flag_unsafe_math_optimizations) condition which
was not present before r12-834 although SF modes were enabled by VDQW
(I think this was a bug).

Using TARGET_VECTORIZE_GET_MASK_MODE has the advantage that we no
longer need to generate vpsel with vectors of 0 and 1: the masks are
now merged via scalar 'ands' instructions operating on 16-bit masks
after converting the boolean vectors.

In addition, this patch fixes a problem in arm_expand_vcond() where
the result would be a vector of 0 or 1 instead of operand 1 or 2.

Since we want to skip gcc.dg/signbit-2.c for MVE, we also add a new
arm_mve effective target.

Reducing the number of iterations in pr100757-3.c from 32 to 8, we
generate the code below:

float a[32];
float fn1(int d) {
  float c = 4.0f;
  for (int b = 0; b < 8; b++)
    if (a[b] != 2.0f)
      c = 5.0f;
  return c;
}

fn1:
	ldr     r3, .L3+48
	vldr.64 d4, .L3              // q2=(2.0,2.0,2.0,2.0)
	vldr.64 d5, .L3+8
	vldrw.32        q0, [r3]     // q0=a(0..3)
	adds    r3, r3, #16
	vcmp.f32        eq, q0, q2   // cmp a(0..3) == (2.0,2.0,2.0,2.0)
	vldrw.32        q1, [r3]     // q1=a(4..7)
	vmrs     r3, P0
	vcmp.f32        eq, q1, q2   // cmp a(4..7) == (2.0,2.0,2.0,2.0)
	vmrs    r2, P0  @ movhi
	ands    r3, r3, r2           // r3=select(a(0..3]) & select(a(4..7))
	vldr.64 d4, .L3+16           // q2=(5.0,5.0,5.0,5.0)
	vldr.64 d5, .L3+24
	vmsr     P0, r3
	vldr.64 d6, .L3+32           // q3=(4.0,4.0,4.0,4.0)
	vldr.64 d7, .L3+40
	vpsel q3, q3, q2             // q3=vcond_mask(4.0,5.0)
	vmov.32 r2, q3[1]            // keep the scalar max
	vmov.32 r0, q3[3]
	vmov.32 r3, q3[2]
	vmov.f32        s11, s12
	vmov    s15, r2
	vmov    s14, r3
	vmaxnm.f32      s15, s11, s15
	vmaxnm.f32      s15, s15, s14
	vmov    s14, r0
	vmaxnm.f32      s15, s15, s14
	vmov    r0, s15
	bx      lr
	.L4:
	.align  3
	.L3:
	.word   1073741824	// 2.0f
	.word   1073741824
	.word   1073741824
	.word   1073741824
	.word   1084227584	// 5.0f
	.word   1084227584
	.word   1084227584
	.word   1084227584
	.word   1082130432	// 4.0f
	.word   1082130432
	.word   1082130432
	.word   1082130432

This patch adds tests that trigger an ICE without this fix.

The pr100757*.c testcases are derived from
gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
various types and return values different from 0 and 1 to avoid
commonalization with boolean masks.  In addition, since we should not
need these masks, the tests make sure they are not present.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  <christophe.lyon@arm.com>

	PR target/100757
	gcc/
	* config/arm/arm-protos.h (arm_get_mask_mode): New prototype.
	(arm_expand_vector_compare): Update prototype.
	* config/arm/arm.cc (TARGET_VECTORIZE_GET_MASK_MODE): New.
	(arm_vector_mode_supported_p): Add support for VxBI modes.
	(arm_expand_vector_compare): Remove useless generation of vpsel.
	(arm_expand_vcond): Fix select operands.
	(arm_get_mask_mode): New.
	* config/arm/mve.md (vec_cmp<mode><MVE_vpred>): New.
	(vec_cmpu<mode><MVE_vpred>): New.
	(vcond_mask_<mode><MVE_vpred>): New.
	* config/arm/vec-common.md (vec_cmp<mode><v_cmp_result>)
	(vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): Move to ...
	* config/arm/neon.md (vec_cmp<mode><v_cmp_result>)
	(vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): ... here
	and disable for MVE.
	* doc/sourcebuild.texi (arm_mve): Document new effective-target.

	gcc/testsuite/
	PR target/100757
	* gcc.target/arm/simd/pr100757-2.c: New.
	* gcc.target/arm/simd/pr100757-3.c: New.
	* gcc.target/arm/simd/pr100757-4.c: New.
	* gcc.target/arm/simd/pr100757.c: New.
	* gcc.dg/signbit-2.c: Skip when targeting ARM/MVE.
	* lib/target-supports.exp (check_effective_target_arm_mve): New.

Message ID	20220222150020.22852-8-christophe.lyon@linaro.org
State	Committed
Headers	Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 399EE389EC57 for <patchwork@sourceware.org>; Tue, 22 Feb 2022 15:07:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 399EE389EC57 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1645542462; bh=IA70R19vu+bomgFKMAjkY9QQeEotqBY51kavpui2FLY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=CG215H4SxxWwKTxkWMmnao7RZrc5xXj4XlaWFNvPKaX5hfhw9k4lq5Tt0SvzBF6qF dkJwjmY/DspIoFXldJwCdo7VWnhJotjF0KPlE83j70dnbDgh3V5n4uGHr94F+6O6vw qezHKckzm7WKlhOaQdPxD9NlG3gwjps92zrwYm+U= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by sourceware.org (Postfix) with ESMTPS id 07D2E388CC39 for <gcc-patches@gcc.gnu.org>; Tue, 22 Feb 2022 15:00:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 07D2E388CC39 Received: by mail-wr1-x431.google.com with SMTP id s1so7421803wrg.10 for <gcc-patches@gcc.gnu.org>; Tue, 22 Feb 2022 07:00:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IA70R19vu+bomgFKMAjkY9QQeEotqBY51kavpui2FLY=; b=Tp+AspLGcBdO6DDZERUT/mYBREGBV8a7KBX7ibFVSTxMksesLQSAErIHh4+RjJU6eP N+3+6iamnRYg2WG8/WWT+L5oASnnRVLoNzMbVpUs5jEFkkoVPGu5DbQLx9xoQ+mJFmvl i85pMsSlPo+11SDrwiGYy3D+Pr496BjzohpKjKWoVVR01gjssPwORsyxkuXHgz7Bz6u+ bm1fDaO/n+NjI5k/aTF2rXkqD+ZeboIagr9hTRD5dJGKD0tXUHaPjEq6O+NOb2WmMsNb 5kRmqh/TNSgl/gjJDUsLRJglOTbY9y5zfzMdQLS8V9WOilB7gHEHmW7pU/11cK5+ZFq0 Hm5w== X-Gm-Message-State: AOAM531Hl1f9tF59DWq85bpsLtGNWllerBhjTwTO6kCE02b91yDj6+mh sFwODu0O2tFD6LS0ntl+gsxwkd9SyV6wdA== X-Google-Smtp-Source: ABdhPJxbhKO2CKb37tYH9cVizTYMYAjn9+GydLGgkrl/BhNqDj8kSFeo9Gj1tjEg4sb2y8/VlVUlwQ== X-Received: by 2002:a5d:6dab:0:b0:1ea:1c5d:904d with SMTP id u11-20020a5d6dab000000b001ea1c5d904dmr9049790wrs.51.1645542028292; Tue, 22 Feb 2022 07:00:28 -0800 (PST) Received: from babel.clyon.hd.free.fr ([2a01:e0a:203:b210:afd3:bde6:6149:fc73]) by smtp.gmail.com with ESMTPSA id w8sm53845941wre.83.2022.02.22.07.00.27 for <gcc-patches@gcc.gnu.org> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Feb 2022 07:00:28 -0800 (PST) X-Google-Original-From: Christophe Lyon <christophe.lyon@linaro.org> To: gcc-patches@gcc.gnu.org Subject: [PATCH v4 07/12] arm: Fix vcond_mask expander for MVE (PR target/100757) Date: Tue, 22 Feb 2022 16:00:15 +0100 Message-Id: <20220222150020.22852-8-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220222150020.22852-1-christophe.lyon@linaro.org> References: <20220222150020.22852-1-christophe.lyon@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Christophe Lyon via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Christophe Lyon <christophe.lyon.oss@gmail.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
Series	ARM/MVE use vectors of boolean for predicates \| [v4,00/12] ARM/MVE use vectors of boolean for predicates [v4,01/12] arm: Add new tests for comparison vectorization with Neon and MVE [v4,02/12] arm: Add GENERAL_AND_VPR_REGS regclass [v4,03/12] arm: Add support for VPR_REG in arm_class_likely_spilled_p [v4,04/12] arm: Fix mve_vmvnq_n_<supf><mode> argument mode [v4,05/12] arm: Implement MVE predicates as vectors of booleans [v4,06/12] arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates [v4,07/12] arm: Fix vcond_mask expander for MVE (PR target/100757) [v4,08/12] arm: Convert remaining MVE vcmp builtins to predicate qualifiers [v4,09/12] arm: Convert more MVE builtins to predicate qualifiers [v4,10/12] arm: Convert more load/store MVE builtins to predicate qualifiers [v4,11/12] arm: Convert more MVE/CDE builtins to predicate qualifiers [v4,12/12] arm: Add VPR_REG to ALL_REGS

[v4,07/12] arm: Fix vcond_mask expander for MVE (PR target/100757)

Commit Message

Patch