From patchwork Sat Jan 22 22:24:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 50366 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 70EF13858414 for ; Sat, 22 Jan 2022 22:25:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 70EF13858414 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642890301; bh=n6Izo8BQlerMlhRtH3bGs0Tj422+PYKA2j6x/4LUeVw=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=pTT27bLNWz8Bux+HXuKNZwQbNnpw3mD0Zf4AlpSS86iK7otyODEHlDjMf8Gxrew08 up/5gl6d90P4BTmMn6NgaB1NYGg9fPCt0EBrEQ551NB7xIrl1cLbNFq7jdTz0vkUwD V/PPzERktT+3ae0Q5GfXAuobr3/czA1TRYr+gZXw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by sourceware.org (Postfix) with ESMTPS id 6E8DB3858D37 for ; Sat, 22 Jan 2022 22:24:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E8DB3858D37 Received: by mail-pj1-x1035.google.com with SMTP id l16so12526872pjl.4 for ; Sat, 22 Jan 2022 14:24:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=n6Izo8BQlerMlhRtH3bGs0Tj422+PYKA2j6x/4LUeVw=; b=yoZl64Kszok1AvmsoM95+lEopsGqQwWvPO8hgNk1VukchYwCmRkGgnKa8aZEci2Qf6 wbMwAk7jAz7+5j40o0NBrinY+VtuGac8QFp5mGGFZEMggBplxJYO5ESytlJF4lRxGJyg SRPQ5zn8SgVhe0AcwB8e32i6tc0xwkRxurah8E1RvWKZ+hIIev0hwgNb1nYHH22IEPcJ YKvon7QqNKIbTpCm/s2agsYGadBoKMUy0xU6498lH27xu7YVWTiExo7v6gpSCTiCk9pk pA7RHchD271ITZAGTMdR8uCF1tcY7emtwSv/4ro+L+gS/hcfXjJ1PREIBSRwXZ5Ze3I2 cgCA== X-Gm-Message-State: AOAM533hTnEVOPJmHgD6OVTYDG20WZhqxHzJ8Vkn9XQ++kUIItBHygne oH/cNqO3pijk4Blf7AlXfSmUXWiu0kw= X-Google-Smtp-Source: ABdhPJzIHS9lifOdIOVlIM0UPDBqfZ1gyKpif2n//nJcSnu3MKR11JdKnaSLv1VxUZBseibcbwxn6w== X-Received: by 2002:a17:90a:440d:: with SMTP id s13mr6708156pjg.86.1642890269072; Sat, 22 Jan 2022 14:24:29 -0800 (PST) Received: from gnu-tgl-3.localdomain ([172.58.35.133]) by smtp.gmail.com with ESMTPSA id b23sm8697102pjz.34.2022.01.22.14.24.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 22 Jan 2022 14:24:28 -0800 (PST) Received: from gnu-tgl-3.. (localhost [IPv6:::1]) by gnu-tgl-3.localdomain (Postfix) with ESMTP id 4E771C00D9 for ; Sat, 22 Jan 2022 14:24:27 -0800 (PST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] x86: Also check VALID_BCST_MODE_P on memory broadcast Date: Sat, 22 Jan 2022 14:24:26 -0800 Message-Id: <20220122222427.625476-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-3028.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Gcc-patches" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Return false for invalid broadcast mode in bcst_mem_operand on memory broadcast: (vec_duplicate:V16SF (mem/j:V4SF (reg/v/f:DI 109 [ b ]))) gcc/ PR target/104188 * config/i386/predicates.md (bcst_mem_operand): Also check VALID_BCST_MODE_P on memory broadcast. gcc/testsuite/ PR target/104188 * gcc.target/i386/pr104188.c: New test. --- gcc/config/i386/predicates.md | 3 +- gcc/testsuite/gcc.target/i386/pr104188.c | 70 ++++++++++++++++++++++++ 2 files changed, 72 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr104188.c diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index eae6ab58e23..fd716f006f3 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1157,7 +1157,8 @@ (define_predicate "bcst_mem_operand" (ior (match_test "TARGET_AVX512VL") (match_test "GET_MODE_SIZE (GET_MODE (op)) == 64"))) (match_test "VALID_BCST_MODE_P (GET_MODE_INNER (GET_MODE (op)))") - (match_test "memory_operand (XEXP (op, 0), GET_MODE (XEXP (op, 0)))"))) + (match_test "memory_operand (XEXP (op, 0), GET_MODE (XEXP (op, 0)))") + (match_test "VALID_BCST_MODE_P (GET_MODE (XEXP (op, 0)))"))) ; Return true when OP is bcst_mem_operand or vector_memory_operand. (define_predicate "bcst_vector_operand" diff --git a/gcc/testsuite/gcc.target/i386/pr104188.c b/gcc/testsuite/gcc.target/i386/pr104188.c new file mode 100644 index 00000000000..c6f615b9625 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104188.c @@ -0,0 +1,70 @@ +/* { dg-do run { target avx512f } } */ +/* { dg-options "-O2 -mfpmath=sse" } */ + +#include + +union U { + float m[4][4]; + __m128 r[4]; + __m512 s; +}; + +__attribute__((noipa, target("avx512f"))) +void +foo (union U *x, union U *a, union U *b) +{ + __m512 c = _mm512_loadu_ps (&a->s); + __m512 d = _mm512_broadcast_f32x4 (b->r[0]); + __m512 e = _mm512_broadcast_f32x4 (b->r[1]); + __m512 f = _mm512_broadcast_f32x4 (b->r[2]); + __m512 g = _mm512_broadcast_f32x4 (b->r[3]); + __m512 h = _mm512_mul_ps (_mm512_permute_ps (c, 0x00), d); + h = _mm512_fmadd_ps (_mm512_permute_ps (c, 0x55), e, h); + h = _mm512_fmadd_ps (_mm512_permute_ps (c, 0xaa), f, h); + h = _mm512_fmadd_ps (_mm512_permute_ps (c, 0xff), g, h); + _mm512_storeu_ps (&x->s, h); +} + +__attribute__((noipa, target("avx512f"))) +void +do_test (void) +{ + union U a = { .m = { { 1.0f, 2.0f, 3.0f, 4.0f }, + { 5.0f, 6.0f, 7.0f, 8.0f }, + { 9.0f, 10.0f, 11.0f, 12.0f }, + { 13.0f, 14.0f, 15.0f, 16.0f } } }; + union U b = { .m = { { 17.0f, 18.0f, 19.0f, 20.0f }, + { 21.0f, 22.0f, 23.0f, 24.0f }, + { 25.0f, 26.0f, 27.0f, 28.0f }, + { 29.0f, 30.0f, 31.0f, 32.0f } } }; + union U c; + foo (&c, &a, &b); + if (c.m[0][0] != 250.0f + || c.m[0][1] != 260.0f + || c.m[0][2] != 270.0f + || c.m[0][3] != 280.0f) + __builtin_abort (); + if (c.m[1][0] != 618.0f + || c.m[1][1] != 644.0f + || c.m[1][2] != 670.0f + || c.m[1][3] != 696.0f) + __builtin_abort (); + if (c.m[2][0] != 986.0f + || c.m[2][1] != 1028.0f + || c.m[2][2] != 1070.0f + || c.m[2][3] != 1112.0f) + __builtin_abort (); + if (c.m[3][0] != 1354.0f + || c.m[3][1] != 1412.0f + || c.m[3][2] != 1470.0f + || c.m[3][3] != 1528.0f) + __builtin_abort (); +} + +int +main () +{ + if (__builtin_cpu_supports ("avx512f")) + do_test (); + return 0; +}