Message ID | 20211020053033.67934-1-hongyu.wang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EDD12385802E for <patchwork@sourceware.org>; Wed, 20 Oct 2021 05:32:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EDD12385802E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1634707926; bh=+3h7XWGzeL2OwaWLDChtCJj1JjFDjE7bn36kx6I33bU=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=IF2Gi/9pB8fZ/39Mw8LoCchhXfvGSyalduuF/kcwj5hhRR9MnCAG7svH7elyepv6Q tHizelHAQzhWxm0jFlKf+WWm5jicw0rnv6fa9Ye/QS3xK/Ygv1QczD1Rj1zOTxEpPq fThcNjFsxm94dEh97bso+7fnv0ifJLqBxp9O2vLc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by sourceware.org (Postfix) with ESMTPS id EE7B93858030 for <gcc-patches@gcc.gnu.org>; Wed, 20 Oct 2021 05:30:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EE7B93858030 X-IronPort-AV: E=McAfee;i="6200,9189,10142"; a="228644737" X-IronPort-AV: E=Sophos;i="5.87,166,1631602800"; d="scan'208";a="228644737" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2021 22:30:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,166,1631602800"; d="scan'208";a="631062900" Received: from scymds02.sc.intel.com ([10.82.73.244]) by fmsmga001.fm.intel.com with ESMTP; 19 Oct 2021 22:30:34 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds02.sc.intel.com with ESMTP id 19K5UXlp018184; Tue, 19 Oct 2021 22:30:33 -0700 To: hongtao.liu@intel.com Subject: [PATCH] i386: Fix wrong codegen for V8HF move without TARGET_AVX512F Date: Wed, 20 Oct 2021 13:30:33 +0800 Message-Id: <20211020053033.67934-1-hongyu.wang@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, SPOOFED_FREEMAIL, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Hongyu Wang via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Hongyu Wang <hongyu.wang@intel.com> Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
i386: Fix wrong codegen for V8HF move without TARGET_AVX512F
|
|
Commit Message
Hongyu Wang
Oct. 20, 2021, 5:30 a.m. UTC
Since _Float16 type is enabled under sse2 target, returning V8HFmode vector without AVX512F target would generate wrong vmovdqa64 instruction. Adjust ix86_get_ssemov to avoid this. Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. OK for master? gcc/ChangeLog: PR target/102812 * config/i386/i386.c (ix86_get_ssemov): Adjust HFmode vector move without AVX512F target. gcc/testsuite/ChangeLog: PR target/102812 * gcc.target/i386/pr102812.c: New test. --- gcc/config/i386/i386.c | 9 ++++++--- gcc/testsuite/gcc.target/i386/pr102812.c | 12 ++++++++++++ 2 files changed, 18 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr102812.c
Comments
On Wed, Oct 20, 2021 at 1:31 PM Hongyu Wang via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Since _Float16 type is enabled under sse2 target, returning > V8HFmode vector without AVX512F target would generate wrong > vmovdqa64 instruction. Adjust ix86_get_ssemov to avoid this. > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > OK for master? > > gcc/ChangeLog: > PR target/102812 > * config/i386/i386.c (ix86_get_ssemov): Adjust HFmode vector > move without AVX512F target. > > gcc/testsuite/ChangeLog: > PR target/102812 > * gcc.target/i386/pr102812.c: New test. > --- > gcc/config/i386/i386.c | 9 ++++++--- > gcc/testsuite/gcc.target/i386/pr102812.c | 12 ++++++++++++ > 2 files changed, 18 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr102812.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 9cc903e826b..1d79180da9a 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -5399,9 +5399,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, > switch (scalar_mode) > { > case E_HFmode: > - opcode = (misaligned_p > - ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > - : "vmovdqa64"); > + if (!TARGET_AVX512F) > + opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; > + else > + opcode = (misaligned_p > + ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > + : "vmovdqa64"); > break; Could we just use similar logic as HI? case E_HImode: if (evex_reg_p) opcode = (need_unaligned_p ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") : "vmovdqa64"); else opcode = (need_unaligned_p ? (TARGET_AVX512BW ? "vmovdqu16" : "%vmovdqu") : "%vmovdqa"); break; > case E_SFmode: > opcode = misaligned_p ? "%vmovups" : "%vmovaps"; > diff --git a/gcc/testsuite/gcc.target/i386/pr102812.c b/gcc/testsuite/gcc.target/i386/pr102812.c > new file mode 100644 > index 00000000000..bad4fa9394e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr102812.c > @@ -0,0 +1,12 @@ > +/* PR target/102812 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse4 -mno-avx" } */ > +/* { dg-final { scan-assembler-not "vmovdqa64\t" } } */ > +/* { dg-final { scan-assembler "movdqa\t" } } */ > + > +typedef _Float16 v8hf __attribute__((__vector_size__ (16))); > + > +v8hf t (_Float16 a) > +{ > + return (v8hf) {a, 0, 0, 0, 0, 0, 0, 0}; > +} > -- > 2.18.1 >
Yes, updated patch. gcc/ChangeLog: PR target/102812 * config/i386/i386.c (ix86_get_ssemov): Adjust HFmode vector move to use the same logic as HImode. gcc/testsuite/ChangeLog: PR target/102812 * gcc.target/i386/pr102812.c: New test. --- gcc/config/i386/i386.c | 15 ++++++++++++--- gcc/testsuite/gcc.target/i386/pr102812.c | 12 ++++++++++++ 2 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr102812.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9cc903e826b..159684ce549 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -5399,9 +5399,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, switch (scalar_mode) { case E_HFmode: - opcode = (misaligned_p - ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") - : "vmovdqa64"); + if (evex_reg_p) + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "vmovdqu64") + : "vmovdqa64"); + else + opcode = (misaligned_p + ? (TARGET_AVX512BW + ? "vmovdqu16" + : "%vmovdqu") + : "%vmovdqa"); break; case E_SFmode: opcode = misaligned_p ? "%vmovups" : "%vmovaps"; diff --git a/gcc/testsuite/gcc.target/i386/pr102812.c b/gcc/testsuite/gcc.target/i386/pr102812.c new file mode 100644 index 00000000000..bad4fa9394e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr102812.c @@ -0,0 +1,12 @@ +/* PR target/102812 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4 -mno-avx" } */ +/* { dg-final { scan-assembler-not "vmovdqa64\t" } } */ +/* { dg-final { scan-assembler "movdqa\t" } } */ + +typedef _Float16 v8hf __attribute__((__vector_size__ (16))); + +v8hf t (_Float16 a) +{ + return (v8hf) {a, 0, 0, 0, 0, 0, 0, 0}; +}
On Thu, Oct 21, 2021 at 1:53 PM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote: > > Yes, updated patch. > > gcc/ChangeLog: > PR target/102812 > * config/i386/i386.c (ix86_get_ssemov): Adjust HFmode vector > move to use the same logic as HImode. > > gcc/testsuite/ChangeLog: > PR target/102812 > * gcc.target/i386/pr102812.c: New test. > --- > gcc/config/i386/i386.c | 15 ++++++++++++--- > gcc/testsuite/gcc.target/i386/pr102812.c | 12 ++++++++++++ > 2 files changed, 24 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr102812.c > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 9cc903e826b..159684ce549 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -5399,9 +5399,18 @@ ix86_get_ssemov (rtx *operands, unsigned size, > switch (scalar_mode) > { > case E_HFmode: > - opcode = (misaligned_p > - ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > - : "vmovdqa64"); > + if (evex_reg_p) > + opcode = (misaligned_p > + ? (TARGET_AVX512BW > + ? "vmovdqu16" > + : "vmovdqu64") > + : "vmovdqa64"); > + else > + opcode = (misaligned_p > + ? (TARGET_AVX512BW > + ? "vmovdqu16" > + : "%vmovdqu") > + : "%vmovdqa"); > break; Assume gmail has swallow "tab", and there's no indent issue in the orinal code. if that, LGTM. > case E_SFmode: > opcode = misaligned_p ? "%vmovups" : "%vmovaps"; > diff --git a/gcc/testsuite/gcc.target/i386/pr102812.c > b/gcc/testsuite/gcc.target/i386/pr102812.c > new file mode 100644 > index 00000000000..bad4fa9394e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr102812.c > @@ -0,0 +1,12 @@ > +/* PR target/102812 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse4 -mno-avx" } */ > +/* { dg-final { scan-assembler-not "vmovdqa64\t" } } */ > +/* { dg-final { scan-assembler "movdqa\t" } } */ > + > +typedef _Float16 v8hf __attribute__((__vector_size__ (16))); > + > +v8hf t (_Float16 a) > +{ > + return (v8hf) {a, 0, 0, 0, 0, 0, 0, 0}; > +} > -- > 2.18.1 > > Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> 于2021年10月21日周四 下午1:24写道: > > > > On Wed, Oct 20, 2021 at 1:31 PM Hongyu Wang via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > Since _Float16 type is enabled under sse2 target, returning > > > V8HFmode vector without AVX512F target would generate wrong > > > vmovdqa64 instruction. Adjust ix86_get_ssemov to avoid this. > > > > > > Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,} and sde. > > > > > > OK for master? > > > > > > gcc/ChangeLog: > > > PR target/102812 > > > * config/i386/i386.c (ix86_get_ssemov): Adjust HFmode vector > > > move without AVX512F target. > > > > > > gcc/testsuite/ChangeLog: > > > PR target/102812 > > > * gcc.target/i386/pr102812.c: New test. > > > --- > > > gcc/config/i386/i386.c | 9 ++++++--- > > > gcc/testsuite/gcc.target/i386/pr102812.c | 12 ++++++++++++ > > > 2 files changed, 18 insertions(+), 3 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr102812.c > > > > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > > index 9cc903e826b..1d79180da9a 100644 > > > --- a/gcc/config/i386/i386.c > > > +++ b/gcc/config/i386/i386.c > > > @@ -5399,9 +5399,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, > > > switch (scalar_mode) > > > { > > > case E_HFmode: > > > - opcode = (misaligned_p > > > - ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > > > - : "vmovdqa64"); > > > + if (!TARGET_AVX512F) > > > + opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; > > > + else > > > + opcode = (misaligned_p > > > + ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > > > + : "vmovdqa64"); > > > break; > > Could we just use similar logic as HI? > > > > case E_HImode: > > if (evex_reg_p) > > opcode = (need_unaligned_p > > ? (TARGET_AVX512BW > > ? "vmovdqu16" > > : "vmovdqu64") > > : "vmovdqa64"); > > else > > opcode = (need_unaligned_p > > ? (TARGET_AVX512BW > > ? "vmovdqu16" > > : "%vmovdqu") > > : "%vmovdqa"); > > break; > > > > > case E_SFmode: > > > opcode = misaligned_p ? "%vmovups" : "%vmovaps"; > > > diff --git a/gcc/testsuite/gcc.target/i386/pr102812.c b/gcc/testsuite/gcc.target/i386/pr102812.c > > > new file mode 100644 > > > index 00000000000..bad4fa9394e > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/pr102812.c > > > @@ -0,0 +1,12 @@ > > > +/* PR target/102812 */ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O2 -msse4 -mno-avx" } */ > > > +/* { dg-final { scan-assembler-not "vmovdqa64\t" } } */ > > > +/* { dg-final { scan-assembler "movdqa\t" } } */ > > > + > > > +typedef _Float16 v8hf __attribute__((__vector_size__ (16))); > > > + > > > +v8hf t (_Float16 a) > > > +{ > > > + return (v8hf) {a, 0, 0, 0, 0, 0, 0, 0}; > > > +} > > > -- > > > 2.18.1 > > > > > > > > > -- > > BR, > > Hongtao
On 10/21/21 07:47, Hongyu Wang via Gcc-patches wrote:
> |Yes, updated patch.|
Note the patch caused the following test failure:
FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*\\) 1
FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%xmm[0-9]+[^\n]*\\) 1
FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*\\) 1
FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%xmm[0-9]+[^\n]*\\) 1
Martin
Thanks for reminding this, will adjust the testcase since the output for 128/256bit HFmode load has changed. Martin Liška <mliska@suse.cz> 于2021年10月21日周四 下午8:49写道: > > On 10/21/21 07:47, Hongyu Wang via Gcc-patches wrote: > > |Yes, updated patch.| > > Note the patch caused the following test failure: > > FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*\\) 1 > FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%xmm[0-9]+[^\n]*\\) 1 > FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*\\) 1 > FAIL: gcc.target/i386/avx512fp16-13.c scan-assembler-times vmovdqa64[ \\t]+[^{\n]*%xmm[0-9]+[^\n]*\\) 1 > > Martin
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9cc903e826b..1d79180da9a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -5399,9 +5399,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, switch (scalar_mode) { case E_HFmode: - opcode = (misaligned_p - ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") - : "vmovdqa64"); + if (!TARGET_AVX512F) + opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; + else + opcode = (misaligned_p + ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") + : "vmovdqa64"); break; case E_SFmode: opcode = misaligned_p ? "%vmovups" : "%vmovaps"; diff --git a/gcc/testsuite/gcc.target/i386/pr102812.c b/gcc/testsuite/gcc.target/i386/pr102812.c new file mode 100644 index 00000000000..bad4fa9394e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr102812.c @@ -0,0 +1,12 @@ +/* PR target/102812 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4 -mno-avx" } */ +/* { dg-final { scan-assembler-not "vmovdqa64\t" } } */ +/* { dg-final { scan-assembler "movdqa\t" } } */ + +typedef _Float16 v8hf __attribute__((__vector_size__ (16))); + +v8hf t (_Float16 a) +{ + return (v8hf) {a, 0, 0, 0, 0, 0, 0, 0}; +}