From patchwork Thu Sep 23 09:15:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 45347 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB7FA385802A for ; Thu, 23 Sep 2021 09:16:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BB7FA385802A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1632388579; bh=7r70uyQ1Pn95xVek8+W/9R1+Vv84sMktIDcJx0vK88A=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=ysLvaIXjHQrPH0yUl8bUYIWw6J5Dqlv1/SjcPFZx1C3jI762nOeFXNVnfbWar66SL wl0tAp7QqADj6/ouCEYHOKWYw16V5prxz5pi5a687aUH9QmWKIGAPxHRuxX9j3oJGV 1iCUdL3VAE9w782vxicP9y9RPZroxddQ8sFjokJw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by sourceware.org (Postfix) with ESMTPS id 516513858D39 for ; Thu, 23 Sep 2021 09:15:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 516513858D39 X-IronPort-AV: E=McAfee;i="6200,9189,10115"; a="223842347" X-IronPort-AV: E=Sophos;i="5.85,316,1624345200"; d="scan'208";a="223842347" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2021 02:15:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,316,1624345200"; d="scan'208";a="585669748" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga004.jf.intel.com with ESMTP; 23 Sep 2021 02:15:47 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 18N9FjJA010832; Thu, 23 Sep 2021 02:15:46 -0700 To: hongtao.liu@intel.com Subject: [PATCH] AVX512FP16: Support cond_op for HFmode Date: Thu, 23 Sep 2021 17:15:45 +0800 Message-Id: <20210923091545.57315-1-hongyu.wang@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, SPOOFED_FREEMAIL, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch extend the expanders for cond_op to support vector HF modes. bootstraped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for master? gcc/ChangeLog: * config/i386/sse.md (cond_): Extend to support vector HFmodes. (cond_mul): Likewise. (cond_div): Likewise. (cond_): Likewise. (cond_fma): Likewise. (cond_fms): Likewise. (cond_fnma): Likewise. (cond_fnms): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_addsubmuldiv__Float16-1.c: New test. * gcc.target/i386/cond_op_addsubmuldiv__Float16-2.c: Ditto. * gcc.target/i386/cond_op_fma__Float16-1.c: Ditto. * gcc.target/i386/cond_op_fma__Float16-2.c: Ditto. * gcc.target/i386/cond_op_maxmin__Float16-1.c: Ditto. * gcc.target/i386/cond_op_maxmin__Float16-2.c: Ditto. --- gcc/config/i386/sse.md | 112 +++++++++--------- .../i386/cond_op_addsubmuldiv__Float16-1.c | 9 ++ .../i386/cond_op_addsubmuldiv__Float16-2.c | 7 ++ .../gcc.target/i386/cond_op_fma__Float16-1.c | 20 ++++ .../gcc.target/i386/cond_op_fma__Float16-2.c | 7 ++ .../i386/cond_op_maxmin__Float16-1.c | 8 ++ .../i386/cond_op_maxmin__Float16-2.c | 6 + 7 files changed, 113 insertions(+), 56 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-1.c create mode 100644 gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-2.c create mode 100644 gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-1.c create mode 100644 gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-2.c create mode 100644 gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-1.c create mode 100644 gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-2.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 1ca95984afc..c2eeb7b1517 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -2118,12 +2118,12 @@ [(set_attr "isa" "noavx,noavx,avx,avx")]) (define_expand "cond_" - [(set (match_operand:VF 0 "register_operand") - (vec_merge:VF - (plusminus:VF - (match_operand:VF 2 "vector_operand") - (match_operand:VF 3 "vector_operand")) - (match_operand:VF 4 "nonimm_or_0_operand") + [(set (match_operand:VFH 0 "register_operand") + (vec_merge:VFH + (plusminus:VFH + (match_operand:VFH 2 "vector_operand") + (match_operand:VFH 3 "vector_operand")) + (match_operand:VFH 4 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] " == 64 || TARGET_AVX512VL" { @@ -2207,12 +2207,12 @@ (set_attr "mode" "")]) (define_expand "cond_mul" - [(set (match_operand:VF 0 "register_operand") - (vec_merge:VF - (mult:VF - (match_operand:VF 2 "vector_operand") - (match_operand:VF 3 "vector_operand")) - (match_operand:VF 4 "nonimm_or_0_operand") + [(set (match_operand:VFH 0 "register_operand") + (vec_merge:VFH + (mult:VFH + (match_operand:VFH 2 "vector_operand") + (match_operand:VFH 3 "vector_operand")) + (match_operand:VFH 4 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] " == 64 || TARGET_AVX512VL" { @@ -2322,12 +2322,12 @@ }) (define_expand "cond_div" - [(set (match_operand:VF 0 "register_operand") - (vec_merge:VF - (div:VF - (match_operand:VF 2 "register_operand") - (match_operand:VF 3 "vector_operand")) - (match_operand:VF 4 "nonimm_or_0_operand") + [(set (match_operand:VFH 0 "register_operand") + (vec_merge:VFH + (div:VFH + (match_operand:VFH 2 "register_operand") + (match_operand:VFH 3 "vector_operand")) + (match_operand:VFH 4 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] " == 64 || TARGET_AVX512VL" { @@ -2660,12 +2660,12 @@ (set_attr "mode" "HF")]) (define_expand "cond_" - [(set (match_operand:VF 0 "register_operand") - (vec_merge:VF - (smaxmin:VF - (match_operand:VF 2 "vector_operand") - (match_operand:VF 3 "vector_operand")) - (match_operand:VF 4 "nonimm_or_0_operand") + [(set (match_operand:VFH 0 "register_operand") + (vec_merge:VFH + (smaxmin:VFH + (match_operand:VFH 2 "vector_operand") + (match_operand:VFH 3 "vector_operand")) + (match_operand:VFH 4 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] " == 64 || TARGET_AVX512VL" { @@ -4785,13 +4785,13 @@ (set_attr "mode" "")]) (define_expand "cond_fma" - [(set (match_operand:VF_AVX512VL 0 "register_operand") - (vec_merge:VF_AVX512VL - (fma:VF_AVX512VL - (match_operand:VF_AVX512VL 2 "vector_operand") - (match_operand:VF_AVX512VL 3 "vector_operand") - (match_operand:VF_AVX512VL 4 "vector_operand")) - (match_operand:VF_AVX512VL 5 "nonimm_or_0_operand") + [(set (match_operand:VFH_AVX512VL 0 "register_operand") + (vec_merge:VFH_AVX512VL + (fma:VFH_AVX512VL + (match_operand:VFH_AVX512VL 2 "vector_operand") + (match_operand:VFH_AVX512VL 3 "vector_operand") + (match_operand:VFH_AVX512VL 4 "vector_operand")) + (match_operand:VFH_AVX512VL 5 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] "TARGET_AVX512F" { @@ -4885,14 +4885,14 @@ (set_attr "mode" "")]) (define_expand "cond_fms" - [(set (match_operand:VF_AVX512VL 0 "register_operand") - (vec_merge:VF_AVX512VL - (fma:VF_AVX512VL - (match_operand:VF_AVX512VL 2 "vector_operand") - (match_operand:VF_AVX512VL 3 "vector_operand") - (neg:VF_AVX512VL - (match_operand:VF_AVX512VL 4 "vector_operand"))) - (match_operand:VF_AVX512VL 5 "nonimm_or_0_operand") + [(set (match_operand:VFH_AVX512VL 0 "register_operand") + (vec_merge:VFH_AVX512VL + (fma:VFH_AVX512VL + (match_operand:VFH_AVX512VL 2 "vector_operand") + (match_operand:VFH_AVX512VL 3 "vector_operand") + (neg:VFH_AVX512VL + (match_operand:VFH_AVX512VL 4 "vector_operand"))) + (match_operand:VFH_AVX512VL 5 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] "TARGET_AVX512F" { @@ -4988,14 +4988,14 @@ (set_attr "mode" "")]) (define_expand "cond_fnma" - [(set (match_operand:VF_AVX512VL 0 "register_operand") - (vec_merge:VF_AVX512VL - (fma:VF_AVX512VL - (neg:VF_AVX512VL - (match_operand:VF_AVX512VL 2 "vector_operand")) - (match_operand:VF_AVX512VL 3 "vector_operand") - (match_operand:VF_AVX512VL 4 "vector_operand")) - (match_operand:VF_AVX512VL 5 "nonimm_or_0_operand") + [(set (match_operand:VFH_AVX512VL 0 "register_operand") + (vec_merge:VFH_AVX512VL + (fma:VFH_AVX512VL + (neg:VFH_AVX512VL + (match_operand:VFH_AVX512VL 2 "vector_operand")) + (match_operand:VFH_AVX512VL 3 "vector_operand") + (match_operand:VFH_AVX512VL 4 "vector_operand")) + (match_operand:VFH_AVX512VL 5 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] "TARGET_AVX512F" { @@ -5093,15 +5093,15 @@ (set_attr "mode" "")]) (define_expand "cond_fnms" - [(set (match_operand:VF_AVX512VL 0 "register_operand") - (vec_merge:VF_AVX512VL - (fma:VF_AVX512VL - (neg:VF_AVX512VL - (match_operand:VF_AVX512VL 2 "vector_operand")) - (match_operand:VF_AVX512VL 3 "vector_operand") - (neg:VF_AVX512VL - (match_operand:VF_AVX512VL 4 "vector_operand"))) - (match_operand:VF_AVX512VL 5 "nonimm_or_0_operand") + [(set (match_operand:VFH_AVX512VL 0 "register_operand") + (vec_merge:VFH_AVX512VL + (fma:VFH_AVX512VL + (neg:VFH_AVX512VL + (match_operand:VFH_AVX512VL 2 "vector_operand")) + (match_operand:VFH_AVX512VL 3 "vector_operand") + (neg:VFH_AVX512VL + (match_operand:VFH_AVX512VL 4 "vector_operand"))) + (match_operand:VFH_AVX512VL 5 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] "TARGET_AVX512F" { diff --git a/gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-1.c b/gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-1.c new file mode 100644 index 00000000000..b503b75d548 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-1.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=sapphirerapids -DTYPE=_Float16 -fdump-tree-vect" } */ +/* { dg-final { scan-tree-dump ".COND_ADD" "vect" } } */ +/* { dg-final { scan-tree-dump ".COND_SUB" "vect" } } */ +/* { dg-final { scan-tree-dump ".COND_MUL" "vect" } } */ +/* { dg-final { scan-tree-dump ".COND_RDIV" "vect" } } */ + +#include "cond_op_addsubmuldiv_double-1.c" + diff --git a/gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-2.c b/gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-2.c new file mode 100644 index 00000000000..e8397bbc5b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cond_op_addsubmuldiv__Float16-2.c @@ -0,0 +1,7 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mavx512fp16 -mavx512vl -mprefer-vector-width=256 -DTYPE=_Float16" } */ +/* { dg-require-effective-target avx512vl } */ +/* { dg-require-effective-target avx512fp16 } */ + +#define AVX512FP16 +#include "cond_op_addsubmuldiv_double-2.c" diff --git a/gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-1.c b/gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-1.c new file mode 100644 index 00000000000..9ea45d690e2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=sapphirerapids -DTYPE=_Float16 -fdump-tree-optimized -D__BUILTIN_FMA=__builtin_fmaf16" } */ +/* { dg-final { scan-tree-dump-times ".COND_FMA" 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times ".COND_FNMA" 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times ".COND_FMS" 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times ".COND_FNMS" 3 "optimized" } } */ +/* { dg-final { scan-assembler-times "vfmadd132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfnmadd132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfmsub132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfnmsub132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfmadd231ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfnmadd231ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfmsub231ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfnmsub231ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfmadd132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfnmadd132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfmsub132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vfnmsub132ph\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ + +#include "cond_op_fma_double-1.c" diff --git a/gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-2.c b/gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-2.c new file mode 100644 index 00000000000..b7ee1cb8c95 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cond_op_fma__Float16-2.c @@ -0,0 +1,7 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mavx512fp16 -mavx512vl -mprefer-vector-width=256 -DTYPE=_Float16 -D__BUILTIN_FMA=__builtin_fmaf16" -DNUM=100 } */ +/* { dg-require-effective-target avx512fp16 } */ +/* { dg-require-effective-target avx512vl } */ + +#define AVX512FP16 +#include "cond_op_fma_double-2.c" diff --git a/gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-1.c b/gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-1.c new file mode 100644 index 00000000000..b09410248f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=sapphirerapids -DTYPE=_Float16 -fdump-tree-optimized -DFN_MAX=__builtin_fmaxf16 -DFN_MIN=__builtin_fminf16" } */ +/* { dg-final { scan-tree-dump ".COND_MAX" "optimized" } } */ +/* { dg-final { scan-tree-dump ".COND_MIN" "optimized" } } */ +/* { dg-final { scan-assembler-times "vmaxph" 1 } } */ +/* { dg-final { scan-assembler-times "vminph" 1 } } */ + +#include "cond_op_maxmin_double-1.c" diff --git a/gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-2.c b/gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-2.c new file mode 100644 index 00000000000..b67adc8b2d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cond_op_maxmin__Float16-2.c @@ -0,0 +1,6 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mavx512fp16 -mavx512vl -mprefer-vector-width=256 -DTYPE=_Float16 -DFN_MAX=__builtin_fmaxf16 -DFN_MIN=__builtin_fminf16 -ffast-math" } */ +/* { dg-require-effective-target avx512vl } */ +/* { dg-require-effective-target avx512fp16 } */ + +#include "cond_op_maxmin_double-2.c"