From patchwork Mon Oct 31 01:10:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 59638 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C732F3858428 for ; Mon, 31 Oct 2022 01:11:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C732F3858428 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667178686; bh=S63C5uX4MGKJpg/83ENIJy0RDkj0BYYHvQ1/ai5CHds=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=PLsLVOhA0zdkF9Ibzt5Dug1M5Bws+5JWFC/H29oo1u4yzk8Y/9GaQxOSlPRylI0mO d5wvqWmhPDoIXuOeCFyUxf/BEAfa9CUf0gkmfSdRBkINcb74LlRbWosG+Eu/dSppMV D3G0yicZvg4yJFD56hXgfhnHoQdNT3gr1PQzTpP4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by sourceware.org (Postfix) with ESMTPS id 873513858C2B for ; Mon, 31 Oct 2022 01:10:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 873513858C2B X-IronPort-AV: E=McAfee;i="6500,9779,10516"; a="335447510" X-IronPort-AV: E=Sophos;i="5.95,227,1661842800"; d="scan'208";a="335447510" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2022 18:10:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10516"; a="738769651" X-IronPort-AV: E=Sophos;i="5.95,227,1661842800"; d="scan'208";a="738769651" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga002.fm.intel.com with ESMTP; 30 Oct 2022 18:10:37 -0700 Received: from shliclel4051.sh.intel.com (shliclel4051.sh.intel.com [10.239.240.51]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 6419E10056B8; Mon, 31 Oct 2022 09:10:36 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH V2] [x86] Fix incorrect digit constraint Date: Mon, 31 Oct 2022 09:10:36 +0800 Message-Id: <20221031011036.1158443-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" >You have a couple of other patterns where operand 1 is matched to >produce vmovddup insn. These are *avx512f_unpcklpd512 and >avx_unpcklpd256. You can also remove expander in both >cases. Yes, changed in V2 patch. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint. In pr107057, the 2 operands in the pattern are both input operands. gcc/ChangeLog: PR target/107057 * config/i386/sse.md (*vec_interleave_highv2df): Remove constraint 1. (*vec_interleave_lowv2df): Ditto. (vec_concatv2df): Ditto. (*avx512f_unpcklpd512): Ditto and renamed to .. (avx512f_unpcklpd512): .. this. (avx512f_movddup512): Change to define_insn. (avx_movddup256): Ditto. (*avx_unpcklpd256): Remove constraint 1 and renamed to .. (avx_unpcklpd256): .. this. * config/i386/i386.cc (ix86_vec_interleave_v2df_operator_ok): Disallow MEM_P (op1) && MEM_P (op2). gcc/testsuite/ChangeLog: * gcc.target/i386/pr107057.c: New test. --- gcc/config/i386/i386.cc | 2 +- gcc/config/i386/sse.md | 140 +++++++++-------------- gcc/testsuite/gcc.target/i386/pr107057.c | 19 +++ 3 files changed, 77 insertions(+), 84 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr107057.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index aeea26ef4be..e3b7bea0d68 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -15652,7 +15652,7 @@ ix86_vec_interleave_v2df_operator_ok (rtx operands[3], bool high) if (MEM_P (operands[0])) return rtx_equal_p (operands[0], operands[1 + high]); if (MEM_P (operands[1]) && MEM_P (operands[2])) - return TARGET_SSE3 && rtx_equal_p (operands[1], operands[2]); + return false; return true; } diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index f4b5506703f..b7922521734 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -12170,107 +12170,88 @@ (define_expand "vec_interleave_highv2df" }) (define_insn "*vec_interleave_highv2df" - [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,m") + [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,m") (vec_select:V2DF (vec_concat:V4DF - (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,o,v") - (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,0,v,0")) + (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,v") + (match_operand:V2DF 2 "nonimmediate_operand" " x,v,0,v,0")) (parallel [(const_int 1) (const_int 3)])))] "TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 1)" "@ unpckhpd\t{%2, %0|%0, %2} vunpckhpd\t{%2, %1, %0|%0, %1, %2} - %vmovddup\t{%H1, %0|%0, %H1} movlpd\t{%H1, %0|%0, %H1} vmovlpd\t{%H1, %2, %0|%0, %2, %H1} %vmovhpd\t{%1, %0|%q0, %1}" - [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*") - (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov") + [(set_attr "isa" "noavx,avx,noavx,avx,*") + (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov") (set (attr "prefix_data16") - (if_then_else (eq_attr "alternative" "3,5") + (if_then_else (eq_attr "alternative" "2,4") (const_string "1") (const_string "*"))) - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig,maybe_evex,maybe_vex") - (set_attr "mode" "V2DF,V2DF,DF,V1DF,V1DF,V1DF")]) + (set_attr "prefix" "orig,maybe_evex,orig,maybe_evex,maybe_vex") + (set_attr "mode" "V2DF,V2DF,V1DF,V1DF,V1DF")]) -(define_expand "avx512f_movddup512" - [(set (match_operand:V8DF 0 "register_operand") +(define_insn "avx512f_movddup512" + [(set (match_operand:V8DF 0 "register_operand" "=v") (vec_select:V8DF (vec_concat:V16DF - (match_operand:V8DF 1 "nonimmediate_operand") + (match_operand:V8DF 1 "memory_operand" "m") (match_dup 1)) (parallel [(const_int 0) (const_int 8) (const_int 2) (const_int 10) (const_int 4) (const_int 12) (const_int 6) (const_int 14)])))] - "TARGET_AVX512F") - -(define_expand "avx512f_unpcklpd512" - [(set (match_operand:V8DF 0 "register_operand") - (vec_select:V8DF - (vec_concat:V16DF - (match_operand:V8DF 1 "register_operand") - (match_operand:V8DF 2 "nonimmediate_operand")) - (parallel [(const_int 0) (const_int 8) - (const_int 2) (const_int 10) - (const_int 4) (const_int 12) - (const_int 6) (const_int 14)])))] - "TARGET_AVX512F") + "TARGET_AVX512F" + "vmovddup\t{%1, %0|%0, %1}" + [(set_attr "type" "sselog") + (set_attr "prefix" "evex") + (set_attr "mode" "V8DF")]) -(define_insn "*avx512f_unpcklpd512" - [(set (match_operand:V8DF 0 "register_operand" "=v,v") +(define_insn "avx512f_unpcklpd512" + [(set (match_operand:V8DF 0 "register_operand" "=v") (vec_select:V8DF (vec_concat:V16DF - (match_operand:V8DF 1 "nonimmediate_operand" "vm, v") - (match_operand:V8DF 2 "nonimmediate_operand" "1 ,vm")) + (match_operand:V8DF 1 "register_operand" "v") + (match_operand:V8DF 2 "nonimmediate_operand" "vm")) (parallel [(const_int 0) (const_int 8) (const_int 2) (const_int 10) (const_int 4) (const_int 12) (const_int 6) (const_int 14)])))] "TARGET_AVX512F" - "@ - vmovddup\t{%1, %0|%0, %1} - vunpcklpd\t{%2, %1, %0|%0, %1, %2}" + "vunpcklpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") (set_attr "prefix" "evex") (set_attr "mode" "V8DF")]) ;; Recall that the 256-bit unpck insns only shuffle within their lanes. -(define_expand "avx_movddup256" - [(set (match_operand:V4DF 0 "register_operand") +(define_insn "avx_movddup256" + [(set (match_operand:V4DF 0 "register_operand" "=v") (vec_select:V4DF (vec_concat:V8DF - (match_operand:V4DF 1 "nonimmediate_operand") + (match_operand:V4DF 1 "memory_operand" "m") (match_dup 1)) (parallel [(const_int 0) (const_int 4) (const_int 2) (const_int 6)])))] - "TARGET_AVX && ") - -(define_expand "avx_unpcklpd256" - [(set (match_operand:V4DF 0 "register_operand") - (vec_select:V4DF - (vec_concat:V8DF - (match_operand:V4DF 1 "register_operand") - (match_operand:V4DF 2 "nonimmediate_operand")) - (parallel [(const_int 0) (const_int 4) - (const_int 2) (const_int 6)])))] - "TARGET_AVX && ") + "TARGET_AVX && " + "vmovddup\t{%1, %0|%0, %1}" + [(set_attr "type" "sselog") + (set_attr "prefix" "") + (set_attr "mode" "V4DF")]) -(define_insn "*avx_unpcklpd256" - [(set (match_operand:V4DF 0 "register_operand" "=v,v") +(define_insn "avx_unpcklpd256" + [(set (match_operand:V4DF 0 "register_operand" "=v") (vec_select:V4DF (vec_concat:V8DF - (match_operand:V4DF 1 "nonimmediate_operand" " v,m") - (match_operand:V4DF 2 "nonimmediate_operand" "vm,1")) + (match_operand:V4DF 1 "register_operand" " v") + (match_operand:V4DF 2 "nonimmediate_operand" "vm")) (parallel [(const_int 0) (const_int 4) (const_int 2) (const_int 6)])))] "TARGET_AVX && " - "@ - vunpcklpd\t{%2, %1, %0|%0, %1, %2} - vmovddup\t{%1, %0|%0, %1}" + "vunpcklpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog") - (set_attr "prefix" "vex") + (set_attr "prefix" "") (set_attr "mode" "V4DF")]) (define_expand "vec_interleave_lowv4df" @@ -12332,29 +12313,28 @@ (define_expand "vec_interleave_lowv2df" }) (define_insn "*vec_interleave_lowv2df" - [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,o") + [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,o") (vec_select:V2DF (vec_concat:V4DF - (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,m,0,v,0") - (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,m,m,v")) + (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,0,v,0") + (match_operand:V2DF 2 "nonimmediate_operand" " x,v,m,m,v")) (parallel [(const_int 0) (const_int 2)])))] "TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 0)" "@ unpcklpd\t{%2, %0|%0, %2} vunpcklpd\t{%2, %1, %0|%0, %1, %2} - %vmovddup\t{%1, %0|%0, %q1} movhpd\t{%2, %0|%0, %q2} vmovhpd\t{%2, %1, %0|%0, %1, %q2} %vmovlpd\t{%2, %H0|%H0, %2}" - [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*") - (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov") + [(set_attr "isa" "noavx,avx,noavx,avx,*") + (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov") (set (attr "prefix_data16") - (if_then_else (eq_attr "alternative" "3,5") + (if_then_else (eq_attr "alternative" "2,4") (const_string "1") (const_string "*"))) - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig,maybe_evex,maybe_vex") - (set_attr "mode" "V2DF,V2DF,DF,V1DF,V1DF,V1DF")]) + (set_attr "prefix" "orig,maybe_evex,orig,maybe_evex,maybe_vex") + (set_attr "mode" "V2DF,V2DF,V1DF,V1DF,V1DF")]) (define_split [(set (match_operand:V2DF 0 "memory_operand") @@ -13560,56 +13540,50 @@ (define_insn "vec_dupv2df" (set_attr "mode" "V2DF,DF,DF")]) (define_insn "vec_concatv2df" - [(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,v,x,x, v,x,x") + [(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,x, v,x,x") (vec_concat:V2DF - (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,vm,0,0") - (match_operand:DF 2 "nonimm_or_0_operand" " x,x,v,1,1,m,m, C,x,m")))] - "TARGET_SSE - && (!(MEM_P (operands[1]) && MEM_P (operands[2])) - || (TARGET_SSE3 && rtx_equal_p (operands[1], operands[2])))" + (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,0,x,vm,0,0") + (match_operand:DF 2 "nonimm_or_0_operand" " x,x,v,m,m, C,x,m")))] + "TARGET_SSE && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ unpcklpd\t{%2, %0|%0, %2} vunpcklpd\t{%2, %1, %0|%0, %1, %2} vunpcklpd\t{%2, %1, %0|%0, %1, %2} - %vmovddup\t{%1, %0|%0, %1} - vmovddup\t{%1, %0|%0, %1} movhpd\t{%2, %0|%0, %2} vmovhpd\t{%2, %1, %0|%0, %1, %2} %vmovq\t{%1, %0|%0, %1} movlhps\t{%2, %0|%0, %2} movhps\t{%2, %0|%0, %2}" [(set (attr "isa") - (cond [(eq_attr "alternative" "0,5") + (cond [(eq_attr "alternative" "0,3") (const_string "sse2_noavx") - (eq_attr "alternative" "1,6") + (eq_attr "alternative" "1,4") (const_string "avx") - (eq_attr "alternative" "2,4") + (eq_attr "alternative" "2") (const_string "avx512vl") - (eq_attr "alternative" "3") - (const_string "sse3") - (eq_attr "alternative" "7") + (eq_attr "alternative" "5") (const_string "sse2") ] (const_string "noavx"))) (set (attr "type") (if_then_else - (eq_attr "alternative" "0,1,2,3,4") + (eq_attr "alternative" "0,1,2") (const_string "sselog") (const_string "ssemov"))) (set (attr "prefix_data16") - (if_then_else (eq_attr "alternative" "5") + (if_then_else (eq_attr "alternative" "3") (const_string "1") (const_string "*"))) (set (attr "prefix") - (cond [(eq_attr "alternative" "1,6") + (cond [(eq_attr "alternative" "1,4") (const_string "vex") - (eq_attr "alternative" "2,4") + (eq_attr "alternative" "2") (const_string "evex") - (eq_attr "alternative" "3,7") + (eq_attr "alternative" "5") (const_string "maybe_vex") ] (const_string "orig"))) - (set_attr "mode" "V2DF,V2DF,V2DF, DF, DF, V1DF,V1DF,DF,V4SF,V2SF")]) + (set_attr "mode" "V2DF,V2DF,V2DF,V1DF,V1DF,DF,V4SF,V2SF")]) ;; vmovq clears also the higher bits. (define_insn "vec_set_0" diff --git a/gcc/testsuite/gcc.target/i386/pr107057.c b/gcc/testsuite/gcc.target/i386/pr107057.c new file mode 100644 index 00000000000..40b49ac21ec --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr107057.c @@ -0,0 +1,19 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mavx -mcmodel=large -O3" } */ + +typedef double v2df __attribute__ ((vector_size (16))); +v2df f (double a, double b) +{ + v2df v; + double *c = (double *)&v; + *c = a; + *(c+1) = b; + return v; +} +void g () +{ + v2df x = f (1.0, 1.0); + v2df y = f (2.0, 2.0); + for (;*(double *)&x<=8; x+=y) + g (); +}