From patchwork Tue Dec 19 05:38:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 82428 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC985385701E for ; Tue, 19 Dec 2023 05:39:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by sourceware.org (Postfix) with ESMTPS id 8B0A93858C2B for ; Tue, 19 Dec 2023 05:38:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B0A93858C2B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8B0A93858C2B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.126 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702964339; cv=none; b=lvLrVQpVJtjFpe/SgmlYQBit/zpOdibjhCwV+b8A1Jcs488/xvQxdeipqq++BP5OLNu7ntcouFzNKlIxw1jU9wPBf5+coE89UyZNoMbm+60slrrSRYFedgeepRK2tdpI+92UU7aq4uhly4rJUpdJG8+Ciq2USVGQwf0F2uHacxE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702964339; c=relaxed/simple; bh=MUj5RLnFExQxqI948r1drDR+02I4wRqVkPZkRBZQhNI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=dLp/vcECVhqTXv/J4gfBQ/BsJbKHLf/B3Tb+6wNszHywZDxQ4Y2Lku9k/VDCfe8awbg+cVQOqkgruIq3PfMljZcP3yklnyDSOy8PS9MWXCLkr+BA7FkaJTOamdlxJK0opBqWLXhecSITERNxeHQKKD8mnAzpyikW0ygH6VkuasU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702964337; x=1734500337; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=MUj5RLnFExQxqI948r1drDR+02I4wRqVkPZkRBZQhNI=; b=JpZldeSg6UaOBcW6Tm8ZjbJC3mr79DzPdN+LVxpZntO+BnmQAjqIvWJP vR0WDveB32BS/Dwmc0kGetsUqQKdJvNkdW9nqzTZPbHptOjRYxfGV4BlC /FTvqe5PGi3+PQx0txj43lPVl3oCysmukFYbo5ZJSvPJ5SxwG+t8OscdB I7aK9mph3SSYxPSnLjcIOk5z7qYmDvW118VwGTI4gG8HZKVyttXzdnN/w KMux72fpKCTVoVFVo+2DN7CqFYkD34Ydm+IQ6Ev4bIP0ReTJcCPnJNbRY 1YDqKKybww+lc+hASkHF3yNM3K8rFG69spQxwfBCsbAWg72WRw1u9v1l/ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="380594254" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="380594254" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Dec 2023 21:38:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="752040048" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="752040048" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga006.jf.intel.com with ESMTP; 18 Dec 2023 21:38:54 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8BBD21005689; Tue, 19 Dec 2023 13:38:53 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH] Optimize A < B ? A : B to MIN_EXPR. Date: Tue, 19 Dec 2023 13:38:53 +0800 Message-Id: <20231219053853.3764283-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Similar for A < B ? B : A to MAX_EXPR. There're codes in the frontend to optimize such pattern but failed to handle testcase in the PR since it's exposed at gimple level when folding backend builtins. pr95906 now can be optimized to MAX_EXPR as it's commented in the testcase. // FIXME: this should further optimize to a MAX_EXPR typedef signed char v16i8 __attribute__((vector_size(16))); v16i8 f(v16i8 a, v16i8 b) Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? (or maybe wait for GCC 15). gcc/ChangeLog: PR target/104401 * match.pd (A < B ? A : B -> MIN_EXPR): New patten match. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104401.c: New test. * gcc.dg/tree-ssa/pr95906.c: Adjust testcase. --- gcc/match.pd | 20 ++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/pr95906.c | 3 +-- gcc/testsuite/gcc.target/i386/pr104401.c | 27 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr104401.c diff --git a/gcc/match.pd b/gcc/match.pd index d57e29bfe1d..9584a70aa3d 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5263,6 +5263,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (view_convert:type (vec_cond @4 (view_convert:vtype @2) (view_convert:vtype @3))))))) +/* Optimize A < B ? A : B to MIN (A, B) + A > B ? A : B to MAX (A, B). */ +(for cmp (lt le gt ge) + minmax (min min max max) + MINMAX (MIN_EXPR MIN_EXPR MAX_EXPR MAX_EXPR) + (simplify + (vec_cond (cmp @0 @1) @0 @1) + (if (VECTOR_INTEGER_TYPE_P (type) + && target_supports_op_p (type, MINMAX, optab_vector)) + (minmax @0 @1)))) + +(for cmp (lt le gt ge) + minmax (max max min min) + MINMAX (MAX_EXPR MAX_EXPR MIN_EXPR MIN_EXPR) + (simplify + (vec_cond (cmp @0 @1) @1 @0) + (if (VECTOR_INTEGER_TYPE_P (type) + && target_supports_op_p (type, MINMAX, optab_vector)) + (minmax @0 @1)))) + /* c1 ? c2 ? a : b : b --> (c1 & c2) ? a : b */ (simplify (vec_cond @0 (vec_cond:s @1 @2 @3) @3) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c index 3d820a58e93..d15670f3e9e 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c @@ -1,7 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-forwprop3-raw -w -Wno-psabi" } */ -// FIXME: this should further optimize to a MAX_EXPR typedef signed char v16i8 __attribute__((vector_size(16))); v16i8 f(v16i8 a, v16i8 b) { @@ -10,4 +9,4 @@ v16i8 f(v16i8 a, v16i8 b) } /* { dg-final { scan-tree-dump-not "bit_(and|ior)_expr" "forwprop3" } } */ -/* { dg-final { scan-tree-dump-times "vec_cond_expr" 1 "forwprop3" } } */ +/* { dg-final { scan-tree-dump-times "max_expr" 1 "forwprop3" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr104401.c b/gcc/testsuite/gcc.target/i386/pr104401.c new file mode 100644 index 00000000000..8ce7ff88d9e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104401.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ +/* { dg-final { scan-assembler-times "pminsd" 2 } } */ +/* { dg-final { scan-assembler-times "pmaxsd" 2 } } */ + +#include + +__m128i min32(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(value, input)); +} + +__m128i max32(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(value, input)); +} + +__m128i min32_1(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(input, value)); +} + +__m128i max32_1(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(input, value)); +} +