From patchwork Tue Oct 29 08:25:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99746 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 624773858C52 for ; Tue, 29 Oct 2024 08:28:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) by sourceware.org (Postfix) with ESMTPS id 907763858D34 for ; Tue, 29 Oct 2024 08:27:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 907763858D34 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 907763858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730190450; cv=none; b=vcHx0f+d7QBZTtX0PJivhN/TjpY45zI4aoMfme//fI80lGjL+mUJ3FUFRgm0M/6uvdZ3cDSUAhMCCeh6gHFewFkFz4/6q2RKtHsRyxg3o1UeX6uFPFHV4Go1c+Phc5Zj/fHqAHg5zHt8sUgoMgPlnWZC/JjuhS4jcQBziP/wmPo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730190450; c=relaxed/simple; bh=5EcTHf1NSNnE9KEEKGfuv3OioLSFFVguUjGicWlX/yg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=JnNHOyxGcJHB77Co2xFteP9l3bRuFoU+B1Y/uGW3n+xClTwJTGVzmFHP/XiOAxpFMlxXLpg1ElvFt5g+7JcPlnnPlLJfk73NZnwGtGyuObB/On4HS82dQMBtwvtxe9tI4Z9pCERHpN7oYCDMNP+hBoC0sTn5WBN5bugMBURo8zM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1730190439; x=1761726439; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5EcTHf1NSNnE9KEEKGfuv3OioLSFFVguUjGicWlX/yg=; b=F61ocRBl7boUR38/PSpwyPJSIhGiG5hW+JOfHNofwgUDTNK29VI4A2es 27Grwlcu6+c5w30/fnBdBUUL9qVnaiX0vAEZy2/VollgKhBqz9QOfMRes WOIyQiIXs70AuK65YjNgKxwjjiqWrc3PFxRgVX4rSyEpFbF4GsN7Vt/mk 9aNJE83T7rsQWXpTCaeT88nozeUpalQJ2ThjisFt4mlROd4T9Fgj1ulc0 dpT7ak+nkZ7M4L29EIJ5RCc6oPHcg3pARsmI/++XvtGV4DvUMgZvJnAf8 k+z+viR1c6u9ybDSzSEaCvrz7gULoBhB5V9B+i3uA50HKESO59InvipNB w==; X-CSE-ConnectionGUID: /hhZ+kUCQBOMVTmQCzGT+g== X-CSE-MsgGUID: F/d0mjkPTIaUEJ0AmcDzgA== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="30002101" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="30002101" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2024 01:27:18 -0700 X-CSE-ConnectionGUID: HAj3LC/5T6GxhHm78vGT0g== X-CSE-MsgGUID: BatcQ8gfT7Wtm4LN7JMeTw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,241,1725346800"; d="scan'208";a="86652636" Received: from panli.sh.intel.com ([10.239.154.73]) by orviesa005.jf.intel.com with ESMTP; 29 Oct 2024 01:27:17 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 3/5] Match: Simplify branch form 8 of unsigned SAT_ADD into branchless Date: Tue, 29 Oct 2024 16:25:19 +0800 Message-ID: <20241029082521.3638409-3-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241029082521.3638409-1-pan2.li@intel.com> References: <20241029082521.3638409-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li There are sorts of forms for the unsigned SAT_ADD. Some of them are complicated while others are cheap. This patch would like to simplify the complicated form into the cheap ones. For example as below: From the form 8 (branch): SAT_U_ADD = x > (T)(x + y) ? -1 : (x + y). To (branchless): SAT_U_ADD = (X + Y) | - ((X + Y) < X). #define T uint8_t T sat_add_u_1 (T x, T y) { return x > (T)(x + y) ? -1 : (x + y); } Before this patch: 1 │ uint8_t sat_add_u_1 (uint8_t x, uint8_t y) 2 │ { 3 │ uint8_t D.2809; 4 │ 5 │ _1 = x + y; 6 │ if (x <= _1) goto ; else goto ; 7 │ : 8 │ D.2809 = x + y; 9 │ goto ; 10 │ : 11 │ D.2809 = 255; 12 │ : 13 │ return D.2809; 14 │ } After this patch: 1 │ uint8_t sat_add_u_1 (uint8_t x, uint8_t y) 2 │ { 3 │ uint8_t D.2809; 4 │ 5 │ _1 = x + y; 6 │ _2 = x + y; 7 │ _3 = x > _2; 8 │ _4 = (unsigned char) _3; 9 │ _5 = -_4; 10 │ D.2809 = _1 | _5; 11 │ return D.2809; 12 │ } The simplify doesn't need to check if target support the SAT_ADD, it is somehow the optimization in gimple level. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Remove unsigned branch form 8 for SAT_ADD, and add simplify to branchless instead. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/sat_u_add-simplify-4-u16.c: New test. * gcc.dg/tree-ssa/sat_u_add-simplify-4-u32.c: New test. * gcc.dg/tree-ssa/sat_u_add-simplify-4-u64.c: New test. * gcc.dg/tree-ssa/sat_u_add-simplify-4-u8.c: New test. Signed-off-by: Pan Li --- gcc/match.pd | 13 ++++++++----- .../gcc.dg/tree-ssa/sat_u_add-simplify-4-u16.c | 15 +++++++++++++++ .../gcc.dg/tree-ssa/sat_u_add-simplify-4-u32.c | 15 +++++++++++++++ .../gcc.dg/tree-ssa/sat_u_add-simplify-4-u64.c | 15 +++++++++++++++ .../gcc.dg/tree-ssa/sat_u_add-simplify-4-u8.c | 15 +++++++++++++++ 5 files changed, 68 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u16.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u32.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u64.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u8.c diff --git a/gcc/match.pd b/gcc/match.pd index d871fb8c24e..7105aedb40c 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3170,6 +3170,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && types_match (type, @0, @1)) (bit_ior @2 (negate (convert (lt @2 @0)))))) +/* Simplify SAT_U_ADD to the cheap form + From: SAT_U_ADD = x > (X + Y) ? -1 : (X + Y). + To: SAT_U_ADD = (X + Y) | - ((X + Y) < X). */ +(simplify (cond (gt @0 (plus:c@2 @0 @1)) integer_minus_onep @2) + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type) + && types_match (type, @0, @1)) + (bit_ior @2 (negate (convert (lt @2 @0)))))) + /* Unsigned saturation add, case 5 (branch with eq .ADD_OVERFLOW): SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> == 0 ? .ADD_OVERFLOW : -1. */ (match (unsigned_integer_sat_add @0 @1) @@ -3182,11 +3190,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop) integer_minus_onep (usadd_left_part_2 @0 @1))) -/* Unsigned saturation add, case 8 (branch with gt): - SAT_ADD = x > (X + Y) ? -1 : (X + Y). */ -(match (unsigned_integer_sat_add @0 @1) - (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2)) - /* Unsigned saturation add, case 9 (one op is imm): SAT_U_ADD = (X + 3) >= x ? (X + 3) : -1. */ (match (unsigned_integer_sat_add @0 @1) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u16.c b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u16.c new file mode 100644 index 00000000000..bc899715dd6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u16.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-gimple-details" } */ + +#include + +#define T uint16_t + +T sat_add_u_1 (T x, T y) +{ + return x > (T)(x + y) ? -1 : (x + y); +} + +/* { dg-final { scan-tree-dump-not " if " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " else " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " goto " "gimple" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u32.c b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u32.c new file mode 100644 index 00000000000..53d6033563a --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u32.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-gimple-details" } */ + +#include + +#define T uint32_t + +T sat_add_u_1 (T x, T y) +{ + return x > (T)(x + y) ? -1 : (x + y); +} + +/* { dg-final { scan-tree-dump-not " if " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " else " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " goto " "gimple" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u64.c b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u64.c new file mode 100644 index 00000000000..772f64a9bae --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u64.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-gimple-details" } */ + +#include + +#define T uint64_t + +T sat_add_u_1 (T x, T y) +{ + return x > (T)(x + y) ? -1 : (x + y); +} + +/* { dg-final { scan-tree-dump-not " if " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " else " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " goto " "gimple" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u8.c b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u8.c new file mode 100644 index 00000000000..6c91fe7ec5d --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sat_u_add-simplify-4-u8.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-gimple-details" } */ + +#include + +#define T uint8_t + +T sat_add_u_1 (T x, T y) +{ + return x > (T)(x + y) ? -1 : (x + y); +} + +/* { dg-final { scan-tree-dump-not " if " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " else " "gimple" } } */ +/* { dg-final { scan-tree-dump-not " goto " "gimple" } } */