From patchwork Thu Jan 11 01:38:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 83805 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 70A46385801E for ; Thu, 11 Jan 2024 01:39:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) by sourceware.org (Postfix) with ESMTPS id 098BC3858C66 for ; Thu, 11 Jan 2024 01:39:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 098BC3858C66 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 098BC3858C66 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.55.52.115 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704937144; cv=none; b=oHpNYdmzgqxFf2ku1oYIsLJFcPKywZiC4o1FcHA/wJpKQhSgcugMjTK2429OyfrHEqX84qSjgwfsPnnIZ46vmU7kyzWtnB9dhLkpAirsSejjDwgVlYxY1IGCWqjbhK51mBWnQClFtqG0CSFWU1GandWkA06D5xrZCuFhA1hx9pg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704937144; c=relaxed/simple; bh=3FYinKd2g6B2CCIYtC2K3XPHgY6e92oTFN7o8in4pgw=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=wmPJ079tp36tKIAi8E4q9z6pAIF1hNmq3YXJ0xzJCnZ/GB4YKdyci7ddKLiPtXe7ufzCeymxLs+H0joQxEECpWMPM9cGMd5BDUr3i83i7qNGXs8hFwOPU9FQLzRvGesL0R5FeECZGNI2CtKIpVYGfbKPiR7knPEXOp23i8hFkS0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1704937142; x=1736473142; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3FYinKd2g6B2CCIYtC2K3XPHgY6e92oTFN7o8in4pgw=; b=j5UdhzcrnKz02UPRtnuFnPiOWbhRtu+C/htzvuEDQctMuqpkSQG6BRSM i+/vnq54VRp416HCK+gKrnUvVSo/GIQGhQQ3lxEhl5AjLEcGusGUGRHcz +pFWvLVo6iwokpaZl6EMz9B1/qsmUXJ1B9zOnvS/eE1r2kMBQuTTeKSb/ 7Zm2smt9igyhiBglMnCgkDpdwDwmYRVbMLKNKA3l+fWA4447cX/XumsX+ nQ9eWRdKuZ9OSmlBn542Ay3l8aUyK+iQ6N2lin0sRcGskINUvLFeWHbWQ RVatuc+Td8Wb/qbYZCOf+0S+fwKicwMlCD8saX9Vc97PZ1oEtFOPRy8/2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10949"; a="398401057" X-IronPort-AV: E=Sophos;i="6.04,185,1695711600"; d="scan'208";a="398401057" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2024 17:39:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10949"; a="816544611" X-IronPort-AV: E=Sophos;i="6.04,185,1695711600"; d="scan'208";a="816544611" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga001.jf.intel.com with ESMTP; 10 Jan 2024 17:38:53 -0800 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9126510056F5; Thu, 11 Jan 2024 09:38:52 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com, richard.guenther@gmail.com, jeffreyalaw@gmail.com Subject: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion Date: Thu, 11 Jan 2024 09:38:42 +0800 Message-Id: <20240111013842.925454-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231223110733.2565292-1-pan2.li@intel.com> References: <20231223110733.2565292-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to support the target that always honor signed zero. Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize instead of HONOR_SIGNED_ZEROS. Then the target/backend can decide to honor the no-signed-zero or not. The below tests are passed for this patch: * The riscv regression tests. * The aarch64 regression tests. * The x86 bootstrap and regression tests. gcc/ChangeLog: * loop-unroll.cc (insert_var_expansion_initialization): Leverage MODE_HAS_SIGNED_ZEROS for expansion variable initialization. gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios. Signed-off-by: Pan Li --- gcc/loop-unroll.cc | 4 +-- gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++---- 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc index 4176a21e308..bfdfe6c2bb7 100644 --- a/gcc/loop-unroll.cc +++ b/gcc/loop-unroll.cc @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, rtx var, zero_init; unsigned i; machine_mode mode = GET_MODE (ve->reg); - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); if (ve->var_expansions.length () == 0) return; @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, case MINUS: FOR_EACH_VEC_ELT (ve->var_expansions, i, var) { - if (honor_signed_zero_p) + if (has_signed_zero_p) zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); else zero_init = CONST0_RTX (mode); diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c index 564410913ab..6a9d3d87932 100644 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ b/gcc/testsuite/gcc.dg/pr30957-1.c @@ -20,16 +20,52 @@ foo (float d, int n) return accum; } +float __attribute__((noinline)) +get_minus_zero() +{ + return 0.0 / -5.0; +} + int main () { - /* When compiling standard compliant we expect foo to return -0.0. But the - variable expansion during unrolling optimization (for this testcase enabled - by non-compliant -fassociative-math) instantiates copy(s) of the - accumulator which it initializes with +0.0. Hence we expect that foo - returns +0.0. */ - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) + /* The variable expansion in unroll requires option unsafe-math-optimizations + (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math + and -freciprocal-math). + + When loop like above will have expansion after unrolling as below: + + accum_1 += d_1; + accum_2 += d_2; + accum_3 += d_3; + ... + + The accum_1, accum_2 and accum_3 need to be initialized. Given the + floating-point we have + +0.0f + -0.0f = +0.0f. + + Thus, we should initialize the accum_* to -0.0 for correctness. But + the things become more complicated when no-signed-zeros, as well as VLA + vectorizer mode which doesn't trigger variable expansion. Then we have: + + Case 1: Trigger variable expansion but target doesn't honor no-signed-zero. + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. + Case 2: Trigger variable expansion but target does honor no-signed-zero. + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. + Case 3: No variable expansion but target doesn't honor no-signed-zero. + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. + Case 4: No variable expansion but target does honor no-signed-zero. + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. + + The test case covers above 4 cases for running. + */ + float minus_zero = get_minus_zero (); + float a = __builtin_copysignf (1.0, minus_zero); + float b = __builtin_copysignf (1.0, foo (minus_zero, 10)); + + if (a != b) abort (); + exit (0); }