From patchwork Fri Sep  9 12:20:11 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Tobias Burnus <tobias@codesourcery.com>
X-Patchwork-Id: 57521
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 60FC93857014
	for <patchwork@sourceware.org>; Fri,  9 Sep 2022 12:20:45 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252])
 by sourceware.org (Postfix) with ESMTPS id 08E2B38582BF
 for <gcc-patches@gcc.gnu.org>; Fri,  9 Sep 2022 12:20:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 08E2B38582BF
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=codesourcery.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com
X-IronPort-AV: E=Sophos;i="5.93,303,1654588800";
 d="diff'?scan'208";a="82715123"
Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167])
 by esa4.mentor.iphmx.com with ESMTP; 09 Sep 2022 04:20:19 -0800
IronPort-SDR: 
 62tu7zWTgka5uvV4piMKs0Gk5fFK2fMgralgYWFxgpxUq+OWvHb3Dqnvd6t2mlk+5fyb5zk92k
 SGsuCZw54CvENtihziDQh4UG7++QyA/VO4FgZSak+kqMJhjwysylAQeZWGlDgeuIOJzSkwLGpB
 uShCAZOAEvPWus7hm0oGMLxXFHImp30ALrZrHorFqmHZEAIWdP/GO7NkFiz81/UfgzPvmf6R3q
 qp44oS/uoEssr1MYptdTMaJ46DCo+NFoKr70PfQVC+YeusMER/oqQI+QqZyDNjkpye7047G95m
 XWI=
Message-ID: <5d7ca95c-8a61-6d07-dd5b-e7c2d03072ff@codesourcery.com>
Date: Fri, 9 Sep 2022 14:20:11 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.2.1
Subject: GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246]
 (was: Re: [PATCH] amdgcn: Add support for additional natively supported
 floating-point operations)
Content-Language: en-US
To: Andrew Stubbs <ams@codesourcery.com>
References: <c44b8158-f20d-ff08-fa83-2de19b77c78b@codesourcery.com>
 <c39077e6-aa57-7728-bbef-46756b51af68@codesourcery.com>
 <08966068-719a-30d1-5b71-7cf839e507e7@codesourcery.com>
 <nycvar.YFH.7.77.849.2209091011440.20505@jbgna.fhfr.qr>
From: Tobias Burnus <tobias@codesourcery.com>
In-Reply-To: <nycvar.YFH.7.77.849.2209091011440.20505@jbgna.fhfr.qr>
X-Originating-IP: [137.202.0.90]
X-ClientProxiedBy: svr-ies-mbx-13.mgc.mentorg.com (139.181.222.13) To
 svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12)
X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS,
 SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Cc: Richard Biener <rguenther@suse.de>,
 Kwok Cheung Yeung <kcy@codesourcery.com>,
 gcc-patches <gcc-patches@gcc.gnu.org>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

On 09.09.22 12:16, Richard Biener wrote:
> On Fri, 9 Sep 2022, Tobias Burnus wrote:
>> -funsafe-math-optimizations implies -fno-signed-zeros, -fno-trapping-math,
>> -fassociative-math,
>> and -freciprocal-math. All of them reduce precision and my violate IEEE or
>> ISO/language standards.
>>
>> However, I think it is rather surprising to have all of the sudden only a
>> precision of the order of 100,000,000 ULP instead of ~4 ULP as to be expected.
>> That's a precision loss of the order of 10^8 or 2^29 which is huge!
>> ...
> I agree - for example powerpc has -mrecip= to control which instructions
> to use (float/double rsqrt or inverse) and -mrecip-precision to
> specify whether further iteration is done or not.
> [...]
> Your suggested huge reduction in precision isn't usually acceptable
> and should be always explicitely enabled.

First, I have to correct myself – Kwok's -munsafe-math-optimizations is
only about single-precision functions, which do not have this problem.

However, the pre-existing 'sqrt' problem still is real. It also applies
to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN.

This patch now adds a commandline flag - off by default - to choose
whether this behavior is wanted. I did use the same name as aarch64,
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mlow-precision-sqrt
(the latter also has -mlow-precision-recip-sqrt, which is not (yet)
sensible for GCN.)

This patch was manually tested for all combinations and I also looked at
insn-recog.cc, given that it is my first .md patch – it it seems to work
fine.

OK for mainline – or are there comments or more suggestions? I also
included some word for the release notes.

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

gcc-13/changes.html - GCN: document -mlow-precision-sqrt

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 390193ca..d335eab3 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -179,6 +179,12 @@ a work-in-progress.</p>
   <li>Support for the Instinct MI200 series devices (<a
       href="https://gcc.gnu.org/onlinedocs/gcc/AMD-GCN-Options.html">
       <code>gfx90a</code></a>) has been added.</li>
+  <li>The <code>-mlow-precision-sqrt</code> option (disabled by default)
+      has been added to use the hardware <code>sqrt</code> also for
+      double-precision floating point arguments; note that the result
+      only has much a reduced accurary of 2<sup>29</sup> ULP. This
+      option requires <code>-funsafe-math-optimizations</code>
+      (implied by <code>-ffast-math</code>) in addition.</li>
 </ul>
 
 <!-- <h3 id="arc">ARC</h3> -->