GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246] (was: Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations)

Message ID 5d7ca95c-8a61-6d07-dd5b-e7c2d03072ff@codesourcery.com
State Committed
Headers
Series GCN: Add -mlow-precision-sqrt for double-precision sqrt [PR105246] (was: Re: [PATCH] amdgcn: Add support for additional natively supported floating-point operations) |

Commit Message

Tobias Burnus Sept. 9, 2022, 12:20 p.m. UTC
  On 09.09.22 12:16, Richard Biener wrote:
> On Fri, 9 Sep 2022, Tobias Burnus wrote:
>> -funsafe-math-optimizations implies -fno-signed-zeros, -fno-trapping-math,
>> -fassociative-math,
>> and -freciprocal-math. All of them reduce precision and my violate IEEE or
>> ISO/language standards.
>>
>> However, I think it is rather surprising to have all of the sudden only a
>> precision of the order of 100,000,000 ULP instead of ~4 ULP as to be expected.
>> That's a precision loss of the order of 10^8 or 2^29 which is huge!
>> ...
> I agree - for example powerpc has -mrecip= to control which instructions
> to use (float/double rsqrt or inverse) and -mrecip-precision to
> specify whether further iteration is done or not.
> [...]
> Your suggested huge reduction in precision isn't usually acceptable
> and should be always explicitely enabled.

First, I have to correct myself – Kwok's -munsafe-math-optimizations is
only about single-precision functions, which do not have this problem.

However, the pre-existing 'sqrt' problem still is real. It also applies
to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN.

This patch now adds a commandline flag - off by default - to choose
whether this behavior is wanted. I did use the same name as aarch64,
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mlow-precision-sqrt
(the latter also has -mlow-precision-recip-sqrt, which is not (yet)
sensible for GCN.)

This patch was manually tested for all combinations and I also looked at
insn-recog.cc, given that it is my first .md patch – it it seems to work
fine.

OK for mainline – or are there comments or more suggestions? I also
included some word for the release notes.

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  

Comments

Andrew Stubbs Sept. 9, 2022, 12:40 p.m. UTC | #1
On 09/09/2022 13:20, Tobias Burnus wrote:
> However, the pre-existing 'sqrt' problem still is real. It also applies 
> to reverse sqrt ("v_rsq"), but that's for whatever reason not used for GCN.
> 
> This patch now adds a commandline flag - off by default - to choose 
> whether this behavior is wanted. I did use the same name as aarch64, 
> https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#index-mlow-precision-sqrt (the latter also has -mlow-precision-recip-sqrt, which is not (yet) sensible for GCN.)
> 
> This patch was manually tested for all combinations and I also looked at 
> insn-recog.cc, given that it is my first .md patch – it it seems to work 
> fine.
> 
> OK for mainline – or are there comments or more suggestions? I also 
> included some word for the release notes.

No, thank you.

I don't see any value in adding an option no one cares about (but we 
still have to maintain and test).

I think it will make sense to drop the double-precision insn definition 
and fall back to libm in that case.

Kwok is currently reviewing all the libm functions and can probably 
include this one.

Andrew
  

Patch

gcc-13/changes.html - GCN: document -mlow-precision-sqrt

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 390193ca..d335eab3 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -179,6 +179,12 @@  a work-in-progress.</p>
   <li>Support for the Instinct MI200 series devices (<a
       href="https://gcc.gnu.org/onlinedocs/gcc/AMD-GCN-Options.html">
       <code>gfx90a</code></a>) has been added.</li>
+  <li>The <code>-mlow-precision-sqrt</code> option (disabled by default)
+      has been added to use the hardware <code>sqrt</code> also for
+      double-precision floating point arguments; note that the result
+      only has much a reduced accurary of 2<sup>29</sup> ULP. This
+      option requires <code>-funsafe-math-optimizations</code>
+      (implied by <code>-ffast-math</code>) in addition.</li>
 </ul>
 
 <!-- <h3 id="arc">ARC</h3> -->