From patchwork Thu Jul 3 18:11:43 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1911 Received: (qmail 28189 invoked by alias); 3 Jul 2014 18:11:57 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 28139 invoked by uid 89); 3 Jul 2014 18:11:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qc0-f171.google.com X-Received: by 10.224.167.70 with SMTP id p6mr10080978qay.79.1404411107315; Thu, 03 Jul 2014 11:11:47 -0700 (PDT) Message-ID: <53B59CDF.1010604@twiddle.net> Date: Thu, 03 Jul 2014 11:11:43 -0700 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: libc-alpha CC: "Joseph S. Myers" Subject: [PATCH, RFC] fma vs gcc 4.9 GCC 4.9 optimizes double a1 = z + m1; double t1 = a1 - z; double t2 = a1 - t1; t1 = m1 - t1; t2 = z - t2; double a2 = t1 + t2; feclearexcept (FE_INEXACT); if (a1 == 0 && m2 == 0) ... return ... into double a1 = z + m1; feclearexcept (FE_INEXACT); if (a1 == 0 && m2 == 0) ... return ... double t1 = a1 - z; double t2 = a1 - t1; t1 = m1 - t1; t2 = z - t2; double a2 = t1 + t2; because the later computation is partially redundant for the path leading to the return. I noticed this because, of course, the moved computation raises INEXACT, leading to incorrect results. At first I was simply going to add a math_force_eval to make sure everything is complete before the feclearexcept, but on further reflection it seemed odd that most of the computation is in fact redundant. It seems to me that there's a typo on that exact zero test: a2 should be used, not m2. Correct, or have I mis-read the code? r~ diff --git a/ChangeLog b/ChangeLog index 5eb0d43..4a3e672 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,9 @@ 2014-07-03 Richard Henderson + * sysdeps/ieee754/dbl-64/s_fma.c (__fma): Use math_force_eval before + feclearexcept; use math_opt_barrier instead of open-coded asm; fix + typo in exact zero test. + * sysdeps/alpha/fpu/s_nearbyintf.c: Remove file. * sysdeps/alpha/fpu/s_nearbyint.c (__nearbyint): Remove; include sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c. diff --git a/sysdeps/ieee754/dbl-64/s_fma.c b/sysdeps/ieee754/dbl-64/s_fma.c index 389acd4..3bd7f71 100644 --- a/sysdeps/ieee754/dbl-64/s_fma.c +++ b/sysdeps/ieee754/dbl-64/s_fma.c @@ -198,16 +198,16 @@ __fma (double x, double y, double z) t1 = m1 - t1; t2 = z - t2; double a2 = t1 + t2; + /* Ensure the addition is not scheduled after feclearexcept call. */ + math_force_eval (a2); feclearexcept (FE_INEXACT); - /* If the result is an exact zero, ensure it has the correct - sign. */ - if (a1 == 0 && m2 == 0) + /* If the result is an exact zero, ensure it has the correct sign. */ + if (a1 == 0 && a2 == 0) { libc_feupdateenv (&env); - /* Ensure that round-to-nearest value of z + m1 is not - reused. */ - asm volatile ("" : "=m" (z) : "m" (z)); + /* Ensure that round-to-nearest value of z + m1 is not reused. */ + z = math_opt_barrier (z); return z + m1; }