From patchwork Wed Jul 16 22:50:40 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 2097 Received: (qmail 30921 invoked by alias); 16 Jul 2014 22:50:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 30892 invoked by uid 89); 16 Jul 2014 22:50:48 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qa0-f45.google.com X-Received: by 10.224.42.196 with SMTP id t4mr48733247qae.48.1405551044900; Wed, 16 Jul 2014 15:50:44 -0700 (PDT) Message-ID: <53C701C0.70503@twiddle.net> Date: Wed, 16 Jul 2014 15:50:40 -0700 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "Joseph S. Myers" CC: libc-alpha Subject: [PATCH v2] fma vs gcc 4.9 References: <53B59CDF.1010604@twiddle.net> In-Reply-To: On 07/16/2014 01:56 PM, Joseph S. Myers wrote: > On Thu, 3 Jul 2014, Richard Henderson wrote: > >> It seems to me that there's a typo on that exact zero test: a2 should be used, >> not m2. Correct, or have I mis-read the code? > > The existing exact zero test seems correct to me. The conditions for such > an exact zero are that the result of the multiplication is exactly > representable in 53 bits (i.e., m2 == 0, with m1 being the exact result of > the multiplication), and that the result of the addition (of z to the high > part of the multipliation result) is an exact zero (i.e. a1 == 0, which > implies a2 == 0). > Thanks. This second version, then, does not s/m2/a2/, but merely adds the appropriate barriers. It also makes sure that m2 is complete before clearing inexact, even though I saw no evidence of it being scheduled after the call. Ok? r~ diff --git a/sysdeps/ieee754/dbl-64/s_fma.c b/sysdeps/ieee754/dbl-64/s_fma.c index 389acd4..77065aa 100644 --- a/sysdeps/ieee754/dbl-64/s_fma.c +++ b/sysdeps/ieee754/dbl-64/s_fma.c @@ -198,16 +198,17 @@ __fma (double x, double y, double z) t1 = m1 - t1; t2 = z - t2; double a2 = t1 + t2; + /* Ensure the arithmetic is not scheduled after feclearexcept call. */ + math_force_eval (m2); + math_force_eval (a2); feclearexcept (FE_INEXACT); - /* If the result is an exact zero, ensure it has the correct - sign. */ + /* If the result is an exact zero, ensure it has the correct sign. */ if (a1 == 0 && m2 == 0) { libc_feupdateenv (&env); - /* Ensure that round-to-nearest value of z + m1 is not - reused. */ - asm volatile ("" : "=m" (z) : "m" (z)); + /* Ensure that round-to-nearest value of z + m1 is not reused. */ + z = math_opt_barrier (z); return z + m1; }