From patchwork Fri Sep 24 17:59:58 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Joseph Myers <joseph@codesourcery.com>
X-Patchwork-Id: 45427
Return-Path: <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 96DC63857C75
	for <patchwork@sourceware.org>; Fri, 24 Sep 2021 18:00:19 +0000 (GMT)
X-Original-To: libc-alpha@sourceware.org
Delivered-To: libc-alpha@sourceware.org
Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98])
 by sourceware.org (Postfix) with ESMTPS id 7580D3858402
 for <libc-alpha@sourceware.org>; Fri, 24 Sep 2021 18:00:04 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7580D3858402
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=codesourcery.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com
IronPort-SDR: 
 h269TjUNo68YLsQNBGAsJnKhWbgyb6wAS4EvZFGGmgGRE1EpqaET0B85KoXbt0xIfszS7XEsL7
 DDeqHmrddDYFFoNXFa1eqcrf7E45Lrp8pnq2JT/67um7jTdbflEfZWeN9Neb0xm7HHO9IbD7gw
 0kSwE46Q8gY75y35Mnt3RWnR+5CGLaC5bl7cCOkC2ESujb6x7kvCcBFvl6iXB1/0IZ7dtgnhwM
 M6X+oM3+lGIIX5krcDWMgaIndDL4EKrt4S+4+He/3LpZgnnYF2P8BRAlZkQ74U1ekCQf+dKDqC
 tc2iA0ofEyV6K9JycfHrzlKd
X-IronPort-AV: E=Sophos;i="5.85,320,1624348800"; d="scan'208";a="66304177"
Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167])
 by esa2.mentor.iphmx.com with ESMTP; 24 Sep 2021 10:00:02 -0800
IronPort-SDR: 
 E/XEATXaJJMgURxuLpiPdugxTy2ai4mTcFgLIdkdxBhC5qiInKvQI2VK0m25uo0rtUifPD1wEQ
 k6/WLmPMdJmFUkWjPMSNyaVndY7EnwzcuJZzyc+bFiR2hoJ5D5z7qMBsaG72IQyUqT5+/H4K3K
 TOfy2vI0w/9pYEdgtkQ/cIvmdVnesIs8OokxIk7EQoU4XvrZM0i2y172Ktv6msvPYyNS1mVRL7
 D5lat2wko0IBV6aYXPvE/4AntY5IcB3P0ljs2/BQuApqHjlZ3NVldy7sOzwQPKqhwyoPTwHl94
 hUU=
Date: Fri, 24 Sep 2021 17:59:58 +0000
From: Joseph Myers <joseph@codesourcery.com>
X-X-Sender: jsm28@digraph.polyomino.org.uk
To: <libc-alpha@sourceware.org>
Subject: Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor case
 [committed]
Message-ID: 
 <alpine.DEB.2.22.394.2109241759380.4038824@digraph.polyomino.org.uk>
User-Agent: Alpine 2.22 (DEB 394 2020-01-19)
MIME-Version: 1.0
X-Originating-IP: [137.202.0.90]
X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To
 svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1)
X-Spam-Status: No, score=-3124.1 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org
Sender: "Libc-alpha"
 <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org>

It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c
does not cover all cases where the x86 fenv_private.h macros might
manipulate one of the SSE and 387 floating-point state, while the
actual fma implementation uses the other.  Specifically, in the 32-bit
case, with a compiler not defaulting to -mfpmath=sse, but testing on a
processor with hardware FMA support, the multiarch fma function
implementations will end up using SSE, while the fenv_private.h macros
will use the 387 state for double.  Change the conditional to use the
default macros rather than the optimized ones in all cases except when
the compiler inlines an fma instruction (in which case, since all
those instructions are SSE instructions and -mfpmath=sse must be in
effect for them to be inlined, the optimized macros will only use the
SSE state and it's OK for them to only use the SSE state).

Tested for x86_64 and x86.  H.J. reports in
<https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html>
that it fixes the problems he observed.
---

Committed.

diff --git a/sysdeps/x86/fpu/s_ffma.c b/sysdeps/x86/fpu/s_ffma.c
index 95c2dcd7b7..da4bb55f9a 100644
--- a/sysdeps/x86/fpu/s_ffma.c
+++ b/sysdeps/x86/fpu/s_ffma.c
@@ -27,10 +27,14 @@
 
 #include <math-narrow.h>
 
-#if defined __SSE2_MATH__ && !defined __FP_FAST_FMA
+#ifndef __FP_FAST_FMA
 /* Depending on the details of the glibc configuration, fma might use
    either SSE or 387 arithmetic; ensure that both parts of the
-   floating-point state are handled in the round-to-odd code.  */
+   floating-point state are handled in the round-to-odd code.  If
+   __FP_FAST_FMA is defined, that implies that the compiler is using
+   SSE floating point and that the fma call will be inlined, so the
+   x86 macros will work with only the SSE state and that is
+   sufficient.  */
 # undef libc_feholdexcept_setround
 # define libc_feholdexcept_setround	default_libc_feholdexcept_setround
 # undef libc_feupdateenv_test