[3/4] Add ILP32 support to aarch64

Message ID	1502215837.3962.127.camel@cavium.com
State	New, archived
Headers	Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org Message-ID: <1502215837.3962.127.camel@cavium.com> Subject: Re: [PATCH 3/4] Add ILP32 support to aarch64 From: Steve Ellcey <sellcey@cavium.com> Reply-To: sellcey@cavium.com To: Szabolcs Nagy <szabolcs.nagy@arm.com>, Joseph Myers <joseph@codesourcery.com>, Wilco Dijkstra <Wilco.Dijkstra@arm.com> Cc: nd@arm.com, "Ellcey, Steve" <Steve.Ellcey@cavium.com>, "libc-alpha@sourceware.org" <libc-alpha@sourceware.org> Date: Tue, 08 Aug 2017 11:10:37 -0700 In-Reply-To: <5989D25B.7000209@arm.com> References: <DB6PR0801MB20533095035144673B49342083B10@DB6PR0801MB2053.eurprd08.prod.outlook.com> <alpine.DEB.2.20.1708040011260.23567@digraph.polyomino.org.uk> <1501888532.3962.92.camel@cavium.com> <5989D25B.7000209@arm.com> Content-Type: multipart/mixed; boundary="=-uRc6vCFaCxYRXz50lNEB" Mime-Version: 1.0 Received-SPF: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; MWHPR07MB3551; 23:cbwhHjq8i3Cemla/+6Bt1uGKEbkktu07ndBxAFwIt?= =?us-ascii?Q?QJ7AiO7zWXbMBPjAabmWr4SArZ5oNHKkLDgJgKFAe3+s9ubWOuTiHqnlr1bL?= =?us-ascii?Q?1GPWP+lIOJDRW9EZc3qelgXQJzMdb7EiSSg4G/ETebnrQQhIKs5KEJlSWzFZ?= =?us-ascii?Q?IUN+eOnTSB5Y6okjUfAexJbgOPuYEZc1d5yfLCEMqM/ZE+52+Bx6sDSRU+YR?= =?us-ascii?Q?vgeEfCR+L3CgsJkNNzUkNT4Z2gEBbdjXLwG+3TcIuEteqt2DVT9vc3yFGS0v?= =?us-ascii?Q?xZ9wTVFvm+WWXl3KFrmxPWcv/Q6pwygPf/JcxFuVW8JMaEinmtUXpxIaRv7K?= =?us-ascii?Q?fG2g/vGccaWDqrQzqKORNkCdb3LqGZmBtlqDqMNyM6gdeFaDZFD+GWhZhmyH?= =?us-ascii?Q?od+sIF4remITcmd8ZtXzN96AFGBpEEqSTj63aQ29MrAsvYd6gChWStzH0abf?= =?us-ascii?Q?+pI0zhAp1xd7EH8KZYYhBmVqv8fS9hcO5gQPbdzMpBfL4Y91FnwocMRdSdTc?= =?us-ascii?Q?ybbX/MnCvCsphZkMcDzG6J77Sbi/oVWHIVwx89KauSm+zAaRIz714UjaDwI2?= =?us-ascii?Q?q7MxKYqlAFHsvSC3l3Co8YEgdNItHKS1Mcd0IqHbzsLRRItvKCfdvAs7erMX?= =?us-ascii?Q?gkmimyPZI8U1NZgLnD17+ESrBd2YhKHV7zuM3qzlAn/XIoQoyAwAuCrBqIZQ?= =?us-ascii?Q?+0Xfu3iboqniouaV1xZbx8cJ6vP+A9ZkPAXyBLRx3InfKWH771O3PFsyN6L2?= =?us-ascii?Q?42Q/pzQRp3WtH8ACz9xZN0Gm8Vnm4b8Tq0hp93Rce6ixJ00Z2ipuUd4cq/PH?= =?us-ascii?Q?8yDAZG2PEQj6ZyrTa9wXYjBEhMZj9Pk7pWPxFduvfEkASRfCUuewy/nqf9n0?= =?us-ascii?Q?Iszwd9bsTyv9bd0TesEoNsHWK0R7PyFvXxluYUVXAJH6BqnUG7xGfftC3B+t?= =?us-ascii?Q?Q5q6ihnYFarwLD4K6y8QCJ6lJeXn4GeJ/LxfeHTw+Gi/ErosrYvW8XsdGUqd?= =?us-ascii?Q?9erpsAqNdxxarBbpSLtmiyOUwcYzu2P7l12wXpMAGb6um7vco3sKrjyHSc+b?= =?us-ascii?Q?UcZ4zFEQ6Z8x9K62ge3rMSlhz/blcp6Gh093XDGgNgThlrx5nxaaF6cURviT?= =?us-ascii?Q?bKPfRIAkTRGJH4jwzytLPRRpQwYHSJFVz2PR4Fp8jTQp7o+bj382aBR27i2b?= =?us-ascii?Q?gq8X2j/Z70JyaxOahJKreeUtMSr/io4ZfjNIsvhrcAWzAxj8y4698cLZsSX2?= =?us-ascii?Q?8nQqUwVMSXMm/2mmkexx6VsKfmgKqcNpPGHlCc8XnGD1CuVIzJJZVClKZVmP?= =?us-ascii?Q?5hu1ikmJ0l+MoxSY0+aZQk5ilWJ2P0acGWBtfXPNyZaMN1kq6Cj3KMjYTmmZ?= =?us-ascii?Q?ydlP533VAEtfTa9i9xKwvBot6BGlHW9MSGCkwlYuc7GI2RkFyVVrujEY4S9v?= =?us-ascii?Q?2H+OmziiqrQH9A93Fp/mOWWeolMo8Iq3v4GvL29vpcYoMchZaHNaUiDQAXeU?= =?us-ascii?Q?oT4a1abT4+56g=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1; MWHPR07MB3551; 6:Ft/tH8vZvsif3o08JOLUexoAlAY8/uxY8DSf7KLvmKBkFeIxSzXDVyp7vtYJaEvUt4Uo/sCVXTVS7CZASiVL8pPmeR71bKTinnEyQZRaKN2btE3ww71avq9K7+nBhUkhI3+/WK17sGjqeooccJz9iCNfYTo03lUA4jGze1Y/0KHdbI31HY2fa+ORAcVt4J3vk1UmbptzetiDCOf9lwJABp8fl+aS/hcNYFA6RnLW0tUl/y36P9dOE/1NoB0140PJ1wMCeNUX/VCU6yDa5fgwb9TAWs8bGfL1f4nJduwS4bLbCBp9uqAGsTR1JIqwvFN5IppZJ0MDnFywL0IIYFMXLg==; 5:3KFu8cckcKWABRAcPmpmBkMjAh01kEb/m3uaUadArmliejuqM/6/oGOIN1Dq7znYgTW1TgxAqN2V5gsjRRXmpv05WU4axUmcwxHsY4Hm+oYhJZopVgSJhFifNE7q0jWuSBADl/Bq9UIfMEPuaLBxwA==; 24:DTmPjon5yMbW/3+ed2gPHXK0fX99263ubSdXdUU1jdKAwXbNziGEb94TfZ2RKpN2WBIPfRZlmxp/tnGFQo4x08aZSPa4XlPgwO76HgdFN0c=; 7:XgHiO99sixg/vYE1vt8WRkgVQm+j4PImyh/ZuM2sdpYazMJyJtnZeIRKMs/RGxxT1vE1G3jFc0zw2DfWYk3LjjNXjYTqRZdzNhsAr4upCtON8HhAIFMybftdIjJ9P6n0uuIYStT+kNTFp6RxhHhI9bxnzkA2tsmFoOB/moVxIst3qWvib9Fg8TloCKr6f7l6RpGGSC/Ekaz+3bD24jKg8eTg+eadgFCDJn2/UV4br6k= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM

Message ID

1502215837.3962.127.camel@cavium.com

State

New, archived

Headers

Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: libc-alpha-owner@sourceware.org
Message-ID: <1502215837.3962.127.camel@cavium.com>
Subject: Re: [PATCH 3/4] Add ILP32 support to aarch64
From: Steve Ellcey <sellcey@cavium.com>
Reply-To: sellcey@cavium.com
To: Szabolcs Nagy <szabolcs.nagy@arm.com>, Joseph Myers
	<joseph@codesourcery.com>, Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Cc: nd@arm.com, "Ellcey, Steve" <Steve.Ellcey@cavium.com>, 
	"libc-alpha@sourceware.org" <libc-alpha@sourceware.org>
Date: Tue, 08 Aug 2017 11:10:37 -0700
In-Reply-To: <5989D25B.7000209@arm.com>
References: <DB6PR0801MB20533095035144673B49342083B10@DB6PR0801MB2053.eurprd08.prod.outlook.com>
	<alpine.DEB.2.20.1708040011260.23567@digraph.polyomino.org.uk>
	<1501888532.3962.92.camel@cavium.com> <5989D25B.7000209@arm.com>
Content-Type: multipart/mixed; boundary="=-uRc6vCFaCxYRXz50lNEB"
Mime-Version: 1.0
Received-SPF: None (protection.outlook.com: cavium.com does not designate
	permitted sender hosts)
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Aug 2017 18:10:39.7873
	(UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR07MB3551

Commit Message

Steve Ellcey Aug. 8, 2017, 6:10 p.m. UTC

  On Tue, 2017-08-08 at 16:01 +0100, Szabolcs Nagy wrote:
> 
> > +#if IREG_SIZE == 64 && OREG_SIZE == 32
> > +  if (__builtin_fabs (x) > INT32_MAX - 2)
> i don't understand the -2 here.

I was confused and trying to handle the fact that fabs(INT32_MIN) !=
INT32_MAX.  I have removed the -2 and am just comparing to INT32_MAX
and that seems to work fine.  Since fabs(INT32_MIN) is greater than
INT32_MAX we may unnecessarily enter this if statement for values
between  INT32_MIN and INT32_MIN+1 but that should not cause any
failures, just a slowdown.

> > +    {
> > +      /* Converting large values to a 32 bit in may cause the
> > frintx/fcvtza
> s/in/int/

Fixed that.

> > +      invalid_p = libc_fetestexcept (FE_INVALID);
> > +      inexact_p = libc_fetestexcept (FE_INEXACT);
> multiple flags can be tested/raised in a single call.

Good point.  I changed this to one call and saved the flags in an
integer variable for checking later.

> > +      libc_fesetenv (&env);
> > +
> > +      if (invalid_p)
> > +	feraiseexcept (FE_INVALID);
> > +      else if (inexact_p)
> > +	feraiseexcept (FE_INEXACT);
> > +
> i think correct trapping is not guaranteed by glibc,
> only correct status flags when the function returns,
> so spurious inexact is not a problem if it is already
> raised, and then i expect better code gen for the
> inexact clearing approach:
> 
> if (fabs (x) > INT32_MAX && fetestexcept (FE_INEXACT) == 0)
>   {
>     asm (...);
>     if (fetestexcept (FE_INVALID|FE_INEXACT) ==
> (FE_INVALID|FE_INEXACT))
>       feclearexcept (FE_INEXACT);
>   }
> else
>   asm (...);

As you mentioned in your followup email, we have to worry about
FE_INVALID being set on entry too.  I have attached an updated
version of my patch.

Steve Ellcey
sellcey@cavium.com


2017-08-08  Steve Ellcey  <sellcey@cavium.com>

	* sysdeps/aarch64/fpu/s_llrint.c (OREG_SIZE): New macro.
	* sysdeps/aarch64/fpu/s_llround.c (OREG_SIZE): Likewise.
	* sysdeps/aarch64/fpu/s_llrintf.c (OREGS, IREGS): Remove.
	(IREG_SIZE, OREG_SIZE): New macros.
	* sysdeps/aarch64/fpu/s_llroundf.c: (OREGS, IREGS): Remove.
	(IREG_SIZE, OREG_SIZE): New macros.
	* sysdeps/aarch64/fpu/s_lrintf.c (IREGS): Remove.
	(IREG_SIZE): New macro.
	* sysdeps/aarch64/fpu/s_lroundf.c (IREGS): Remove.
	(IREG_SIZE): New macro.
	* sysdeps/aarch64/fpu/s_lrint.c (math_private.h, fenv.h, stdint.h):
	New includes.
	(IREG_SIZE, OREG_SIZE): Initialize if not already set.
	(OREGS, IREGS): Set based on IREG_SIZE and OREG_SIZE.
	(__CONCATX): Handle exceptions correctly on large values that may
	set FE_INVALID.
	* sysdeps/aarch64/fpu/s_lround.c (IREG_SIZE, OREG_SIZE):
	Initialize if not already set.
        (OREGS, IREGS): Set based on IREG_SIZE and OREG_SIZE.

diff mbox

Patch

diff --git a/sysdeps/aarch64/fpu/s_llrint.c b/sysdeps/aarch64/fpu/s_llrint.c
index c0d0d0e..57821c0 100644
--- a/sysdeps/aarch64/fpu/s_llrint.c
+++ b/sysdeps/aarch64/fpu/s_llrint.c
@@ -18,4 +18,5 @@ 
 
 #define FUNC llrint
 #define OTYPE long long int
+#define OREG_SIZE 64
 #include <s_lrint.c>
diff --git a/sysdeps/aarch64/fpu/s_llrintf.c b/sysdeps/aarch64/fpu/s_llrintf.c
index 67724c6..98ed4f8 100644
--- a/sysdeps/aarch64/fpu/s_llrintf.c
+++ b/sysdeps/aarch64/fpu/s_llrintf.c
@@ -18,6 +18,7 @@ 
 
 #define FUNC llrintf
 #define ITYPE float
-#define IREGS "s"
+#define IREG_SIZE 32
 #define OTYPE long long int
+#define OREG_SIZE 64
 #include <s_lrint.c>
diff --git a/sysdeps/aarch64/fpu/s_llround.c b/sysdeps/aarch64/fpu/s_llround.c
index ed4b192..ef7aedf 100644
--- a/sysdeps/aarch64/fpu/s_llround.c
+++ b/sysdeps/aarch64/fpu/s_llround.c
@@ -18,4 +18,5 @@ 
 
 #define FUNC llround
 #define OTYPE long long int
+#define OREG_SIZE 64
 #include <s_lround.c>
diff --git a/sysdeps/aarch64/fpu/s_llroundf.c b/sysdeps/aarch64/fpu/s_llroundf.c
index 360ce8b..294f0f4 100644
--- a/sysdeps/aarch64/fpu/s_llroundf.c
+++ b/sysdeps/aarch64/fpu/s_llroundf.c
@@ -18,6 +18,7 @@ 
 
 #define FUNC llroundf
 #define ITYPE float
-#define IREGS "s"
+#define IREG_SIZE 32
 #define OTYPE long long int
+#define OREG_SIZE 64
 #include <s_lround.c>
diff --git a/sysdeps/aarch64/fpu/s_lrint.c b/sysdeps/aarch64/fpu/s_lrint.c
index 8c61a03..ed0135c 100644
--- a/sysdeps/aarch64/fpu/s_lrint.c
+++ b/sysdeps/aarch64/fpu/s_lrint.c
@@ -16,7 +16,10 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#include <math_private.h>
 #include <math.h>
+#include <fenv.h>
+#include <stdint.h>
 
 #ifndef FUNC
 # define FUNC lrint
@@ -24,18 +27,37 @@ 
 
 #ifndef ITYPE
 # define ITYPE double
-# define IREGS "d"
+# define IREG_SIZE 64
 #else
-# ifndef IREGS
-#  error IREGS not defined
+# ifndef IREG_SIZE
+#  error IREG_SIZE not defined
 # endif
 #endif
 
 #ifndef OTYPE
 # define OTYPE long int
+# ifdef __ILP32__
+#  define OREG_SIZE 32
+# else
+#  define OREG_SIZE 64
+# endif
+#else
+# ifndef OREG_SIZE
+#  error OREG_SIZE not defined
+# endif
+#endif
+
+#if IREG_SIZE == 32
+# define IREGS "s"
+#else
+# define IREGS "d"
 #endif
 
-#define OREGS "x"
+#if OREG_SIZE == 32
+# define OREGS "w"
+#else
+# define OREGS "x"
+#endif
 
 #define __CONCATX(a,b) __CONCAT(a,b)
 
@@ -44,6 +66,32 @@  __CONCATX(__,FUNC) (ITYPE x)
 {
   OTYPE result;
   ITYPE temp;
+
+#if IREG_SIZE == 64 && OREG_SIZE == 32
+  if (__builtin_fabs (x) > INT32_MAX)
+    {
+      /* Converting large values to a 32 bit int may cause the frintx/fcvtza
+	 sequence to set both FE_INVALID and FE_INEXACT.  To avoid this
+         we save and restore the FE and only set one or the other.  */
+
+      fenv_t env;
+      int feflags;
+
+      libc_feholdexcept (&env);
+      asm ( "frintx" "\t%" IREGS "1, %" IREGS "2\n\t"
+	    "fcvtzs" "\t%" OREGS "0, %" IREGS "1"
+	    : "=r" (result), "=w" (temp) : "w" (x) );
+      feflags = libc_fetestexcept (FE_INVALID | FE_INEXACT);
+      libc_fesetenv (&env);
+
+      if (feflags & FE_INVALID)
+	feraiseexcept (FE_INVALID);
+      else if (feflags & FE_INEXACT)
+	feraiseexcept (FE_INEXACT);
+
+      return result;
+  }
+#endif
   asm ( "frintx" "\t%" IREGS "1, %" IREGS "2\n\t"
         "fcvtzs" "\t%" OREGS "0, %" IREGS "1"
         : "=r" (result), "=w" (temp) : "w" (x) );
diff --git a/sysdeps/aarch64/fpu/s_lrintf.c b/sysdeps/aarch64/fpu/s_lrintf.c
index a995e4b..2e73271 100644
--- a/sysdeps/aarch64/fpu/s_lrintf.c
+++ b/sysdeps/aarch64/fpu/s_lrintf.c
@@ -18,5 +18,5 @@ 
 
 #define FUNC lrintf
 #define ITYPE float
-#define IREGS "s"
+#define IREG_SIZE 32
 #include <s_lrint.c>
diff --git a/sysdeps/aarch64/fpu/s_lround.c b/sysdeps/aarch64/fpu/s_lround.c
index 9be9e7f..1f77d82 100644
--- a/sysdeps/aarch64/fpu/s_lround.c
+++ b/sysdeps/aarch64/fpu/s_lround.c
@@ -24,18 +24,37 @@ 
 
 #ifndef ITYPE
 # define ITYPE double
-# define IREGS "d"
+# define IREG_SIZE 64
 #else
-# ifndef IREGS
-#  error IREGS not defined
+# ifndef IREG_SIZE
+#  error IREG_SIZE not defined
 # endif
 #endif
 
 #ifndef OTYPE
 # define OTYPE long int
+# ifdef __ILP32__
+#  define OREG_SIZE 32
+# else
+#  define OREG_SIZE 64
+# endif
+#else
+# ifndef OREG_SIZE
+#  error OREG_SIZE not defined
+# endif
+#endif
+
+#if IREG_SIZE == 32
+# define IREGS "s"
+#else
+# define IREGS "d"
 #endif
 
-#define OREGS "x"
+#if OREG_SIZE == 32
+# define OREGS "w"
+#else
+# define OREGS "x"
+#endif
 
 #define __CONCATX(a,b) __CONCAT(a,b)
 
diff --git a/sysdeps/aarch64/fpu/s_lroundf.c b/sysdeps/aarch64/fpu/s_lroundf.c
index 4a066d4..b30ddb6 100644
--- a/sysdeps/aarch64/fpu/s_lroundf.c
+++ b/sysdeps/aarch64/fpu/s_lroundf.c
@@ -18,5 +18,5 @@ 
 
 #define FUNC lroundf
 #define ITYPE float
-#define IREGS "s"
+#define IREG_SIZE 32
 #include <s_lround.c>