[v5] x86_64: Update THREAD_SETMEM/THREAD_SETMEM_NC for IMM64 [BZ #27591]

Message ID CAMe9rOpFyz495-B_VrvNAMENLGSPiSbLFZk6cO_0deB3ei373w@mail.gmail.com
State Superseded
Headers
Series [v5] x86_64: Update THREAD_SETMEM/THREAD_SETMEM_NC for IMM64 [BZ #27591] |

Commit Message

H.J. Lu March 16, 2021, 3:45 p.m. UTC
  On Tue, Mar 16, 2021 at 2:04 AM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> On Mär 15 2021, Carlos O'Donell via Libc-alpha wrote:
>
> > Then we try to fix this with the new sequence:
> >
> >   asm volatile ("movq %q0,%%fs:%P1" :
> >               : "i" ((uint64_t) cast_to_integer (value)),
> >                 "i" (offsetof (struct pthread, member)));
> >
> > Note that "%q0" is a machine constraint for any register.
>
> An operand modifier isn't a constraint.  It does not change what the
> compiler matches for the operand, it only modifies the way the operand
> is output.  'q' has no meaning for an operand matched by the 'i'
> constraint, since that is never a register.

That is correct.

Since movq takes a register or a signed 32-bit immediate source operand,
select the immediate source operand only if it is known to be 32-bit
signed immediate at compile-time.

Here is the v5 patch with a testcase.  OK for master?

Thanks.

--
H.J.
  

Comments

Andreas Schwab March 16, 2021, 4:02 p.m. UTC | #1
On Mär 16 2021, H.J. Lu wrote:

> diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
> index 20f0958780..b19d90433d 100644
> --- a/sysdeps/x86_64/nptl/tls.h
> +++ b/sysdeps/x86_64/nptl/tls.h
> @@ -271,9 +271,18 @@ _Static_assert (offsetof (tcbhead_t, __glibc_unused2) == 0x80,
>  		       "i" (offsetof (struct pthread, member)));	      \
>       else /* 8 */							      \
>         {								      \
> -	 asm volatile ("movq %q0,%%fs:%P1" :				      \
> -		       : IMM_MODE ((uint64_t) cast_to_integer (value)),	      \
> -			 "i" (offsetof (struct pthread, member)));	      \
> +	 /* Since movq takes a register or a signed 32-bit immediate source   \
> +	    operand, select the immediate source operand only if it is known  \
> +	    to be 32-bit signed immediate at compile-time.  */		      \
> +	 if (__builtin_constant_p (value)				      \
> +	     && (int64_t) (int32_t) (uintptr_t) value == (uintptr_t) value)   \
> +	   asm volatile ("movq %0,%%fs:%P1" :				      \
> +			 : "i" ((uint64_t) cast_to_integer (value)),	      \
> +			   "i" (offsetof (struct pthread, member)));	      \
> +	 else								      \
> +	   asm volatile ("movq %q0,%%fs:%P1" :				      \
> +			 : "r" ((uint64_t) cast_to_integer (value)),	      \
> +			   "i" (offsetof (struct pthread, member)));	      \

I don't think this will work in general.  Wouldn't it be possible that
an operands optimizes to a 64-bit constant in later passes?  What you
really need is a constraint that only matches a 32-bit immediate or a
register.  Like "rWf"?

Andreas.
  
Florian Weimer March 16, 2021, 4:10 p.m. UTC | #2
* Andreas Schwab:

> I don't think this will work in general.  Wouldn't it be possible that
> an operands optimizes to a 64-bit constant in later passes?  What you
> really need is a constraint that only matches a 32-bit immediate or a
> register.  Like "rWf"?

Maybe we should to the __seg_fs namespace instead?  Wouldn't that avoid
these issues?

Thanks,
Florian
  
Carlos O'Donell March 16, 2021, 4:34 p.m. UTC | #3
On 3/16/21 12:10 PM, Florian Weimer via Libc-alpha wrote:
> * Andreas Schwab:
> 
>> I don't think this will work in general.  Wouldn't it be possible that
>> an operands optimizes to a 64-bit constant in later passes?  What you
>> really need is a constraint that only matches a 32-bit immediate or a
>> register.  Like "rWf"?
> 
> Maybe we should to the __seg_fs namespace instead?  Wouldn't that avoid
> these issues?

It might.

HJ, Does it work to rewrite this using __seg_fs?

Otherwise I'll review v5, which I think is better than it was before and
makes forward progress.
  
H.J. Lu March 16, 2021, 4:50 p.m. UTC | #4
On Tue, Mar 16, 2021 at 9:34 AM Carlos O'Donell <carlos@redhat.com> wrote:
>
> On 3/16/21 12:10 PM, Florian Weimer via Libc-alpha wrote:
> > * Andreas Schwab:
> >
> >> I don't think this will work in general.  Wouldn't it be possible that
> >> an operands optimizes to a 64-bit constant in later passes?  What you
> >> really need is a constraint that only matches a 32-bit immediate or a
> >> register.  Like "rWf"?

config/i386/constraints.md in GCC has

(define_constraint "e"
  "32-bit signed integer constant, or a symbolic reference known
   to fit that range (for immediate operands in sign-extending x86-64
   instructions)."
  (match_operand 0 "x86_64_immediate_operand"))

"er" is the correct constraint for movq.

> > Maybe we should to the __seg_fs namespace instead?  Wouldn't that avoid
> > these issues?
>
> It might.
>
> HJ, Does it work to rewrite this using __seg_fs?

I tried it.  GCC generates worse code.

> Otherwise I'll review v5, which I think is better than it was before and
> makes forward progress.

Here is the v6 patch to use the "er" constraint.   OK for master?

Thanks.
  
H.J. Lu March 16, 2021, 4:52 p.m. UTC | #5
On Tue, Mar 16, 2021 at 9:50 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Mar 16, 2021 at 9:34 AM Carlos O'Donell <carlos@redhat.com> wrote:
> >
> > On 3/16/21 12:10 PM, Florian Weimer via Libc-alpha wrote:
> > > * Andreas Schwab:
> > >
> > >> I don't think this will work in general.  Wouldn't it be possible that
> > >> an operands optimizes to a 64-bit constant in later passes?  What you
> > >> really need is a constraint that only matches a 32-bit immediate or a
> > >> register.  Like "rWf"?
>
> config/i386/constraints.md in GCC has
>
> (define_constraint "e"
>   "32-bit signed integer constant, or a symbolic reference known
>    to fit that range (for immediate operands in sign-extending x86-64
>    instructions)."
>   (match_operand 0 "x86_64_immediate_operand"))
>
> "er" is the correct constraint for movq.
>
> > > Maybe we should to the __seg_fs namespace instead?  Wouldn't that avoid
> > > these issues?
> >
> > It might.
> >
> > HJ, Does it work to rewrite this using __seg_fs?
>
> I tried it.  GCC generates worse code.
>
> > Otherwise I'll review v5, which I think is better than it was before and
> > makes forward progress.
>
> Here is the v6 patch to use the "er" constraint.   OK for master?
>

Include the patch.
  

Patch

From 199f147a2c95549890fcb1c8d4c3598bce45545b Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 16 Mar 2021 07:41:46 -0700
Subject: [PATCH v5] x86_64: Update THREAD_SETMEM/THREAD_SETMEM_NC for IMM64
 [BZ #27591]

Since movq takes a register or a signed 32-bit immediate source operand,
select the immediate source operand only if it is known to be 32-bit
signed immediate at compile-time.
---
 sysdeps/x86_64/Makefile           |  2 ++
 sysdeps/x86_64/nptl/tls.h         | 33 +++++++++++++++-----
 sysdeps/x86_64/tst-x86-64-tls-1.c | 50 +++++++++++++++++++++++++++++++
 3 files changed, 78 insertions(+), 7 deletions(-)
 create mode 100644 sysdeps/x86_64/tst-x86-64-tls-1.c

diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile
index d1d7cb9d2e..06a444b6af 100644
--- a/sysdeps/x86_64/Makefile
+++ b/sysdeps/x86_64/Makefile
@@ -183,6 +183,8 @@  ifeq (no,$(build-hardcoded-path-in-tests))
 tests-container += tst-glibc-hwcaps-cache
 endif
 
+tests-internal += tst-x86-64-tls-1
+
 endif # $(subdir) == elf
 
 ifeq ($(subdir),csu)
diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
index 20f0958780..b19d90433d 100644
--- a/sysdeps/x86_64/nptl/tls.h
+++ b/sysdeps/x86_64/nptl/tls.h
@@ -271,9 +271,18 @@  _Static_assert (offsetof (tcbhead_t, __glibc_unused2) == 0x80,
 		       "i" (offsetof (struct pthread, member)));	      \
      else /* 8 */							      \
        {								      \
-	 asm volatile ("movq %q0,%%fs:%P1" :				      \
-		       : IMM_MODE ((uint64_t) cast_to_integer (value)),	      \
-			 "i" (offsetof (struct pthread, member)));	      \
+	 /* Since movq takes a register or a signed 32-bit immediate source   \
+	    operand, select the immediate source operand only if it is known  \
+	    to be 32-bit signed immediate at compile-time.  */		      \
+	 if (__builtin_constant_p (value)				      \
+	     && (int64_t) (int32_t) (uintptr_t) value == (uintptr_t) value)   \
+	   asm volatile ("movq %0,%%fs:%P1" :				      \
+			 : "i" ((uint64_t) cast_to_integer (value)),	      \
+			   "i" (offsetof (struct pthread, member)));	      \
+	 else								      \
+	   asm volatile ("movq %q0,%%fs:%P1" :				      \
+			 : "r" ((uint64_t) cast_to_integer (value)),	      \
+			   "i" (offsetof (struct pthread, member)));	      \
        }})
 
 
@@ -296,10 +305,20 @@  _Static_assert (offsetof (tcbhead_t, __glibc_unused2) == 0x80,
 		       "r" (idx));					      \
      else /* 8 */							      \
        {								      \
-	 asm volatile ("movq %q0,%%fs:%P1(,%q2,8)" :			      \
-		       : IMM_MODE ((uint64_t) cast_to_integer (value)),	      \
-			 "i" (offsetof (struct pthread, member[0])),	      \
-			 "r" (idx));					      \
+	 /* Since movq takes a register or a signed 32-bit immediate source   \
+	    operand, select the immediate source operand only if it is known  \
+	    to be 32-bit signed immediate at compile-time.  */		      \
+	 if (__builtin_constant_p (value)				      \
+	     && (int64_t) (int32_t) (uintptr_t) value == (uintptr_t) value)   \
+	   asm volatile ("movq %0,%%fs:%P1(,%q2,8)" :			      \
+			 : "i" ((uint64_t) cast_to_integer (value)),	      \
+			   "i" (offsetof (struct pthread, member[0])),	      \
+			   "r" (idx));					      \
+	 else								      \
+	   asm volatile ("movq %q0,%%fs:%P1(,%q2,8)" :			      \
+			 : "r" ((uint64_t) cast_to_integer (value)),	      \
+			   "i" (offsetof (struct pthread, member[0])),	      \
+			   "r" (idx));					      \
        }})
 
 
diff --git a/sysdeps/x86_64/tst-x86-64-tls-1.c b/sysdeps/x86_64/tst-x86-64-tls-1.c
new file mode 100644
index 0000000000..ac6b13d3b7
--- /dev/null
+++ b/sysdeps/x86_64/tst-x86-64-tls-1.c
@@ -0,0 +1,50 @@ 
+/* Test THREAD_SETMEM and THREAD_SETMEM_NC for IMM64.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tls.h>
+#include <support/check.h>
+
+static int
+do_test (void)
+{
+  unsigned long long int saved_ssp_base
+    = THREAD_GETMEM (THREAD_SELF, header.ssp_base);
+  THREAD_SETMEM (THREAD_SELF, header.ssp_base, (1ULL << 57) - 1);
+  unsigned long long int ssp_base
+    = THREAD_GETMEM (THREAD_SELF, header.ssp_base);
+  if (ssp_base != ((1ULL << 57) - 1))
+    FAIL_EXIT1 ("THREAD_SETMEM: 0x%llx != 0x%llx",
+		ssp_base, (1ULL << 57) - 1);
+  THREAD_SETMEM (THREAD_SELF, header.ssp_base, saved_ssp_base);
+#ifndef __ILP32__
+  struct pthread_key_data *saved_specific
+    = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
+  uintptr_t value = (1UL << 57) - 1;
+  THREAD_SETMEM_NC (THREAD_SELF, specific, 1,
+		    (struct pthread_key_data *) value);
+  struct pthread_key_data *specific
+    = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
+  if (specific != (struct pthread_key_data *) value)
+    FAIL_EXIT1 ("THREAD_GETMEM_NC: %p != %p",
+		specific, (struct pthread_key_data *) value);
+  THREAD_SETMEM_NC (THREAD_SELF, specific, 1, saved_specific);
+#endif
+  return 0;
+}
+
+#include <support/test-driver.c>
-- 
2.30.2