x86-64: Align the stack in __tls_get_addr [BZ #21609]
Commit Message
On Tue, Jun 20, 2017 at 03:59:12PM +0200, Florian Weimer wrote:
> On 06/19/2017 12:18 PM, Florian Weimer wrote:
> > This is a follow-up to bug 15128, where an unaligned stack was
> > compensated for in the dynamic linker relocation trampoline.
> > __tls_get_addr itself was not fixed, though, and the function can reach
> > deeply into libc due to its malloc dependency, so can result crashes
> > with an unaligned stack, too.
> >
> > The attached patch adds a compatibility implementation of __tls_get_addr
> > which aligns the stack, but only on the slow path. Internal calls go to
> > the default implementation (from elf/dl-tls.c) which does not perform
> > stack realignment.
> >
> > I plan to submit a follow-up patch which adds a new symbol version for
> > __tls_get_addr which bypasses the stack alignment for new binaries.
> >
> > In my patch, the CFI annotations need review. I have never written
> > those before.
>
> I went over the CFI annotations with Jakub. He suggested some
> simplifications, incorporated in the attached patch. We don't need
> .cfi_restore because the %rbp value at the specified offset is protected
> by the red zone until the function returns.
>
> Thanks,
> Florian
> x86-64: Align the stack in __tls_get_addr [BZ #21609]
>
> This change forces realignment of the stack pointer in
> __tls_get_addr, so that old GCC-compiled binaries continue to work
> even if vector instructions are used in glibc which require the
> ABI stack realignment.
>
> The new assembler implementation of __tls_get_addr calls the default
> implementation (from elf/dl-tls.c, now call __tls_get_addr_default)
> after realigning the stack (but only does so on the slow path).
> Internal calls go directly to __tls_get_addr_default because they do not
> need stack realignment.
>
> 2017-06-20 Florian Weimer <fweimer@redhat.com>
>
> [BZ #21609]
> __tls_get_addr implementation with stack alignment for older GCC.
> * sysdeps/x86_64/dl-tls.h: Add multiple inclusion guards.
> * sysdeps/x86_64/tls_get_addr_compat.c: New file.
> * sysdeps/x86_64/dl-tls.c: Likewise.
> * sysdeps/x86_64/rtld-offets.h: Likewise.
> * sysdeps/x86_64/dl-tlsdesc.h: Include <dl-tls.h>.
> (tls_index): Remove definition.
> * sysdeps/x86_64/tlsdesc.sym (TI_MODULE_OFFSET, TI_OFFSET_OFFSET):
> Add.
> * sysdeps/x86_64/Makefile (sysdep-dl-routines): Add
> tls_get_addr_compat.
> (gen-as-const-headers): Add rtld-offsets.sym.
>
Hi Florian,
I prefer to let GCC realign the stack for us. What do you think?
H.J.
---
From 4781d00aa6656221d1ca82c863a07689b93172c0 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Mon, 3 Jul 2017 13:16:57 -0700
Subject: [PATCH] x86-64: Align the stack in __tls_get_addr [BZ #21609]
This change forces realignment of the stack pointer in __tls_get_addr, so
that old GCC-compiled binaries continue to work even if vector instructions
are used in glibc which require the ABI stack realignment.
The implementation of __tls_get_addr calls the default implementation
(from elf/dl-tls.c, now calls __tls_get_addr_default) after realigning
the stack. Internal calls go directly to __tls_get_addr_default because
they do not need stack realignment.
2017-07-03 Florian Weimer <fweimer@redhat.com>
H.J. Lu <hongjiu.lu@intel.com>
[BZ #21609]
* sysdeps/x86_64/dl-tls.c: New file.
* sysdeps/x86_64/dl-tls.h: Add multiple inclusion guards.
---
sysdeps/x86_64/dl-tls.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
sysdeps/x86_64/dl-tls.h | 5 +++++
2 files changed, 52 insertions(+)
create mode 100644 sysdeps/x86_64/dl-tls.c
Comments
On 07/03/2017 01:25 PM, H.J. Lu wrote:
>
> Hi Florian,
>
> I prefer to let GCC realign the stack for us. What do you think?
>
> H.J.
That's what I originally suggested, but Florian's assembly version has
the advantage that it only realigns the stack in the slow path.
On 07/03/2017 10:25 PM, H.J. Lu wrote:
> I prefer to let GCC realign the stack for us. What do you think?
I tried that first, but it does not work with all the GCC versions we
support. We would have to add a configure check and hope that older GCC
versions never generate code which needs an aligned stack.
Furthermore, the code GCC generates for stack realignment is really bad,
and __tls_get_addr appears to be quite performance-critical because not
all callers cache the result.
Thanks,
Florian
new file mode 100644
@@ -0,0 +1,47 @@
+/* Thread-local storage handling in the ELF dynamic linker. x86-64 version.
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#ifdef SHARED
+/* Work around GCC PR58066, due to which __tls_get_addr may be called
+ with an unaligned stack. The compat implementation is in
+ tls_get_addr-compat.S. */
+
+# include <dl-tls.h>
+
+/* Define __tls_get_addr within elf/dl-tls.c under a different
+ name. */
+extern __typeof__ (__tls_get_addr) __tls_get_addr_default;
+
+# define __tls_get_addr __tls_get_addr_default
+# include <elf/dl-tls.c>
+# undef __tls_get_addr
+
+hidden_ver (__tls_get_addr_default, __tls_get_addr)
+
+__attribute__ ((force_align_arg_pointer))
+void *
+__tls_get_addr (tls_index *ti)
+{
+ return __tls_get_addr_default (ti);
+}
+#else
+
+/* No compatibility symbol needed. */
+# include <elf/dl-tls.c>
+
+#endif
@@ -16,6 +16,9 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#ifndef _X86_64_DL_TLS_H
+#define _X86_64_DL_TLS_H
+
#include <stdint.h>
/* Type used for the representation of TLS information in the GOT. */
@@ -27,3 +30,5 @@ typedef struct dl_tls_index
extern void *__tls_get_addr (tls_index *ti);
+
+#endif /* _X86_64_DL_TLS_H */