[BZ,#19679,x86_64] Set DL_RUNIME_UNALIGNED_VEC_SIZE to 8

Message ID 20160219220955.GA25540@intel.com
State New, archived
Headers

Commit Message

Lu, Hongjiu Feb. 19, 2016, 10:09 p.m. UTC
  Due to GCC bug:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

__tls_get_addr may be called with 8-byte stack alignment.  Although
this bug has been fixed in GCC 4.9.4, 5.3 and 6, we can't assume
that stack will be always aligned at 16 bytes.  Since SSE optimized
memory/string functions with aligned SSE register load and store are
used in the dynamic linker, we must set DL_RUNIME_UNALIGNED_VEC_SIZE
to 8 so that _dl_runtime_resolve_sse will align the stack before
calling _dl_fixup:

Dump of assembler code for function _dl_runtime_resolve_sse:
   0x00007ffff7deea90 <+0>:	push   %rbx
   0x00007ffff7deea91 <+1>:	mov    %rsp,%rbx
   0x00007ffff7deea94 <+4>:	and    $0xfffffffffffffff0,%rsp
                                ^^^^^^^^^^^ Align stack to 16 bytes
   0x00007ffff7deea98 <+8>:	sub    $0x100,%rsp
   0x00007ffff7deea9f <+15>:	mov    %rax,0xc0(%rsp)
   0x00007ffff7deeaa7 <+23>:	mov    %rcx,0xc8(%rsp)
   0x00007ffff7deeaaf <+31>:	mov    %rdx,0xd0(%rsp)
   0x00007ffff7deeab7 <+39>:	mov    %rsi,0xd8(%rsp)
   0x00007ffff7deeabf <+47>:	mov    %rdi,0xe0(%rsp)
   0x00007ffff7deeac7 <+55>:	mov    %r8,0xe8(%rsp)
   0x00007ffff7deeacf <+63>:	mov    %r9,0xf0(%rsp)
   0x00007ffff7deead7 <+71>:	movaps %xmm0,(%rsp)
   0x00007ffff7deeadb <+75>:	movaps %xmm1,0x10(%rsp)
   0x00007ffff7deeae0 <+80>:	movaps %xmm2,0x20(%rsp)
   0x00007ffff7deeae5 <+85>:	movaps %xmm3,0x30(%rsp)
   0x00007ffff7deeaea <+90>:	movaps %xmm4,0x40(%rsp)
   0x00007ffff7deeaef <+95>:	movaps %xmm5,0x50(%rsp)
   0x00007ffff7deeaf4 <+100>:	movaps %xmm6,0x60(%rsp)
   0x00007ffff7deeaf9 <+105>:	movaps %xmm7,0x70(%rsp)

Tested on x86-64.  Any comments?

H.J.
---
	[BZ #19679]
	* sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
	Set to 8.
---
 sysdeps/x86_64/dl-trampoline.S | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
  

Comments

Carlos O'Donell Feb. 19, 2016, 10:54 p.m. UTC | #1
On 02/19/2016 05:09 PM, H.J. Lu wrote:
> Tested on x86-64.  Any comments?

Why isn't this DL_RUN*T*IME_UNALIGNED_VEC_SIZE.

Typos in macro interfaces are the whole reason
we added -Wundef. This looks very very typo
prone, I always want to type "DL_RUNTIME_..."
 
> H.J.
> ---
> 	[BZ #19679]
> 	* sysdeps/x86_64/dl-trampoline.S (DL_RUNIME_UNALIGNED_VEC_SIZE):
> 	Set to 8.
> ---
>  sysdeps/x86_64/dl-trampoline.S | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/sysdeps/x86_64/dl-trampoline.S b/sysdeps/x86_64/dl-trampoline.S
> index 9fb6b13..0a480b6 100644
> --- a/sysdeps/x86_64/dl-trampoline.S
> +++ b/sysdeps/x86_64/dl-trampoline.S
> @@ -34,8 +34,11 @@
>  #endif
>  
>  #ifndef DL_RUNIME_UNALIGNED_VEC_SIZE
> -/* The maximum size of unaligned vector load and store.  */
> -# define DL_RUNIME_UNALIGNED_VEC_SIZE 16
> +/* The maximum size of unaligned vector load and store in the dynamic

The maximum size *in bytes* ...

> +   linker.  Since SSE optimized memory/string functions with aligned
> +   SSE register load and store are used in the dynamic linker, we must
> +   set this to 8.  */
> +# define DL_RUNIME_UNALIGNED_VEC_SIZE 8

OK.

>  #endif
>  
>  /* True if _dl_runtime_resolve should align stack to VEC_SIZE bytes.  */
> 

OK if you:
* Fix the name to be DL_RUNTIME_*
* Adjust the comment to use a unit of size.

Cheers,
Carlos.
  

Patch

diff --git a/sysdeps/x86_64/dl-trampoline.S b/sysdeps/x86_64/dl-trampoline.S
index 9fb6b13..0a480b6 100644
--- a/sysdeps/x86_64/dl-trampoline.S
+++ b/sysdeps/x86_64/dl-trampoline.S
@@ -34,8 +34,11 @@ 
 #endif
 
 #ifndef DL_RUNIME_UNALIGNED_VEC_SIZE
-/* The maximum size of unaligned vector load and store.  */
-# define DL_RUNIME_UNALIGNED_VEC_SIZE 16
+/* The maximum size of unaligned vector load and store in the dynamic
+   linker.  Since SSE optimized memory/string functions with aligned
+   SSE register load and store are used in the dynamic linker, we must
+   set this to 8.  */
+# define DL_RUNIME_UNALIGNED_VEC_SIZE 8
 #endif
 
 /* True if _dl_runtime_resolve should align stack to VEC_SIZE bytes.  */