diff mbox

Add hidden __tls_get_addr/___tls_get_addr alias

Message ID 20141220203247.GA22423@gmail.com
State Superseded
Headers show

Commit Message

H.J. Lu Dec. 20, 2014, 8:32 p.m. UTC
On Fri, Dec 19, 2014 at 11:01:39PM -0500, Carlos O'Donell wrote:
> On 12/19/2014 08:05 PM, Chris Metcalf wrote:
> > In https://sourceware.org/ml/libc-alpha/2014-11/msg00123.html,
> > Carlos added support for checking ld.so for appropriate PLT use.
> > Running on tilegx just now, it fails, complaining that there is a
> > missing required PLT reference to __tls_get_addr in ld.so.
> 
> It is always safe to remove entries from localplt.data, but you
> should not add them without realizing you *need* the PLT entry.
> 
> > However, it's not at all clear to me that this is in fact a bug.  Looking at
> > libc.so for x86, for example, it seems that the only reason there is a PLT
> > reference is because it is called via an explicit @PLT reference from
> > _dl_tlsdesc_dynamic in sysdeps/x86_64/dl-tlsdesc.S.
> 
> Correct.
> 
> > With tilegx we don't have any calls to __tls_get_addr from within
> > ld.so itself, so there's no PLT created, so the test fails.
> 
> OK.
> 
> > It seems that Ulrich wrote the original x86 version of _dl_tlsdesc_dynamic
> > with the @PLT reference, but it's certainly not clear to me why; I
> > think this code is used only in ld.so, and I can't see a good reason
> > why you'd want to allow an override from another shared library for
> > __tls_get_addr.  So there are multiple mysteries here.
> 
> I don't know either, I took only the existing status quo and maintained it.
> 
> > I can fix this by creating a local copy of localplt.data for tile,
> > of course, but I wonder why it's not "ld.so: __tls_get_addr ?"
> > in the generic localplt.data, i.e. marked as an optional symbol.
> > Or, more deeply, why the PLT references are needed on any platform
> > in the first place.
> 
> All very good questions. Someone would have to dig into it to determine
> if it is possible to remove the reference.
> 
> > Any guidance for an appropriate change would be appreciated.
>  
> The appropriate immediate change is for tile to have it's own copy of
> localplt.data with the right entries. Less entries is always better.
> 
> Making the generic localplt.data better is secondary with the freeze
> so close.
> 

The problem was the missing hidden __tls_get_addr/___tls_get_addr
definiton.  Callers inside ld.so have to use the public interface.
__tls_get_addr/___tls_get_addr is always defined in ld.so.  There is
no need to call them via PLT inside ld.so.  This patch adds the hidden
__tls_get_addr/___tls_get_addr aliases and calls them directly from
_dl_tlsdesc_dynamic.  There is no need to set up the EBX register in
i386 _dl_tlsdesc_dynamic when calling the hidden ___tls_get_addr.
Tested on x32, x86 and x86-64.  OK to install?

Thanks.

H.J.
---
	* elf/dl-tls.c (__tls_get_addr): Provide the hidden definition
	if not defined.
	* sysdeps/i386/dl-tls.h (___tls_get_addr): Provide the hidden
	definition.
	* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
	hidden ___tls_get_addr.
	* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
	hidden __tls_get_addr.
	* sysdeps/generic/localplt.data (__tls_get_addr): Removed.
	* sysdeps/unix/sysv/linux/i386/localplt.data (___tls_get_addr):
	Likewise.
---
 ChangeLog                                  | 14 ++++++++++++++
 elf/dl-tls.c                               |  5 +++++
 sysdeps/generic/localplt.data              |  7 +++----
 sysdeps/i386/dl-tls.h                      |  2 ++
 sysdeps/i386/dl-tlsdesc.S                  |  5 +----
 sysdeps/unix/sysv/linux/i386/localplt.data |  9 +++------
 sysdeps/x86_64/dl-tlsdesc.S                |  2 +-
 7 files changed, 29 insertions(+), 15 deletions(-)

Comments

Andreas Schwab Dec. 20, 2014, 8:57 p.m. UTC | #1
"H.J. Lu" <hjl.tools@gmail.com> writes:

> +# The dynamic loader uses __libc_memalign # internally to allocate aligned

Spurious #.

Andreas.
Carlos O'Donell Dec. 21, 2014, 3:50 p.m. UTC | #2
On 12/20/2014 03:32 PM, H.J. Lu wrote:
> On Fri, Dec 19, 2014 at 11:01:39PM -0500, Carlos O'Donell wrote:
>> On 12/19/2014 08:05 PM, Chris Metcalf wrote:
>>> In https://sourceware.org/ml/libc-alpha/2014-11/msg00123.html,
>>> Carlos added support for checking ld.so for appropriate PLT use.
>>> Running on tilegx just now, it fails, complaining that there is a
>>> missing required PLT reference to __tls_get_addr in ld.so.
>>
>> It is always safe to remove entries from localplt.data, but you
>> should not add them without realizing you *need* the PLT entry.
>>
>>> However, it's not at all clear to me that this is in fact a bug.  Looking at
>>> libc.so for x86, for example, it seems that the only reason there is a PLT
>>> reference is because it is called via an explicit @PLT reference from
>>> _dl_tlsdesc_dynamic in sysdeps/x86_64/dl-tlsdesc.S.
>>
>> Correct.
>>
>>> With tilegx we don't have any calls to __tls_get_addr from within
>>> ld.so itself, so there's no PLT created, so the test fails.
>>
>> OK.
>>
>>> It seems that Ulrich wrote the original x86 version of _dl_tlsdesc_dynamic
>>> with the @PLT reference, but it's certainly not clear to me why; I
>>> think this code is used only in ld.so, and I can't see a good reason
>>> why you'd want to allow an override from another shared library for
>>> __tls_get_addr.  So there are multiple mysteries here.
>>
>> I don't know either, I took only the existing status quo and maintained it.
>>
>>> I can fix this by creating a local copy of localplt.data for tile,
>>> of course, but I wonder why it's not "ld.so: __tls_get_addr ?"
>>> in the generic localplt.data, i.e. marked as an optional symbol.
>>> Or, more deeply, why the PLT references are needed on any platform
>>> in the first place.
>>
>> All very good questions. Someone would have to dig into it to determine
>> if it is possible to remove the reference.
>>
>>> Any guidance for an appropriate change would be appreciated.
>>  
>> The appropriate immediate change is for tile to have it's own copy of
>> localplt.data with the right entries. Less entries is always better.
>>
>> Making the generic localplt.data better is secondary with the freeze
>> so close.
>>
> 
> The problem was the missing hidden __tls_get_addr/___tls_get_addr
> definiton.  Callers inside ld.so have to use the public interface.
> __tls_get_addr/___tls_get_addr is always defined in ld.so.  There is
> no need to call them via PLT inside ld.so.  This patch adds the hidden
> __tls_get_addr/___tls_get_addr aliases and calls them directly from
> _dl_tlsdesc_dynamic.  There is no need to set up the EBX register in
> i386 _dl_tlsdesc_dynamic when calling the hidden ___tls_get_addr.
> Tested on x32, x86 and x86-64.  OK to install?

This looks like it should work.

OK with the addition of the comment below.

> Thanks.
> 
> H.J.
> ---
> 	* elf/dl-tls.c (__tls_get_addr): Provide the hidden definition
> 	if not defined.
> 	* sysdeps/i386/dl-tls.h (___tls_get_addr): Provide the hidden
> 	definition.
> 	* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
> 	hidden ___tls_get_addr.
> 	* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
> 	hidden __tls_get_addr.
> 	* sysdeps/generic/localplt.data (__tls_get_addr): Removed.
> 	* sysdeps/unix/sysv/linux/i386/localplt.data (___tls_get_addr):
> 	Likewise.
> ---
>  ChangeLog                                  | 14 ++++++++++++++
>  elf/dl-tls.c                               |  5 +++++
>  sysdeps/generic/localplt.data              |  7 +++----
>  sysdeps/i386/dl-tls.h                      |  2 ++
>  sysdeps/i386/dl-tlsdesc.S                  |  5 +----
>  sysdeps/unix/sysv/linux/i386/localplt.data |  9 +++------
>  sysdeps/x86_64/dl-tlsdesc.S                |  2 +-
>  7 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/ChangeLog b/ChangeLog
> index a354c93..a397544 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,5 +1,19 @@
>  2014-12-20  H.J. Lu  <hongjiu.lu@intel.com>
>  
> +	* elf/dl-tls.c (__tls_get_addr): Provide the hidden definition
> +	if not defined.
> +	* sysdeps/i386/dl-tls.h (___tls_get_addr): Provide the hidden
> +	definition.
> +	* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
> +	hidden ___tls_get_addr.
> +	* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
> +	hidden __tls_get_addr.
> +	* sysdeps/generic/localplt.data (__tls_get_addr): Removed.
> +	* sysdeps/unix/sysv/linux/i386/localplt.data (___tls_get_addr):
> +	Likewise.
> +
> +2014-12-20  H.J. Lu  <hongjiu.lu@intel.com>
> +
>  	* sysdeps/i386/dl-machine.h (_dl_start_user): Remove @PLT
>  	from "call _dl_init@PLT".
>  	* sysdeps/x86_64/dl-machine.h (_dl_start_user): Likewise.
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 76b8b36..cff213f 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -809,6 +809,11 @@ update_get_addr (GET_ADDR_ARGS)
>    return (void *) p + GET_ADDR_OFFSET;
>  }
>  

Suggest comment:

/* For all machines that have a non-macro version of __tls_get_addr
   we want to use rtld_hidden_proto/*_def in order to call the internal
   alias for __tls_get_addr from ld.so. This avoids a PLT entry in
   ld.so for __tls_get_addr.  */

> +#ifndef __tls_get_addr
> +extern void * __tls_get_addr (GET_ADDR_ARGS);
> +rtld_hidden_proto (__tls_get_addr)
> +rtld_hidden_def (__tls_get_addr)
> +#endif

OK. This is true for all targets that don't define __tls_get_addr
as a macro in dl-tls.h (which is all the new targets).

>  /* The generic dynamic and local dynamic model cannot be used in
>     statically linked applications.  */
> diff --git a/sysdeps/generic/localplt.data b/sysdeps/generic/localplt.data
> index d7d6734..8d49a34 100644
> --- a/sysdeps/generic/localplt.data
> +++ b/sysdeps/generic/localplt.data
> @@ -7,10 +7,9 @@ libc.so: malloc
>  libc.so: memalign
>  libc.so: realloc
>  libm.so: matherr
> -# The dynamic loader needs __tls_get_addr for TLS, and uses __libc_memalign
> -# internally to allocate aligned TLS storage. The other malloc family of
> -# functions are expected to allow user symbol interposition.
> -ld.so: __tls_get_addr
> +# The dynamic loader uses __libc_memalign # internally to allocate aligned
> +# TLS storage. The other malloc family of functions are expected to allow
> +# user symbol interposition.
>  ld.so: __libc_memalign
>  ld.so: malloc
>  ld.so: calloc
> diff --git a/sysdeps/i386/dl-tls.h b/sysdeps/i386/dl-tls.h
> index 48809a5..99b86f9 100644
> --- a/sysdeps/i386/dl-tls.h
> +++ b/sysdeps/i386/dl-tls.h
> @@ -50,6 +50,8 @@ __tls_get_addr (tls_index *ti)
>     version of this file.  */
>  # define __tls_get_addr __attribute__ ((__regparm__ (1))) ___tls_get_addr
>  strong_alias (___tls_get_addr, ___tls_get_addr_internal)
> +rtld_hidden_proto (___tls_get_addr)
> +rtld_hidden_def (___tls_get_addr)
>  #else
>  
>  /* Users should get the better interface.  */
> diff --git a/sysdeps/i386/dl-tlsdesc.S b/sysdeps/i386/dl-tlsdesc.S
> index e6753e9..570b180 100644
> --- a/sysdeps/i386/dl-tlsdesc.S
> +++ b/sysdeps/i386/dl-tlsdesc.S
> @@ -126,10 +126,7 @@ _dl_tlsdesc_dynamic:
>  	.p2align 4,,7
>  .Lslow:
>  	cfi_adjust_cfa_offset (28)
> -	movl	%ebx, 16(%esp)
> -	LOAD_PIC_REG (bx)
> -	call	___tls_get_addr@PLT
> -	movl	16(%esp), %ebx
> +	call	HIDDEN_JUMPTARGET (___tls_get_addr)
>  	jmp	.Lret
>  	cfi_endproc
>  	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
> diff --git a/sysdeps/unix/sysv/linux/i386/localplt.data b/sysdeps/unix/sysv/linux/i386/localplt.data
> index 009797b..b25abf8 100644
> --- a/sysdeps/unix/sysv/linux/i386/localplt.data
> +++ b/sysdeps/unix/sysv/linux/i386/localplt.data
> @@ -5,12 +5,9 @@ libc.so: malloc
>  libc.so: memalign
>  libc.so: realloc
>  libm.so: matherr
> -# The dynamic loader needs ___tls_get_addr for TLS, and uses __libc_memalign
> -# internally to allocate aligned TLS storage. The other malloc family of
> -# functions are expected to allow user symbol interposition.
> -# Note that it is triple underscore for ___tls_get_addr e.g. the alternate
> -# ABI.
> -ld.so: ___tls_get_addr
> +# The dynamic loader uses __libc_memalign internally to allocate aligned
> +# TLS storage. The other malloc family of functions are expected to allow
> +# user symbol interposition.
>  ld.so: __libc_memalign
>  ld.so: malloc
>  ld.so: calloc
> diff --git a/sysdeps/x86_64/dl-tlsdesc.S b/sysdeps/x86_64/dl-tlsdesc.S
> index 92e18a5..03f5ca4 100644
> --- a/sysdeps/x86_64/dl-tlsdesc.S
> +++ b/sysdeps/x86_64/dl-tlsdesc.S
> @@ -128,7 +128,7 @@ _dl_tlsdesc_dynamic:
>  	movq	%r10, 40(%rsp)
>  	movq	%r11, 48(%rsp)
>  	/* %rdi already points to the tlsinfo data structure.  */
> -	call	__tls_get_addr@PLT
> +	call	HIDDEN_JUMPTARGET (__tls_get_addr)
>  	movq	8(%rsp), %rdx
>  	movq	16(%rsp), %rcx
>  	movq	24(%rsp), %r8
>
diff mbox

Patch

diff --git a/ChangeLog b/ChangeLog
index a354c93..a397544 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,19 @@ 
 2014-12-20  H.J. Lu  <hongjiu.lu@intel.com>
 
+	* elf/dl-tls.c (__tls_get_addr): Provide the hidden definition
+	if not defined.
+	* sysdeps/i386/dl-tls.h (___tls_get_addr): Provide the hidden
+	definition.
+	* sysdeps/i386/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
+	hidden ___tls_get_addr.
+	* sysdeps/x86_64/dl-tlsdesc.S (_dl_tlsdesc_dynamic): Call the
+	hidden __tls_get_addr.
+	* sysdeps/generic/localplt.data (__tls_get_addr): Removed.
+	* sysdeps/unix/sysv/linux/i386/localplt.data (___tls_get_addr):
+	Likewise.
+
+2014-12-20  H.J. Lu  <hongjiu.lu@intel.com>
+
 	* sysdeps/i386/dl-machine.h (_dl_start_user): Remove @PLT
 	from "call _dl_init@PLT".
 	* sysdeps/x86_64/dl-machine.h (_dl_start_user): Likewise.
diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 76b8b36..cff213f 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -809,6 +809,11 @@  update_get_addr (GET_ADDR_ARGS)
   return (void *) p + GET_ADDR_OFFSET;
 }
 
+#ifndef __tls_get_addr
+extern void * __tls_get_addr (GET_ADDR_ARGS);
+rtld_hidden_proto (__tls_get_addr)
+rtld_hidden_def (__tls_get_addr)
+#endif
 
 /* The generic dynamic and local dynamic model cannot be used in
    statically linked applications.  */
diff --git a/sysdeps/generic/localplt.data b/sysdeps/generic/localplt.data
index d7d6734..8d49a34 100644
--- a/sysdeps/generic/localplt.data
+++ b/sysdeps/generic/localplt.data
@@ -7,10 +7,9 @@  libc.so: malloc
 libc.so: memalign
 libc.so: realloc
 libm.so: matherr
-# The dynamic loader needs __tls_get_addr for TLS, and uses __libc_memalign
-# internally to allocate aligned TLS storage. The other malloc family of
-# functions are expected to allow user symbol interposition.
-ld.so: __tls_get_addr
+# The dynamic loader uses __libc_memalign # internally to allocate aligned
+# TLS storage. The other malloc family of functions are expected to allow
+# user symbol interposition.
 ld.so: __libc_memalign
 ld.so: malloc
 ld.so: calloc
diff --git a/sysdeps/i386/dl-tls.h b/sysdeps/i386/dl-tls.h
index 48809a5..99b86f9 100644
--- a/sysdeps/i386/dl-tls.h
+++ b/sysdeps/i386/dl-tls.h
@@ -50,6 +50,8 @@  __tls_get_addr (tls_index *ti)
    version of this file.  */
 # define __tls_get_addr __attribute__ ((__regparm__ (1))) ___tls_get_addr
 strong_alias (___tls_get_addr, ___tls_get_addr_internal)
+rtld_hidden_proto (___tls_get_addr)
+rtld_hidden_def (___tls_get_addr)
 #else
 
 /* Users should get the better interface.  */
diff --git a/sysdeps/i386/dl-tlsdesc.S b/sysdeps/i386/dl-tlsdesc.S
index e6753e9..570b180 100644
--- a/sysdeps/i386/dl-tlsdesc.S
+++ b/sysdeps/i386/dl-tlsdesc.S
@@ -126,10 +126,7 @@  _dl_tlsdesc_dynamic:
 	.p2align 4,,7
 .Lslow:
 	cfi_adjust_cfa_offset (28)
-	movl	%ebx, 16(%esp)
-	LOAD_PIC_REG (bx)
-	call	___tls_get_addr@PLT
-	movl	16(%esp), %ebx
+	call	HIDDEN_JUMPTARGET (___tls_get_addr)
 	jmp	.Lret
 	cfi_endproc
 	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
diff --git a/sysdeps/unix/sysv/linux/i386/localplt.data b/sysdeps/unix/sysv/linux/i386/localplt.data
index 009797b..b25abf8 100644
--- a/sysdeps/unix/sysv/linux/i386/localplt.data
+++ b/sysdeps/unix/sysv/linux/i386/localplt.data
@@ -5,12 +5,9 @@  libc.so: malloc
 libc.so: memalign
 libc.so: realloc
 libm.so: matherr
-# The dynamic loader needs ___tls_get_addr for TLS, and uses __libc_memalign
-# internally to allocate aligned TLS storage. The other malloc family of
-# functions are expected to allow user symbol interposition.
-# Note that it is triple underscore for ___tls_get_addr e.g. the alternate
-# ABI.
-ld.so: ___tls_get_addr
+# The dynamic loader uses __libc_memalign internally to allocate aligned
+# TLS storage. The other malloc family of functions are expected to allow
+# user symbol interposition.
 ld.so: __libc_memalign
 ld.so: malloc
 ld.so: calloc
diff --git a/sysdeps/x86_64/dl-tlsdesc.S b/sysdeps/x86_64/dl-tlsdesc.S
index 92e18a5..03f5ca4 100644
--- a/sysdeps/x86_64/dl-tlsdesc.S
+++ b/sysdeps/x86_64/dl-tlsdesc.S
@@ -128,7 +128,7 @@  _dl_tlsdesc_dynamic:
 	movq	%r10, 40(%rsp)
 	movq	%r11, 48(%rsp)
 	/* %rdi already points to the tlsinfo data structure.  */
-	call	__tls_get_addr@PLT
+	call	HIDDEN_JUMPTARGET (__tls_get_addr)
 	movq	8(%rsp), %rdx
 	movq	16(%rsp), %rcx
 	movq	24(%rsp), %r8