[BZ,#17078] ARM: R_ARM_TLS_DESC prelinker support

Message ID alpine.DEB.1.10.1406230322570.25395@tp.orcam.me.uk
State Committed
Headers

Commit Message

Maciej W. Rozycki July 1, 2014, 1:28 p.m. UTC
  Hello,

 Here is a change to the dynamic linker to add prelinker support for the 
R_ARM_TLS_DESC relocation.  Two cases can be considered here, the usual 
one where lazy binding is in use and the less frequent one, where 
immediate binding is requested via the use of the DF_BIND_NOW dynamic flag 
(e.g. by using the GNU linker's "-z now" option).

 The change below only handle the first case.  In this scenario the 
prelinker does what the dynamic linker would do, that is preinitialises 
R_ARM_TLS_DESC relocations with a pointer to the lazy specialisation as 
provided with the DT_TLSDESC_PLT dynamic tag.  A conflict is additionally 
created and in the conflict resolution path the dynamic linker complements 
the work by initialising the object's pointer as indicated by the 
DT_TLSDESC_GOT dynamic tag to the linker's internal lazy specialisation 
worker function and also providing the associated link map in the second 
entry of the GOT.  This step is required, because if prelinking is 
successful at the run time, then the dynamic linker's 
elf_machine_runtime_setup() function isn't called that would normally do 
so.

 The second case remains unresolved, because support for that scenario has 
not been implemented in the prelinker.  In this case the lazy 
specialisation is unavailable and the DT_TLSDESC_PLT dynamic tag is not 
present.

 The prelinker could assume the common case of static specialisation and 
resolve the relocation, but that would require the exposure of dynamic 
linker's specialisation worker function.  Furthermore the dynamic linker 
would have to handle the relocation in the conflict resoultion path and 
see if the dynamic specialisation shouldn't be used instead.  This however 
would require access to data structures currently not made available to 
the conflict resoultion path and therefore a redesign of this part of the 
dynamic linker.

 Alternatively the prelinker could defer all processing to the dynamic 
linker's conflict resolution path, but that would require similar access 
to the said data structures.

 Therefore the prelinker issues an error instead and the dynamic linker 
has assertions to check that DT_TLSDESC_PLT and DT_TLSDESC_GOT are in use 
in its conflict resolution path.

 The changes below resolve all TLS failures in the prelinker testsuite, 
as noted in the bug report, as well as the small test case provided there.  
Unfortunately we don't seem to have any hooks to factor in the prelinker 
(if present on a system) to testing, so unless someone has an idea how to 
do that, this fix will have to rely on using the prelinker test suite and 
enabling TLS descriptors there for coverage.

 This has been verified across a number of multilibs with no regressions 
in our test suite.  OK to apply?

2014-07-01  Maciej W. Rozycki  <macro@codesourcery.com>

	[BZ #17078]
	* sysdeps/arm/dl-machine.h (elf_machine_rela)
	[RESOLVE_CONFLICT_FIND_MAP]: Handle R_ARM_TLS_DESC relocation.
	(elf_machine_lazy_rel): Handle prelinked R_ARM_TLS_DESC entries.

  Maciej

glibc-arm-tlsdesc-prelink.patch
  

Comments

Rich Felker July 1, 2014, 4:16 p.m. UTC | #1
On Tue, Jul 01, 2014 at 02:28:04PM +0100, Maciej W. Rozycki wrote:
> Hello,
> 
>  Here is a change to the dynamic linker to add prelinker support for the 
> R_ARM_TLS_DESC relocation.  Two cases can be considered here, the usual 
> one where lazy binding is in use and the less frequent one, where 
> immediate binding is requested via the use of the DF_BIND_NOW dynamic flag 
> (e.g. by using the GNU linker's "-z now" option).
> 
>  The change below only handle the first case.  In this scenario the 

Given that there seems to be an intent for glibc to move towards safe
allocation of TLS at dlopen/pthread_create time rather than lazy
allocation (which inherently leads to crashing under memory
exhaustion), I don't think it's useful to focus on the lazy case,
which also crashes if there's no memory for the tls index structure.
Rather the lazy case should eventually be removed.

Technically it's possible to support the lazy case without crashing on
memory exhaustion, by simply re-doing the lookup every time the symbol
is referenced if allocation fails. But I really doubt anybody wants to
implement that hideously-slow fallback. And the allocation also makes
the accesses async-signal-unsafe, which is another bug that should be
fixed.

Rich
  
Maciej W. Rozycki July 2, 2014, 8:39 a.m. UTC | #2
On Tue, 1 Jul 2014, Rich Felker wrote:

> >  Here is a change to the dynamic linker to add prelinker support for the 
> > R_ARM_TLS_DESC relocation.  Two cases can be considered here, the usual 
> > one where lazy binding is in use and the less frequent one, where 
> > immediate binding is requested via the use of the DF_BIND_NOW dynamic flag 
> > (e.g. by using the GNU linker's "-z now" option).
> > 
> >  The change below only handle the first case.  In this scenario the 
> 
> Given that there seems to be an intent for glibc to move towards safe
> allocation of TLS at dlopen/pthread_create time rather than lazy
> allocation (which inherently leads to crashing under memory
> exhaustion), I don't think it's useful to focus on the lazy case,
> which also crashes if there's no memory for the tls index structure.
> Rather the lazy case should eventually be removed.
> 
> Technically it's possible to support the lazy case without crashing on
> memory exhaustion, by simply re-doing the lookup every time the symbol
> is referenced if allocation fails. But I really doubt anybody wants to
> implement that hideously-slow fallback. And the allocation also makes
> the accesses async-signal-unsafe, which is another bug that should be
> fixed.

 Thank you for your input.  I think it is good that you think about glibc 
improvements, and you are welcome to submit patches to back up your 
considerations.

 However this change handles what is already there and supported across 
the toolchain and glibc, fixing a legitimate use case that does not work 
although it should.  It is also mostly agnostic about dynamic loading 
implementation internals by merely copying data from the DT_TLSDESC_PLT 
dynamic tag to the memory location pointed by the DT_TLSDESC_GOT tag as 
per the TLS descriptor ARM/Linux psABI addendum.  Any change to this 
semantics that your suggested improvements may require will be an 
incompatible ABI change and support for preexisting binaries will have to 
be retained anyway.  There is also GOT[1] initialised here as per our 
practice elsewhere, which is our dynamic linker's internal protocol, and 
obviously any incompatible change of yours will have to address the 
different treatment of GOT[1] throughout our code anyway.  But at the 
point you'll have redefined or removed DT_TLSDESC_PLT/DT_TLSDESC_GOT this 
is going to be the least of a problem.

 Therefore I fail to see how the observations you have made relate to the 
bug fix I have proposed; as I say you're free to submit any improvements 
on top of it.

  Maciej
  
Rich Felker July 2, 2014, 2:41 p.m. UTC | #3
On Wed, Jul 02, 2014 at 09:39:44AM +0100, Maciej W. Rozycki wrote:
>  However this change handles what is already there and supported across 
> the toolchain and glibc, fixing a legitimate use case that does not work 
> although it should.

If prelink is really broken (errors/crashes at runtime, etc.) on ARM
right now because of this issue, then yes, by all means fix it.

> It is also mostly agnostic about dynamic loading 
> implementation internals by merely copying data from the DT_TLSDESC_PLT 
> dynamic tag to the memory location pointed by the DT_TLSDESC_GOT tag as 
> per the TLS descriptor ARM/Linux psABI addendum.  Any change to this 
> semantics that your suggested improvements may require will be an 
> incompatible ABI change and support for preexisting binaries will have to 
> be retained anyway.

I don't think there are any ABI incompatibilities; it's all an
internal implementation detail of the dynamic linker. Of course the
changed dynamic linker would always overwrite these prelinked TLSDESC
relocations at startup since they'd all be lazy.

> There is also GOT[1] initialised here as per our 
> practice elsewhere, which is our dynamic linker's internal protocol, and 
> obviously any incompatible change of yours will have to address the 
> different treatment of GOT[1] throughout our code anyway.  But at the 
> point you'll have redefined or removed DT_TLSDESC_PLT/DT_TLSDESC_GOT this 
> is going to be the least of a problem.

These aren't my proposed changes; they're a natural continuation of
the project that was started (and reverted because it wasn't ready) to
make TLS async-signal-safe and eliminate the possibility of error
conditions that applications fundamentally can't handle at runtime.

Rich
  
Joseph Myers July 16, 2014, 8:39 p.m. UTC | #4
On Tue, 1 Jul 2014, Maciej W. Rozycki wrote:

> 2014-07-01  Maciej W. Rozycki  <macro@codesourcery.com>
> 
> 	[BZ #17078]
> 	* sysdeps/arm/dl-machine.h (elf_machine_rela)
> 	[RESOLVE_CONFLICT_FIND_MAP]: Handle R_ARM_TLS_DESC relocation.
> 	(elf_machine_lazy_rel): Handle prelinked R_ARM_TLS_DESC entries.

OK.
  
Maciej W. Rozycki July 17, 2014, 6:50 p.m. UTC | #5
On Wed, 16 Jul 2014, Joseph S. Myers wrote:

> > 2014-07-01  Maciej W. Rozycki  <macro@codesourcery.com>
> > 
> > 	[BZ #17078]
> > 	* sysdeps/arm/dl-machine.h (elf_machine_rela)
> > 	[RESOLVE_CONFLICT_FIND_MAP]: Handle R_ARM_TLS_DESC relocation.
> > 	(elf_machine_lazy_rel): Handle prelinked R_ARM_TLS_DESC entries.
> 
> OK.

 Applied, thanks.

  Maciej
  

Patch

Index: glibc-fsf-trunk-quilt/sysdeps/arm/dl-machine.h
===================================================================
--- glibc-fsf-trunk-quilt.orig/sysdeps/arm/dl-machine.h	2014-06-23 03:36:48.621961686 +0100
+++ glibc-fsf-trunk-quilt/sysdeps/arm/dl-machine.h	2014-06-23 03:37:04.621789475 +0100
@@ -573,6 +573,32 @@  elf_machine_rela (struct link_map *map, 
 	case R_ARM_ABS32:
 	  *reloc_addr = value + reloc->r_addend;
 	  break;
+#  ifdef RESOLVE_CONFLICT_FIND_MAP
+	case R_ARM_TLS_DESC:
+	  {
+	    struct tlsdesc volatile *td =
+	      (struct tlsdesc volatile *) reloc_addr;
+
+	    RESOLVE_CONFLICT_FIND_MAP (map, reloc_addr);
+
+	    /* Make sure we know what's going on.  */
+	    assert (td->entry
+		    == (void *) (D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
+				 + map->l_addr));
+	    assert (map->l_info[ADDRIDX (DT_TLSDESC_GOT)]);
+
+	    /* Set up the lazy resolver and store the pointer to our link
+	       map in _GLOBAL_OFFSET_TABLE[1] now as for a prelinked
+	       binary elf_machine_runtime_setup() is not called and hence
+	       neither has been initialized.  */
+	    *(Elf32_Addr *) (D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_GOT)])
+			     + map->l_addr)
+	      = (Elf32_Addr) &_dl_tlsdesc_lazy_resolver;
+	    ((Elf32_Addr *) D_PTR (map, l_info[DT_PLTGOT]))[1]
+	      = (Elf32_Addr) map;
+	  }
+	  break;
+#  endif /* RESOLVE_CONFLICT_FIND_MAP */
 	case R_ARM_PC24:
           relocate_pc24 (map, value, reloc_addr, reloc->r_addend);
 	  break;
@@ -652,9 +678,11 @@  elf_machine_lazy_rel (struct link_map *m
 	(struct tlsdesc volatile *)reloc_addr;
 
       /* The linker must have given us the parameter we need in the
-	 first GOT entry, and left the second one empty. We fill the
-	 last with the resolver address */
-      assert (td->entry == 0);
+	 first GOT entry, and left the second one empty.  The latter
+	 will have been preset by the prelinker if used though.
+	 We fill it with the resolver address.  */
+      assert (td->entry == 0
+	      || map->l_info[VALIDX (DT_GNU_PRELINKED)] != NULL);
       td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
 			  + map->l_addr);
     }