ppc64le: expected localentry:0 `pthread_condattr_destroy'

Message ID 20170729044418.GE1956@bubble.grove.modra.org
State Not applicable
Headers

Commit Message

Alan Modra July 29, 2017, 4:44 a.m. UTC
  On Fri, Jul 28, 2017 at 03:17:45PM +0200, Florian Weimer wrote:
> This absolutely has to be disabled by default.

I'd already come to that conclusion.  I'll commit the following to
binutils master and 2.29 after some testing.

Subject: [PATCH] PR 21847, Don't default PowerPC64 to --plt-localentry

The big comment in ppc64_elf_tls_setup says why.  I've also added some
code to the bfd linker that catches the -lpthread -lc symbol
differences and disable generation of optimized call stubs even when
--plt-localentry is activated.  Gold doesn't yet have that.

	PR 21847
bfd/
	* elf64-ppc.c (struct ppc_link_hash_entry): Add non_zero_localentry.
	(ppc64_elf_merge_symbol): Set non_zero_localentry.
	(is_elfv2_localentry0): Test non_zero_localentry.
	(ppc64_elf_tls_setup): Default to --no-plt-localentry.
gold/
	* powerpc.cc (Target_powerpc::scan_relocs): Default to
	--no-plt-localentry.
ld/
	* ld.texinfo (plt-localentry): Document.
  

Comments

Florian Weimer July 30, 2017, 9:15 a.m. UTC | #1
On 07/29/2017 06:44 AM, Alan Modra wrote:
> +@cindex PowerPC64 ELFv2 PLT localentry optimization
> +@kindex --plt-localentry
> +@kindex --no-plt-localentry
> +@item --plt-localentry
> +@itemx --no-localentry
> +ELFv2 functions with localentry:0 are those with a single entry point,
> +ie. global entry == local entry, and that have no requirement on r2
> +(the TOC/GOT pointer) or r12, and guarantee r2 is unchanged on return.
> +Such an external function can be called via the PLT without saving r2
> +or restoring it on return, avoiding a common load-hit-store for small
> +functions.   The optimization is attractive, with up to 40% reduction
> +in execution time for a small function, but can result in symbol
> +interposition failures.  Use with care.  @option{--no-plt-localentry}
> +is the default.

I think this is still very misleading.

The documentation should make the following things clear:

* If objects are linked with --plt-localentry, it is likely that at run
time, the exact same version of the object will be required by the
dynamic linker.  Even supposedly ABI-compatible library updates may not
work.

* Mere recompilation of shared object dependencies can prevent loading
of objects linked with --plt-localentry even if the sources are
unchanged.  Compiler flag changes (such as optimization levels) and
compiler version changes affect compatibility.

* There is no promise  that applications which link to the core GNU
toolchain libraries (libgcc_s, glibc, libstdc++) will continue to load
after any change to these libraries, even though these libraries
generally maintain ABI backward compatibility.

The point about symbol interposition is valid, but it is really not the
core issue here.

The source code comment should be adjusted in a similar way.

Thanks,
Florian
  
Andreas Schwab July 30, 2017, 10:22 a.m. UTC | #2
On Jul 30 2017, Florian Weimer <fweimer@redhat.com> wrote:

> * If objects are linked with --plt-localentry, it is likely that at run
> time, the exact same version of the object will be required by the
> dynamic linker.  Even supposedly ABI-compatible library updates may not
> work.

Or, in other words, --plt-localentry changes the ABI.

Andreas.
  
Florian Weimer July 30, 2017, 10:29 a.m. UTC | #3
On 07/30/2017 12:22 PM, Andreas Schwab wrote:
> On Jul 30 2017, Florian Weimer <fweimer@redhat.com> wrote:
> 
>> * If objects are linked with --plt-localentry, it is likely that at run
>> time, the exact same version of the object will be required by the
>> dynamic linker.  Even supposedly ABI-compatible library updates may not
>> work.
> 
> Or, in other words, --plt-localentry changes the ABI.

It's worse than that: It changes what is part of the ABI.  It makes ABI
dependencies more strict, i.e., changes which were ABI-compatible before
no longer are.  And what I think it is important to convey is that none
of the GNU toolchain libraries (and other system libraries, presumably)
are prepared to preserve ABI compatibility according to this new
definition of what is part of the ABI of a shared object.

Thanks,
Florian
  

Patch

diff --git a/bfd/ChangeLog b/bfd/ChangeLog
index 0f71358..41c935d 100644
--- a/bfd/ChangeLog
+++ b/bfd/ChangeLog
@@ -1,3 +1,11 @@ 
+2017-07-29  Alan Modra  <amodra@gmail.com>
+
+	PR 21847
+	* elf64-ppc.c (struct ppc_link_hash_entry): Add non_zero_localentry.
+	(ppc64_elf_merge_symbol): Set non_zero_localentry.
+	(is_elfv2_localentry0): Test non_zero_localentry.
+	(ppc64_elf_tls_setup): Default to --no-plt-localentry.
+
 2017-07-28  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
         * elf32-s390.c (elf_s390_finish_dynamic_sections): Add NULL
diff --git a/bfd/elf64-ppc.c b/bfd/elf64-ppc.c
index cc0e8ee..5f3c79f 100644
--- a/bfd/elf64-ppc.c
+++ b/bfd/elf64-ppc.c
@@ -4013,6 +4013,10 @@  struct ppc_link_hash_entry
      with non-standard calling convention.  */
   unsigned int save_res:1;
 
+  /* Set if a duplicate symbol with non-zero localentry is detected,
+     even when the duplicate symbol does not provide a definition.  */
+  unsigned int non_zero_localentry:1;
+
   /* Contexts in which symbol is used in the GOT (or TOC).
      TLS_GD .. TLS_EXPLICIT bits are or'd into the mask as the
      corresponding relocs are encountered during check_relocs.
@@ -5021,7 +5025,7 @@  ppc64_elf_merge_symbol_attribute (struct elf_link_hash_entry *h,
 
 static bfd_boolean
 ppc64_elf_merge_symbol (struct elf_link_hash_entry *h,
-			const Elf_Internal_Sym *isym ATTRIBUTE_UNUSED,
+			const Elf_Internal_Sym *isym,
 			asection **psec ATTRIBUTE_UNUSED,
 			bfd_boolean newdef ATTRIBUTE_UNUSED,
 			bfd_boolean olddef ATTRIBUTE_UNUSED,
@@ -5029,6 +5033,8 @@  ppc64_elf_merge_symbol (struct elf_link_hash_entry *h,
 			const asection *oldsec ATTRIBUTE_UNUSED)
 {
   ((struct ppc_link_hash_entry *) h)->fake = 0;
+  if ((STO_PPC64_LOCAL_MASK & isym->st_other) != 0)
+    ((struct ppc_link_hash_entry *) h)->non_zero_localentry = 1;
   return TRUE;
 }
 
@@ -6335,6 +6341,7 @@  is_elfv2_localentry0 (struct elf_link_hash_entry *h)
 	  && h->type == STT_FUNC
 	  && h->root.type == bfd_link_hash_defined
 	  && (STO_PPC64_LOCAL_MASK & h->other) == 0
+	  && !((struct ppc_link_hash_entry *) h)->non_zero_localentry
 	  && is_ppc64_elf (h->root.u.def.section->owner)
 	  && abiversion (h->root.u.def.section->owner) >= 2);
 }
@@ -8349,10 +8356,22 @@  ppc64_elf_tls_setup (struct bfd_link_info *info)
   else if (!htab->do_multi_toc)
     htab->params->no_multi_toc = 1;
 
+  /* Default to --no-plt-localentry, as this option can cause problems
+     with symbol interposition.  For example, glibc libpthread.so and
+     libc.so duplicate many pthread symbols, with a fallback
+     implementation in libc.so.  In some cases the fallback does more
+     work than the pthread implementation.  __pthread_condattr_destroy
+     is one such symbol: the libpthread.so implementation is
+     localentry:0 while the libc.so implementation is localentry:8.
+     An app that "cleverly" uses dlopen to only load necessary
+     libraries at runtime may omit loading libpthread.so when not
+     running multi-threaded, which then results in the libc.so
+     fallback symbols being used and ld.so complaining.  Now there
+     are workarounds in ld (see non_zero_localentry) to detect the
+     pthread situation, but that may not be the only case where
+     --plt-localentry can cause trouble.  */
   if (htab->params->plt_localentry0 < 0)
-    htab->params->plt_localentry0
-      = elf_link_hash_lookup (&htab->elf, "GLIBC_2.26",
-			      FALSE, FALSE, FALSE) != NULL;
+    htab->params->plt_localentry0 = 0;
 
   htab->tls_get_addr = ((struct ppc_link_hash_entry *)
 			elf_link_hash_lookup (&htab->elf, ".__tls_get_addr",
diff --git a/gold/ChangeLog b/gold/ChangeLog
index fdac931..a019393 100644
--- a/gold/ChangeLog
+++ b/gold/ChangeLog
@@ -1,3 +1,9 @@ 
+2017-07-29  Alan Modra  <amodra@gmail.com>
+
+	PR 21847
+	* powerpc.cc (Target_powerpc::scan_relocs): Default to
+	--no-plt-localentry.
+
 2017-07-28  H.J. Lu  <hongjiu.lu@intel.com>
 
 	PR gold/21857
diff --git a/gold/powerpc.cc b/gold/powerpc.cc
index a4966b8..e322d6f 100644
--- a/gold/powerpc.cc
+++ b/gold/powerpc.cc
@@ -7660,8 +7660,6 @@  Target_powerpc<size, big_endian>::scan_relocs(
 	{
 	  if (parameters->options().user_set_plt_localentry())
 	    plt_localentry0 = parameters->options().plt_localentry();
-	  else
-	    plt_localentry0 = symtab->lookup("GLIBC_2.26", NULL) != NULL;
 	}
       this->plt_localentry0_ = plt_localentry0;
       this->plt_localentry0_init_ = true;
diff --git a/ld/ChangeLog b/ld/ChangeLog
index 9345785..2a371b9 100644
--- a/ld/ChangeLog
+++ b/ld/ChangeLog
@@ -1,3 +1,7 @@ 
+2017-07-29  Alan Modra  <amodra@gmail.com>
+
+	* ld.texinfo (plt-localentry): Document.
+
 2017-07-28  Andrew Burgess  <andrew.burgess@embecosm.com>
 
 	* ldgram.y (ldgram_had_keep): Make static.
diff --git a/ld/ld.texinfo b/ld/ld.texinfo
index 987816f..172c1dd 100644
--- a/ld/ld.texinfo
+++ b/ld/ld.texinfo
@@ -7600,6 +7600,21 @@  barrier in the call stub, or use LD_BIND_NOW=1.  By default, @code{ld}
 looks for calls to commonly used functions that create threads, and if
 seen, adds the necessary barriers.  Use these options to change the
 default behaviour.
+
+@cindex PowerPC64 ELFv2 PLT localentry optimization
+@kindex --plt-localentry
+@kindex --no-plt-localentry
+@item --plt-localentry
+@itemx --no-localentry
+ELFv2 functions with localentry:0 are those with a single entry point,
+ie. global entry == local entry, and that have no requirement on r2
+(the TOC/GOT pointer) or r12, and guarantee r2 is unchanged on return.
+Such an external function can be called via the PLT without saving r2
+or restoring it on return, avoiding a common load-hit-store for small
+functions.   The optimization is attractive, with up to 40% reduction
+in execution time for a small function, but can result in symbol
+interposition failures.  Use with care.  @option{--no-plt-localentry}
+is the default.
 @end table
 
 @ifclear GENERIC