Use __ehdr_start rather than _begin in _dl_start_final

Message ID Ymk4dQAkaGYfZWWT@squeak.grove.modra.org
State Committed
Commit 6f043e0ee7e477f50a44024ed0cb579d5e3f511d
Headers
Series Use __ehdr_start rather than _begin in _dl_start_final |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Alan Modra April 27, 2022, 12:35 p.m. UTC
  __ehdr_start is already used in rltld.c:dl_main, and can serve the
same purpose as _begin.  Besides tidying the code, using linker
defined section relative symbols rather than "-defsym _begin=0" better
reflects the intent of _dl_start_final use of _begin, which is to
refer to the load address of ld.so rather than absolute address zero.

The motive for this patch is to finally tidy ppc32 GNU ld treatment of
absolute symbols.  On ppc32, the reference to _begin generates a GOT
entry.  A GOT entry for an absolute symbol shouldn't be dynamically
relocated, so this usage in glibc of an absolute _begin will fail once
I fix ppc32 ld.  Note that on many other targets, including ppc64, gcc
generates pc relative or got-pointer relative addressing for _begin.
ie. the compiler assumes _begin is *not* at an absolute address, and
generally linkers go along with that.

Other uses of absolute symbols in glibc, eg. see
_NL_CURRENT_DEFINE_ABS should not be dynamically relocated, but are
with current ppc32 GNU ld.  This doesn't cause a problem in glibc
since the references are either undefined weak (and value zero) or
non-zero and relocated to another non-zero value, and glibc just tests
for zero/non-zero.
  

Comments

Florian Weimer April 27, 2022, 4:08 p.m. UTC | #1
* Alan Modra via Libc-alpha:

> __ehdr_start is already used in rltld.c:dl_main, and can serve the
> same purpose as _begin.  Besides tidying the code, using linker
> defined section relative symbols rather than "-defsym _begin=0" better
> reflects the intent of _dl_start_final use of _begin, which is to
> refer to the load address of ld.so rather than absolute address zero.

Looks reasonable.  I verified it still builds everywhere.

Thanks,
Florian
  
Adhemerval Zanella Netto April 27, 2022, 6:54 p.m. UTC | #2
On 27/04/2022 09:35, Alan Modra via Libc-alpha wrote:
> __ehdr_start is already used in rltld.c:dl_main, and can serve the
> same purpose as _begin.  Besides tidying the code, using linker
> defined section relative symbols rather than "-defsym _begin=0" better
> reflects the intent of _dl_start_final use of _begin, which is to
> refer to the load address of ld.so rather than absolute address zero.
> 
> The motive for this patch is to finally tidy ppc32 GNU ld treatment of
> absolute symbols.  On ppc32, the reference to _begin generates a GOT
> entry.  A GOT entry for an absolute symbol shouldn't be dynamically
> relocated, so this usage in glibc of an absolute _begin will fail once
> I fix ppc32 ld.  Note that on many other targets, including ppc64, gcc
> generates pc relative or got-pointer relative addressing for _begin.
> ie. the compiler assumes _begin is *not* at an absolute address, and
> generally linkers go along with that.

Does it mean that once you fix it, newer binutils will start to fail
to build older glibc releases for powerpc32?
  
Alan Modra April 28, 2022, 7:07 a.m. UTC | #3
On Wed, Apr 27, 2022 at 03:54:27PM -0300, Adhemerval Zanella wrote:
> 
> 
> On 27/04/2022 09:35, Alan Modra via Libc-alpha wrote:
> > __ehdr_start is already used in rltld.c:dl_main, and can serve the
> > same purpose as _begin.  Besides tidying the code, using linker
> > defined section relative symbols rather than "-defsym _begin=0" better
> > reflects the intent of _dl_start_final use of _begin, which is to
> > refer to the load address of ld.so rather than absolute address zero.
> > 
> > The motive for this patch is to finally tidy ppc32 GNU ld treatment of
> > absolute symbols.  On ppc32, the reference to _begin generates a GOT
> > entry.  A GOT entry for an absolute symbol shouldn't be dynamically
> > relocated, so this usage in glibc of an absolute _begin will fail once
> > I fix ppc32 ld.  Note that on many other targets, including ppc64, gcc
> > generates pc relative or got-pointer relative addressing for _begin.
> > ie. the compiler assumes _begin is *not* at an absolute address, and
> > generally linkers go along with that.
> 
> Does it mean that once you fix it, newer binutils will start to fail
> to build older glibc releases for powerpc32?

Not necessarily.  I found these issues with absolute symbols in glibc
by inspecting object files.  Possibly dl_rtld_map.l_map_start == 0
causes no problem, I'll know when I have set up a test environment for
ppc32 that lets me build and run a ppc32 glibc.
  
Alan Modra April 29, 2022, 11:01 p.m. UTC | #4
On Thu, Apr 28, 2022 at 04:37:53PM +0930, Alan Modra wrote:
> On Wed, Apr 27, 2022 at 03:54:27PM -0300, Adhemerval Zanella wrote:
> > 
> > 
> > On 27/04/2022 09:35, Alan Modra via Libc-alpha wrote:
> > > __ehdr_start is already used in rltld.c:dl_main, and can serve the
> > > same purpose as _begin.  Besides tidying the code, using linker
> > > defined section relative symbols rather than "-defsym _begin=0" better
> > > reflects the intent of _dl_start_final use of _begin, which is to
> > > refer to the load address of ld.so rather than absolute address zero.
> > > 
> > > The motive for this patch is to finally tidy ppc32 GNU ld treatment of
> > > absolute symbols.  On ppc32, the reference to _begin generates a GOT
> > > entry.  A GOT entry for an absolute symbol shouldn't be dynamically
> > > relocated, so this usage in glibc of an absolute _begin will fail once
> > > I fix ppc32 ld.  Note that on many other targets, including ppc64, gcc
> > > generates pc relative or got-pointer relative addressing for _begin.
> > > ie. the compiler assumes _begin is *not* at an absolute address, and
> > > generally linkers go along with that.
> > 
> > Does it mean that once you fix it, newer binutils will start to fail
> > to build older glibc releases for powerpc32?
> 
> Not necessarily.  I found these issues with absolute symbols in glibc
> by inspecting object files.  Possibly dl_rtld_map.l_map_start == 0
> causes no problem, I'll know when I have set up a test environment for
> ppc32 that lets me build and run a ppc32 glibc.

Just tst-dl_find_object regresses on a glibc ld.so that uses _begin.
  
Florian Weimer April 30, 2022, 11:27 a.m. UTC | #5
* Alan Modra via Libc-alpha:

> On Thu, Apr 28, 2022 at 04:37:53PM +0930, Alan Modra wrote:
>> On Wed, Apr 27, 2022 at 03:54:27PM -0300, Adhemerval Zanella wrote:
>> > 
>> > 
>> > On 27/04/2022 09:35, Alan Modra via Libc-alpha wrote:
>> > > __ehdr_start is already used in rltld.c:dl_main, and can serve the
>> > > same purpose as _begin.  Besides tidying the code, using linker
>> > > defined section relative symbols rather than "-defsym _begin=0" better
>> > > reflects the intent of _dl_start_final use of _begin, which is to
>> > > refer to the load address of ld.so rather than absolute address zero.
>> > > 
>> > > The motive for this patch is to finally tidy ppc32 GNU ld treatment of
>> > > absolute symbols.  On ppc32, the reference to _begin generates a GOT
>> > > entry.  A GOT entry for an absolute symbol shouldn't be dynamically
>> > > relocated, so this usage in glibc of an absolute _begin will fail once
>> > > I fix ppc32 ld.  Note that on many other targets, including ppc64, gcc
>> > > generates pc relative or got-pointer relative addressing for _begin.
>> > > ie. the compiler assumes _begin is *not* at an absolute address, and
>> > > generally linkers go along with that.
>> > 
>> > Does it mean that once you fix it, newer binutils will start to fail
>> > to build older glibc releases for powerpc32?
>> 
>> Not necessarily.  I found these issues with absolute symbols in glibc
>> by inspecting object files.  Possibly dl_rtld_map.l_map_start == 0
>> causes no problem, I'll know when I have set up a test environment for
>> ppc32 that lets me build and run a ppc32 glibc.
>
> Just tst-dl_find_object regresses on a glibc ld.so that uses _begin.

I think _dl_find_dso_for_object will go wrong as well, potentially
misidentifying dlopen/dlsym/atexit callers.

But I think you should proceed with your binutils fix, and we should
backport the glibc fix.

Thanks,
Florian
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index ad253defdd..c8a351e2ae 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1326,8 +1326,7 @@  $(objpfx)ld.so: $(objpfx)librtld.os $(ld-map)
 	$(LINK.o) -nostdlib -nostartfiles -shared -o $@.new		\
 		  $(LDFLAGS-rtld) -Wl,-z,defs $(z-now-$(bind-now))	\
 		  $(filter-out $(map-file),$^) $(load-map-file)		\
-		  -Wl,-soname=$(rtld-installed-name)			\
-		  -Wl,-defsym=_begin=0
+		  -Wl,-soname=$(rtld-installed-name)
 	$(call after-link,$@.new)
 	$(READELF) -s $@.new \
 	  | $(AWK) '($$7 ~ /^UND(|EF)$$/ && $$1 != "0:" && $$4 != "REGISTER") { print; p=1 } END { exit p != 0 }'
diff --git a/elf/rtld.c b/elf/rtld.c
index be6daa1c44..3b2e05bf4c 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -440,8 +440,8 @@  static ElfW(Addr) _dl_start_final (void *arg,
 				   struct dl_start_final_info *info);
 #endif
 
-/* These defined magically in the linker script.  */
-extern char _begin[] attribute_hidden;
+/* These are defined magically by the linker.  */
+extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
 extern char _etext[] attribute_hidden;
 extern char _end[] attribute_hidden;
 
@@ -490,7 +490,7 @@  _dl_start_final (void *arg, struct dl_start_final_info *info)
 #endif
   _dl_setup_hash (&GL(dl_rtld_map));
   GL(dl_rtld_map).l_real = &GL(dl_rtld_map);
-  GL(dl_rtld_map).l_map_start = (ElfW(Addr)) _begin;
+  GL(dl_rtld_map).l_map_start = (ElfW(Addr)) &__ehdr_start;
   GL(dl_rtld_map).l_map_end = (ElfW(Addr)) _end;
   GL(dl_rtld_map).l_text_end = (ElfW(Addr)) _etext;
   /* Copy the TLS related data if necessary.  */
@@ -1741,7 +1741,6 @@  dl_main (const ElfW(Phdr) *phdr,
      segment that also includes the phdrs.  If that's not available, we use
      the old method that assumes the beginning of the file is part of the
      lowest-addressed PT_LOAD segment.  */
-  extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility ("hidden")));
 
   /* Set up the program header information for the dynamic linker
      itself.  It is needed in the dl_iterate_phdr callbacks.  */