[v2] elf: Replace a --defsym trick with an object file to be compatible with LLD

Message ID 20210129192824.804615-1-maskray@google.com
State Committed
Delegated to: Adhemerval Zanella Netto
Headers
Series [v2] elf: Replace a --defsym trick with an object file to be compatible with LLD |

Commit Message

Fangrui Song Jan. 29, 2021, 7:28 p.m. UTC
  The existing code specifies -Wl,--defsym=malloc=0 and other malloc.os
definitions before libc_pic.a so that libc_pic.a(malloc.os) is not
fetched. This trick is used to avoid multiple definition errors which
would happen as a chain result:

  dl-allobjs.os has an undefined __libc_scratch_buffer_set_array_size
  __libc_scratch_buffer_set_array_size fetches libc_pic.a(scratch_buffer_set_array_size.os)
  libc_pic.a(scratch_buffer_set_array_size.os) has an undefined free
  free fetches libc_pic.a(malloc.os)
  libc_pic.a(malloc.os) has an undefined __libc_message
  __libc_message fetches libc_pic.a(libc_fatal.os)

  libc_fatal.os will cause a multiple definition error (__GI___libc_fatal)
  >>> defined at dl-fxstatat64.c
  >>>            /tmp/p/glibc/Release/elf/dl-allobjs.os:(__GI___libc_fatal)
  >>> defined at libc_fatal.c
  >>>            libc_fatal.os:(.text+0x240) in archive /tmp/p/glibc/Release/libc_pic.a

LLD processes --defsym after all input files, so this trick does not
suppress multiple definition errors with LLD. Split the step into two
and use an object file to make the intention more obvious and make LLD
work.

This is conceptually more appropriate because --defsym defines a SHN_ABS
symbol while a normal definition is relative to the image base.

See https://sourceware.org/pipermail/libc-alpha/2020-March/111910.html
for discussions about the --defsym semantics.

This commit is available on https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/lld
---
 elf/Makefile | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)
  

Comments

Adhemerval Zanella Feb. 1, 2021, 1:19 p.m. UTC | #1
On 29/01/2021 16:28, Fangrui Song wrote:
> The existing code specifies -Wl,--defsym=malloc=0 and other malloc.os
> definitions before libc_pic.a so that libc_pic.a(malloc.os) is not
> fetched. This trick is used to avoid multiple definition errors which
> would happen as a chain result:
> 
>   dl-allobjs.os has an undefined __libc_scratch_buffer_set_array_size
>   __libc_scratch_buffer_set_array_size fetches libc_pic.a(scratch_buffer_set_array_size.os)
>   libc_pic.a(scratch_buffer_set_array_size.os) has an undefined free
>   free fetches libc_pic.a(malloc.os)
>   libc_pic.a(malloc.os) has an undefined __libc_message
>   __libc_message fetches libc_pic.a(libc_fatal.os)
> 
>   libc_fatal.os will cause a multiple definition error (__GI___libc_fatal)
>   >>> defined at dl-fxstatat64.c
>   >>>            /tmp/p/glibc/Release/elf/dl-allobjs.os:(__GI___libc_fatal)
>   >>> defined at libc_fatal.c
>   >>>            libc_fatal.os:(.text+0x240) in archive /tmp/p/glibc/Release/libc_pic.a
> 
> LLD processes --defsym after all input files, so this trick does not
> suppress multiple definition errors with LLD. Split the step into two
> and use an object file to make the intention more obvious and make LLD
> work.
> 
> This is conceptually more appropriate because --defsym defines a SHN_ABS
> symbol while a normal definition is relative to the image base.
> 
> See https://sourceware.org/pipermail/libc-alpha/2020-March/111910.html
> for discussions about the --defsym semantics.
> 
> This commit is available on https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/lld
> ---
>  elf/Makefile | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/elf/Makefile b/elf/Makefile
> index 5d666b1b0c..03e4034cc7 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -526,10 +526,6 @@ rtld-stubbed-symbols = \
>    malloc \
>    realloc \
>  
> -# The GCC arguments that implement $(rtld-stubbed-symbols).
> -rtld-stubbed-symbols-args = \
> -  $(patsubst %,-Wl$(comma)--defsym=%=0, $(rtld-stubbed-symbols))
> -
>  ifeq ($(have-ssp),yes)
>  # rtld is not built with the stack protector, so these references will
>  # go away in the rebuilds.
> @@ -538,9 +534,10 @@ endif
>  
>  $(objpfx)librtld.map: $(objpfx)dl-allobjs.os $(common-objpfx)libc_pic.a
>  	@-rm -f $@T
> -	$(reloc-link) -o $@.o $(rtld-stubbed-symbols-args) \
> -		'-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
> -	rm -f $@.o
> +	printf '$(patsubst %,.globl %\n,$(rtld-stubbed-symbols)) $(patsubst %,%:\n,$(rtld-stubbed-symbols))' | \
> +		$(CC) -o $@T.o $(ASFLAGS) -c -x assembler -
> +	$(reloc-link) -o $@.o $@T.o '-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
> +	rm -f %@T.o $@.o
>  	mv -f $@T $@
>  
>  # For lld, skip preceding addresses and values before matching the archive and the member.

Looks ok, I haven't see any build issue.  The '=' seems unrequired, although '=' 
seems to work on all supported architectures (HAVE_ASM_SET_DIRECTIVE does not 
seem to be necessary to handle here).

It is ok for 2.34 from my side.  H.J, do you see any issues of not setting the
symbol to '0'?
  
Florian Weimer Feb. 1, 2021, 1:43 p.m. UTC | #2
* Fangrui Song via Libc-alpha:

>  $(objpfx)librtld.map: $(objpfx)dl-allobjs.os $(common-objpfx)libc_pic.a
>  	@-rm -f $@T
> -	$(reloc-link) -o $@.o $(rtld-stubbed-symbols-args) \
> -		'-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
> -	rm -f $@.o
> +	printf '$(patsubst %,.globl %\n,$(rtld-stubbed-symbols)) $(patsubst %,%:\n,$(rtld-stubbed-symbols))' | \
> +		$(CC) -o $@T.o $(ASFLAGS) -c -x assembler -
> +	$(reloc-link) -o $@.o $@T.o '-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
> +	rm -f %@T.o $@.o
>  	mv -f $@T $@

It may be a little bit cleaner to replace the shell/make mix with just
shell, like (untested):

	for symbol in $(rtld-stubbed-symbols) ; do \
	   echo ".globl $$symbol" ; \
	   echo "$$symbol:" ; \
	done | $(CC) -o $@T.o $(ASFLAGS) -c -x assembler -

Thanks,
Florian
  
H.J. Lu Feb. 1, 2021, 1:45 p.m. UTC | #3
On Mon, Feb 1, 2021 at 5:19 AM Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 29/01/2021 16:28, Fangrui Song wrote:
> > The existing code specifies -Wl,--defsym=malloc=0 and other malloc.os
> > definitions before libc_pic.a so that libc_pic.a(malloc.os) is not
> > fetched. This trick is used to avoid multiple definition errors which
> > would happen as a chain result:
> >
> >   dl-allobjs.os has an undefined __libc_scratch_buffer_set_array_size
> >   __libc_scratch_buffer_set_array_size fetches libc_pic.a(scratch_buffer_set_array_size.os)
> >   libc_pic.a(scratch_buffer_set_array_size.os) has an undefined free
> >   free fetches libc_pic.a(malloc.os)
> >   libc_pic.a(malloc.os) has an undefined __libc_message
> >   __libc_message fetches libc_pic.a(libc_fatal.os)
> >
> >   libc_fatal.os will cause a multiple definition error (__GI___libc_fatal)
> >   >>> defined at dl-fxstatat64.c
> >   >>>            /tmp/p/glibc/Release/elf/dl-allobjs.os:(__GI___libc_fatal)
> >   >>> defined at libc_fatal.c
> >   >>>            libc_fatal.os:(.text+0x240) in archive /tmp/p/glibc/Release/libc_pic.a
> >
> > LLD processes --defsym after all input files, so this trick does not
> > suppress multiple definition errors with LLD. Split the step into two
> > and use an object file to make the intention more obvious and make LLD
> > work.
> >
> > This is conceptually more appropriate because --defsym defines a SHN_ABS
> > symbol while a normal definition is relative to the image base.
> >
> > See https://sourceware.org/pipermail/libc-alpha/2020-March/111910.html
> > for discussions about the --defsym semantics.
> >
> > This commit is available on https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/lld
> > ---
> >  elf/Makefile | 11 ++++-------
> >  1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/elf/Makefile b/elf/Makefile
> > index 5d666b1b0c..03e4034cc7 100644
> > --- a/elf/Makefile
> > +++ b/elf/Makefile
> > @@ -526,10 +526,6 @@ rtld-stubbed-symbols = \
> >    malloc \
> >    realloc \
> >
> > -# The GCC arguments that implement $(rtld-stubbed-symbols).
> > -rtld-stubbed-symbols-args = \
> > -  $(patsubst %,-Wl$(comma)--defsym=%=0, $(rtld-stubbed-symbols))
> > -
> >  ifeq ($(have-ssp),yes)
> >  # rtld is not built with the stack protector, so these references will
> >  # go away in the rebuilds.
> > @@ -538,9 +534,10 @@ endif
> >
> >  $(objpfx)librtld.map: $(objpfx)dl-allobjs.os $(common-objpfx)libc_pic.a
> >       @-rm -f $@T
> > -     $(reloc-link) -o $@.o $(rtld-stubbed-symbols-args) \
> > -             '-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
> > -     rm -f $@.o
> > +     printf '$(patsubst %,.globl %\n,$(rtld-stubbed-symbols)) $(patsubst %,%:\n,$(rtld-stubbed-symbols))' | \
> > +             $(CC) -o $@T.o $(ASFLAGS) -c -x assembler -
> > +     $(reloc-link) -o $@.o $@T.o '-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
> > +     rm -f %@T.o $@.o
> >       mv -f $@T $@
> >
> >  # For lld, skip preceding addresses and values before matching the archive and the member.
>
> Looks ok, I haven't see any build issue.  The '=' seems unrequired, although '='
> seems to work on all supported architectures (HAVE_ASM_SET_DIRECTIVE does not
> seem to be necessary to handle here).
>
> It is ok for 2.34 from my side.  H.J, do you see any issues of not setting the
> symbol to '0'?

LGTM.

Thanks.
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index 5d666b1b0c..03e4034cc7 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -526,10 +526,6 @@  rtld-stubbed-symbols = \
   malloc \
   realloc \
 
-# The GCC arguments that implement $(rtld-stubbed-symbols).
-rtld-stubbed-symbols-args = \
-  $(patsubst %,-Wl$(comma)--defsym=%=0, $(rtld-stubbed-symbols))
-
 ifeq ($(have-ssp),yes)
 # rtld is not built with the stack protector, so these references will
 # go away in the rebuilds.
@@ -538,9 +534,10 @@  endif
 
 $(objpfx)librtld.map: $(objpfx)dl-allobjs.os $(common-objpfx)libc_pic.a
 	@-rm -f $@T
-	$(reloc-link) -o $@.o $(rtld-stubbed-symbols-args) \
-		'-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
-	rm -f $@.o
+	printf '$(patsubst %,.globl %\n,$(rtld-stubbed-symbols)) $(patsubst %,%:\n,$(rtld-stubbed-symbols))' | \
+		$(CC) -o $@T.o $(ASFLAGS) -c -x assembler -
+	$(reloc-link) -o $@.o $@T.o '-Wl,-(' $^ -lgcc '-Wl,-)' -Wl,-Map,$@T
+	rm -f %@T.o $@.o
 	mv -f $@T $@
 
 # For lld, skip preceding addresses and values before matching the archive and the member.