[v2] dl: Use "adr" assembler command to get proper load address on ARM

Message ID 20211015075417.29931-1-lukma@denx.de
State Changes Requested, archived
Headers
Series [v2] dl: Use "adr" assembler command to get proper load address on ARM |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Lukasz Majewski Oct. 15, 2021, 7:54 a.m. UTC
  This change is a partial revert of commit
bca0f5cbc9257c13322b99e55235c4f21ba0bd82
"arm: Simplify elf_machine_{load_address,dynamic}" which imposed usage
of __ehdr_start linker variable to get the address of loaded program.

The elf_machine_load_address() function is declared in the
sysdeps/arm/dl-machine.h header. It is called from (very early)
_dl_start() entry point for the program. It shall return the load
address of the dynamic linker program.

With this revert the 'adr' assembler instruction is used instead of a
place holder:

arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
00000000 l       .note.gnu.build-id     00000000      __ehdr_start

which is pre-set by binutils.

The problem starts when one runs 'prelink' on the rootfs created with
for example OE/Yocto.
Then the _ehdr_start stays as 0x0, but the ELF header's sections have
different addresses - for example 0x41000000 instead of the originally
set 0x0.

This is crucial when /sbin/init is executed. Value set in __ehdr_start
symbol is not updated. This causes the program to crash very early
when ld-linux-armhf.so.3's _dl_start is executed, as calculated offset
for loader relocation is going to hit the kernel space (0xf7xxyyyy).

It looks like the correct way to obtain the _dl_start offset on ARM is
to use assembler instruction 'adr' at execution time (so the prelink
assigned offset is taken into consideration) instead of __ehdr_start.

With this patch we only modify the elf_machine_load_address() function,
as it is called very early, before the ld-linux-armhf.so.3 is performing
relocation (also its own one).

HW:
Hardware name:
	- ARM-Versatile Express (Run with QEMU)
	- Beagle Bone Black

Build Environment: OE/Yocto -> poky
SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1

Fixes: BZ #28293
---
 sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)
  

Comments

Szabolcs Nagy Oct. 15, 2021, 12:09 p.m. UTC | #1
The 10/15/2021 09:54, Lukasz Majewski wrote:
> This change is a partial revert of commit
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> "arm: Simplify elf_machine_{load_address,dynamic}" which imposed usage
> of __ehdr_start linker variable to get the address of loaded program.
> 
> The elf_machine_load_address() function is declared in the
> sysdeps/arm/dl-machine.h header. It is called from (very early)
> _dl_start() entry point for the program. It shall return the load
> address of the dynamic linker program.
> 
> With this revert the 'adr' assembler instruction is used instead of a
> place holder:
> 
> arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> 
> which is pre-set by binutils.
> 
> The problem starts when one runs 'prelink' on the rootfs created with
> for example OE/Yocto.
> Then the _ehdr_start stays as 0x0, but the ELF header's sections have
> different addresses - for example 0x41000000 instead of the originally
> set 0x0.
> 
> This is crucial when /sbin/init is executed. Value set in __ehdr_start
> symbol is not updated. This causes the program to crash very early
> when ld-linux-armhf.so.3's _dl_start is executed, as calculated offset
> for loader relocation is going to hit the kernel space (0xf7xxyyyy).
> 
> It looks like the correct way to obtain the _dl_start offset on ARM is
> to use assembler instruction 'adr' at execution time (so the prelink
> assigned offset is taken into consideration) instead of __ehdr_start.
> 
> With this patch we only modify the elf_machine_load_address() function,
> as it is called very early, before the ld-linux-armhf.so.3 is performing
> relocation (also its own one).

i'd use an explanation like:

__ehdr_start is a linker created symbol that points to the elf header.
The elf header is at the beginning of the elf file and normally its
virtual address is 0 in a shared library.  This means the runtime
address of __ehdr_start is the load address of the module.  However if
prelinking is applied to ld.so then all virtual addresses are moved by
an offset so the runtime address of the elf header becomes the load
address + prelink offset.  The kernel does not treat prelinked ld.so
specially so the load address is not 0, it still has to be computed,
but simply using __ehdr_start no longer gives a correct value for that.

This issue affects all targets with prelinking support, but so far we
only got reports from OE/Yocto builds for arm that has prelinked ld.so.

but i think a better fix is possible than revert:

ElfW(Addr)
elf_machine_load_address ()
{
  extern ElfW(Dyn) _DYNAMIC[] attribute_hidden;
  extern ElfW(Dyn) extern_DYNAMIC[] asm ("_DYNAMIC");

  /* Uses pc-relative address computation.  */
  ElfW(Addr) runtime_addr = (ElfW(Addr)) &_DYNAMIC;

  /* Loads an unrelocated GOT entry.  */
  ElfW(Addr) linktime_addr = (ElfW(Addr)) &extern_DYNAMIC;

  return runtime_addr - linktime_addr;
}

I expect this to work on most targets and very similar to the code
that was originally used on other targets: only a new GOT entry is
introduced instead of using GOT[0]. (that new got entry will have a
relative relocation which means there must be a dynamic section even
in a static PIE, so i expect _DYNAMIC to be defined. this also means
that it's slightly more expensive than &__ehdr_start, so it is for
targets that want to support prelinked ld.so)

The original arm code used _dl_start symbol, likely because that's
within range for the adr instruction for more efficient pc-relative
computation. But that's a function symbol that requires fixups due
to thumb interworking issues and is not available in static PIE, so
using _DYNAMIC sounds better even on arm.

> 
> HW:
> Hardware name:
> 	- ARM-Versatile Express (Run with QEMU)
> 	- Beagle Bone Black
> 
> Build Environment: OE/Yocto -> poky
> SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> 
> Fixes: BZ #28293
> ---
>  sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> index dfa05eee44..d6e5f1d5ec 100644
> --- a/sysdeps/arm/dl-machine.h
> +++ b/sysdeps/arm/dl-machine.h
> @@ -39,11 +39,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
>  }
>  
>  /* Return the run-time load address of the shared object.  */
> -static inline ElfW(Addr) __attribute__ ((unused))
> +static inline Elf32_Addr __attribute__ ((unused))
>  elf_machine_load_address (void)
>  {
> -  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> -  return (ElfW(Addr)) &__ehdr_start;
> +  Elf32_Addr pcrel_addr;
> +#ifdef SHARED
> +  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> +  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> +#else
> +  extern Elf32_Addr __dl_relocate_static_pie (void *)
> +    asm ("_dl_relocate_static_pie") attribute_hidden;
> +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> +  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> +#endif
> +#ifdef __thumb__
> +  /* Clear the low bit of the function address.
> +
> +     NOTE: got_addr is from GOT table whose lsb is always set by linker if it's
> +     Thumb function address.  PCREL_ADDR comes from PC-relative calculation
> +     which will finish during assembling.  GAS assembler before the fix for
> +     PR gas/21458 was not setting the lsb but does after that.  Always do the
> +     strip for both, so the code works with various combinations of glibc and
> +     Binutils.  */
> +  got_addr &= ~(Elf32_Addr) 1;
> +  pcrel_addr &= ~(Elf32_Addr) 1;
> +#endif
> +  return pcrel_addr - got_addr;
>  }
>  
>  /* Return the link-time address of _DYNAMIC.  */
> -- 
> 2.20.1
>
  
H.J. Lu Oct. 15, 2021, 12:21 p.m. UTC | #2
On Fri, Oct 15, 2021 at 5:09 AM Szabolcs Nagy via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The 10/15/2021 09:54, Lukasz Majewski wrote:
> > This change is a partial revert of commit
> > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed usage
> > of __ehdr_start linker variable to get the address of loaded program.
> >
> > The elf_machine_load_address() function is declared in the
> > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > _dl_start() entry point for the program. It shall return the load
> > address of the dynamic linker program.
> >
> > With this revert the 'adr' assembler instruction is used instead of a
> > place holder:
> >
> > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> >
> > which is pre-set by binutils.
> >
> > The problem starts when one runs 'prelink' on the rootfs created with
> > for example OE/Yocto.
> > Then the _ehdr_start stays as 0x0, but the ELF header's sections have
> > different addresses - for example 0x41000000 instead of the originally
> > set 0x0.
> >
> > This is crucial when /sbin/init is executed. Value set in __ehdr_start
> > symbol is not updated. This causes the program to crash very early
> > when ld-linux-armhf.so.3's _dl_start is executed, as calculated offset
> > for loader relocation is going to hit the kernel space (0xf7xxyyyy).
> >
> > It looks like the correct way to obtain the _dl_start offset on ARM is
> > to use assembler instruction 'adr' at execution time (so the prelink
> > assigned offset is taken into consideration) instead of __ehdr_start.
> >
> > With this patch we only modify the elf_machine_load_address() function,
> > as it is called very early, before the ld-linux-armhf.so.3 is performing
> > relocation (also its own one).
>
> i'd use an explanation like:
>
> __ehdr_start is a linker created symbol that points to the elf header.
> The elf header is at the beginning of the elf file and normally its
> virtual address is 0 in a shared library.  This means the runtime
> address of __ehdr_start is the load address of the module.  However if
> prelinking is applied to ld.so then all virtual addresses are moved by
> an offset so the runtime address of the elf header becomes the load
> address + prelink offset.  The kernel does not treat prelinked ld.so
> specially so the load address is not 0, it still has to be computed,
> but simply using __ehdr_start no longer gives a correct value for that.
>
> This issue affects all targets with prelinking support, but so far we
> only got reports from OE/Yocto builds for arm that has prelinked ld.so.
>
> but i think a better fix is possible than revert:

I think either prelink should be fixed not to prelink ld.so or Yocto
should be fixed not to prelink ld.so.
  
Lukasz Majewski Oct. 15, 2021, 12:59 p.m. UTC | #3
On Fri, 15 Oct 2021 05:21:23 -0700
"H.J. Lu" <hjl.tools@gmail.com> wrote:

> On Fri, Oct 15, 2021 at 5:09 AM Szabolcs Nagy via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > The 10/15/2021 09:54, Lukasz Majewski wrote:  
> > > This change is a partial revert of commit
> > > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > > usage of __ehdr_start linker variable to get the address of
> > > loaded program.
> > >
> > > The elf_machine_load_address() function is declared in the
> > > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > > _dl_start() entry point for the program. It shall return the load
> > > address of the dynamic linker program.
> > >
> > > With this revert the 'adr' assembler instruction is used instead
> > > of a place holder:
> > >
> > > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > >
> > > which is pre-set by binutils.
> > >
> > > The problem starts when one runs 'prelink' on the rootfs created
> > > with for example OE/Yocto.
> > > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > > have different addresses - for example 0x41000000 instead of the
> > > originally set 0x0.
> > >
> > > This is crucial when /sbin/init is executed. Value set in
> > > __ehdr_start symbol is not updated. This causes the program to
> > > crash very early when ld-linux-armhf.so.3's _dl_start is
> > > executed, as calculated offset for loader relocation is going to
> > > hit the kernel space (0xf7xxyyyy).
> > >
> > > It looks like the correct way to obtain the _dl_start offset on
> > > ARM is to use assembler instruction 'adr' at execution time (so
> > > the prelink assigned offset is taken into consideration) instead
> > > of __ehdr_start.
> > >
> > > With this patch we only modify the elf_machine_load_address()
> > > function, as it is called very early, before the
> > > ld-linux-armhf.so.3 is performing relocation (also its own one).  
> >
> > i'd use an explanation like:
> >
> > __ehdr_start is a linker created symbol that points to the elf
> > header. The elf header is at the beginning of the elf file and
> > normally its virtual address is 0 in a shared library.  This means
> > the runtime address of __ehdr_start is the load address of the
> > module.  However if prelinking is applied to ld.so then all virtual
> > addresses are moved by an offset so the runtime address of the elf
> > header becomes the load address + prelink offset.  The kernel does
> > not treat prelinked ld.so specially so the load address is not 0,
> > it still has to be computed, but simply using __ehdr_start no
> > longer gives a correct value for that.
> >
> > This issue affects all targets with prelinking support, but so far
> > we only got reports from OE/Yocto builds for arm that has prelinked
> > ld.so.
> >
> > but i think a better fix is possible than revert:  
> 
> I think either prelink should be fixed not to prelink ld.so or Yocto
> should be fixed not to prelink ld.so.
> 

Could you explain why?

Was the relocation of ld.so (I guess that ld.so = ld-linux-arm.so) a
bug from the very beginning and it was apparent just now?

From my point of view - the original change to use __ehdr_start broke
working setups, so it is a regression and shall be fixed in glibc.

Anyway, it would be beneficial to have input from other glibc
developers how to proceed with this issue.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski Oct. 15, 2021, 1:59 p.m. UTC | #4
Hi Szabolcs,

> The 10/15/2021 09:54, Lukasz Majewski wrote:
> > This change is a partial revert of commit
> > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > usage of __ehdr_start linker variable to get the address of loaded
> > program.
> > 
> > The elf_machine_load_address() function is declared in the
> > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > _dl_start() entry point for the program. It shall return the load
> > address of the dynamic linker program.
> > 
> > With this revert the 'adr' assembler instruction is used instead of
> > a place holder:
> > 
> > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > 
> > which is pre-set by binutils.
> > 
> > The problem starts when one runs 'prelink' on the rootfs created
> > with for example OE/Yocto.
> > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > have different addresses - for example 0x41000000 instead of the
> > originally set 0x0.
> > 
> > This is crucial when /sbin/init is executed. Value set in
> > __ehdr_start symbol is not updated. This causes the program to
> > crash very early when ld-linux-armhf.so.3's _dl_start is executed,
> > as calculated offset for loader relocation is going to hit the
> > kernel space (0xf7xxyyyy).
> > 
> > It looks like the correct way to obtain the _dl_start offset on ARM
> > is to use assembler instruction 'adr' at execution time (so the
> > prelink assigned offset is taken into consideration) instead of
> > __ehdr_start.
> > 
> > With this patch we only modify the elf_machine_load_address()
> > function, as it is called very early, before the
> > ld-linux-armhf.so.3 is performing relocation (also its own one).  
> 
> i'd use an explanation like:
> 
> __ehdr_start is a linker created symbol that points to the elf header.
> The elf header is at the beginning of the elf file and normally its
> virtual address is 0 in a shared library.  This means the runtime
> address of __ehdr_start is the load address of the module.  However if
> prelinking is applied to ld.so then all virtual addresses are moved by
> an offset so the runtime address of the elf header becomes the load
> address + prelink offset.  The kernel does not treat prelinked ld.so
> specially so the load address is not 0, it still has to be computed,
> but simply using __ehdr_start no longer gives a correct value for
> that.
> 
> This issue affects all targets with prelinking support, but so far we
> only got reports from OE/Yocto builds for arm that has prelinked
> ld.so.
> 

Thanks for a very detailed description.

> but i think a better fix is possible than revert:
> 
> ElfW(Addr)
> elf_machine_load_address ()
> {
>   extern ElfW(Dyn) _DYNAMIC[] attribute_hidden;
>   extern ElfW(Dyn) extern_DYNAMIC[] asm ("_DYNAMIC");
> 

So the _DYNAMIC = GOT[0] (and it points into the .dynamic section)
objdump -d -j .got ld-linux-armhf.so.3

Disassembly of section .got:

41036fbc <.got+0x41000000>:
41036fbc:       41036ef4        .word   0x41036ef4

So it indeed points into the
  [16] .dynamic          DYNAMIC         41036ef4

>   /* Uses pc-relative address computation.  */
>   ElfW(Addr) runtime_addr = (ElfW(Addr)) &_DYNAMIC;

I guess that the &_DYNAMIC gives the address around which $pc runs 
(e.g. 0xb6fc9504) - this is the actual address of run program.

(A side question - is there any way to read the _DYNAMIC symbol value
directly - via e.g. readelf or objdump?)

> 
>   /* Loads an unrelocated GOT entry.  */
>   ElfW(Addr) linktime_addr = (ElfW(Addr)) &extern_DYNAMIC;
> 

This is the prelink'ed address -> 0x41036ef4 in our case?

>   return runtime_addr - linktime_addr;

And the address to which we shall relocated would be:

0xb6fc9504 - 0x41036ef4 = 0x75f92610 - which is the address to which
the ld.so (ld-linux-armhf.so.3) will re-relocate itself?

> }
> 
> I expect this to work on most targets and very similar to the code
> that was originally used on other targets: only a new GOT entry is
> introduced instead of using GOT[0].

In fact we only rely on _DYNAMIC symbol -> which points into .dynamic
section.

> (that new got entry will have a
> relative relocation which means there must be a dynamic section even
> in a static PIE, so i expect _DYNAMIC to be defined.

Ok.

> this also means
> that it's slightly more expensive than &__ehdr_start, so it is for
> targets that want to support prelinked ld.so)
> 
> The original arm code used _dl_start symbol, likely because that's
> within range for the adr instruction for more efficient pc-relative
> computation. But that's a function symbol that requires fixups due
> to thumb interworking issues and is not available in static PIE, so
> using _DYNAMIC sounds better even on arm.

+1.

> 
> > 
> > HW:
> > Hardware name:
> > 	- ARM-Versatile Express (Run with QEMU)
> > 	- Beagle Bone Black
> > 
> > Build Environment: OE/Yocto -> poky
> > SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> > 
> > Fixes: BZ #28293
> > ---
> >  sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> >  1 file changed, 25 insertions(+), 3 deletions(-)
> > 
> > diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> > index dfa05eee44..d6e5f1d5ec 100644
> > --- a/sysdeps/arm/dl-machine.h
> > +++ b/sysdeps/arm/dl-machine.h
> > @@ -39,11 +39,33 @@ elf_machine_matches_host (const Elf32_Ehdr
> > *ehdr) }
> >  
> >  /* Return the run-time load address of the shared object.  */
> > -static inline ElfW(Addr) __attribute__ ((unused))
> > +static inline Elf32_Addr __attribute__ ((unused))
> >  elf_machine_load_address (void)
> >  {
> > -  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> > -  return (ElfW(Addr)) &__ehdr_start;
> > +  Elf32_Addr pcrel_addr;
> > +#ifdef SHARED
> > +  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> > +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> > +  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> > +#else
> > +  extern Elf32_Addr __dl_relocate_static_pie (void *)
> > +    asm ("_dl_relocate_static_pie") attribute_hidden;
> > +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> > +  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> > +#endif
> > +#ifdef __thumb__
> > +  /* Clear the low bit of the function address.
> > +
> > +     NOTE: got_addr is from GOT table whose lsb is always set by
> > linker if it's
> > +     Thumb function address.  PCREL_ADDR comes from PC-relative
> > calculation
> > +     which will finish during assembling.  GAS assembler before
> > the fix for
> > +     PR gas/21458 was not setting the lsb but does after that.
> > Always do the
> > +     strip for both, so the code works with various combinations
> > of glibc and
> > +     Binutils.  */
> > +  got_addr &= ~(Elf32_Addr) 1;
> > +  pcrel_addr &= ~(Elf32_Addr) 1;
> > +#endif
> > +  return pcrel_addr - got_addr;
> >  }
> >  
> >  /* Return the link-time address of _DYNAMIC.  */
> > -- 
> > 2.20.1
> >   




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Fangrui Song Oct. 15, 2021, 11:53 p.m. UTC | #5
On Fri, Oct 15, 2021 at 6:00 AM Lukasz Majewski <lukma@denx.de> wrote:
>
> On Fri, 15 Oct 2021 05:21:23 -0700
> "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
> > On Fri, Oct 15, 2021 at 5:09 AM Szabolcs Nagy via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > >
> > > The 10/15/2021 09:54, Lukasz Majewski wrote:
> > > > This change is a partial revert of commit
> > > > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > > > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > > > usage of __ehdr_start linker variable to get the address of
> > > > loaded program.
> > > >
> > > > The elf_machine_load_address() function is declared in the
> > > > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > > > _dl_start() entry point for the program. It shall return the load
> > > > address of the dynamic linker program.
> > > >
> > > > With this revert the 'adr' assembler instruction is used instead
> > > > of a place holder:
> > > >
> > > > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > > > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > > >
> > > > which is pre-set by binutils.
> > > >
> > > > The problem starts when one runs 'prelink' on the rootfs created
> > > > with for example OE/Yocto.
> > > > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > > > have different addresses - for example 0x41000000 instead of the
> > > > originally set 0x0.
> > > >
> > > > This is crucial when /sbin/init is executed. Value set in
> > > > __ehdr_start symbol is not updated. This causes the program to
> > > > crash very early when ld-linux-armhf.so.3's _dl_start is
> > > > executed, as calculated offset for loader relocation is going to
> > > > hit the kernel space (0xf7xxyyyy).
> > > >
> > > > It looks like the correct way to obtain the _dl_start offset on
> > > > ARM is to use assembler instruction 'adr' at execution time (so
> > > > the prelink assigned offset is taken into consideration) instead
> > > > of __ehdr_start.
> > > >
> > > > With this patch we only modify the elf_machine_load_address()
> > > > function, as it is called very early, before the
> > > > ld-linux-armhf.so.3 is performing relocation (also its own one).
> > >
> > > i'd use an explanation like:
> > >
> > > __ehdr_start is a linker created symbol that points to the elf
> > > header. The elf header is at the beginning of the elf file and
> > > normally its virtual address is 0 in a shared library.  This means
> > > the runtime address of __ehdr_start is the load address of the
> > > module.  However if prelinking is applied to ld.so then all virtual
> > > addresses are moved by an offset so the runtime address of the elf
> > > header becomes the load address + prelink offset.  The kernel does
> > > not treat prelinked ld.so specially so the load address is not 0,
> > > it still has to be computed, but simply using __ehdr_start no
> > > longer gives a correct value for that.
> > >
> > > This issue affects all targets with prelinking support, but so far
> > > we only got reports from OE/Yocto builds for arm that has prelinked
> > > ld.so.
> > >
> > > but i think a better fix is possible than revert:
> >
> > I think either prelink should be fixed not to prelink ld.so or Yocto
> > should be fixed not to prelink ld.so.
> >
>
> Could you explain why?
>
> Was the relocation of ld.so (I guess that ld.so = ld-linux-arm.so) a
> bug from the very beginning and it was apparent just now?

Prelinking improves application relocation performance but prelinking
ld.so itself doesn't provide any saving.
It is very likely that the prelink program doesn't intend to prelink
ld.so. It just doesn't provide a diagnostic.
If we look at the problem from this angle, prelinking ld.so is a pilot
error: OE/Yocto used an unsupported thing which happened to work in
the past.
Now, the unsupported (well, it can be supported if prelink correctly
prelinks ld.so) thing fails.
I sent the original commit trying to untangle the messy arm code.
Although Szabolcs's version is still short, I'd prefer we don't work
around glibc for error/prelink errors.

> From my point of view - the original change to use __ehdr_start broke
> working setups, so it is a regression and shall be fixed in glibc.
>
> Anyway, it would be beneficial to have input from other glibc
> developers how to proceed with this issue.
>
>
> Best regards,
>
> Lukasz Majewski
>
> --
>
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Szabolcs Nagy Oct. 18, 2021, 11:08 a.m. UTC | #6
The 10/15/2021 16:53, Fāng-ruì Sòng wrote:
> On Fri, Oct 15, 2021 at 6:00 AM Lukasz Majewski <lukma@denx.de> wrote:
> > On Fri, 15 Oct 2021 05:21:23 -0700
> > "H.J. Lu" <hjl.tools@gmail.com> wrote:
> > > I think either prelink should be fixed not to prelink ld.so or Yocto
> > > should be fixed not to prelink ld.so.
> >
> > Could you explain why?
> >
> > Was the relocation of ld.so (I guess that ld.so = ld-linux-arm.so) a
> > bug from the very beginning and it was apparent just now?
> 
> Prelinking improves application relocation performance but prelinking
> ld.so itself doesn't provide any saving.
> It is very likely that the prelink program doesn't intend to prelink
> ld.so. It just doesn't provide a diagnostic.
> If we look at the problem from this angle, prelinking ld.so is a pilot
> error: OE/Yocto used an unsupported thing which happened to work in
> the past.
> Now, the unsupported (well, it can be supported if prelink correctly
> prelinks ld.so) thing fails.
> I sent the original commit trying to untangle the messy arm code.
> Although Szabolcs's version is still short, I'd prefer we don't work
> around glibc for error/prelink errors.

i don't know much about pelinking, but i'd expect that ld.so
has to be prelinked for it to work:

if the kernel can load ld.so anywhere it will conflict with
other libraries that prelinking allocated to a fixed location.

instead ld.so has to be prelinked to an offset that comes after
all other prelinked libraries in the system, then the kernel
will place it after all other libraries at runtime.

i don't have a prelinked system to check if this is the case.
  
Florian Weimer Oct. 18, 2021, 11:35 a.m. UTC | #7
* Szabolcs Nagy:

> i don't know much about pelinking, but i'd expect that ld.so
> has to be prelinked for it to work:
>
> if the kernel can load ld.so anywhere it will conflict with
> other libraries that prelinking allocated to a fixed location.

I think ld.so can back out prelinking if it detects any conflicts.
(ld.so doesn't use MAP_FIXED for the initial ET_DYN mapping even when
prelinking.)

> instead ld.so has to be prelinked to an offset that comes after
> all other prelinked libraries in the system, then the kernel
> will place it after all other libraries at runtime.
>
> i don't have a prelinked system to check if this is the case.

I tried on glibc 2.12-based system with prelink enabled and got this:

# ldd /bin/bash
	linux-vdso.so.1 =>  (0x00007ffc7e798000)
	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f8dc919c000)
# ldd /bin/bash
	linux-vdso.so.1 =>  (0x00007ffef3bf4000)
	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff9e66a6000)
# eu-readelf -d /lib64/ld-linux-x86-64.so.2

Dynamic segment contains 25 entries:
 Addr: 0x0000003da7220df0  Offset: 0x020df0  Link to section: [ 5] '.dynstr'
  Type              Value
  SONAME            Library soname: [ld-linux-x86-64.so.2]
  HASH              0x0000003da70001f0
  GNU_HASH          0x0000003da70002a8
  STRTAB            0x0000003da7000608
  SYMTAB            0x0000003da7000380
  STRSZ             380 (bytes)
  SYMENT            24 (bytes)
  PLTGOT            0x0000003da7220f80
  PLTRELSZ          144 (bytes)
  PLTREL            RELA
  JMPREL            0x0000003da7000a30
  RELA              0x0000003da7000868
  RELASZ            456 (bytes)
  RELAENT           24 (bytes)
  VERDEF            0x0000003da70007c0
  VERDEFNUM         5
  BIND_NOW          
  FLAGS_1           NOW
  VERSYM            0x0000003da7000784
  RELACOUNT         16
  CHECKSUM          0x00000000e90e92bc
  GNU_PRELINKED     0x00000000616d5a26
  NULL              
  NULL              
  NULL
# 

As expected based on the previous discussion here, the kernel maps ld.so
at random addresses even though it has been prelinked.

This looks like another place where ASLR layout as to be tweaked
carefully to avoid obscure failure modes.

Thanks,
Florian
  
Lukasz Majewski Oct. 19, 2021, 12:03 p.m. UTC | #8
Hi Florian,

> * Szabolcs Nagy:
> 
> > i don't know much about pelinking, but i'd expect that ld.so
> > has to be prelinked for it to work:
> >
> > if the kernel can load ld.so anywhere it will conflict with
> > other libraries that prelinking allocated to a fixed location.  
> 
> I think ld.so can back out prelinking if it detects any conflicts.
> (ld.so doesn't use MAP_FIXED for the initial ET_DYN mapping even when
> prelinking.)
> 
> > instead ld.so has to be prelinked to an offset that comes after
> > all other prelinked libraries in the system, then the kernel
> > will place it after all other libraries at runtime.
> >
> > i don't have a prelinked system to check if this is the case.  
> 
> I tried on glibc 2.12-based system with prelink enabled and got this:
> 
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffc7e798000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007f8dc919c000)
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffef3bf4000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007ff9e66a6000)
> # eu-readelf -d /lib64/ld-linux-x86-64.so.2
> 
> Dynamic segment contains 25 entries:
>  Addr: 0x0000003da7220df0  Offset: 0x020df0  Link to section: [ 5]
> '.dynstr' Type              Value
>   SONAME            Library soname: [ld-linux-x86-64.so.2]
>   HASH              0x0000003da70001f0
>   GNU_HASH          0x0000003da70002a8
>   STRTAB            0x0000003da7000608
>   SYMTAB            0x0000003da7000380
>   STRSZ             380 (bytes)
>   SYMENT            24 (bytes)
>   PLTGOT            0x0000003da7220f80
>   PLTRELSZ          144 (bytes)
>   PLTREL            RELA
>   JMPREL            0x0000003da7000a30
>   RELA              0x0000003da7000868
>   RELASZ            456 (bytes)
>   RELAENT           24 (bytes)
>   VERDEF            0x0000003da70007c0
>   VERDEFNUM         5
>   BIND_NOW          
>   FLAGS_1           NOW
>   VERSYM            0x0000003da7000784
>   RELACOUNT         16
>   CHECKSUM          0x00000000e90e92bc
>   GNU_PRELINKED     0x00000000616d5a26
>   NULL              
>   NULL              
>   NULL
> # 
> 
> As expected based on the previous discussion here, the kernel maps
> ld.so at random addresses even though it has been prelinked.
> 
> This looks like another place where ASLR layout as to be tweaked
> carefully to avoid obscure failure modes.

Is the approach proposed by Szabolcs acceptable for 32 bit ARM? Or
shall we look for other way to proceed with this issue?

> 
> Thanks,
> Florian
> 


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski Oct. 25, 2021, 10:18 a.m. UTC | #9
Dear Community,

> * Szabolcs Nagy:
> 
> > i don't know much about pelinking, but i'd expect that ld.so
> > has to be prelinked for it to work:
> >
> > if the kernel can load ld.so anywhere it will conflict with
> > other libraries that prelinking allocated to a fixed location.  
> 
> I think ld.so can back out prelinking if it detects any conflicts.
> (ld.so doesn't use MAP_FIXED for the initial ET_DYN mapping even when
> prelinking.)
> 
> > instead ld.so has to be prelinked to an offset that comes after
> > all other prelinked libraries in the system, then the kernel
> > will place it after all other libraries at runtime.
> >
> > i don't have a prelinked system to check if this is the case.  
> 
> I tried on glibc 2.12-based system with prelink enabled and got this:
> 
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffc7e798000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007f8dc919c000)
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffef3bf4000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007ff9e66a6000)
> # eu-readelf -d /lib64/ld-linux-x86-64.so.2
> 
> Dynamic segment contains 25 entries:
>  Addr: 0x0000003da7220df0  Offset: 0x020df0  Link to section: [ 5]
> '.dynstr' Type              Value
>   SONAME            Library soname: [ld-linux-x86-64.so.2]
>   HASH              0x0000003da70001f0
>   GNU_HASH          0x0000003da70002a8
>   STRTAB            0x0000003da7000608
>   SYMTAB            0x0000003da7000380
>   STRSZ             380 (bytes)
>   SYMENT            24 (bytes)
>   PLTGOT            0x0000003da7220f80
>   PLTRELSZ          144 (bytes)
>   PLTREL            RELA
>   JMPREL            0x0000003da7000a30
>   RELA              0x0000003da7000868
>   RELASZ            456 (bytes)
>   RELAENT           24 (bytes)
>   VERDEF            0x0000003da70007c0
>   VERDEFNUM         5
>   BIND_NOW          
>   FLAGS_1           NOW
>   VERSYM            0x0000003da7000784
>   RELACOUNT         16
>   CHECKSUM          0x00000000e90e92bc
>   GNU_PRELINKED     0x00000000616d5a26
>   NULL              
>   NULL              
>   NULL
> # 
> 
> As expected based on the previous discussion here, the kernel maps
> ld.so at random addresses even though it has been prelinked.
> 
> This looks like another place where ASLR layout as to be tweaked
> carefully to avoid obscure failure modes.
> 

Do we have any idea on how to move forward with this issue?

> Thanks,
> Florian
> 

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Florian Weimer Oct. 25, 2021, 10:25 a.m. UTC | #10
* Lukasz Majewski:

> Do we have any idea on how to move forward with this issue?

Either fix the prelink tool not to prelink shared objects that do not
have a dependency on libc.so.6, or fix the dynamic loader to work if
prelinked on AArch64.  I do not have a strong opinion.

Thanks,
Florian
  
Lukasz Majewski Oct. 25, 2021, 10:53 a.m. UTC | #11
Hi Florian,

> * Lukasz Majewski:
> 
> > Do we have any idea on how to move forward with this issue?  
> 
> Either fix the prelink tool not to prelink shared objects that do not
> have a dependency on libc.so.6, or fix the dynamic loader to work if
> prelinked on AArch64.

Just for the correctness - both 64 and 32 bit ARMs are affected.

> I do not have a strong opinion.

Thanks for your opinion. Let's wait for other community members
opinions.

> 
> Thanks,
> Florian
> 



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Szabolcs Nagy Oct. 25, 2021, 1:34 p.m. UTC | #12
The 10/25/2021 12:53, Lukasz Majewski wrote:
> Hi Florian,
> 
> > * Lukasz Majewski:
> > 
> > > Do we have any idea on how to move forward with this issue?  
> > 
> > Either fix the prelink tool not to prelink shared objects that do not
> > have a dependency on libc.so.6, or fix the dynamic loader to work if
> > prelinked on AArch64.
> 
> Just for the correctness - both 64 and 32 bit ARMs are affected.

last time i looked, prelinking did not support tlsdesc
correctly so it is unusable for aarch64.

does yocto/oe use prelinking on aarch64?

> 
> > I do not have a strong opinion.
> 
> Thanks for your opinion. Let's wait for other community members
> opinions.

i think fixing the arm load address computation makes
sense (small extra cost of a relative reloc). i think
the c code proposal i made in the thread is nicer than
the old asm.

(i'm happy to make the same change on aarch64 too if
prelinking is used there, but i think that's broken.)
  
Lukasz Majewski Oct. 25, 2021, 2:04 p.m. UTC | #13
Hi Szabolcs,

> The 10/25/2021 12:53, Lukasz Majewski wrote:
> > Hi Florian,
> >   
> > > * Lukasz Majewski:
> > >   
> > > > Do we have any idea on how to move forward with this issue?    
> > > 
> > > Either fix the prelink tool not to prelink shared objects that do
> > > not have a dependency on libc.so.6, or fix the dynamic loader to
> > > work if prelinked on AArch64.  
> > 
> > Just for the correctness - both 64 and 32 bit ARMs are affected.  
> 
> last time i looked, prelinking did not support tlsdesc
> correctly so it is unusable for aarch64.
> 
> does yocto/oe use prelinking on aarch64?

I think yes - the 
USER_CLASSES ?= "buildstats image-prelink"

is added by default to local.conf

> 
> >   
> > > I do not have a strong opinion.  
> > 
> > Thanks for your opinion. Let's wait for other community members
> > opinions.  
> 
> i think fixing the arm load address computation makes
> sense (small extra cost of a relative reloc). i think
> the c code proposal i made in the thread is nicer than
> the old asm.
> 
> (i'm happy to make the same change on aarch64 too if
> prelinking is used there, but i think that's broken.)

+1

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Szabolcs Nagy Oct. 25, 2021, 3:09 p.m. UTC | #14
The 10/25/2021 16:04, Lukasz Majewski wrote:
> > > > Either fix the prelink tool not to prelink shared objects that do
> > > > not have a dependency on libc.so.6, or fix the dynamic loader to
> > > > work if prelinked on AArch64.  
> > > 
> > > Just for the correctness - both 64 and 32 bit ARMs are affected.  
> > 
> > last time i looked, prelinking did not support tlsdesc
> > correctly so it is unusable for aarch64.
> > 
> > does yocto/oe use prelinking on aarch64?
> 
> I think yes - the 
> USER_CLASSES ?= "buildstats image-prelink"
> 
> is added by default to local.conf

ok, i think we need the patches upstream for that like
https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html

> > > > I do not have a strong opinion.  
> > > 
> > > Thanks for your opinion. Let's wait for other community members
> > > opinions.  
> > 
> > i think fixing the arm load address computation makes
> > sense (small extra cost of a relative reloc). i think
> > the c code proposal i made in the thread is nicer than
> > the old asm.
> > 
> > (i'm happy to make the same change on aarch64 too if
> > prelinking is used there, but i think that's broken.)
> 
> +1

since you have a prelink setup, can you prepare the
arm and aarch64 patches?

(i suspect x86 would need the same fix, but probably
prelink is not used there anymore..?)
  
Joseph Myers Oct. 25, 2021, 5:26 p.m. UTC | #15
On Mon, 25 Oct 2021, Szabolcs Nagy via Libc-alpha wrote:

> ok, i think we need the patches upstream for that like
> https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html

The AArch64 prelink support isn't in the upstream Yocto cross-prelink, and 
the version written by Samsung in 2015 and on the cross_prelink_aarch64 
branch has various problems resulting in test failures, in my experience.

I sent patches (on top of a merge of the upstream cross_prelink and 
cross_prelink_aarch64 branches) to the maintainer in May 2020 (the Yocto 
project mailing list doesn't accept email from non-subscribers, so won't 
have seen those patches), which made it work well enough to get clean 
prelink test results, but so far they haven't been committed to the 
upstream cross_prelink branch (or any other upstream branch).  I've now 
made those available at https://github.com/jsm28/prelink 
(cross_prelink_aarch64_fixes branch).

Note however that, like the original patches from Samsung, this version 
does indeed depend on a hack in _dl_tlsdesc_undefweak to work with TLS 
descriptors.
  
Lukasz Majewski Oct. 25, 2021, 6:25 p.m. UTC | #16
Hi Szabolcs,

> The 10/25/2021 16:04, Lukasz Majewski wrote:
> > > > > Either fix the prelink tool not to prelink shared objects
> > > > > that do not have a dependency on libc.so.6, or fix the
> > > > > dynamic loader to work if prelinked on AArch64.    
> > > > 
> > > > Just for the correctness - both 64 and 32 bit ARMs are
> > > > affected.    
> > > 
> > > last time i looked, prelinking did not support tlsdesc
> > > correctly so it is unusable for aarch64.
> > > 
> > > does yocto/oe use prelinking on aarch64?  
> > 
> > I think yes - the 
> > USER_CLASSES ?= "buildstats image-prelink"
> > 
> > is added by default to local.conf  
> 
> ok, i think we need the patches upstream for that like
> https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html
> 

Oh... I see.

> > > > > I do not have a strong opinion.    
> > > > 
> > > > Thanks for your opinion. Let's wait for other community members
> > > > opinions.    
> > > 
> > > i think fixing the arm load address computation makes
> > > sense (small extra cost of a relative reloc). i think
> > > the c code proposal i made in the thread is nicer than
> > > the old asm.
> > > 
> > > (i'm happy to make the same change on aarch64 too if
> > > prelinking is used there, but i think that's broken.)  
> > 
> > +1  
> 
> since you have a prelink setup, can you prepare the
> arm and aarch64 patches?
> 

I can prepare the patch - no problem.

Beforehand, I would like to hear from the community if we do have a
consensus about this solution...

> (i suspect x86 would need the same fix, but probably
> prelink is not used there anymore..?)

I do assume that in yocto at least it would use prelink by default as
well.

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski Oct. 26, 2021, 1:52 p.m. UTC | #17
Hi Joseph, Szabolcs 

> On Mon, 25 Oct 2021, Szabolcs Nagy via Libc-alpha wrote:
> 
> > ok, i think we need the patches upstream for that like
> > https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html
> >  
> 
> The AArch64 prelink support isn't in the upstream Yocto
> cross-prelink, and the version written by Samsung in 2015 and on the
> cross_prelink_aarch64 branch has various problems resulting in test
> failures, in my experience.

Ok.

> 
> I sent patches (on top of a merge of the upstream cross_prelink and 
> cross_prelink_aarch64 branches) to the maintainer in May 2020 (the
> Yocto project mailing list doesn't accept email from non-subscribers,
> so won't have seen those patches), which made it work well enough to
> get clean prelink test results, but so far they haven't been
> committed to the upstream cross_prelink branch (or any other upstream
> branch).  I've now made those available at
> https://github.com/jsm28/prelink (cross_prelink_aarch64_fixes branch).
> 

So this branch shall be pulled by yocto's cross-prelink maintainer.
Without it the cross-prelink doesn't support aarch64?

> Note however that, like the original patches from Samsung, this
> version does indeed depend on a hack in _dl_tlsdesc_undefweak to work
> with TLS descriptors.
> 

To properly solve this issue we shall:

1. For arm
- Fix the cross-prelink (no patches available)

or

- Fix glibc (as proposed by Szabolcs)

2. For aarch64

- Try to upstream patches from Joseph to OE/Yocto's cross-prelink

or

- Fix glibc (if required)

or

- Do nothing (aarch64 will not be prelinked in OE/Yocto, which means
  that it will work correctly)


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Joseph Myers Oct. 26, 2021, 8:55 p.m. UTC | #18
On Tue, 26 Oct 2021, Lukasz Majewski wrote:

> > I sent patches (on top of a merge of the upstream cross_prelink and 
> > cross_prelink_aarch64 branches) to the maintainer in May 2020 (the
> > Yocto project mailing list doesn't accept email from non-subscribers,
> > so won't have seen those patches), which made it work well enough to
> > get clean prelink test results, but so far they haven't been
> > committed to the upstream cross_prelink branch (or any other upstream
> > branch).  I've now made those available at
> > https://github.com/jsm28/prelink (cross_prelink_aarch64_fixes branch).
> 
> So this branch shall be pulled by yocto's cross-prelink maintainer.

It should be.

> Without it the cross-prelink doesn't support aarch64?

Correct.

With that branch, basic AArch64 support is there, but the following still 
applies:

> > Note however that, like the original patches from Samsung, this
> > version does indeed depend on a hack in _dl_tlsdesc_undefweak to work
> > with TLS descriptors.

(And given that hack, test results with my branch should be clear on 
AArch64.)
  
Szabolcs Nagy Oct. 27, 2021, 9:38 a.m. UTC | #19
The 10/26/2021 20:55, Joseph Myers wrote:
> On Tue, 26 Oct 2021, Lukasz Majewski wrote:
> 
> > > I sent patches (on top of a merge of the upstream cross_prelink and 
> > > cross_prelink_aarch64 branches) to the maintainer in May 2020 (the
> > > Yocto project mailing list doesn't accept email from non-subscribers,
> > > so won't have seen those patches), which made it work well enough to
> > > get clean prelink test results, but so far they haven't been
> > > committed to the upstream cross_prelink branch (or any other upstream
> > > branch).  I've now made those available at
> > > https://github.com/jsm28/prelink (cross_prelink_aarch64_fixes branch).
> > 
> > So this branch shall be pulled by yocto's cross-prelink maintainer.
> 
> It should be.
> 
> > Without it the cross-prelink doesn't support aarch64?
> 
> Correct.
> 
> With that branch, basic AArch64 support is there, but the following still 
> applies:
> 
> > > Note however that, like the original patches from Samsung, this
> > > version does indeed depend on a hack in _dl_tlsdesc_undefweak to work
> > > with TLS descriptors.
> 
> (And given that hack, test results with my branch should be clear on 
> AArch64.)

i think the undefweak hack is acceptable.
(in the non-prelinked case that path is rare)

and then i'm happy to take the load address change
to support prelinked ld.so.
  

Patch

diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
index dfa05eee44..d6e5f1d5ec 100644
--- a/sysdeps/arm/dl-machine.h
+++ b/sysdeps/arm/dl-machine.h
@@ -39,11 +39,33 @@  elf_machine_matches_host (const Elf32_Ehdr *ehdr)
 }
 
 /* Return the run-time load address of the shared object.  */
-static inline ElfW(Addr) __attribute__ ((unused))
+static inline Elf32_Addr __attribute__ ((unused))
 elf_machine_load_address (void)
 {
-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
-  return (ElfW(Addr)) &__ehdr_start;
+  Elf32_Addr pcrel_addr;
+#ifdef SHARED
+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
+#else
+  extern Elf32_Addr __dl_relocate_static_pie (void *)
+    asm ("_dl_relocate_static_pie") attribute_hidden;
+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
+#endif
+#ifdef __thumb__
+  /* Clear the low bit of the function address.
+
+     NOTE: got_addr is from GOT table whose lsb is always set by linker if it's
+     Thumb function address.  PCREL_ADDR comes from PC-relative calculation
+     which will finish during assembling.  GAS assembler before the fix for
+     PR gas/21458 was not setting the lsb but does after that.  Always do the
+     strip for both, so the code works with various combinations of glibc and
+     Binutils.  */
+  got_addr &= ~(Elf32_Addr) 1;
+  pcrel_addr &= ~(Elf32_Addr) 1;
+#endif
+  return pcrel_addr - got_addr;
 }
 
 /* Return the link-time address of _DYNAMIC.  */