elf: Support DT_RELR relative relocation format [BZ #27924]

Message ID 20211008065740.1485737-1-maskray@google.com
State Superseded
Delegated to: Adhemerval Zanella Netto
Headers
Series elf: Support DT_RELR relative relocation format [BZ #27924] |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Fangrui Song Oct. 8, 2021, 6:57 a.m. UTC
  PIC objects (especially PIE and symbolic shared objects) usually have many
relative relocations. In 2017/2018, SHT_RELR/DT_RELR was proposed on
https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/GxjM0L-PBAAJ
("Proposal for a new section type SHT_RELR") and welcomed by many parties
(including Solaris). This packed format can typically save 95% dynamic
relocation section size for PIE. The vaddr size of a PIE can be 10% smaller.

* Chrome OS folks have carried a local patch for a while (latest version:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-libs/glibc/files/local/glibc-2.32).
  I.e. this feature has been battle tested.
* Android bionic supports DT_RELR.
* The Linux kernel has supported CONFIG_RELR since 2019-08
  (https://git.kernel.org/linus/5cf896fb6be3effd9aea455b22213e27be8bdb1d).
* A musl patch (by me) exists but is not applied:
  https://www.openwall.com/lists/musl/2019/03/06/3

I believe upstream glibc should support DT_RELR to benefit all Linux
distributions.

As of linker support (to the best of my knowledge):

* LLD support DT_RELR.
* https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-devel/binutils/files/
  has a gold patch.
* GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=27923

I wish that GNU ld and gold maintainers can implement the feature as well :)

This patch is simpler than Chrome OS's glibc patch and makes ELF_DYNAMIC_DO_RELR
available to all ports.

I have adjusted aclocal.m4, otherwise it thinks ld.lld doesn't support
--pack-dyn-relocs=relr just because ld.lld -v --help doesn't contain the literal
string.

    % ld.lld -v --help | grep pack-dyn-relocs
    --pack-dyn-relocs=[none,android,relr,android+relr]

(`$gnu_ld` is a lie: both gold and ld.lld's "User-Agent:" strings contain
"GNU" and therefore make gnu_ld=true.)

Tested on aarch64 and x86_64.
---
 aclocal.m4             |  19 +++-----
 configure              | 107 ++++++++++++++++++++++++-----------------
 configure.ac           |   4 ++
 elf/Makefile           |   4 ++
 elf/dynamic-link.h     |  28 +++++++++++
 elf/elf.h              |  13 ++++-
 elf/get-dynamic-info.h |   3 ++
 elf/tst-relr.c         |  20 ++++++++
 8 files changed, 141 insertions(+), 57 deletions(-)
 create mode 100644 elf/tst-relr.c
  

Comments

Florian Weimer Oct. 8, 2021, 3:39 p.m. UTC | #1
* Fangrui Song via Libc-alpha:

> @@ -887,7 +893,10 @@ typedef struct
>  #define DT_PREINIT_ARRAY 32		/* Array with addresses of preinit fct*/
>  #define DT_PREINIT_ARRAYSZ 33		/* size in bytes of DT_PREINIT_ARRAY */
>  #define DT_SYMTAB_SHNDX	34		/* Address of SYMTAB_SHNDX section */
> -#define	DT_NUM		35		/* Number used */
> +#define DT_RELRSZ	35
> +#define DT_RELR		36
> +#define DT_RELRENT	37
> +#define	DT_NUM		38		/* Number used */
>  #define DT_LOOS		0x6000000d	/* Start of OS-specific */
>  #define DT_HIOS		0x6ffff000	/* End of OS-specific */
>  #define DT_LOPROC	0x70000000	/* Start of processor-specific */

<http://www.sco.com/developers/gabi/latest/ch5.dynamic.html#tag_encodings>
(Figure 5-10: Dynamic Array Tags, d_tag) still ends at 34, and I'm
worried about collisions with these numbers.

Do you know what the official allocation status for the new constants
is?
  
Fangrui Song Oct. 8, 2021, 4:36 p.m. UTC | #2
On 2021-10-08, Florian Weimer wrote:
>* Fangrui Song via Libc-alpha:
>
>> @@ -887,7 +893,10 @@ typedef struct
>>  #define DT_PREINIT_ARRAY 32		/* Array with addresses of preinit fct*/
>>  #define DT_PREINIT_ARRAYSZ 33		/* size in bytes of DT_PREINIT_ARRAY */
>>  #define DT_SYMTAB_SHNDX	34		/* Address of SYMTAB_SHNDX section */
>> -#define	DT_NUM		35		/* Number used */
>> +#define DT_RELRSZ	35
>> +#define DT_RELR		36
>> +#define DT_RELRENT	37
>> +#define	DT_NUM		38		/* Number used */
>>  #define DT_LOOS		0x6000000d	/* Start of OS-specific */
>>  #define DT_HIOS		0x6ffff000	/* End of OS-specific */
>>  #define DT_LOPROC	0x70000000	/* Start of processor-specific */
>
><http://www.sco.com/developers/gabi/latest/ch5.dynamic.html#tag_encodings>
>(Figure 5-10: Dynamic Array Tags, d_tag) still ends at 34, and I'm
>worried about collisions with these numbers.
>
>Do you know what the official allocation status for the new constants
>is?

AIUI the last update of the website was in 2015 when SHF_COMPRESSED was
added.

See https://groups.google.com/g/generic-abi/c/XsQpUE6s02o for
information about no maintenance from xinuous.

Cary took over maintenanceship (
https://groups.google.com/g/generic-abi/c/9OO5vhxb00Y "Ongoing
Maintenance of the gABI") but (AIUI) does not have access to that
website.

Updates on SHF_LINK_ORDER and SHT_RELR/DT_RELR are scattered in
generic-abi posts these days. The current reality is if Solaris/GNU/LLVM
folks have reached agreement, we can make some progress on
clarification/semantics update. (Personally when I make linker changes,
I may even reach out to FreeBSD folks on IRC.)
(For new features getting ack from Solaris is the most difficult part in
these years. They are happy with RELR.)

 From Cary Coutant
(https://sourceware.org/bugzilla/show_bug.cgi?id=27924#c7)
"I am now under contract with Xinuos to convert the documentation
to a more maintainable form and maintain it from there. Current plans
are to translate it to ReStructured Text and place it on github."

If Cary doesn't assign DT_RELRSZ/DT_RELR/DT_RELRENT to other semantics
when then the new site is available, I think we are free to use the 3
values now:)
  
H.J. Lu Oct. 8, 2021, 4:51 p.m. UTC | #3
On Thu, Oct 7, 2021 at 11:58 PM Fangrui Song via Binutils
<binutils@sourceware.org> wrote:
>
> PIC objects (especially PIE and symbolic shared objects) usually have many
> relative relocations. In 2017/2018, SHT_RELR/DT_RELR was proposed on
> https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/GxjM0L-PBAAJ
> ("Proposal for a new section type SHT_RELR") and welcomed by many parties
> (including Solaris). This packed format can typically save 95% dynamic
> relocation section size for PIE. The vaddr size of a PIE can be 10% smaller.
>
> * Chrome OS folks have carried a local patch for a while (latest version:
>   https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-libs/glibc/files/local/glibc-2.32).
>   I.e. this feature has been battle tested.
> * Android bionic supports DT_RELR.
> * The Linux kernel has supported CONFIG_RELR since 2019-08
>   (https://git.kernel.org/linus/5cf896fb6be3effd9aea455b22213e27be8bdb1d).
> * A musl patch (by me) exists but is not applied:
>   https://www.openwall.com/lists/musl/2019/03/06/3
>
> I believe upstream glibc should support DT_RELR to benefit all Linux
> distributions.
>
> As of linker support (to the best of my knowledge):
>
> * LLD support DT_RELR.
> * https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-devel/binutils/files/
>   has a gold patch.
> * GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=27923
>
> I wish that GNU ld and gold maintainers can implement the feature as well :)

This feature should be implemented in GNU linker first before getting
into glibc.


H.J.
  
Fangrui Song Oct. 8, 2021, 5:37 p.m. UTC | #4
On 2021-10-08, H.J. Lu wrote:
>On Thu, Oct 7, 2021 at 11:58 PM Fangrui Song via Binutils
><binutils@sourceware.org> wrote:
>>
>> PIC objects (especially PIE and symbolic shared objects) usually have many
>> relative relocations. In 2017/2018, SHT_RELR/DT_RELR was proposed on
>> https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/GxjM0L-PBAAJ
>> ("Proposal for a new section type SHT_RELR") and welcomed by many parties
>> (including Solaris). This packed format can typically save 95% dynamic
>> relocation section size for PIE. The vaddr size of a PIE can be 10% smaller.
>>
>> * Chrome OS folks have carried a local patch for a while (latest version:
>>   https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-libs/glibc/files/local/glibc-2.32).
>>   I.e. this feature has been battle tested.
>> * Android bionic supports DT_RELR.
>> * The Linux kernel has supported CONFIG_RELR since 2019-08
>>   (https://git.kernel.org/linus/5cf896fb6be3effd9aea455b22213e27be8bdb1d).
>> * A musl patch (by me) exists but is not applied:
>>   https://www.openwall.com/lists/musl/2019/03/06/3
>>
>> I believe upstream glibc should support DT_RELR to benefit all Linux
>> distributions.
>>
>> As of linker support (to the best of my knowledge):
>>
>> * LLD support DT_RELR.
>> * https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-devel/binutils/files/
>>   has a gold patch.
>> * GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=27923
>>
>> I wish that GNU ld and gold maintainers can implement the feature as well :)
>
>This feature should be implemented in GNU linker first before getting
>into glibc.
>
>
>H.J.

Why? I think the situation is quite different from other features where
we generally want producers before consumers.  The feature is validated
by multiple parties and has multiple independent implementations and has
an implementation in a glibc supported linker (LLD).

Landing into glibc first can do GNU linker folks a favor: they don't
need to apply the glibc patch themselves to validate correctness of the
linker change.
  
H.J. Lu Oct. 8, 2021, 5:43 p.m. UTC | #5
On Fri, Oct 8, 2021 at 10:37 AM Fangrui Song <maskray@google.com> wrote:
>
> On 2021-10-08, H.J. Lu wrote:
> >On Thu, Oct 7, 2021 at 11:58 PM Fangrui Song via Binutils
> ><binutils@sourceware.org> wrote:
> >>
> >> PIC objects (especially PIE and symbolic shared objects) usually have many
> >> relative relocations. In 2017/2018, SHT_RELR/DT_RELR was proposed on
> >> https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/GxjM0L-PBAAJ
> >> ("Proposal for a new section type SHT_RELR") and welcomed by many parties
> >> (including Solaris). This packed format can typically save 95% dynamic
> >> relocation section size for PIE. The vaddr size of a PIE can be 10% smaller.
> >>
> >> * Chrome OS folks have carried a local patch for a while (latest version:
> >>   https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-libs/glibc/files/local/glibc-2.32).
> >>   I.e. this feature has been battle tested.
> >> * Android bionic supports DT_RELR.
> >> * The Linux kernel has supported CONFIG_RELR since 2019-08
> >>   (https://git.kernel.org/linus/5cf896fb6be3effd9aea455b22213e27be8bdb1d).
> >> * A musl patch (by me) exists but is not applied:
> >>   https://www.openwall.com/lists/musl/2019/03/06/3
> >>
> >> I believe upstream glibc should support DT_RELR to benefit all Linux
> >> distributions.
> >>
> >> As of linker support (to the best of my knowledge):
> >>
> >> * LLD support DT_RELR.
> >> * https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-devel/binutils/files/
> >>   has a gold patch.
> >> * GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=27923
> >>
> >> I wish that GNU ld and gold maintainers can implement the feature as well :)
> >
> >This feature should be implemented in GNU linker first before getting
> >into glibc.
> >
> >
> >H.J.
>
> Why? I think the situation is quite different from other features where
> we generally want producers before consumers.  The feature is validated
> by multiple parties and has multiple independent implementations and has
> an implementation in a glibc supported linker (LLD).

This feature should be validated by the default glibc linker.   Otherwise,
things can go wrong unnoticed.

> Landing into glibc first can do GNU linker folks a favor: they don't
> need to apply the glibc patch themselves to validate correctness of the
> linker change.

This has never been a real issue.  We prefer to test linker correctness
independent of host, library and/or CPU.
  
Fangrui Song Oct. 8, 2021, 6:46 p.m. UTC | #6
On 2021-10-08, H.J. Lu wrote:
>On Fri, Oct 8, 2021 at 10:37 AM Fangrui Song <maskray@google.com> wrote:
>>
>> On 2021-10-08, H.J. Lu wrote:
>> >On Thu, Oct 7, 2021 at 11:58 PM Fangrui Song via Binutils
>> ><binutils@sourceware.org> wrote:
>> >>
>> >> PIC objects (especially PIE and symbolic shared objects) usually have many
>> >> relative relocations. In 2017/2018, SHT_RELR/DT_RELR was proposed on
>> >> https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/GxjM0L-PBAAJ
>> >> ("Proposal for a new section type SHT_RELR") and welcomed by many parties
>> >> (including Solaris). This packed format can typically save 95% dynamic
>> >> relocation section size for PIE. The vaddr size of a PIE can be 10% smaller.
>> >>
>> >> * Chrome OS folks have carried a local patch for a while (latest version:
>> >>   https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-libs/glibc/files/local/glibc-2.32).
>> >>   I.e. this feature has been battle tested.
>> >> * Android bionic supports DT_RELR.
>> >> * The Linux kernel has supported CONFIG_RELR since 2019-08
>> >>   (https://git.kernel.org/linus/5cf896fb6be3effd9aea455b22213e27be8bdb1d).
>> >> * A musl patch (by me) exists but is not applied:
>> >>   https://www.openwall.com/lists/musl/2019/03/06/3
>> >>
>> >> I believe upstream glibc should support DT_RELR to benefit all Linux
>> >> distributions.
>> >>
>> >> As of linker support (to the best of my knowledge):
>> >>
>> >> * LLD support DT_RELR.
>> >> * https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-devel/binutils/files/
>> >>   has a gold patch.
>> >> * GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=27923
>> >>
>> >> I wish that GNU ld and gold maintainers can implement the feature as well :)
>> >
>> >This feature should be implemented in GNU linker first before getting
>> >into glibc.
>> >
>> >
>> >H.J.
>>
>> Why? I think the situation is quite different from other features where
>> we generally want producers before consumers.  The feature is validated
>> by multiple parties and has multiple independent implementations and has
>> an implementation in a glibc supported linker (LLD).
>
>This feature should be validated by the default glibc linker.   Otherwise,
>things can go wrong unnoticed.

I wish you did not get overly cautious on this matter:)

The original ChromeOS patch
https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-libs/glibc/files/local/glibc-2.32/0004-sys-libs-glibc-add-support-for-SHT_RELR-sections.patch
is indeed intrusive and may cause maintenance problem in my view
(with my recent experience working on the loader).
Its elf_machine_relr_relative callback and its change to elf/do-rel.h
make me concerned.

This patch is clean and touches just one place. The core logic is just
the 26-line ELF_DYNAMIC_DO_RELR in an isolated place.
The ELF_DYNAMIC_RELOCATE framework hasn't change since 1995.
The new logic doesn't touch symbol resolution, so it will require
minimum maintenance burden.

Since I put up the patch, I certainly sign up for maintaining this piece
of code in case of a rare break.

>> Landing into glibc first can do GNU linker folks a favor: they don't
>> need to apply the glibc patch themselves to validate correctness of the
>> linker change.
>
>This has never been a real issue.  We prefer to test linker correctness
>independent of host, library and/or CPU.

May I ask you, Alan, or Nick, or other GNU ld maintainers to take up the work? :)

Sorry that I cannot deal with the sheer mount of GNU ld complexity by
myself. I will be happy to do validation, though.

I can give one hint that since .relr.dyn is sensitive to the relocated
locations' addresses care needs  to be taken to avoid size oscillation
especially on architectures which need range extension
thunks(veneers/stub groups).  A nice approach is to never shrink the
size of .relr.dyn (https://reviews.llvm.org/D67164)

https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/refs/heads/main/sys-devel/binutils/files/0006-gold-readelf-add-experimental-support-for-SHT_RELR-s.patch
mixes gold and readelf patches. Perhaps a maintainer wants to split it.
(--experimental-use-relr may be better spelled as --pack-dyn-relocs=relr
to be consistent with LLD. If you insist that ELF specific options
should use -z relr I am fine as well.)

(I just checked: Rahul Chaudhry <rahulchaudhry@google.com> no longer
works at Google so the email address will not work.)
  
Cary Coutant Oct. 8, 2021, 7:41 p.m. UTC | #7
> >Do you know what the official allocation status for the new constants
> >is?
>
> Cary took over maintenanceship (
> https://groups.google.com/g/generic-abi/c/9OO5vhxb00Y "Ongoing
> Maintenance of the gABI") but (AIUI) does not have access to that
> website.

I am nearing completion of the conversion of the gABI/ELF spec to
reStructuredText, after which I will start applying the changes we've
agreed to over the years. The plan is to maintain it on github. Stay
tuned.

-cary
  
Jan Beulich Oct. 11, 2021, 7:48 a.m. UTC | #8
On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
> --- a/elf/dynamic-link.h
> +++ b/elf/dynamic-link.h
> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
>  # endif
>  
> +# define ELF_DYNAMIC_DO_RELR(map)					      \
> +  do {									      \
> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;			      \
> +    const ElfW(Relr) *r = 0, *end = 0;					      \
> +    if (!(map)->l_info[DT_RELR])					      \
> +      break;								      \
> +    start = D_PTR((map), l_info[DT_RELR]);				      \
> +    r = (const ElfW(Relr) *)start;					      \
> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
> +    for (; r < end; ++r) {						      \
> +      ElfW(Relr) entry = *r;						      \
> +      if ((entry & 1) == 0) {						      \
> +	*((ElfW(Addr) *)(l_addr + entry)) += l_addr;			      \
> +	base = entry + sizeof(ElfW(Addr));				      \
> +	continue;							      \
> +      }									      \
> +      ElfW(Addr) offset = base;						      \
> +      do {								      \
> +	entry >>= 1;							      \
> +	if ((entry & 1) != 0)						      \
> +	  *((ElfW(Addr) *)(l_addr + offset)) += l_addr;			      \
> +	offset += sizeof(ElfW(Addr));					      \
> +      } while (entry != 0);						      \
> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));	      \

While in line with the proposed spec additions I'm afraid the uses of
ElfW(Addr) here aren't universally correct: You assume that ELF
container type (size) expresses an aspect of the ABI. While this is
indeed the case for several arch-es, I think this has been a mistake.
IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
code can validly live in a 32-bit ELF container (at least as far as
the psABI is concerned; I have no idea whether glibc actually
followed the spec). There's a separate ELF header flag indicating the
ABI, and hence the size of a pointer.

> --- /dev/null
> +++ b/elf/tst-relr.c
> @@ -0,0 +1,20 @@
> +static int o, x;
> +void *arr[] = {
> +  &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o,
> +  0,
> +  &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x,
> +};

Personally I consider this overly simplistic a testcase. The two
sequences are fully identical, have no gaps except for the one in
the middle, and there's also no interleaving / mixing of pointers.
None of this should matter as the specific symbol the relocation
is for is not supposed to be of interest, but in a testcase you
want to make sure none of this has an effect.

I also wonder whether it is a good idea to have a testcase with
"fundamentally" different behavior on 32-bit vs 64-bit: The former
will require two RELR entries afaict, while the latter will get
away with one here.

Jan

> +static int
> +do_test (void)
> +{
> +  for (int i = 0; i < 16; i++)
> +    if (arr[i] != &o)
> +      return 1;
> +  for (int i = 17; i < 33; i++)
> +    if (arr[i] != &x)
> +      return 1;
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
>
  
Fangrui Song Oct. 11, 2021, 6:43 p.m. UTC | #9
On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
> > --- a/elf/dynamic-link.h
> > +++ b/elf/dynamic-link.h
> > @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
> >  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
> >  # endif
> >
> > +# define ELF_DYNAMIC_DO_RELR(map)                                          \
> > +  do {                                                                             \
> > +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
> > +    const ElfW(Relr) *r = 0, *end = 0;                                             \
> > +    if (!(map)->l_info[DT_RELR])                                           \
> > +      break;                                                               \
> > +    start = D_PTR((map), l_info[DT_RELR]);                                 \
> > +    r = (const ElfW(Relr) *)start;                                         \
> > +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
> > +    for (; r < end; ++r) {                                                 \
> > +      ElfW(Relr) entry = *r;                                               \
> > +      if ((entry & 1) == 0) {                                                      \
> > +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
> > +     base = entry + sizeof(ElfW(Addr));                                    \
> > +     continue;                                                             \
> > +      }                                                                            \
> > +      ElfW(Addr) offset = base;                                                    \
> > +      do {                                                                 \
> > +     entry >>= 1;                                                          \
> > +     if ((entry & 1) != 0)                                                 \
> > +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
> > +     offset += sizeof(ElfW(Addr));                                         \
> > +      } while (entry != 0);                                                \
> > +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
>
> While in line with the proposed spec additions I'm afraid the uses of
> ElfW(Addr) here aren't universally correct: You assume that ELF
> container type (size) expresses an aspect of the ABI. While this is
> indeed the case for several arch-es, I think this has been a mistake.
> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> code can validly live in a 32-bit ELF container (at least as far as
> the psABI is concerned; I have no idea whether glibc actually
> followed the spec). There's a separate ELF header flag indicating the
> ABI, and hence the size of a pointer.

Thanks for chiming in.

As of ia64 buildability, it works for me:

scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
mkdir -p out/ia64; cd out/ia64
../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
make -j 50

As of the actual functionality, ugh, I cannot find ia64 in my Debian
testing's qemu-user-static package:( So I cannot test.

That said, gold and LLD don't support ia64.
If we have a concern that ia64 may not work, the GNU ld maintainers
can simply not add ia64 support:)
I hope that means we don't need ELF_MACHINE_NO_RELR.

> > --- /dev/null
> > +++ b/elf/tst-relr.c
> > @@ -0,0 +1,20 @@
> > +static int o, x;
> > +void *arr[] = {
> > +  &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o,
> > +  0,
> > +  &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x,
> > +};
>
> Personally I consider this overly simplistic a testcase. The two
> sequences are fully identical, have no gaps except for the one in
> the middle, and there's also no interleaving / mixing of pointers.
> None of this should matter as the specific symbol the relocation
> is for is not supposed to be of interest, but in a testcase you
> want to make sure none of this has an effect.
>
> I also wonder whether it is a good idea to have a testcase with
> "fundamentally" different behavior on 32-bit vs 64-bit: The former
> will require two RELR entries afaict, while the latter will get
> away with one here.

Thanks for the test suggestion. I can use some fancy patterns.

> Jan
>
> > +static int
> > +do_test (void)
> > +{
> > +  for (int i = 0; i < 16; i++)
> > +    if (arr[i] != &o)
> > +      return 1;
> > +  for (int i = 17; i < 33; i++)
> > +    if (arr[i] != &x)
> > +      return 1;
> > +  return 0;
> > +}
> > +
> > +#include <support/test-driver.c>
> >
>
  
Joseph Myers Oct. 11, 2021, 9:47 p.m. UTC | #10
On Thu, 7 Oct 2021, Fangrui Song via Libc-alpha wrote:

> I have adjusted aclocal.m4, otherwise it thinks ld.lld doesn't support
> --pack-dyn-relocs=relr just because ld.lld -v --help doesn't contain the literal
> string.

Since you're changing LIBC_LINKER_FEATURE, I think 
sysdeps/unix/sysv/linux/powerpc/configure also needs regenerating 
(sysdeps/unix/sysv/linux/powerpc/configure.ac uses LIBC_LINKER_FEATURE).
  
Joseph Myers Oct. 11, 2021, 10:08 p.m. UTC | #11
On Mon, 11 Oct 2021, Fāng-ruì Sòng via Libc-alpha wrote:

> > While in line with the proposed spec additions I'm afraid the uses of
> > ElfW(Addr) here aren't universally correct: You assume that ELF
> > container type (size) expresses an aspect of the ABI. While this is
> > indeed the case for several arch-es, I think this has been a mistake.
> > IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> > code can validly live in a 32-bit ELF container (at least as far as
> > the psABI is concerned; I have no idea whether glibc actually
> > followed the spec). There's a separate ELF header flag indicating the
> > ABI, and hence the size of a pointer.
> 
> Thanks for chiming in.
> 
> As of ia64 buildability, it works for me:

As far as I know, glibc and the Linux kernel never supported ILP32 on ia64 
(HP-UX did).  But I'm not entirely clear whether the ILP32 ABI is what's 
being referred to here (cf. x32, MIPS n32, etc. - ILP32 ABIs using 64-bit 
instructions, and 32-bit ELF), or something else.
  
Jan Beulich Oct. 12, 2021, 8:14 a.m. UTC | #12
On 12.10.2021 00:08, Joseph Myers wrote:
> On Mon, 11 Oct 2021, Fāng-ruì Sòng via Libc-alpha wrote:
> 
>>> While in line with the proposed spec additions I'm afraid the uses of
>>> ElfW(Addr) here aren't universally correct: You assume that ELF
>>> container type (size) expresses an aspect of the ABI. While this is
>>> indeed the case for several arch-es, I think this has been a mistake.
>>> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
>>> code can validly live in a 32-bit ELF container (at least as far as
>>> the psABI is concerned; I have no idea whether glibc actually
>>> followed the spec). There's a separate ELF header flag indicating the
>>> ABI, and hence the size of a pointer.
>>
>> Thanks for chiming in.
>>
>> As of ia64 buildability, it works for me:
> 
> As far as I know, glibc and the Linux kernel never supported ILP32 on ia64 
> (HP-UX did).  But I'm not entirely clear whether the ILP32 ABI is what's 
> being referred to here (cf. x32, MIPS n32, etc. - ILP32 ABIs using 64-bit 
> instructions, and 32-bit ELF), or something else.

Yes, it's ILP32 that I'm referring to, but specifically not 32-bit ELF
to express that (unlike e.g. x32 or Arm64's respective mode). Whether
MIPS is comparable I don't know - there are at least a couple of header
flags which suggest the ELF container size may be independent of the
ABI.

Jan
  
Jan Beulich Oct. 12, 2021, 8:18 a.m. UTC | #13
On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
> On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
>>> --- a/elf/dynamic-link.h
>>> +++ b/elf/dynamic-link.h
>>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
>>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
>>>  # endif
>>>
>>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
>>> +  do {                                                                             \
>>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
>>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
>>> +    if (!(map)->l_info[DT_RELR])                                           \
>>> +      break;                                                               \
>>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
>>> +    r = (const ElfW(Relr) *)start;                                         \
>>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
>>> +    for (; r < end; ++r) {                                                 \
>>> +      ElfW(Relr) entry = *r;                                               \
>>> +      if ((entry & 1) == 0) {                                                      \
>>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
>>> +     base = entry + sizeof(ElfW(Addr));                                    \
>>> +     continue;                                                             \
>>> +      }                                                                            \
>>> +      ElfW(Addr) offset = base;                                                    \
>>> +      do {                                                                 \
>>> +     entry >>= 1;                                                          \
>>> +     if ((entry & 1) != 0)                                                 \
>>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
>>> +     offset += sizeof(ElfW(Addr));                                         \
>>> +      } while (entry != 0);                                                \
>>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
>>
>> While in line with the proposed spec additions I'm afraid the uses of
>> ElfW(Addr) here aren't universally correct: You assume that ELF
>> container type (size) expresses an aspect of the ABI. While this is
>> indeed the case for several arch-es, I think this has been a mistake.
>> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
>> code can validly live in a 32-bit ELF container (at least as far as
>> the psABI is concerned; I have no idea whether glibc actually
>> followed the spec). There's a separate ELF header flag indicating the
>> ABI, and hence the size of a pointer.
> 
> Thanks for chiming in.
> 
> As of ia64 buildability, it works for me:
> 
> scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
> mkdir -p out/ia64; cd out/ia64
> ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
> CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
> CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
> make -j 50

I didn't suggest the build would fail. What I said is that I don't
think the code is correct there.

> As of the actual functionality, ugh, I cannot find ia64 in my Debian
> testing's qemu-user-static package:( So I cannot test.
> 
> That said, gold and LLD don't support ia64.
> If we have a concern that ia64 may not work, the GNU ld maintainers
> can simply not add ia64 support:)

But you realize that I took ia64 only as example, as that's where
I know ABI (pointer size) and ELF container size aren't connected.
As per my looking at merely EF_MIPS_* in context of reading
Joseph's reply, it might be that MIPS is another such example. But
I lack sufficient knowledge of MIPS ...

Jan
  
H.J. Lu Oct. 12, 2021, 2:09 p.m. UTC | #14
On Tue, Oct 12, 2021 at 1:18 AM Jan Beulich via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
> > On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
> >>> --- a/elf/dynamic-link.h
> >>> +++ b/elf/dynamic-link.h
> >>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
> >>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
> >>>  # endif
> >>>
> >>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
> >>> +  do {                                                                             \
> >>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
> >>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
> >>> +    if (!(map)->l_info[DT_RELR])                                           \
> >>> +      break;                                                               \
> >>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
> >>> +    r = (const ElfW(Relr) *)start;                                         \
> >>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
> >>> +    for (; r < end; ++r) {                                                 \
> >>> +      ElfW(Relr) entry = *r;                                               \
> >>> +      if ((entry & 1) == 0) {                                                      \
> >>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
> >>> +     base = entry + sizeof(ElfW(Addr));                                    \
> >>> +     continue;                                                             \
> >>> +      }                                                                            \
> >>> +      ElfW(Addr) offset = base;                                                    \
> >>> +      do {                                                                 \
> >>> +     entry >>= 1;                                                          \
> >>> +     if ((entry & 1) != 0)                                                 \
> >>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
> >>> +     offset += sizeof(ElfW(Addr));                                         \
> >>> +      } while (entry != 0);                                                \
> >>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
> >>
> >> While in line with the proposed spec additions I'm afraid the uses of
> >> ElfW(Addr) here aren't universally correct: You assume that ELF
> >> container type (size) expresses an aspect of the ABI. While this is
> >> indeed the case for several arch-es, I think this has been a mistake.
> >> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> >> code can validly live in a 32-bit ELF container (at least as far as
> >> the psABI is concerned; I have no idea whether glibc actually
> >> followed the spec). There's a separate ELF header flag indicating the
> >> ABI, and hence the size of a pointer.
> >
> > Thanks for chiming in.
> >
> > As of ia64 buildability, it works for me:
> >
> > scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
> > mkdir -p out/ia64; cd out/ia64
> > ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
> > CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
> > CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
> > make -j 50
>
> I didn't suggest the build would fail. What I said is that I don't
> think the code is correct there.
>
> > As of the actual functionality, ugh, I cannot find ia64 in my Debian
> > testing's qemu-user-static package:( So I cannot test.
> >
> > That said, gold and LLD don't support ia64.
> > If we have a concern that ia64 may not work, the GNU ld maintainers
> > can simply not add ia64 support:)
>
> But you realize that I took ia64 only as example, as that's where
> I know ABI (pointer size) and ELF container size aren't connected.
> As per my looking at merely EF_MIPS_* in context of reading
> Joseph's reply, it might be that MIPS is another such example. But
> I lack sufficient knowledge of MIPS ...
>

The new code should be tested and verified on all supported
targets.  That is another reason to implement this in binutils
ld first.
  
Fangrui Song Oct. 12, 2021, 4:07 p.m. UTC | #15
On Tue, Oct 12, 2021 at 7:10 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Oct 12, 2021 at 1:18 AM Jan Beulich via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
> > > On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
> > >>
> > >> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
> > >>> --- a/elf/dynamic-link.h
> > >>> +++ b/elf/dynamic-link.h
> > >>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
> > >>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
> > >>>  # endif
> > >>>
> > >>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
> > >>> +  do {                                                                             \
> > >>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
> > >>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
> > >>> +    if (!(map)->l_info[DT_RELR])                                           \
> > >>> +      break;                                                               \
> > >>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
> > >>> +    r = (const ElfW(Relr) *)start;                                         \
> > >>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
> > >>> +    for (; r < end; ++r) {                                                 \
> > >>> +      ElfW(Relr) entry = *r;                                               \
> > >>> +      if ((entry & 1) == 0) {                                                      \
> > >>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
> > >>> +     base = entry + sizeof(ElfW(Addr));                                    \
> > >>> +     continue;                                                             \
> > >>> +      }                                                                            \
> > >>> +      ElfW(Addr) offset = base;                                                    \
> > >>> +      do {                                                                 \
> > >>> +     entry >>= 1;                                                          \
> > >>> +     if ((entry & 1) != 0)                                                 \
> > >>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
> > >>> +     offset += sizeof(ElfW(Addr));                                         \
> > >>> +      } while (entry != 0);                                                \
> > >>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
> > >>
> > >> While in line with the proposed spec additions I'm afraid the uses of
> > >> ElfW(Addr) here aren't universally correct: You assume that ELF
> > >> container type (size) expresses an aspect of the ABI. While this is
> > >> indeed the case for several arch-es, I think this has been a mistake.
> > >> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> > >> code can validly live in a 32-bit ELF container (at least as far as
> > >> the psABI is concerned; I have no idea whether glibc actually
> > >> followed the spec). There's a separate ELF header flag indicating the
> > >> ABI, and hence the size of a pointer.
> > >
> > > Thanks for chiming in.
> > >
> > > As of ia64 buildability, it works for me:
> > >
> > > scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
> > > mkdir -p out/ia64; cd out/ia64
> > > ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
> > > CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
> > > CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
> > > make -j 50
> >
> > I didn't suggest the build would fail. What I said is that I don't
> > think the code is correct there.
> >
> > > As of the actual functionality, ugh, I cannot find ia64 in my Debian
> > > testing's qemu-user-static package:( So I cannot test.
> > >
> > > That said, gold and LLD don't support ia64.
> > > If we have a concern that ia64 may not work, the GNU ld maintainers
> > > can simply not add ia64 support:)
> >
> > But you realize that I took ia64 only as example, as that's where
> > I know ABI (pointer size) and ELF container size aren't connected.
> > As per my looking at merely EF_MIPS_* in context of reading
> > Joseph's reply, it might be that MIPS is another such example. But
> > I lack sufficient knowledge of MIPS ...
> >
>
> The new code should be tested and verified on all supported
> targets.  That is another reason to implement this in binutils
> ld first.

--pack-dyn-relocs=relr is well tested on arm, aarch64, and x86, and
works on popular arches like ppc64 as well.
For mips, it is no harm to keep the DT_RELR code path. Its
elf_machine_rel_relative is empty and it has no relative relocation
anyway.
I wish that our reasonable design decisions are not restricted by the
few retrocomputing architectures, especially when the concern is still
at the theoretical stage.
Well, if DT_RELR has issue, don't use it; nobody forces them to use DT_RELR.
After all, --pack-dyn-relocs=relr is opt in.

On the other side, DT_RELR is a really great addition to save code
size for non-exotic architectures. Distributions will benefit a lot.
LLD users can immediately benefit.
gold users can considerably leverage the feature relatively easily
since a patch is available.
  
Jan Beulich Oct. 13, 2021, 6 a.m. UTC | #16
On 12.10.2021 18:07, Fāng-ruì Sòng wrote:
> On Tue, Oct 12, 2021 at 7:10 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Tue, Oct 12, 2021 at 1:18 AM Jan Beulich via Libc-alpha
>> <libc-alpha@sourceware.org> wrote:
>>>
>>> On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
>>>> On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>
>>>>> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
>>>>>> --- a/elf/dynamic-link.h
>>>>>> +++ b/elf/dynamic-link.h
>>>>>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
>>>>>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
>>>>>>  # endif
>>>>>>
>>>>>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
>>>>>> +  do {                                                                             \
>>>>>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
>>>>>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
>>>>>> +    if (!(map)->l_info[DT_RELR])                                           \
>>>>>> +      break;                                                               \
>>>>>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
>>>>>> +    r = (const ElfW(Relr) *)start;                                         \
>>>>>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
>>>>>> +    for (; r < end; ++r) {                                                 \
>>>>>> +      ElfW(Relr) entry = *r;                                               \
>>>>>> +      if ((entry & 1) == 0) {                                                      \
>>>>>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
>>>>>> +     base = entry + sizeof(ElfW(Addr));                                    \
>>>>>> +     continue;                                                             \
>>>>>> +      }                                                                            \
>>>>>> +      ElfW(Addr) offset = base;                                                    \
>>>>>> +      do {                                                                 \
>>>>>> +     entry >>= 1;                                                          \
>>>>>> +     if ((entry & 1) != 0)                                                 \
>>>>>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
>>>>>> +     offset += sizeof(ElfW(Addr));                                         \
>>>>>> +      } while (entry != 0);                                                \
>>>>>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
>>>>>
>>>>> While in line with the proposed spec additions I'm afraid the uses of
>>>>> ElfW(Addr) here aren't universally correct: You assume that ELF
>>>>> container type (size) expresses an aspect of the ABI. While this is
>>>>> indeed the case for several arch-es, I think this has been a mistake.
>>>>> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
>>>>> code can validly live in a 32-bit ELF container (at least as far as
>>>>> the psABI is concerned; I have no idea whether glibc actually
>>>>> followed the spec). There's a separate ELF header flag indicating the
>>>>> ABI, and hence the size of a pointer.
>>>>
>>>> Thanks for chiming in.
>>>>
>>>> As of ia64 buildability, it works for me:
>>>>
>>>> scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
>>>> mkdir -p out/ia64; cd out/ia64
>>>> ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
>>>> CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
>>>> CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
>>>> make -j 50
>>>
>>> I didn't suggest the build would fail. What I said is that I don't
>>> think the code is correct there.
>>>
>>>> As of the actual functionality, ugh, I cannot find ia64 in my Debian
>>>> testing's qemu-user-static package:( So I cannot test.
>>>>
>>>> That said, gold and LLD don't support ia64.
>>>> If we have a concern that ia64 may not work, the GNU ld maintainers
>>>> can simply not add ia64 support:)
>>>
>>> But you realize that I took ia64 only as example, as that's where
>>> I know ABI (pointer size) and ELF container size aren't connected.
>>> As per my looking at merely EF_MIPS_* in context of reading
>>> Joseph's reply, it might be that MIPS is another such example. But
>>> I lack sufficient knowledge of MIPS ...
>>>
>>
>> The new code should be tested and verified on all supported
>> targets.  That is another reason to implement this in binutils
>> ld first.
> 
> --pack-dyn-relocs=relr is well tested on arm, aarch64, and x86, and
> works on popular arches like ppc64 as well.
> For mips, it is no harm to keep the DT_RELR code path. Its
> elf_machine_rel_relative is empty and it has no relative relocation
> anyway.
> I wish that our reasonable design decisions are not restricted by the
> few retrocomputing architectures, especially when the concern is still
> at the theoretical stage.

For ia64 it's not theoretical at all, as long as you leave aside the
fact the deprecation state of that architecture. I also have to admit
that I have trouble seeing why the design can't be adjusted to fit
original ELF intentions rather than (as said, imo bad) decisions
taken by a few (popular) architectures. Besides adjusting the wording
accordingly, all it takes for your implementation is to parameterize
word (pointer) size.

Jan
  
Fangrui Song Oct. 13, 2021, 6:13 a.m. UTC | #17
On Tue, Oct 12, 2021 at 11:00 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 12.10.2021 18:07, Fāng-ruì Sòng wrote:
> > On Tue, Oct 12, 2021 at 7:10 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>
> >> On Tue, Oct 12, 2021 at 1:18 AM Jan Beulich via Libc-alpha
> >> <libc-alpha@sourceware.org> wrote:
> >>>
> >>> On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
> >>>> On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>
> >>>>> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
> >>>>>> --- a/elf/dynamic-link.h
> >>>>>> +++ b/elf/dynamic-link.h
> >>>>>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
> >>>>>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
> >>>>>>  # endif
> >>>>>>
> >>>>>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
> >>>>>> +  do {                                                                             \
> >>>>>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
> >>>>>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
> >>>>>> +    if (!(map)->l_info[DT_RELR])                                           \
> >>>>>> +      break;                                                               \
> >>>>>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
> >>>>>> +    r = (const ElfW(Relr) *)start;                                         \
> >>>>>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
> >>>>>> +    for (; r < end; ++r) {                                                 \
> >>>>>> +      ElfW(Relr) entry = *r;                                               \
> >>>>>> +      if ((entry & 1) == 0) {                                                      \
> >>>>>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
> >>>>>> +     base = entry + sizeof(ElfW(Addr));                                    \
> >>>>>> +     continue;                                                             \
> >>>>>> +      }                                                                            \
> >>>>>> +      ElfW(Addr) offset = base;                                                    \
> >>>>>> +      do {                                                                 \
> >>>>>> +     entry >>= 1;                                                          \
> >>>>>> +     if ((entry & 1) != 0)                                                 \
> >>>>>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
> >>>>>> +     offset += sizeof(ElfW(Addr));                                         \
> >>>>>> +      } while (entry != 0);                                                \
> >>>>>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
> >>>>>
> >>>>> While in line with the proposed spec additions I'm afraid the uses of
> >>>>> ElfW(Addr) here aren't universally correct: You assume that ELF
> >>>>> container type (size) expresses an aspect of the ABI. While this is
> >>>>> indeed the case for several arch-es, I think this has been a mistake.
> >>>>> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> >>>>> code can validly live in a 32-bit ELF container (at least as far as
> >>>>> the psABI is concerned; I have no idea whether glibc actually
> >>>>> followed the spec). There's a separate ELF header flag indicating the
> >>>>> ABI, and hence the size of a pointer.
> >>>>
> >>>> Thanks for chiming in.
> >>>>
> >>>> As of ia64 buildability, it works for me:
> >>>>
> >>>> scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
> >>>> mkdir -p out/ia64; cd out/ia64
> >>>> ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
> >>>> CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
> >>>> CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
> >>>> make -j 50
> >>>
> >>> I didn't suggest the build would fail. What I said is that I don't
> >>> think the code is correct there.
> >>>
> >>>> As of the actual functionality, ugh, I cannot find ia64 in my Debian
> >>>> testing's qemu-user-static package:( So I cannot test.
> >>>>
> >>>> That said, gold and LLD don't support ia64.
> >>>> If we have a concern that ia64 may not work, the GNU ld maintainers
> >>>> can simply not add ia64 support:)
> >>>
> >>> But you realize that I took ia64 only as example, as that's where
> >>> I know ABI (pointer size) and ELF container size aren't connected.
> >>> As per my looking at merely EF_MIPS_* in context of reading
> >>> Joseph's reply, it might be that MIPS is another such example. But
> >>> I lack sufficient knowledge of MIPS ...
> >>>
> >>
> >> The new code should be tested and verified on all supported
> >> targets.  That is another reason to implement this in binutils
> >> ld first.
> >
> > --pack-dyn-relocs=relr is well tested on arm, aarch64, and x86, and
> > works on popular arches like ppc64 as well.
> > For mips, it is no harm to keep the DT_RELR code path. Its
> > elf_machine_rel_relative is empty and it has no relative relocation
> > anyway.
> > I wish that our reasonable design decisions are not restricted by the
> > few retrocomputing architectures, especially when the concern is still
> > at the theoretical stage.
>
> For ia64 it's not theoretical at all, as long as you leave aside the
> fact the deprecation state of that architecture. I also have to admit
> that I have trouble seeing why the design can't be adjusted to fit
> original ELF intentions rather than (as said, imo bad) decisions
> taken by a few (popular) architectures. Besides adjusting the wording
> accordingly, all it takes for your implementation is to parameterize
> word (pointer) size.
>
> Jan
>

I re-read your first reply. Did you mean 64-bit small code model code
in an ELFCLASS32 ET_EXEC/ET_DYN file?
OK. I considered it as a size optimization for a different task.
"AArch64 and x86-64 define ILP32 ABIs and use ELFCLASS32, but
technically they can use ELFCLASS32 for small code model with regular
ABIs, if the kernel allows." (from
https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order)

I assume that switching *((ElfW(Addr) *)(l_addr + entry)) += l_addr;
to `unsigned long` makes this use case work, though I doubt glibc's
ia64 implementation supports this as sysdeps/ia64/dl-machine.h uses
Elf64.
  
Fangrui Song Oct. 13, 2021, 6:18 a.m. UTC | #18
On Tue, Oct 12, 2021 at 11:13 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>
> On Tue, Oct 12, 2021 at 11:00 PM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > On 12.10.2021 18:07, Fāng-ruì Sòng wrote:
> > > On Tue, Oct 12, 2021 at 7:10 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >>
> > >> On Tue, Oct 12, 2021 at 1:18 AM Jan Beulich via Libc-alpha
> > >> <libc-alpha@sourceware.org> wrote:
> > >>>
> > >>> On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
> > >>>> On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
> > >>>>>
> > >>>>> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
> > >>>>>> --- a/elf/dynamic-link.h
> > >>>>>> +++ b/elf/dynamic-link.h
> > >>>>>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
> > >>>>>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
> > >>>>>>  # endif
> > >>>>>>
> > >>>>>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
> > >>>>>> +  do {                                                                             \
> > >>>>>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
> > >>>>>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
> > >>>>>> +    if (!(map)->l_info[DT_RELR])                                           \
> > >>>>>> +      break;                                                               \
> > >>>>>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
> > >>>>>> +    r = (const ElfW(Relr) *)start;                                         \
> > >>>>>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
> > >>>>>> +    for (; r < end; ++r) {                                                 \
> > >>>>>> +      ElfW(Relr) entry = *r;                                               \
> > >>>>>> +      if ((entry & 1) == 0) {                                                      \
> > >>>>>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
> > >>>>>> +     base = entry + sizeof(ElfW(Addr));                                    \
> > >>>>>> +     continue;                                                             \
> > >>>>>> +      }                                                                            \
> > >>>>>> +      ElfW(Addr) offset = base;                                                    \
> > >>>>>> +      do {                                                                 \
> > >>>>>> +     entry >>= 1;                                                          \
> > >>>>>> +     if ((entry & 1) != 0)                                                 \
> > >>>>>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
> > >>>>>> +     offset += sizeof(ElfW(Addr));                                         \
> > >>>>>> +      } while (entry != 0);                                                \
> > >>>>>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
> > >>>>>
> > >>>>> While in line with the proposed spec additions I'm afraid the uses of
> > >>>>> ElfW(Addr) here aren't universally correct: You assume that ELF
> > >>>>> container type (size) expresses an aspect of the ABI. While this is
> > >>>>> indeed the case for several arch-es, I think this has been a mistake.
> > >>>>> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> > >>>>> code can validly live in a 32-bit ELF container (at least as far as
> > >>>>> the psABI is concerned; I have no idea whether glibc actually
> > >>>>> followed the spec). There's a separate ELF header flag indicating the
> > >>>>> ABI, and hence the size of a pointer.
> > >>>>
> > >>>> Thanks for chiming in.
> > >>>>
> > >>>> As of ia64 buildability, it works for me:
> > >>>>
> > >>>> scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
> > >>>> mkdir -p out/ia64; cd out/ia64
> > >>>> ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
> > >>>> CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
> > >>>> CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
> > >>>> make -j 50
> > >>>
> > >>> I didn't suggest the build would fail. What I said is that I don't
> > >>> think the code is correct there.
> > >>>
> > >>>> As of the actual functionality, ugh, I cannot find ia64 in my Debian
> > >>>> testing's qemu-user-static package:( So I cannot test.
> > >>>>
> > >>>> That said, gold and LLD don't support ia64.
> > >>>> If we have a concern that ia64 may not work, the GNU ld maintainers
> > >>>> can simply not add ia64 support:)
> > >>>
> > >>> But you realize that I took ia64 only as example, as that's where
> > >>> I know ABI (pointer size) and ELF container size aren't connected.
> > >>> As per my looking at merely EF_MIPS_* in context of reading
> > >>> Joseph's reply, it might be that MIPS is another such example. But
> > >>> I lack sufficient knowledge of MIPS ...
> > >>>
> > >>
> > >> The new code should be tested and verified on all supported
> > >> targets.  That is another reason to implement this in binutils
> > >> ld first.
> > >
> > > --pack-dyn-relocs=relr is well tested on arm, aarch64, and x86, and
> > > works on popular arches like ppc64 as well.
> > > For mips, it is no harm to keep the DT_RELR code path. Its
> > > elf_machine_rel_relative is empty and it has no relative relocation
> > > anyway.
> > > I wish that our reasonable design decisions are not restricted by the
> > > few retrocomputing architectures, especially when the concern is still
> > > at the theoretical stage.
> >
> > For ia64 it's not theoretical at all, as long as you leave aside the
> > fact the deprecation state of that architecture. I also have to admit
> > that I have trouble seeing why the design can't be adjusted to fit
> > original ELF intentions rather than (as said, imo bad) decisions
> > taken by a few (popular) architectures. Besides adjusting the wording
> > accordingly, all it takes for your implementation is to parameterize
> > word (pointer) size.
> >
> > Jan
> >
>
> I re-read your first reply. Did you mean 64-bit small code model code
> in an ELFCLASS32 ET_EXEC/ET_DYN file?
> OK. I considered it as a size optimization for a different task.
> "AArch64 and x86-64 define ILP32 ABIs and use ELFCLASS32, but
> technically they can use ELFCLASS32 for small code model with regular
> ABIs, if the kernel allows." (from
> https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order)
>
> I assume that switching *((ElfW(Addr) *)(l_addr + entry)) += l_addr;
> to `unsigned long` makes this use case work, though I doubt glibc's
> ia64 implementation supports this as sysdeps/ia64/dl-machine.h uses
> Elf64.

It won't. sysdeps/ia64/dl-machine.h:elf_machine_rela_relative writes
to an Elf64_Addr.
l_addr and many other representations assume that ElfW can not be an
Elf32 container for a 64-bit architecture.
  
Fangrui Song Oct. 16, 2021, 8:22 p.m. UTC | #19
On 2021-10-11, Jan Beulich wrote:
>On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
>> --- /dev/null
>> +++ b/elf/tst-relr.c
>> @@ -0,0 +1,20 @@
>> +static int o, x;
>> +void *arr[] = {
>> +  &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o,
>> +  0,
>> +  &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x,
>> +};
>
>Personally I consider this overly simplistic a testcase. The two
>sequences are fully identical, have no gaps except for the one in
>the middle, and there's also no interleaving / mixing of pointers.
>None of this should matter as the specific symbol the relocation
>is for is not supposed to be of interest, but in a testcase you
>want to make sure none of this has an effect.
>
>I also wonder whether it is a good idea to have a testcase with
>"fundamentally" different behavior on 32-bit vs 64-bit: The former
>will require two RELR entries afaict, while the latter will get
>away with one here.
>
>Jan

In
https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/relr

I enhanced the test to this form:

#define ELEMS O O O O O O O O X X X X X X X O O X O O X X X E X E E O X O E
#define E 0,

#define O &o,
#define X &x,
void *arr[] = { ELEMS };
#undef O
#undef X

#define O 1,
#define X 2,
static char val[] = { ELEMS };

static int
do_test (void)
{
   for (int i = 0; i < sizeof (arr) / sizeof (arr[0]); i++)
     if (!((arr[i] == 0 && val[i] == 0) ||
           (arr[i] == &o && val[i] == 1) ||
           (arr[i] == &x && val[i] == 2)))
       return 1;
   return 0;
}

>> +static int
>> +do_test (void)
>> +{
>> +  for (int i = 0; i < 16; i++)
>> +    if (arr[i] != &o)
>> +      return 1;
>> +  for (int i = 17; i < 33; i++)
>> +    if (arr[i] != &x)
>> +      return 1;
>> +  return 0;
>> +}
>> +
>> +#include <support/test-driver.c>
>>
>
  
Jan Beulich Oct. 18, 2021, 7:59 a.m. UTC | #20
On 13.10.2021 08:18, Fāng-ruì Sòng wrote:
> On Tue, Oct 12, 2021 at 11:13 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>>
>> On Tue, Oct 12, 2021 at 11:00 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>
>>> On 12.10.2021 18:07, Fāng-ruì Sòng wrote:
>>>> On Tue, Oct 12, 2021 at 7:10 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>
>>>>> On Tue, Oct 12, 2021 at 1:18 AM Jan Beulich via Libc-alpha
>>>>> <libc-alpha@sourceware.org> wrote:
>>>>>>
>>>>>> On 11.10.2021 20:43, Fāng-ruì Sòng wrote:
>>>>>>> On Mon, Oct 11, 2021 at 12:48 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>>
>>>>>>>> On 08.10.2021 08:57, Fangrui Song via Binutils wrote:
>>>>>>>>> --- a/elf/dynamic-link.h
>>>>>>>>> +++ b/elf/dynamic-link.h
>>>>>>>>> @@ -192,6 +192,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
>>>>>>>>>  #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
>>>>>>>>>  # endif
>>>>>>>>>
>>>>>>>>> +# define ELF_DYNAMIC_DO_RELR(map)                                          \
>>>>>>>>> +  do {                                                                             \
>>>>>>>>> +    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;                            \
>>>>>>>>> +    const ElfW(Relr) *r = 0, *end = 0;                                             \
>>>>>>>>> +    if (!(map)->l_info[DT_RELR])                                           \
>>>>>>>>> +      break;                                                               \
>>>>>>>>> +    start = D_PTR((map), l_info[DT_RELR]);                                 \
>>>>>>>>> +    r = (const ElfW(Relr) *)start;                                         \
>>>>>>>>> +    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
>>>>>>>>> +    for (; r < end; ++r) {                                                 \
>>>>>>>>> +      ElfW(Relr) entry = *r;                                               \
>>>>>>>>> +      if ((entry & 1) == 0) {                                                      \
>>>>>>>>> +     *((ElfW(Addr) *)(l_addr + entry)) += l_addr;                          \
>>>>>>>>> +     base = entry + sizeof(ElfW(Addr));                                    \
>>>>>>>>> +     continue;                                                             \
>>>>>>>>> +      }                                                                            \
>>>>>>>>> +      ElfW(Addr) offset = base;                                                    \
>>>>>>>>> +      do {                                                                 \
>>>>>>>>> +     entry >>= 1;                                                          \
>>>>>>>>> +     if ((entry & 1) != 0)                                                 \
>>>>>>>>> +       *((ElfW(Addr) *)(l_addr + offset)) += l_addr;                       \
>>>>>>>>> +     offset += sizeof(ElfW(Addr));                                         \
>>>>>>>>> +      } while (entry != 0);                                                \
>>>>>>>>> +      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));           \
>>>>>>>>
>>>>>>>> While in line with the proposed spec additions I'm afraid the uses of
>>>>>>>> ElfW(Addr) here aren't universally correct: You assume that ELF
>>>>>>>> container type (size) expresses an aspect of the ABI. While this is
>>>>>>>> indeed the case for several arch-es, I think this has been a mistake.
>>>>>>>> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
>>>>>>>> code can validly live in a 32-bit ELF container (at least as far as
>>>>>>>> the psABI is concerned; I have no idea whether glibc actually
>>>>>>>> followed the spec). There's a separate ELF header flag indicating the
>>>>>>>> ABI, and hence the size of a pointer.
>>>>>>>
>>>>>>> Thanks for chiming in.
>>>>>>>
>>>>>>> As of ia64 buildability, it works for me:
>>>>>>>
>>>>>>> scripts/build-many-glibcs.py /tmp/glibc-many compilers ia64-linux-gnu
>>>>>>> mkdir -p out/ia64; cd out/ia64
>>>>>>> ../../configure --prefix=/tmp/glibc/ia64 --host=ia64-linux-gnu
>>>>>>> CC=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-gcc
>>>>>>> CXX=/tmp/glibc-many/install/compilers/ia64-linux-gnu/bin/ia64-glibc-linux-gnu-g++
>>>>>>> make -j 50
>>>>>>
>>>>>> I didn't suggest the build would fail. What I said is that I don't
>>>>>> think the code is correct there.
>>>>>>
>>>>>>> As of the actual functionality, ugh, I cannot find ia64 in my Debian
>>>>>>> testing's qemu-user-static package:( So I cannot test.
>>>>>>>
>>>>>>> That said, gold and LLD don't support ia64.
>>>>>>> If we have a concern that ia64 may not work, the GNU ld maintainers
>>>>>>> can simply not add ia64 support:)
>>>>>>
>>>>>> But you realize that I took ia64 only as example, as that's where
>>>>>> I know ABI (pointer size) and ELF container size aren't connected.
>>>>>> As per my looking at merely EF_MIPS_* in context of reading
>>>>>> Joseph's reply, it might be that MIPS is another such example. But
>>>>>> I lack sufficient knowledge of MIPS ...
>>>>>>
>>>>>
>>>>> The new code should be tested and verified on all supported
>>>>> targets.  That is another reason to implement this in binutils
>>>>> ld first.
>>>>
>>>> --pack-dyn-relocs=relr is well tested on arm, aarch64, and x86, and
>>>> works on popular arches like ppc64 as well.
>>>> For mips, it is no harm to keep the DT_RELR code path. Its
>>>> elf_machine_rel_relative is empty and it has no relative relocation
>>>> anyway.
>>>> I wish that our reasonable design decisions are not restricted by the
>>>> few retrocomputing architectures, especially when the concern is still
>>>> at the theoretical stage.
>>>
>>> For ia64 it's not theoretical at all, as long as you leave aside the
>>> fact the deprecation state of that architecture. I also have to admit
>>> that I have trouble seeing why the design can't be adjusted to fit
>>> original ELF intentions rather than (as said, imo bad) decisions
>>> taken by a few (popular) architectures. Besides adjusting the wording
>>> accordingly, all it takes for your implementation is to parameterize
>>> word (pointer) size.
>>
>> I re-read your first reply. Did you mean 64-bit small code model code
>> in an ELFCLASS32 ET_EXEC/ET_DYN file?
>> OK. I considered it as a size optimization for a different task.
>> "AArch64 and x86-64 define ILP32 ABIs and use ELFCLASS32, but
>> technically they can use ELFCLASS32 for small code model with regular
>> ABIs, if the kernel allows." (from
>> https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order)
>>
>> I assume that switching *((ElfW(Addr) *)(l_addr + entry)) += l_addr;
>> to `unsigned long` makes this use case work, though I doubt glibc's
>> ia64 implementation supports this as sysdeps/ia64/dl-machine.h uses
>> Elf64.
> 
> It won't. sysdeps/ia64/dl-machine.h:elf_machine_rela_relative writes
> to an Elf64_Addr.
> l_addr and many other representations assume that ElfW can not be an
> Elf32 container for a 64-bit architecture.

Okay, so maybe the code you have is fine within the constraints of glibc.
But I'd still hope for the text addition to the standard to have removed
the ABI implications from ELF container size, before it actually gets
merged.

Jan
  
Cary Coutant Oct. 26, 2021, 11:28 p.m. UTC | #21
> While in line with the proposed spec additions I'm afraid the uses of
> ElfW(Addr) here aren't universally correct: You assume that ELF
> container type (size) expresses an aspect of the ABI. While this is
> indeed the case for several arch-es, I think this has been a mistake.
> IA-64, while meanwhile mostly dead, is (was) an example where 64-bit
> code can validly live in a 32-bit ELF container (at least as far as
> the psABI is concerned; I have no idea whether glibc actually
> followed the spec). There's a separate ELF header flag indicating the
> ABI, and hence the size of a pointer.

The IA-64 ABI specifically requires the class of the container
(ELFCLASS32 vs ELFCLASS64) to match the runtime model for executable
objects. It's allowable to use ELFCLASS32 for 64-bit relocatable
objects, but not for executable objects. Thus, I think it's quite
reasonable to spec it this way, and it really isn't any different from
a non-RELR relocation. For executables, there really are strong ties
between the ELF container size and the runtime model.

One point that I don't think has been raised yet is that some
platforms, like IA-64, may have more than one RELATIVE relocation.
IA-64 has four -- R_IA64_REL{32|64}{MSB|LSB} -- only one of which
would be eligible (or expected) in a RELR context. I'm thinking that a
DT_RELRTYPE entry specifying the relocation type might be called for,
just to make it explicit. I suppose it could be optional for platforms
where there is only one choice of R_XXX_RELATIVE.

-cary
  

Patch

diff --git a/aclocal.m4 b/aclocal.m4
index c195c4db56..65a12df047 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -224,20 +224,17 @@  AC_DEFUN([LIBC_LINKER_FEATURE],
 [AC_MSG_CHECKING([for linker that supports $1])
 libc_linker_feature=no
 if test x"$gnu_ld" = x"yes"; then
-  libc_linker_check=`$LD -v --help 2>/dev/null | grep "\$1"`
-  if test -n "$libc_linker_check"; then
-    cat > conftest.c <<EOF
+  cat > conftest.c <<EOF
 int _start (void) { return 42; }
 EOF
-    if AC_TRY_COMMAND([${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
-				$2 -nostdlib -nostartfiles
-				-fPIC -shared -o conftest.so conftest.c
-				1>&AS_MESSAGE_LOG_FD])
-    then
-      libc_linker_feature=yes
-    fi
-    rm -f conftest*
+  if AC_TRY_COMMAND([${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
+		    $2 -nostdlib -nostartfiles
+		    -fPIC -shared -o conftest.so conftest.c
+		    1>&AS_MESSAGE_LOG_FD])
+  then
+    libc_linker_feature=yes
   fi
+  rm -f conftest*
 fi
 if test $libc_linker_feature = yes; then
   $3
diff --git a/configure b/configure
index 39d75eb4ed..fdab6a97ef 100755
--- a/configure
+++ b/configure
@@ -5979,25 +5979,22 @@  fi
 $as_echo_n "checking for linker that supports -z execstack... " >&6; }
 libc_linker_feature=no
 if test x"$gnu_ld" = x"yes"; then
-  libc_linker_check=`$LD -v --help 2>/dev/null | grep "\-z execstack"`
-  if test -n "$libc_linker_check"; then
-    cat > conftest.c <<EOF
+  cat > conftest.c <<EOF
 int _start (void) { return 42; }
 EOF
-    if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
-				-Wl,-z,execstack -nostdlib -nostartfiles
-				-fPIC -shared -o conftest.so conftest.c
-				1>&5'
+  if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
+		    -Wl,-z,execstack -nostdlib -nostartfiles
+		    -fPIC -shared -o conftest.so conftest.c
+		    1>&5'
   { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
   (eval $ac_try) 2>&5
   ac_status=$?
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; }
-    then
-      libc_linker_feature=yes
-    fi
-    rm -f conftest*
+  then
+    libc_linker_feature=yes
   fi
+  rm -f conftest*
 fi
 if test $libc_linker_feature = yes; then
   libc_cv_z_execstack=yes
@@ -6012,25 +6009,22 @@  $as_echo "$libc_linker_feature" >&6; }
 $as_echo_n "checking for linker that supports -z start-stop-gc... " >&6; }
 libc_linker_feature=no
 if test x"$gnu_ld" = x"yes"; then
-  libc_linker_check=`$LD -v --help 2>/dev/null | grep "\-z start-stop-gc"`
-  if test -n "$libc_linker_check"; then
-    cat > conftest.c <<EOF
+  cat > conftest.c <<EOF
 int _start (void) { return 42; }
 EOF
-    if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
-				-Wl,-z,start-stop-gc -nostdlib -nostartfiles
-				-fPIC -shared -o conftest.so conftest.c
-				1>&5'
+  if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
+		    -Wl,-z,start-stop-gc -nostdlib -nostartfiles
+		    -fPIC -shared -o conftest.so conftest.c
+		    1>&5'
   { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
   (eval $ac_try) 2>&5
   ac_status=$?
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; }
-    then
-      libc_linker_feature=yes
-    fi
-    rm -f conftest*
+  then
+    libc_linker_feature=yes
   fi
+  rm -f conftest*
 fi
 if test $libc_linker_feature = yes; then
   libc_cv_z_start_stop_gc=yes
@@ -6046,25 +6040,22 @@  have-z-start-stop-gc = $libc_cv_z_start_stop_gc"
 $as_echo_n "checking for linker that supports --depaudit... " >&6; }
 libc_linker_feature=no
 if test x"$gnu_ld" = x"yes"; then
-  libc_linker_check=`$LD -v --help 2>/dev/null | grep "\--depaudit"`
-  if test -n "$libc_linker_check"; then
-    cat > conftest.c <<EOF
+  cat > conftest.c <<EOF
 int _start (void) { return 42; }
 EOF
-    if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
-				-Wl,--depaudit,x -nostdlib -nostartfiles
-				-fPIC -shared -o conftest.so conftest.c
-				1>&5'
+  if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
+		    -Wl,--depaudit,x -nostdlib -nostartfiles
+		    -fPIC -shared -o conftest.so conftest.c
+		    1>&5'
   { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
   (eval $ac_try) 2>&5
   ac_status=$?
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; }
-    then
-      libc_linker_feature=yes
-    fi
-    rm -f conftest*
+  then
+    libc_linker_feature=yes
   fi
+  rm -f conftest*
 fi
 if test $libc_linker_feature = yes; then
   libc_cv_depaudit=yes
@@ -6076,29 +6067,57 @@  $as_echo "$libc_linker_feature" >&6; }
 config_vars="$config_vars
 have-depaudit = $libc_cv_depaudit"
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for linker that supports --pack-dyn-relocs=relr" >&5
+$as_echo_n "checking for linker that supports --pack-dyn-relocs=relr... " >&6; }
+libc_linker_feature=no
+if test x"$gnu_ld" = x"yes"; then
+  cat > conftest.c <<EOF
+int _start (void) { return 42; }
+EOF
+  if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
+		    -Wl,--pack-dyn-relocs=relr -nostdlib -nostartfiles
+		    -fPIC -shared -o conftest.so conftest.c
+		    1>&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+  then
+    libc_linker_feature=yes
+  fi
+  rm -f conftest*
+fi
+if test $libc_linker_feature = yes; then
+  libc_cv_relr=yes
+else
+  libc_cv_relr=no
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5
+$as_echo "$libc_linker_feature" >&6; }
+config_vars="$config_vars
+have-relr = $libc_cv_relr"
+
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for linker that supports --no-dynamic-linker" >&5
 $as_echo_n "checking for linker that supports --no-dynamic-linker... " >&6; }
 libc_linker_feature=no
 if test x"$gnu_ld" = x"yes"; then
-  libc_linker_check=`$LD -v --help 2>/dev/null | grep "\--no-dynamic-linker"`
-  if test -n "$libc_linker_check"; then
-    cat > conftest.c <<EOF
+  cat > conftest.c <<EOF
 int _start (void) { return 42; }
 EOF
-    if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
-				-Wl,--no-dynamic-linker -nostdlib -nostartfiles
-				-fPIC -shared -o conftest.so conftest.c
-				1>&5'
+  if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp
+		    -Wl,--no-dynamic-linker -nostdlib -nostartfiles
+		    -fPIC -shared -o conftest.so conftest.c
+		    1>&5'
   { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
   (eval $ac_try) 2>&5
   ac_status=$?
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; }
-    then
-      libc_linker_feature=yes
-    fi
-    rm -f conftest*
+  then
+    libc_linker_feature=yes
   fi
+  rm -f conftest*
 fi
 if test $libc_linker_feature = yes; then
   libc_cv_no_dynamic_linker=yes
diff --git a/configure.ac b/configure.ac
index 00f49f09f7..96110f9d7d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1354,6 +1354,10 @@  LIBC_LINKER_FEATURE([--depaudit], [-Wl,--depaudit,x],
 		    [libc_cv_depaudit=yes], [libc_cv_depaudit=no])
 LIBC_CONFIG_VAR([have-depaudit], [$libc_cv_depaudit])
 
+LIBC_LINKER_FEATURE([--pack-dyn-relocs=relr], [-Wl,--pack-dyn-relocs=relr],
+		    [libc_cv_relr=yes], [libc_cv_relr=no])
+LIBC_CONFIG_VAR([have-relr], [$libc_cv_relr])
+
 LIBC_LINKER_FEATURE([--no-dynamic-linker],
 		    [-Wl,--no-dynamic-linker],
 		    [libc_cv_no_dynamic_linker=yes],
diff --git a/elf/Makefile b/elf/Makefile
index 26986c0692..e7c3f3404b 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -241,6 +241,10 @@  endif
 ifeq ($(have-depaudit),yes)
 tests += tst-audit14 tst-audit15 tst-audit16
 endif
+ifeq ($(have-relr),yes)
+tests += tst-relr
+LDFLAGS-tst-relr += -Wl,--pack-dyn-relocs=relr
+endif
 endif
 tests += $(tests-execstack-$(have-z-execstack))
 ifeq ($(run-built-tests),yes)
diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h
index 7cc3021164..ce8d628ca1 100644
--- a/elf/dynamic-link.h
+++ b/elf/dynamic-link.h
@@ -192,6 +192,33 @@  elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
 #  define ELF_DYNAMIC_DO_RELA(map, scope, lazy, skip_ifunc) /* Nothing to do.  */
 # endif
 
+# define ELF_DYNAMIC_DO_RELR(map)					      \
+  do {									      \
+    ElfW(Addr) l_addr = (map)->l_addr, base = 0, start;			      \
+    const ElfW(Relr) *r = 0, *end = 0;					      \
+    if (!(map)->l_info[DT_RELR])					      \
+      break;								      \
+    start = D_PTR((map), l_info[DT_RELR]);				      \
+    r = (const ElfW(Relr) *)start;					      \
+    end = (const ElfW(Relr) *)(start + (map)->l_info[DT_RELRSZ]->d_un.d_val); \
+    for (; r < end; ++r) {						      \
+      ElfW(Relr) entry = *r;						      \
+      if ((entry & 1) == 0) {						      \
+	*((ElfW(Addr) *)(l_addr + entry)) += l_addr;			      \
+	base = entry + sizeof(ElfW(Addr));				      \
+	continue;							      \
+      }									      \
+      ElfW(Addr) offset = base;						      \
+      do {								      \
+	entry >>= 1;							      \
+	if ((entry & 1) != 0)						      \
+	  *((ElfW(Addr) *)(l_addr + offset)) += l_addr;			      \
+	offset += sizeof(ElfW(Addr));					      \
+      } while (entry != 0);						      \
+      base += (8 * sizeof(ElfW(Relr)) - 1) * sizeof(ElfW(Addr));	      \
+    }									      \
+  } while (0);
+
 /* This can't just be an inline function because GCC is too dumb
    to inline functions containing inlines themselves.  */
 # define ELF_DYNAMIC_RELOCATE(map, scope, lazy, consider_profile, skip_ifunc) \
@@ -200,6 +227,7 @@  elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
 					      (consider_profile));	      \
     ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc);		      \
     ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc);		      \
+    ELF_DYNAMIC_DO_RELR ((map));					      \
   } while (0)
 
 #endif
diff --git a/elf/elf.h b/elf/elf.h
index 50f87baceb..093d69fd07 100644
--- a/elf/elf.h
+++ b/elf/elf.h
@@ -443,7 +443,8 @@  typedef struct
 #define SHT_PREINIT_ARRAY 16		/* Array of pre-constructors */
 #define SHT_GROUP	  17		/* Section group */
 #define SHT_SYMTAB_SHNDX  18		/* Extended section indices */
-#define	SHT_NUM		  19		/* Number of defined types.  */
+#define SHT_RELR	  19            /* RELR relative relocations */
+#define	SHT_NUM		  20		/* Number of defined types.  */
 #define SHT_LOOS	  0x60000000	/* Start OS-specific.  */
 #define SHT_GNU_ATTRIBUTES 0x6ffffff5	/* Object attributes.  */
 #define SHT_GNU_HASH	  0x6ffffff6	/* GNU-style hash table.  */
@@ -662,6 +663,11 @@  typedef struct
   Elf64_Sxword	r_addend;		/* Addend */
 } Elf64_Rela;
 
+/* RELR relocation table entry */
+
+typedef Elf32_Word	Elf32_Relr;
+typedef Elf64_Xword	Elf64_Relr;
+
 /* How to extract and insert information held in the r_info field.  */
 
 #define ELF32_R_SYM(val)		((val) >> 8)
@@ -887,7 +893,10 @@  typedef struct
 #define DT_PREINIT_ARRAY 32		/* Array with addresses of preinit fct*/
 #define DT_PREINIT_ARRAYSZ 33		/* size in bytes of DT_PREINIT_ARRAY */
 #define DT_SYMTAB_SHNDX	34		/* Address of SYMTAB_SHNDX section */
-#define	DT_NUM		35		/* Number used */
+#define DT_RELRSZ	35
+#define DT_RELR		36
+#define DT_RELRENT	37
+#define	DT_NUM		38		/* Number used */
 #define DT_LOOS		0x6000000d	/* Start of OS-specific */
 #define DT_HIOS		0x6ffff000	/* End of OS-specific */
 #define DT_LOPROC	0x70000000	/* Start of processor-specific */
diff --git a/elf/get-dynamic-info.h b/elf/get-dynamic-info.h
index 15c316b38c..08e3aae68b 100644
--- a/elf/get-dynamic-info.h
+++ b/elf/get-dynamic-info.h
@@ -87,6 +87,7 @@  elf_get_dynamic_info (struct link_map *l)
 # if ! ELF_MACHINE_NO_REL
       ADJUST_DYN_INFO (DT_REL);
 # endif
+      ADJUST_DYN_INFO (DT_RELR);
       ADJUST_DYN_INFO (DT_JMPREL);
       ADJUST_DYN_INFO (VERSYMIDX (DT_VERSYM));
       ADJUST_DYN_INFO (ADDRIDX (DT_GNU_HASH));
@@ -111,6 +112,8 @@  elf_get_dynamic_info (struct link_map *l)
   if (info[DT_REL] != NULL)
     assert (info[DT_RELENT]->d_un.d_val == sizeof (ElfW(Rel)));
 #endif
+  if (info[DT_RELR] != NULL)
+    assert (info[DT_RELRENT]->d_un.d_val == sizeof (ElfW(Relr)));
 #ifdef RTLD_BOOTSTRAP
   /* Only the bind now flags are allowed.  */
   assert (info[VERSYMIDX (DT_FLAGS_1)] == NULL
diff --git a/elf/tst-relr.c b/elf/tst-relr.c
new file mode 100644
index 0000000000..20a1409a6c
--- /dev/null
+++ b/elf/tst-relr.c
@@ -0,0 +1,20 @@ 
+static int o, x;
+void *arr[] = {
+  &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o, &o,
+  0,
+  &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x, &x,
+};
+
+static int
+do_test (void)
+{
+  for (int i = 0; i < 16; i++)
+    if (arr[i] != &o)
+      return 1;
+  for (int i = 17; i < 33; i++)
+    if (arr[i] != &x)
+      return 1;
+  return 0;
+}
+
+#include <support/test-driver.c>