elf: Defer all IRELATIVE relocations until after PLT setup (BZ 20673)

Message ID 20260428134130.2430638-1-adhemerval.zanella@linaro.org (mailing list archive)
State Superseded
Headers
Series elf: Defer all IRELATIVE relocations until after PLT setup (BZ 20673) |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit success Build for i686

Commit Message

Adhemerval Zanella April 28, 2026, 1:40 p.m. UTC
  When a shared library is built with -z lazy and its IFUNC resolver calls
a PLT function, the dynamic linker can crash.  The resolver runs while
the PLT stubs still hold their raw ELF virtual addresses — l_addr has
not yet been added — so the call branches to an unmapped address.

The old code deferred IRELATIVE entries only to the end of the relocation
range currently being processed (via the r2/end2 scan-ahead mechanism in
elf_dynamic_do_Rel).  This was sufficient only when both IRELATIVE and the
JMP_SLOT entries for the PLT functions it needs are in the same section.
On x86-64, aarch64, arm, i386 and most other targets, a file-scope
initialiser of the form

  int (*fptr)(void) = some_ifunc;

causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
entries for any PLT calls made by the resolver live in .rela.plt.
Processing .rela.dyn before .rela.plt means the resolver fires before the
PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.

Fix this by splitting IRELATIVE processing into a separate, explicitly
deferred pass.  In elf/do-rel.h:

 - Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
   elf_dynamic_do_Rel.  IRELATIVE entries are now always skipped in the
   non-bootstrap path.

 - Add a new elf_dynamic_do_Rel_irelative function that scans a
   relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
   IRELATIVE and ifunc relocations.

In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
approach for non-bootstrap builds unconditionally (regardless of whether
ranges[1].size is zero):

 Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
            everything except IRELATIVE/STT_GNU_IFUNC.
 Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
            processes only IRELATIVE, by which point all PLT stubs are
            valid.

This guarantees that IRELATIVE resolvers can call PLT stubs safely
regardless of which section the linker placed R_*_IRELATIVE in.

Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
the new skip logic in elf_dynamic_do_Rel is compiled for all targets.

I checked on all ABI that support iFUNC (x86_64, i686, aarch64, arm,
loongarch, powerpc, riscv, s390, and sparc), although on some through
qemu-system (which should not matter for this case).

It also fixes the mold reported issues [1], which shows an example
where IFUNC relocation placement and processing can works different
for different ABIs.

[1] https://github.com/rui314/mold/issues/1550
---
 elf/Makefile                           |  12 ++
 elf/do-rel.h                           | 179 +++++++++++++++----------
 elf/dynamic-link.h                     |  37 +++--
 elf/tst-ifunc-plt-dep.c                |  23 ++++
 elf/tst-ifunc-plt-dlopen.c             |  46 +++++++
 elf/tst-ifunc-plt-lib.c                |  56 ++++++++
 elf/tst-ifunc-plt.c                    |  38 ++++++
 sysdeps/aarch64/dl-machine.h           |   1 +
 sysdeps/arm/dl-machine.h               |   1 +
 sysdeps/i386/dl-machine.h              |   1 +
 sysdeps/powerpc/powerpc32/dl-machine.h |   1 +
 sysdeps/powerpc/powerpc64/dl-machine.h |   1 +
 sysdeps/riscv/dl-machine.h             |   1 +
 sysdeps/sparc/sparc32/dl-machine.h     |   1 +
 sysdeps/sparc/sparc64/dl-machine.h     |   1 +
 15 files changed, 320 insertions(+), 79 deletions(-)
 create mode 100644 elf/tst-ifunc-plt-dep.c
 create mode 100644 elf/tst-ifunc-plt-dlopen.c
 create mode 100644 elf/tst-ifunc-plt-lib.c
 create mode 100644 elf/tst-ifunc-plt.c
  

Comments

H.J. Lu May 8, 2026, 11:48 p.m. UTC | #1
On Tue, Apr 28, 2026 at 9:42 PM Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
> When a shared library is built with -z lazy and its IFUNC resolver calls
> a PLT function, the dynamic linker can crash.  The resolver runs while
> the PLT stubs still hold their raw ELF virtual addresses — l_addr has
> not yet been added — so the call branches to an unmapped address.
>
> The old code deferred IRELATIVE entries only to the end of the relocation
> range currently being processed (via the r2/end2 scan-ahead mechanism in
> elf_dynamic_do_Rel).  This was sufficient only when both IRELATIVE and the
> JMP_SLOT entries for the PLT functions it needs are in the same section.
> On x86-64, aarch64, arm, i386 and most other targets, a file-scope
> initialiser of the form
>
>   int (*fptr)(void) = some_ifunc;
>
> causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
> entries for any PLT calls made by the resolver live in .rela.plt.
> Processing .rela.dyn before .rela.plt means the resolver fires before the
> PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.
>
> Fix this by splitting IRELATIVE processing into a separate, explicitly
> deferred pass.  In elf/do-rel.h:
>
>  - Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
>    elf_dynamic_do_Rel.  IRELATIVE entries are now always skipped in the
>    non-bootstrap path.
>
>  - Add a new elf_dynamic_do_Rel_irelative function that scans a
>    relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
>    IRELATIVE and ifunc relocations.
>
> In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
> approach for non-bootstrap builds unconditionally (regardless of whether
> ranges[1].size is zero):
>
>  Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
>             everything except IRELATIVE/STT_GNU_IFUNC.
>  Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
>             processes only IRELATIVE, by which point all PLT stubs are
>             valid.
>
> This guarantees that IRELATIVE resolvers can call PLT stubs safely
> regardless of which section the linker placed R_*_IRELATIVE in.
>
> Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
> the new skip logic in elf_dynamic_do_Rel is compiled for all targets.
>
> I checked on all ABI that support iFUNC (x86_64, i686, aarch64, arm,
> loongarch, powerpc, riscv, s390, and sparc), although on some through
> qemu-system (which should not matter for this case).
>
> It also fixes the mold reported issues [1], which shows an example
> where IFUNC relocation placement and processing can works different
> for different ABIs.

Just to be clear.  This patch doesn't fully fix:

https://sourceware.org/bugzilla/show_bug.cgi?id=20673

This fix only deals with IRELATIVE relocation order within an object.
Calling an external function from an IFUC resolver may still crash.
Am I correct?

> [1] https://github.com/rui314/mold/issues/1550
> ---
>  elf/Makefile                           |  12 ++
>  elf/do-rel.h                           | 179 +++++++++++++++----------
>  elf/dynamic-link.h                     |  37 +++--
>  elf/tst-ifunc-plt-dep.c                |  23 ++++
>  elf/tst-ifunc-plt-dlopen.c             |  46 +++++++
>  elf/tst-ifunc-plt-lib.c                |  56 ++++++++
>  elf/tst-ifunc-plt.c                    |  38 ++++++
>  sysdeps/aarch64/dl-machine.h           |   1 +
>  sysdeps/arm/dl-machine.h               |   1 +
>  sysdeps/i386/dl-machine.h              |   1 +
>  sysdeps/powerpc/powerpc32/dl-machine.h |   1 +
>  sysdeps/powerpc/powerpc64/dl-machine.h |   1 +
>  sysdeps/riscv/dl-machine.h             |   1 +
>  sysdeps/sparc/sparc32/dl-machine.h     |   1 +
>  sysdeps/sparc/sparc64/dl-machine.h     |   1 +
>  15 files changed, 320 insertions(+), 79 deletions(-)
>  create mode 100644 elf/tst-ifunc-plt-dep.c
>  create mode 100644 elf/tst-ifunc-plt-dlopen.c
>  create mode 100644 elf/tst-ifunc-plt-lib.c
>  create mode 100644 elf/tst-ifunc-plt.c
>
> diff --git a/elf/Makefile b/elf/Makefile
> index c835eb8156..377ea2c0cc 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -1252,6 +1252,8 @@ ifeq (yes,$(build-shared))
>  tests += \
>    tst-ifunc-fault-bindnow \
>    tst-ifunc-fault-lazy \
> +  tst-ifunc-plt \
> +  tst-ifunc-plt-dlopen \
>    # tests
>  # Note: sysdeps/x86_64/ifuncmain8.c uses ifuncmain8.
>  tests-internal += \
> @@ -1314,6 +1316,8 @@ modules-names += \
>    ifuncmod1 \
>    ifuncmod3 \
>    ifuncmod6 \
> +  tst-ifunc-plt-dep \
> +  tst-ifunc-plt-lib \
>    # modules-names
>  ifeq (no,$(with-lld))
>  modules-names += ifuncmod5
> @@ -1793,6 +1797,7 @@ unload4mod1.so-no-z-defs = yes
>  ifuncmod1.so-no-z-defs = yes
>  ifuncmod5.so-no-z-defs = yes
>  ifuncmod6.so-no-z-defs = yes
> +tst-ifunc-plt-lib.so-no-z-defs = yes
>  tst-auditmod9a.so-no-z-defs = yes
>  tst-auditmod9b.so-no-z-defs = yes
>  tst-nodelete-uniquemod.so-no-z-defs = yes
> @@ -2426,6 +2431,13 @@ $(objpfx)tst-ifunc-fault-bindnow.out: $(objpfx)tst-ifunc-fault-bindnow \
>     $(objpfx)ld.so
>         $(tst-ifunc-fault-script)
>
> +LDFLAGS-tst-ifunc-plt-lib.so = -Wl,-z,lazy
> +
> +$(objpfx)tst-ifunc-plt-lib.so: $(objpfx)tst-ifunc-plt-dep.so
> +$(objpfx)tst-ifunc-plt: $(objpfx)tst-ifunc-plt-lib.so
> +$(objpfx)tst-ifunc-plt-dlopen.out: \
> +  $(objpfx)tst-ifunc-plt-lib.so $(objpfx)tst-ifunc-plt-dep.so
> +
>  $(objpfx)tst-unique1.out: $(objpfx)tst-unique1mod1.so \
>                           $(objpfx)tst-unique1mod2.so
>
> diff --git a/elf/do-rel.h b/elf/do-rel.h
> index d00ffab7e9..e7912d4ed4 100644
> --- a/elf/do-rel.h
> +++ b/elf/do-rel.h
> @@ -23,6 +23,8 @@
>
>  #ifdef DO_RELA
>  # define elf_dynamic_do_Rel            elf_dynamic_do_Rela
> +# define elf_dynamic_do_Rel_irelative  elf_dynamic_do_Rela_irelative
> +# define elf_dynamic_is_Rel_irelative  elf_dynamic_is_Rela_irelative
>  # define Rel                           Rela
>  # define elf_machine_rel               elf_machine_rela
>  # define elf_machine_rel_relative      elf_machine_rela_relative
> @@ -34,16 +36,34 @@
>                             (void *) (l_addr + relative->r_offset))
>  #endif
>
> +static __always_inline bool
> +elf_dynamic_is_Rel_irelative (const ElfW(Rel) *reloc, const ElfW(Sym) *sym)
> +{
> +#ifdef ELF_MACHINE_IRELATIVE
> +  const unsigned int r_type = ELFW (R_TYPE) (reloc->r_info);
> +  return ((sym != NULL
> +          && ELFW(ST_TYPE) (sym->st_info) == STT_GNU_IFUNC
> +          && sym->st_shndx != SHN_UNDEF)
> +         || r_type == ELF_MACHINE_IRELATIVE);
> +#else
> +  return false;
> +#endif
> +}
> +
>  /* Perform the relocations in MAP on the running program image as specified
>     by RELTAG, SZTAG.  If LAZY is nonzero, this is the first pass on PLT
>     relocations; they should be set up to call _dl_runtime_resolve, rather
> -   than fully resolved now.  */
> +   than fully resolved now.
> +
> +   IRELATIVE entries are always skipped (non-bootstrap); they are handled
> +   separately by elf_dynamic_do_Rel_irelative after all other relocations
> +   for both .rel.dyn and .rel.plt have been processed.  */
>
>  static inline void __attribute__ ((always_inline))
>  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>                     ElfW(Addr) reladdr, ElfW(Addr) relsize,
>                     __typeof (((ElfW(Dyn) *) 0)->d_un.d_val) nrelative,
> -                   int lazy, int skip_ifunc)
> +                   int lazy)
>  {
>    const ElfW(Rel) *relative = (const void *) reladdr;
>    const ElfW(Rel) *r = relative + nrelative;
> @@ -65,14 +85,9 @@ elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>        void *const r_addr_arg = (void *) (l_addr + r->r_offset);
>        const struct r_found_version *rversion = &map->l_versions[ndx];
>
> -      elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg, skip_ifunc);
> +      elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg, 0);
>      }
>  #else /* !RTLD_BOOTSTRAP */
> -# if defined ELF_MACHINE_IRELATIVE
> -  const ElfW(Rel) *r2 = NULL;
> -  const ElfW(Rel) *end2 = NULL;
> -# endif
> -
>  #if !defined DO_RELA || !defined ELF_MACHINE_PLT_REL
>    /* We never bind lazily during ld.so bootstrap.  Unfortunately gcc is
>       not clever enough to see through all the function calls to realize
> @@ -81,23 +96,12 @@ elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>      {
>        /* Doing lazy PLT relocations; they need very little info.  */
>        for (; r < end; ++r)
> -# ifdef ELF_MACHINE_IRELATIVE
> -       if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_IRELATIVE)
> -         {
> -           if (r2 == NULL)
> -             r2 = r;
> -           end2 = r;
> -         }
> -       else
> -# endif
> -         elf_machine_lazy_rel (map, scope, l_addr, r, skip_ifunc);
> -
> -# ifdef ELF_MACHINE_IRELATIVE
> -      if (r2 != NULL)
> -       for (; r2 <= end2; ++r2)
> -         if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
> -           elf_machine_lazy_rel (map, scope, l_addr, r2, skip_ifunc);
> -# endif
> +       {
> +         const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
> +         if (elf_dynamic_is_Rel_irelative (r, sym))
> +           continue;
> +         elf_machine_lazy_rel (map, scope, l_addr, r, 0);
> +       }
>      }
>    else
>  #endif
> @@ -125,18 +129,10 @@ elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>               const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (r->r_info)];
>               void *const r_addr_arg = (void *) (l_addr + r->r_offset);
>               const struct r_found_version *rversion = &map->l_versions[ndx];
> -#if defined ELF_MACHINE_IRELATIVE
> -             if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_IRELATIVE)
> -               {
> -                 if (r2 == NULL)
> -                   r2 = r;
> -                 end2 = r;
> -                 continue;
> -               }
> -#endif
>
> -             elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg,
> -                              skip_ifunc);
> +             if (elf_dynamic_is_Rel_irelative (r, sym))
> +               continue;
> +             elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg, 0);
>  #if defined SHARED
>               if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_JMP_SLOT
>                   && GLRO(dl_naudit) > 0)
> @@ -150,21 +146,6 @@ elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>                 }
>  #endif
>             }
> -
> -#if defined ELF_MACHINE_IRELATIVE
> -         if (r2 != NULL)
> -           for (; r2 <= end2; ++r2)
> -             if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
> -               {
> -                 ElfW(Half) ndx
> -                   = version[ELFW(R_SYM) (r2->r_info)] & 0x7fff;
> -                 elf_machine_rel (map, scope, r2,
> -                                  &symtab[ELFW(R_SYM) (r2->r_info)],
> -                                  &map->l_versions[ndx],
> -                                  (void *) (l_addr + r2->r_offset),
> -                                  skip_ifunc);
> -               }
> -#endif
>         }
>        else
>         {
> @@ -172,17 +153,10 @@ elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>             {
>               const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (r->r_info)];
>               void *const r_addr_arg = (void *) (l_addr + r->r_offset);
> -# ifdef ELF_MACHINE_IRELATIVE
> -             if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_IRELATIVE)
> -               {
> -                 if (r2 == NULL)
> -                   r2 = r;
> -                 end2 = r;
> -                 continue;
> -               }
> -# endif
> -             elf_machine_rel (map, scope, r, sym, NULL, r_addr_arg,
> -                              skip_ifunc);
> +
> +             if (elf_dynamic_is_Rel_irelative (r, sym))
> +               continue;
> +             elf_machine_rel (map, scope, r, sym, NULL, r_addr_arg, 1);
>  # if defined SHARED
>               if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_JMP_SLOT
>                   && GLRO(dl_naudit) > 0)
> @@ -197,21 +171,84 @@ elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
>                 }
>  # endif
>             }
> -
> -# ifdef ELF_MACHINE_IRELATIVE
> -         if (r2 != NULL)
> -           for (; r2 <= end2; ++r2)
> -             if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
> -               elf_machine_rel (map, scope, r2, &symtab[ELFW(R_SYM) (r2->r_info)],
> -                                NULL, (void *) (l_addr + r2->r_offset),
> -                                skip_ifunc);
> -# endif
>         }
>      }
>  #endif /* !RTLD_BOOTSTRAP */
>  }
>
> +/* Process only IRELATIVE entries in the relocation range
> +   [reladdr, reladdr+relsize).  When lazy is non-zero the PLT lazy-binding
> +   path (elf_machine_lazy_rel) is used, otherwise the full non-lazy path
> +   (elf_machine_rel) is used.
> +
> +   Called by _ELF_DYNAMIC_DO_RELOC after all non-IRELATIVE relocations have
> +   been processed for both .rela.dyn and .rela.plt, so that IRELATIVE
> +   resolvers may call PLT stubs safely regardless of which section the linker
> +   placed R_*_IRELATIVE in.  */
> +static __always_inline void
> +elf_dynamic_do_Rel_irelative (struct link_map *map,
> +                             struct r_scope_elem *scope[],
> +                             ElfW(Addr) reladdr, ElfW(Addr) relsize,
> +                             int lazy, int skip_ifunc)
> +{
> +# ifdef ELF_MACHINE_IRELATIVE
> +  const ElfW(Rel) *r = (const void *) reladdr;
> +  const ElfW(Rel) *end = (const void *) (reladdr + relsize);
> +  ElfW(Addr) l_addr = map->l_addr;
> +  const ElfW(Sym) *const symtab = (const void *) D_PTR (map, l_info[DT_SYMTAB]);
> +
> +  if (lazy)
> +    {
> +      for (; r < end; ++r)
> +       {
> +         const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
> +         if (!elf_dynamic_is_Rel_irelative (r, sym))
> +           continue;
> +         elf_machine_lazy_rel (map, scope, l_addr, r, skip_ifunc);
> +       }
> +    }
> +  else
> +    {
> +      if (map->l_info[VERSYMIDX (DT_VERSYM)])
> +       {
> +         const ElfW(Half) *const version =
> +           (const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
> +
> +         for (; r < end; ++r)
> +           {
> +             const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
> +             if (!elf_dynamic_is_Rel_irelative (r, sym))
> +               continue;
> +
> +             ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
> +             elf_machine_rel (map, scope, r,
> +                              &symtab[ELFW(R_SYM) (r->r_info)],
> +                              &map->l_versions[ndx],
> +                              (void *) (l_addr + r->r_offset),
> +                              skip_ifunc);
> +           }
> +       }
> +      else
> +       {
> +         for (; r < end; ++r)
> +           {
> +             const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
> +             if (!elf_dynamic_is_Rel_irelative (r, sym))
> +               continue;
> +
> +             elf_machine_rel (map, scope, r,
> +                              &symtab[ELFW(R_SYM) (r->r_info)],
> +                              NULL,
> +                              (void *) (l_addr + r->r_offset),
> +                              skip_ifunc);
> +           }
> +       }
> +    }
> +# endif
> +}
> +
>  #undef elf_dynamic_do_Rel
> +#undef elf_dynamic_do_Rel_irelative
>  #undef Rel
>  #undef elf_machine_rel
>  #undef elf_machine_rel_relative
> diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h
> index a46f36b8d4..2055d910c6 100644
> --- a/elf/dynamic-link.h
> +++ b/elf/dynamic-link.h
> @@ -78,7 +78,8 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
>     consumes precisely the very end of the DT_REL*, or DT_JMPREL and DT_REL*
>     are completely separate and there is a gap between them.  */
>
> -# define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, scope, do_lazy, skip_ifunc, test_rel) \
> +# define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, scope, do_lazy, skip_ifunc, \
> +                              test_rel)                                      \
>    do {                                                                       \
>      struct { ElfW(Addr) start, size;                                         \
>              __typeof (((ElfW(Dyn) *) 0)->d_un.d_val) nrelative; int lazy; }  \
> @@ -118,13 +119,33 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
>           }                                                                   \
>        }                                                                              \
>                                                                               \
> -      for (int ranges_index = 0; ranges_index < 2; ++ranges_index)           \
> -        elf_dynamic_do_##reloc ((map), scope,                                \
> -                               ranges[ranges_index].start,                   \
> -                               ranges[ranges_index].size,                    \
> -                               ranges[ranges_index].nrelative,               \
> -                               ranges[ranges_index].lazy,                    \
> -                               skip_ifunc);                                  \
> +      /* Defer all IRELATIVE relocations until after all non-IRELATIVE       \
> +        relocations (including PLT lazy-binding setup) have been processed   \
> +        for both sections.  This ensures IRELATIVE resolvers can call PLT    \
> +        stubs safely regardless of which section R_*_IRELATIVE was placed in \
> +        by the linker.  */                                                   \
> +      if (!DO_RTLD_BOOTSTRAP)                                                \
> +       {                                                                     \
> +         for (int ranges_index = 0; ranges_index < 2; ++ranges_index)        \
> +           elf_dynamic_do_##reloc ((map), scope,                             \
> +                                   ranges[ranges_index].start,               \
> +                                   ranges[ranges_index].size,                \
> +                                   ranges[ranges_index].nrelative,           \
> +                                   ranges[ranges_index].lazy);               \
> +         for (int ranges_index = 0; ranges_index < 2; ++ranges_index)        \
> +           elf_dynamic_do_##reloc##_irelative ((map), scope,                 \
> +                                               ranges[ranges_index].start,   \
> +                                               ranges[ranges_index].size,    \
> +                                               ranges[ranges_index].lazy,    \
> +                                               skip_ifunc);                  \
> +       }                                                                     \
> +      else                                                                   \
> +       for (int ranges_index = 0; ranges_index < 2; ++ranges_index)          \
> +         elf_dynamic_do_##reloc ((map), scope,                               \
> +                                 ranges[ranges_index].start,                 \
> +                                 ranges[ranges_index].size,                  \
> +                                 ranges[ranges_index].nrelative,             \
> +                                 ranges[ranges_index].lazy);                 \
>    } while (0)
>
>  # if ELF_MACHINE_NO_REL || ELF_MACHINE_NO_RELA
> diff --git a/elf/tst-ifunc-plt-dep.c b/elf/tst-ifunc-plt-dep.c
> new file mode 100644
> index 0000000000..9cf2b1b0b2
> --- /dev/null
> +++ b/elf/tst-ifunc-plt-dep.c
> @@ -0,0 +1,23 @@
> +/* Dependency library for tst-ifunc-plt.
> +   Copyright (C) 2026 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +int
> +get_value (void)
> +{
> +  return 42;
> +}
> diff --git a/elf/tst-ifunc-plt-dlopen.c b/elf/tst-ifunc-plt-dlopen.c
> new file mode 100644
> index 0000000000..ed9d69b69c
> --- /dev/null
> +++ b/elf/tst-ifunc-plt-dlopen.c
> @@ -0,0 +1,46 @@
> +/* Test for BZ #20673 via dlopen.
> +   Copyright (C) 2026 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +
> +/* dlopen tst-ifunc-plt-lib.so with RTLD_LAZY and verify that the IFUNC
> +   resolver (which calls get_value() via PLT) ran successfully.  This
> +   exercises the same _dl_relocate_object code path as startup loading
> +   but via the dlopen entry point.  */
> +
> +#include <support/xdlfcn.h>
> +#include <support/check.h>
> +
> +typedef int (*fn_t) (void);
> +
> +static int
> +do_test (void)
> +{
> +  void *handle = xdlopen ("tst-ifunc-plt-lib.so", RTLD_LAZY | RTLD_LOCAL);
> +
> +  fn_t compute_a = (fn_t) xdlsym (handle, "compute_a");
> +  TEST_COMPARE (compute_a (), 1);
> +
> +  fn_t compute_b = (fn_t) xdlsym (handle, "compute_b");
> +  TEST_COMPARE (compute_b (), 2);
> +
> +  xdlclose (handle);
> +
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
> diff --git a/elf/tst-ifunc-plt-lib.c b/elf/tst-ifunc-plt-lib.c
> new file mode 100644
> index 0000000000..650231ff28
> --- /dev/null
> +++ b/elf/tst-ifunc-plt-lib.c
> @@ -0,0 +1,56 @@
> +/* Shared library for tst-ifunc-plt-multi (bug 20673).
> +   Two static IFUNCs whose resolvers both call get_value() via PLT.
> +   Copyright (C) 2026 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +
> +/* Both resolvers call get_value() via PLT (one JUMP_SLOT entry in
> +   .rel{a}.plt).  This verifies that every IRELATIVE entry is deferred
> +   until after .rela.plt has been processed, not just the first one.  */
> +
> +#include <stddef.h>
> +
> +extern int get_value (void);
> +
> +static int
> +impl_a (void)
> +{
> +  return 1;
> +}
> +
> +static int
> +impl_b (void)
> +{
> +  return 2;
> +}
> +
> +static int (*
> +resolve_a (void)) (void)
> +{
> +  return get_value () == 42 ? impl_a : NULL;
> +}
> +
> +static int (*
> +resolve_b (void)) (void)
> +{
> +  return get_value () == 42 ? impl_b : NULL;
> +}
> +
> +/* The test is only built for $(have-ifunc), so we can assume HAVE_GCC_IFUNC
> +   here.  */
> +int compute_a (void) __attribute__ ((ifunc ("resolve_a")));
> +int compute_b (void) __attribute__ ((ifunc ("resolve_b")));
> diff --git a/elf/tst-ifunc-plt.c b/elf/tst-ifunc-plt.c
> new file mode 100644
> index 0000000000..382f90d852
> --- /dev/null
> +++ b/elf/tst-ifunc-plt.c
> @@ -0,0 +1,38 @@
> +/* Test for BZ #20673.
> +   Copyright (C) 2026 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +
> +/* tst-ifunc-plt-multi-lib.so defines two static IFUNCs (compute_a and
> +   compute_b), each producing an R_*_IRELATIVE in .rel{a}.dyn, with both
> +   resolvers calling get_value() via PLT.  The test verifies that both
> +   IRELATIVEs are deferred until after .rel{a}.plt is processed.  */
> +
> +#include <support/check.h>
> +
> +extern int compute_a (void);
> +extern int compute_b (void);
> +
> +static int
> +do_test (void)
> +{
> +  TEST_COMPARE (compute_a (), 1);
> +  TEST_COMPARE (compute_b (), 2);
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
> diff --git a/sysdeps/aarch64/dl-machine.h b/sysdeps/aarch64/dl-machine.h
> index 21af8bc56e..15651c62f3 100644
> --- a/sysdeps/aarch64/dl-machine.h
> +++ b/sysdeps/aarch64/dl-machine.h
> @@ -119,6 +119,7 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
>     | (((type) == R_AARCH64_COPY) * ELF_RTYPE_CLASS_COPY))
>
>  #define ELF_MACHINE_JMP_SLOT   R_AARCH64_JUMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_AARCH64_IRELATIVE
>
>  #define DL_PLATFORM_INIT dl_platform_init ()
>
> diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> index e0065ce73c..15cced693e 100644
> --- a/sysdeps/arm/dl-machine.h
> +++ b/sysdeps/arm/dl-machine.h
> @@ -190,6 +190,7 @@ _dl_start_user:\n\
>
>  /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
>  #define ELF_MACHINE_JMP_SLOT   R_ARM_JUMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_ARM_IRELATIVE
>
>  /* We define an initialization functions.  This is called very early in
>     _dl_sysdep_start.  */
> diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h
> index 6657f68791..dd49079d75 100644
> --- a/sysdeps/i386/dl-machine.h
> +++ b/sysdeps/i386/dl-machine.h
> @@ -190,6 +190,7 @@ _dl_start_user:\n\
>
>  /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
>  #define ELF_MACHINE_JMP_SLOT   R_386_JMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_386_IRELATIVE
>
>  /* We define an initialization functions.  This is called very early in
>     _dl_sysdep_start.  */
> diff --git a/sysdeps/powerpc/powerpc32/dl-machine.h b/sysdeps/powerpc/powerpc32/dl-machine.h
> index d787298636..e07a44a5a5 100644
> --- a/sysdeps/powerpc/powerpc32/dl-machine.h
> +++ b/sysdeps/powerpc/powerpc32/dl-machine.h
> @@ -146,6 +146,7 @@ __elf_preferred_address(struct link_map *loader, size_t maplength,
>
>  /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
>  #define ELF_MACHINE_JMP_SLOT   R_PPC_JMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_PPC_IRELATIVE
>
>  /* We define an initialization function to initialize HWCAP/HWCAP2 and
>     platform data so it can be copied into the TCB later.  This is called
> diff --git a/sysdeps/powerpc/powerpc64/dl-machine.h b/sysdeps/powerpc/powerpc64/dl-machine.h
> index 6e3771c91c..1f5d7a0170 100644
> --- a/sysdeps/powerpc/powerpc64/dl-machine.h
> +++ b/sysdeps/powerpc/powerpc64/dl-machine.h
> @@ -304,6 +304,7 @@ BODY_PREFIX "_dl_start_user:\n"                                             \
>
>  /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
>  #define ELF_MACHINE_JMP_SLOT   R_PPC64_JMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_PPC64_IRELATIVE
>
>  /* We define an initialization function to initialize HWCAP/HWCAP2 and
>     platform data so it can be copied into the TCB later.  This is called
> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
> index 8c7312ad98..babb52af20 100644
> --- a/sysdeps/riscv/dl-machine.h
> +++ b/sysdeps/riscv/dl-machine.h
> @@ -42,6 +42,7 @@
>  #endif
>
>  #define ELF_MACHINE_JMP_SLOT R_RISCV_JUMP_SLOT
> +#define ELF_MACHINE_IRELATIVE R_RISCV_IRELATIVE
>
>  #define elf_machine_type_class(type)                           \
>    ((ELF_RTYPE_CLASS_PLT * ((type) == ELF_MACHINE_JMP_SLOT      \
> diff --git a/sysdeps/sparc/sparc32/dl-machine.h b/sysdeps/sparc/sparc32/dl-machine.h
> index 73b347bfdf..3771339a04 100644
> --- a/sysdeps/sparc/sparc32/dl-machine.h
> +++ b/sysdeps/sparc/sparc32/dl-machine.h
> @@ -156,6 +156,7 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
>
>  /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
>  #define ELF_MACHINE_JMP_SLOT   R_SPARC_JMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_SPARC_IRELATIVE
>
>  /* Undo the sub %sp, 6*4, %sp; add %sp, 22*4, %o0 below to get at the
>     value we want in __libc_stack_end.  */
> diff --git a/sysdeps/sparc/sparc64/dl-machine.h b/sysdeps/sparc/sparc64/dl-machine.h
> index 856922680a..8a1a38d170 100644
> --- a/sysdeps/sparc/sparc64/dl-machine.h
> +++ b/sysdeps/sparc/sparc64/dl-machine.h
> @@ -119,6 +119,7 @@ elf_machine_plt_value (struct link_map *map, const Elf64_Rela *reloc,
>
>  /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
>  #define ELF_MACHINE_JMP_SLOT   R_SPARC_JMP_SLOT
> +#define ELF_MACHINE_IRELATIVE  R_SPARC_IRELATIVE
>
>  /* Set up the loaded object described by L so its unrelocated PLT
>     entries will jump to the on-demand fixup code in dl-runtime.c.  */
> --
> 2.43.0
>
  
Adhemerval Zanella May 11, 2026, 2:41 p.m. UTC | #2
On 08/05/26 20:48, H.J. Lu wrote:
> On Tue, Apr 28, 2026 at 9:42 PM Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>> When a shared library is built with -z lazy and its IFUNC resolver calls
>> a PLT function, the dynamic linker can crash.  The resolver runs while
>> the PLT stubs still hold their raw ELF virtual addresses — l_addr has
>> not yet been added — so the call branches to an unmapped address.
>>
>> The old code deferred IRELATIVE entries only to the end of the relocation
>> range currently being processed (via the r2/end2 scan-ahead mechanism in
>> elf_dynamic_do_Rel).  This was sufficient only when both IRELATIVE and the
>> JMP_SLOT entries for the PLT functions it needs are in the same section.
>> On x86-64, aarch64, arm, i386 and most other targets, a file-scope
>> initialiser of the form
>>
>>   int (*fptr)(void) = some_ifunc;
>>
>> causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
>> entries for any PLT calls made by the resolver live in .rela.plt.
>> Processing .rela.dyn before .rela.plt means the resolver fires before the
>> PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.
>>
>> Fix this by splitting IRELATIVE processing into a separate, explicitly
>> deferred pass.  In elf/do-rel.h:
>>
>>  - Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
>>    elf_dynamic_do_Rel.  IRELATIVE entries are now always skipped in the
>>    non-bootstrap path.
>>
>>  - Add a new elf_dynamic_do_Rel_irelative function that scans a
>>    relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
>>    IRELATIVE and ifunc relocations.
>>
>> In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
>> approach for non-bootstrap builds unconditionally (regardless of whether
>> ranges[1].size is zero):
>>
>>  Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
>>             everything except IRELATIVE/STT_GNU_IFUNC.
>>  Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
>>             processes only IRELATIVE, by which point all PLT stubs are
>>             valid.
>>
>> This guarantees that IRELATIVE resolvers can call PLT stubs safely
>> regardless of which section the linker placed R_*_IRELATIVE in.
>>
>> Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
>> the new skip logic in elf_dynamic_do_Rel is compiled for all targets.
>>
>> I checked on all ABI that support iFUNC (x86_64, i686, aarch64, arm,
>> loongarch, powerpc, riscv, s390, and sparc), although on some through
>> qemu-system (which should not matter for this case).
>>
>> It also fixes the mold reported issues [1], which shows an example
>> where IFUNC relocation placement and processing can works different
>> for different ABIs.
> 
> Just to be clear.  This patch doesn't fully fix:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=20673
> 
> This fix only deals with IRELATIVE relocation order within an object.
> Calling an external function from an IFUC resolver may still crash.
> Am I correct?
Yes, it does not solve all the iFUNC raised by Szabolcs [1], nor the
original BZ#21041 issues [2] which Fangrui creates an example [3]. For this
I think we will need something like what Florian did [4], and I am exploring
a solution similar.


[1] https://sourceware.org/legacy-ml/libc-alpha/2015-11/msg00108.html
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=21041
[3] https://maskray.me/blog/2021-01-18-gnu-indirect-function
[4] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/fw/bug21242
  
H.J. Lu May 11, 2026, 6:24 p.m. UTC | #3
On Mon, May 11, 2026 at 10:41 PM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 08/05/26 20:48, H.J. Lu wrote:
> > On Tue, Apr 28, 2026 at 9:42 PM Adhemerval Zanella
> > <adhemerval.zanella@linaro.org> wrote:
> >>
> >> When a shared library is built with -z lazy and its IFUNC resolver calls
> >> a PLT function, the dynamic linker can crash.  The resolver runs while
> >> the PLT stubs still hold their raw ELF virtual addresses — l_addr has
> >> not yet been added — so the call branches to an unmapped address.
> >>
> >> The old code deferred IRELATIVE entries only to the end of the relocation
> >> range currently being processed (via the r2/end2 scan-ahead mechanism in
> >> elf_dynamic_do_Rel).  This was sufficient only when both IRELATIVE and the
> >> JMP_SLOT entries for the PLT functions it needs are in the same section.
> >> On x86-64, aarch64, arm, i386 and most other targets, a file-scope
> >> initialiser of the form
> >>
> >>   int (*fptr)(void) = some_ifunc;
> >>
> >> causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
> >> entries for any PLT calls made by the resolver live in .rela.plt.
> >> Processing .rela.dyn before .rela.plt means the resolver fires before the
> >> PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.
> >>
> >> Fix this by splitting IRELATIVE processing into a separate, explicitly
> >> deferred pass.  In elf/do-rel.h:
> >>
> >>  - Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
> >>    elf_dynamic_do_Rel.  IRELATIVE entries are now always skipped in the
> >>    non-bootstrap path.
> >>
> >>  - Add a new elf_dynamic_do_Rel_irelative function that scans a
> >>    relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
> >>    IRELATIVE and ifunc relocations.
> >>
> >> In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
> >> approach for non-bootstrap builds unconditionally (regardless of whether
> >> ranges[1].size is zero):
> >>
> >>  Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
> >>             everything except IRELATIVE/STT_GNU_IFUNC.
> >>  Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
> >>             processes only IRELATIVE, by which point all PLT stubs are
> >>             valid.
> >>
> >> This guarantees that IRELATIVE resolvers can call PLT stubs safely
> >> regardless of which section the linker placed R_*_IRELATIVE in.
> >>
> >> Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
> >> the new skip logic in elf_dynamic_do_Rel is compiled for all targets.
> >>
> >> I checked on all ABI that support iFUNC (x86_64, i686, aarch64, arm,
> >> loongarch, powerpc, riscv, s390, and sparc), although on some through
> >> qemu-system (which should not matter for this case).
> >>
> >> It also fixes the mold reported issues [1], which shows an example
> >> where IFUNC relocation placement and processing can works different
> >> for different ABIs.
> >
> > Just to be clear.  This patch doesn't fully fix:
> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=20673
> >
> > This fix only deals with IRELATIVE relocation order within an object.
> > Calling an external function from an IFUC resolver may still crash.
> > Am I correct?
> Yes, it does not solve all the iFUNC raised by Szabolcs [1], nor the
> original BZ#21041 issues [2] which Fangrui creates an example [3]. For this
> I think we will need something like what Florian did [4], and I am exploring
> a solution similar.
>
>
> [1] https://sourceware.org/legacy-ml/libc-alpha/2015-11/msg00108.html
> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=21041
> [3] https://maskray.me/blog/2021-01-18-gnu-indirect-function
> [4] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/fw/bug21242

IFUNC resolvers have many limitations when they call another function.
I don't think we can make it work in all cases.   Can you remove the
reference to BZ 20673 in your commit message?  Your patch supports
any IRELATIVE relocation orders.  It should be good enough on its own.
  
Adhemerval Zanella May 11, 2026, 6:39 p.m. UTC | #4
On 11/05/26 15:24, H.J. Lu wrote:
> On Mon, May 11, 2026 at 10:41 PM Adhemerval Zanella Netto
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>>
>> On 08/05/26 20:48, H.J. Lu wrote:
>>> On Tue, Apr 28, 2026 at 9:42 PM Adhemerval Zanella
>>> <adhemerval.zanella@linaro.org> wrote:
>>>>
>>>> When a shared library is built with -z lazy and its IFUNC resolver calls
>>>> a PLT function, the dynamic linker can crash.  The resolver runs while
>>>> the PLT stubs still hold their raw ELF virtual addresses — l_addr has
>>>> not yet been added — so the call branches to an unmapped address.
>>>>
>>>> The old code deferred IRELATIVE entries only to the end of the relocation
>>>> range currently being processed (via the r2/end2 scan-ahead mechanism in
>>>> elf_dynamic_do_Rel).  This was sufficient only when both IRELATIVE and the
>>>> JMP_SLOT entries for the PLT functions it needs are in the same section.
>>>> On x86-64, aarch64, arm, i386 and most other targets, a file-scope
>>>> initialiser of the form
>>>>
>>>>   int (*fptr)(void) = some_ifunc;
>>>>
>>>> causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
>>>> entries for any PLT calls made by the resolver live in .rela.plt.
>>>> Processing .rela.dyn before .rela.plt means the resolver fires before the
>>>> PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.
>>>>
>>>> Fix this by splitting IRELATIVE processing into a separate, explicitly
>>>> deferred pass.  In elf/do-rel.h:
>>>>
>>>>  - Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
>>>>    elf_dynamic_do_Rel.  IRELATIVE entries are now always skipped in the
>>>>    non-bootstrap path.
>>>>
>>>>  - Add a new elf_dynamic_do_Rel_irelative function that scans a
>>>>    relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
>>>>    IRELATIVE and ifunc relocations.
>>>>
>>>> In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
>>>> approach for non-bootstrap builds unconditionally (regardless of whether
>>>> ranges[1].size is zero):
>>>>
>>>>  Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
>>>>             everything except IRELATIVE/STT_GNU_IFUNC.
>>>>  Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
>>>>             processes only IRELATIVE, by which point all PLT stubs are
>>>>             valid.
>>>>
>>>> This guarantees that IRELATIVE resolvers can call PLT stubs safely
>>>> regardless of which section the linker placed R_*_IRELATIVE in.
>>>>
>>>> Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
>>>> the new skip logic in elf_dynamic_do_Rel is compiled for all targets.
>>>>
>>>> I checked on all ABI that support iFUNC (x86_64, i686, aarch64, arm,
>>>> loongarch, powerpc, riscv, s390, and sparc), although on some through
>>>> qemu-system (which should not matter for this case).
>>>>
>>>> It also fixes the mold reported issues [1], which shows an example
>>>> where IFUNC relocation placement and processing can works different
>>>> for different ABIs.
>>>
>>> Just to be clear.  This patch doesn't fully fix:
>>>
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=20673
>>>
>>> This fix only deals with IRELATIVE relocation order within an object.
>>> Calling an external function from an IFUC resolver may still crash.
>>> Am I correct?
>> Yes, it does not solve all the iFUNC raised by Szabolcs [1], nor the
>> original BZ#21041 issues [2] which Fangrui creates an example [3]. For this
>> I think we will need something like what Florian did [4], and I am exploring
>> a solution similar.
>>
>>
>> [1] https://sourceware.org/legacy-ml/libc-alpha/2015-11/msg00108.html
>> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=21041
>> [3] https://maskray.me/blog/2021-01-18-gnu-indirect-function
>> [4] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/fw/bug21242
> 
> IFUNC resolvers have many limitations when they call another function.
> I don't think we can make it work in all cases.   Can you remove the
> reference to BZ 20673 in your commit message?  Your patch supports
> any IRELATIVE relocation orders.  It should be good enough on its own.
Indeed, albeit I think there are still room for improvements.  From Szabolcs
raised points, I think at least we can make point 1 (calling extern functions
from an IFUNC resolver may not work) to work more reliable. I am working on
a patch that should fix it.

The point 2 (-fstack-protector-all may crash a resolver) could be resolved,
although I think it would be complex for ABIs that places the cookie on the TCB
(since we need to know the static TLS size prior sizing the TCB header).  I think
a better approach would to warn and disable this by the compiler.

The point 3 (libc functions are not safe to call from resolvers under static 
linking) could also be resolved, but I haven't dig into yet.  Most likely it will 
require adjust the static linking initialization order.

The point 4 (limited methods for an IFUNC resolver to learn about the machine)
is somewhat out of scope of IFUNC implementation. However, I think if we can
make 1. work properly the resolve can rely on calling external library
to gather better information.

The Point 5 (caching results in userspace is unsafe / causes dirty pages) is
an implementation detail of libgcc, so I think it is out-of-scope for the glibc.

The Point 6 (lazy binding requires the resolver to be async-signal-safe and 
thread-safe) is also quite hard and I think it falls on same category of
atfork handlers; where glibc can not really enforce a program model
without some compiler help.

I will drop the bug report from commit message, should I send another version?
  
H.J. Lu May 11, 2026, 6:56 p.m. UTC | #5
On Tue, May 12, 2026 at 2:39 AM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 11/05/26 15:24, H.J. Lu wrote:
> > On Mon, May 11, 2026 at 10:41 PM Adhemerval Zanella Netto
> > <adhemerval.zanella@linaro.org> wrote:
> >>
> >>
> >>
> >> On 08/05/26 20:48, H.J. Lu wrote:
> >>> On Tue, Apr 28, 2026 at 9:42 PM Adhemerval Zanella
> >>> <adhemerval.zanella@linaro.org> wrote:
> >>>>
> >>>> When a shared library is built with -z lazy and its IFUNC resolver calls
> >>>> a PLT function, the dynamic linker can crash.  The resolver runs while
> >>>> the PLT stubs still hold their raw ELF virtual addresses — l_addr has
> >>>> not yet been added — so the call branches to an unmapped address.
> >>>>
> >>>> The old code deferred IRELATIVE entries only to the end of the relocation
> >>>> range currently being processed (via the r2/end2 scan-ahead mechanism in
> >>>> elf_dynamic_do_Rel).  This was sufficient only when both IRELATIVE and the
> >>>> JMP_SLOT entries for the PLT functions it needs are in the same section.
> >>>> On x86-64, aarch64, arm, i386 and most other targets, a file-scope
> >>>> initialiser of the form
> >>>>
> >>>>   int (*fptr)(void) = some_ifunc;
> >>>>
> >>>> causes the linker to place R_*_IRELATIVE in .rela.dyn, while JMP_SLOT
> >>>> entries for any PLT calls made by the resolver live in .rela.plt.
> >>>> Processing .rela.dyn before .rela.plt means the resolver fires before the
> >>>> PLT is usable, regardless of where within .rela.dyn IRELATIVE appears.
> >>>>
> >>>> Fix this by splitting IRELATIVE processing into a separate, explicitly
> >>>> deferred pass.  In elf/do-rel.h:
> >>>>
> >>>>  - Remove the r2/end2 variables and the post-loop IRELATIVE re-scan from
> >>>>    elf_dynamic_do_Rel.  IRELATIVE entries are now always skipped in the
> >>>>    non-bootstrap path.
> >>>>
> >>>>  - Add a new elf_dynamic_do_Rel_irelative function that scans a
> >>>>    relocation range and calls elf_machine_rel/elf_machine_lazy_rel for
> >>>>    IRELATIVE and ifunc relocations.
> >>>>
> >>>> In elf/dynamic-link.h, update _ELF_DYNAMIC_DO_RELOC to use a two-phase
> >>>> approach for non-bootstrap builds unconditionally (regardless of whether
> >>>> ranges[1].size is zero):
> >>>>
> >>>>  Phase 1+2: elf_dynamic_do_Rel over .rela.dyn then .rela.plt — processes
> >>>>             everything except IRELATIVE/STT_GNU_IFUNC.
> >>>>  Phase 3+4: elf_dynamic_do_Rel_irelative over .rela.dyn then .rela.plt —
> >>>>             processes only IRELATIVE, by which point all PLT stubs are
> >>>>             valid.
> >>>>
> >>>> This guarantees that IRELATIVE resolvers can call PLT stubs safely
> >>>> regardless of which section the linker placed R_*_IRELATIVE in.
> >>>>
> >>>> Add ELF_MACHINE_IRELATIVE to the architectures that were missing it so
> >>>> the new skip logic in elf_dynamic_do_Rel is compiled for all targets.
> >>>>
> >>>> I checked on all ABI that support iFUNC (x86_64, i686, aarch64, arm,
> >>>> loongarch, powerpc, riscv, s390, and sparc), although on some through
> >>>> qemu-system (which should not matter for this case).
> >>>>
> >>>> It also fixes the mold reported issues [1], which shows an example
> >>>> where IFUNC relocation placement and processing can works different
> >>>> for different ABIs.
> >>>
> >>> Just to be clear.  This patch doesn't fully fix:
> >>>
> >>> https://sourceware.org/bugzilla/show_bug.cgi?id=20673
> >>>
> >>> This fix only deals with IRELATIVE relocation order within an object.
> >>> Calling an external function from an IFUC resolver may still crash.
> >>> Am I correct?
> >> Yes, it does not solve all the iFUNC raised by Szabolcs [1], nor the
> >> original BZ#21041 issues [2] which Fangrui creates an example [3]. For this
> >> I think we will need something like what Florian did [4], and I am exploring
> >> a solution similar.
> >>
> >>
> >> [1] https://sourceware.org/legacy-ml/libc-alpha/2015-11/msg00108.html
> >> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=21041
> >> [3] https://maskray.me/blog/2021-01-18-gnu-indirect-function
> >> [4] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/fw/bug21242
> >
> > IFUNC resolvers have many limitations when they call another function.
> > I don't think we can make it work in all cases.   Can you remove the
> > reference to BZ 20673 in your commit message?  Your patch supports
> > any IRELATIVE relocation orders.  It should be good enough on its own.
> Indeed, albeit I think there are still room for improvements.  From Szabolcs
> raised points, I think at least we can make point 1 (calling extern functions
> from an IFUNC resolver may not work) to work more reliable. I am working on
> a patch that should fix it.
>
> The point 2 (-fstack-protector-all may crash a resolver) could be resolved,
> although I think it would be complex for ABIs that places the cookie on the TCB
> (since we need to know the static TLS size prior sizing the TCB header).  I think
> a better approach would to warn and disable this by the compiler.
>
> The point 3 (libc functions are not safe to call from resolvers under static
> linking) could also be resolved, but I haven't dig into yet.  Most likely it will
> require adjust the static linking initialization order.
>
> The point 4 (limited methods for an IFUNC resolver to learn about the machine)
> is somewhat out of scope of IFUNC implementation. However, I think if we can
> make 1. work properly the resolve can rely on calling external library
> to gather better information.
>
> The Point 5 (caching results in userspace is unsafe / causes dirty pages) is
> an implementation detail of libgcc, so I think it is out-of-scope for the glibc.
>
> The Point 6 (lazy binding requires the resolver to be async-signal-safe and
> thread-safe) is also quite hard and I think it falls on same category of
> atfork handlers; where glibc can not really enforce a program model
> without some compiler help.
>
> I will drop the bug report from commit message, should I send another version?
>
>

Yes, please.  If you want, you may reference

https://sourceware.org/bugzilla/show_bug.cgi?id=13302

Your patch addresses the same issue from the glibc side.
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index c835eb8156..377ea2c0cc 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1252,6 +1252,8 @@  ifeq (yes,$(build-shared))
 tests += \
   tst-ifunc-fault-bindnow \
   tst-ifunc-fault-lazy \
+  tst-ifunc-plt \
+  tst-ifunc-plt-dlopen \
   # tests
 # Note: sysdeps/x86_64/ifuncmain8.c uses ifuncmain8.
 tests-internal += \
@@ -1314,6 +1316,8 @@  modules-names += \
   ifuncmod1 \
   ifuncmod3 \
   ifuncmod6 \
+  tst-ifunc-plt-dep \
+  tst-ifunc-plt-lib \
   # modules-names
 ifeq (no,$(with-lld))
 modules-names += ifuncmod5
@@ -1793,6 +1797,7 @@  unload4mod1.so-no-z-defs = yes
 ifuncmod1.so-no-z-defs = yes
 ifuncmod5.so-no-z-defs = yes
 ifuncmod6.so-no-z-defs = yes
+tst-ifunc-plt-lib.so-no-z-defs = yes
 tst-auditmod9a.so-no-z-defs = yes
 tst-auditmod9b.so-no-z-defs = yes
 tst-nodelete-uniquemod.so-no-z-defs = yes
@@ -2426,6 +2431,13 @@  $(objpfx)tst-ifunc-fault-bindnow.out: $(objpfx)tst-ifunc-fault-bindnow \
    $(objpfx)ld.so
 	$(tst-ifunc-fault-script)
 
+LDFLAGS-tst-ifunc-plt-lib.so = -Wl,-z,lazy
+
+$(objpfx)tst-ifunc-plt-lib.so: $(objpfx)tst-ifunc-plt-dep.so
+$(objpfx)tst-ifunc-plt: $(objpfx)tst-ifunc-plt-lib.so
+$(objpfx)tst-ifunc-plt-dlopen.out: \
+  $(objpfx)tst-ifunc-plt-lib.so $(objpfx)tst-ifunc-plt-dep.so
+
 $(objpfx)tst-unique1.out: $(objpfx)tst-unique1mod1.so \
 			  $(objpfx)tst-unique1mod2.so
 
diff --git a/elf/do-rel.h b/elf/do-rel.h
index d00ffab7e9..e7912d4ed4 100644
--- a/elf/do-rel.h
+++ b/elf/do-rel.h
@@ -23,6 +23,8 @@ 
 
 #ifdef DO_RELA
 # define elf_dynamic_do_Rel		elf_dynamic_do_Rela
+# define elf_dynamic_do_Rel_irelative	elf_dynamic_do_Rela_irelative
+# define elf_dynamic_is_Rel_irelative	elf_dynamic_is_Rela_irelative
 # define Rel				Rela
 # define elf_machine_rel		elf_machine_rela
 # define elf_machine_rel_relative	elf_machine_rela_relative
@@ -34,16 +36,34 @@ 
 			    (void *) (l_addr + relative->r_offset))
 #endif
 
+static __always_inline bool
+elf_dynamic_is_Rel_irelative (const ElfW(Rel) *reloc, const ElfW(Sym) *sym)
+{
+#ifdef ELF_MACHINE_IRELATIVE
+  const unsigned int r_type = ELFW (R_TYPE) (reloc->r_info);
+  return ((sym != NULL
+	   && ELFW(ST_TYPE) (sym->st_info) == STT_GNU_IFUNC
+	   && sym->st_shndx != SHN_UNDEF)
+	  || r_type == ELF_MACHINE_IRELATIVE);
+#else
+  return false;
+#endif
+}
+
 /* Perform the relocations in MAP on the running program image as specified
    by RELTAG, SZTAG.  If LAZY is nonzero, this is the first pass on PLT
    relocations; they should be set up to call _dl_runtime_resolve, rather
-   than fully resolved now.  */
+   than fully resolved now.
+
+   IRELATIVE entries are always skipped (non-bootstrap); they are handled
+   separately by elf_dynamic_do_Rel_irelative after all other relocations
+   for both .rel.dyn and .rel.plt have been processed.  */
 
 static inline void __attribute__ ((always_inline))
 elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
 		    ElfW(Addr) reladdr, ElfW(Addr) relsize,
 		    __typeof (((ElfW(Dyn) *) 0)->d_un.d_val) nrelative,
-		    int lazy, int skip_ifunc)
+		    int lazy)
 {
   const ElfW(Rel) *relative = (const void *) reladdr;
   const ElfW(Rel) *r = relative + nrelative;
@@ -65,14 +85,9 @@  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
       void *const r_addr_arg = (void *) (l_addr + r->r_offset);
       const struct r_found_version *rversion = &map->l_versions[ndx];
 
-      elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg, skip_ifunc);
+      elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg, 0);
     }
 #else /* !RTLD_BOOTSTRAP */
-# if defined ELF_MACHINE_IRELATIVE
-  const ElfW(Rel) *r2 = NULL;
-  const ElfW(Rel) *end2 = NULL;
-# endif
-
 #if !defined DO_RELA || !defined ELF_MACHINE_PLT_REL
   /* We never bind lazily during ld.so bootstrap.  Unfortunately gcc is
      not clever enough to see through all the function calls to realize
@@ -81,23 +96,12 @@  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
     {
       /* Doing lazy PLT relocations; they need very little info.  */
       for (; r < end; ++r)
-# ifdef ELF_MACHINE_IRELATIVE
-	if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_IRELATIVE)
-	  {
-	    if (r2 == NULL)
-	      r2 = r;
-	    end2 = r;
-	  }
-	else
-# endif
-	  elf_machine_lazy_rel (map, scope, l_addr, r, skip_ifunc);
-
-# ifdef ELF_MACHINE_IRELATIVE
-      if (r2 != NULL)
-	for (; r2 <= end2; ++r2)
-	  if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
-	    elf_machine_lazy_rel (map, scope, l_addr, r2, skip_ifunc);
-# endif
+	{
+	  const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
+	  if (elf_dynamic_is_Rel_irelative (r, sym))
+	    continue;
+	  elf_machine_lazy_rel (map, scope, l_addr, r, 0);
+	}
     }
   else
 #endif
@@ -125,18 +129,10 @@  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
 	      const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (r->r_info)];
 	      void *const r_addr_arg = (void *) (l_addr + r->r_offset);
 	      const struct r_found_version *rversion = &map->l_versions[ndx];
-#if defined ELF_MACHINE_IRELATIVE
-	      if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_IRELATIVE)
-		{
-		  if (r2 == NULL)
-		    r2 = r;
-		  end2 = r;
-		  continue;
-		}
-#endif
 
-	      elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg,
-			       skip_ifunc);
+	      if (elf_dynamic_is_Rel_irelative (r, sym))
+		continue;
+	      elf_machine_rel (map, scope, r, sym, rversion, r_addr_arg, 0);
 #if defined SHARED
 	      if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_JMP_SLOT
 		  && GLRO(dl_naudit) > 0)
@@ -150,21 +146,6 @@  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
 		}
 #endif
 	    }
-
-#if defined ELF_MACHINE_IRELATIVE
-	  if (r2 != NULL)
-	    for (; r2 <= end2; ++r2)
-	      if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
-		{
-		  ElfW(Half) ndx
-		    = version[ELFW(R_SYM) (r2->r_info)] & 0x7fff;
-		  elf_machine_rel (map, scope, r2,
-				   &symtab[ELFW(R_SYM) (r2->r_info)],
-				   &map->l_versions[ndx],
-				   (void *) (l_addr + r2->r_offset),
-				   skip_ifunc);
-		}
-#endif
 	}
       else
 	{
@@ -172,17 +153,10 @@  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
 	    {
 	      const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (r->r_info)];
 	      void *const r_addr_arg = (void *) (l_addr + r->r_offset);
-# ifdef ELF_MACHINE_IRELATIVE
-	      if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_IRELATIVE)
-		{
-		  if (r2 == NULL)
-		    r2 = r;
-		  end2 = r;
-		  continue;
-		}
-# endif
-	      elf_machine_rel (map, scope, r, sym, NULL, r_addr_arg,
-			       skip_ifunc);
+
+	      if (elf_dynamic_is_Rel_irelative (r, sym))
+		continue;
+	      elf_machine_rel (map, scope, r, sym, NULL, r_addr_arg, 1);
 # if defined SHARED
 	      if (ELFW(R_TYPE) (r->r_info) == ELF_MACHINE_JMP_SLOT
 		  && GLRO(dl_naudit) > 0)
@@ -197,21 +171,84 @@  elf_dynamic_do_Rel (struct link_map *map, struct r_scope_elem *scope[],
 		}
 # endif
 	    }
-
-# ifdef ELF_MACHINE_IRELATIVE
-	  if (r2 != NULL)
-	    for (; r2 <= end2; ++r2)
-	      if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
-		elf_machine_rel (map, scope, r2, &symtab[ELFW(R_SYM) (r2->r_info)],
-				 NULL, (void *) (l_addr + r2->r_offset),
-				 skip_ifunc);
-# endif
 	}
     }
 #endif /* !RTLD_BOOTSTRAP */
 }
 
+/* Process only IRELATIVE entries in the relocation range
+   [reladdr, reladdr+relsize).  When lazy is non-zero the PLT lazy-binding
+   path (elf_machine_lazy_rel) is used, otherwise the full non-lazy path
+   (elf_machine_rel) is used.
+
+   Called by _ELF_DYNAMIC_DO_RELOC after all non-IRELATIVE relocations have
+   been processed for both .rela.dyn and .rela.plt, so that IRELATIVE
+   resolvers may call PLT stubs safely regardless of which section the linker
+   placed R_*_IRELATIVE in.  */
+static __always_inline void
+elf_dynamic_do_Rel_irelative (struct link_map *map,
+			      struct r_scope_elem *scope[],
+			      ElfW(Addr) reladdr, ElfW(Addr) relsize,
+			      int lazy, int skip_ifunc)
+{
+# ifdef ELF_MACHINE_IRELATIVE
+  const ElfW(Rel) *r = (const void *) reladdr;
+  const ElfW(Rel) *end = (const void *) (reladdr + relsize);
+  ElfW(Addr) l_addr = map->l_addr;
+  const ElfW(Sym) *const symtab = (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+
+  if (lazy)
+    {
+      for (; r < end; ++r)
+	{
+	  const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
+	  if (!elf_dynamic_is_Rel_irelative (r, sym))
+	    continue;
+	  elf_machine_lazy_rel (map, scope, l_addr, r, skip_ifunc);
+	}
+    }
+  else
+    {
+      if (map->l_info[VERSYMIDX (DT_VERSYM)])
+	{
+	  const ElfW(Half) *const version =
+	    (const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+
+	  for (; r < end; ++r)
+	    {
+	      const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
+	      if (!elf_dynamic_is_Rel_irelative (r, sym))
+		continue;
+
+	      ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
+	      elf_machine_rel (map, scope, r,
+			       &symtab[ELFW(R_SYM) (r->r_info)],
+			       &map->l_versions[ndx],
+			       (void *) (l_addr + r->r_offset),
+			       skip_ifunc);
+	    }
+	}
+      else
+	{
+	  for (; r < end; ++r)
+	    {
+	      const ElfW (Sym) *sym = &symtab[ELFW (R_SYM) (r->r_info)];
+	      if (!elf_dynamic_is_Rel_irelative (r, sym))
+		continue;
+
+	      elf_machine_rel (map, scope, r,
+			       &symtab[ELFW(R_SYM) (r->r_info)],
+			       NULL,
+			       (void *) (l_addr + r->r_offset),
+			       skip_ifunc);
+	    }
+	}
+    }
+# endif
+}
+
 #undef elf_dynamic_do_Rel
+#undef elf_dynamic_do_Rel_irelative
 #undef Rel
 #undef elf_machine_rel
 #undef elf_machine_rel_relative
diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h
index a46f36b8d4..2055d910c6 100644
--- a/elf/dynamic-link.h
+++ b/elf/dynamic-link.h
@@ -78,7 +78,8 @@  elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
    consumes precisely the very end of the DT_REL*, or DT_JMPREL and DT_REL*
    are completely separate and there is a gap between them.  */
 
-# define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, scope, do_lazy, skip_ifunc, test_rel) \
+# define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, scope, do_lazy, skip_ifunc, \
+			       test_rel)				      \
   do {									      \
     struct { ElfW(Addr) start, size;					      \
 	     __typeof (((ElfW(Dyn) *) 0)->d_un.d_val) nrelative; int lazy; }  \
@@ -118,13 +119,33 @@  elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
 	  }								      \
       }									      \
 									      \
-      for (int ranges_index = 0; ranges_index < 2; ++ranges_index)	      \
-        elf_dynamic_do_##reloc ((map), scope,				      \
-				ranges[ranges_index].start,		      \
-				ranges[ranges_index].size,		      \
-				ranges[ranges_index].nrelative,		      \
-				ranges[ranges_index].lazy,		      \
-				skip_ifunc);				      \
+      /* Defer all IRELATIVE relocations until after all non-IRELATIVE	      \
+	 relocations (including PLT lazy-binding setup) have been processed   \
+	 for both sections.  This ensures IRELATIVE resolvers can call PLT    \
+	 stubs safely regardless of which section R_*_IRELATIVE was placed in \
+	 by the linker.  */						      \
+      if (!DO_RTLD_BOOTSTRAP)						      \
+	{								      \
+	  for (int ranges_index = 0; ranges_index < 2; ++ranges_index)	      \
+	    elf_dynamic_do_##reloc ((map), scope,			      \
+				    ranges[ranges_index].start,		      \
+				    ranges[ranges_index].size,		      \
+				    ranges[ranges_index].nrelative,	      \
+				    ranges[ranges_index].lazy);		      \
+	  for (int ranges_index = 0; ranges_index < 2; ++ranges_index)	      \
+	    elf_dynamic_do_##reloc##_irelative ((map), scope,		      \
+						ranges[ranges_index].start,   \
+						ranges[ranges_index].size,    \
+						ranges[ranges_index].lazy,    \
+						skip_ifunc);		      \
+	}								      \
+      else								      \
+	for (int ranges_index = 0; ranges_index < 2; ++ranges_index)	      \
+	  elf_dynamic_do_##reloc ((map), scope,				      \
+				  ranges[ranges_index].start,		      \
+				  ranges[ranges_index].size,		      \
+				  ranges[ranges_index].nrelative,	      \
+				  ranges[ranges_index].lazy);		      \
   } while (0)
 
 # if ELF_MACHINE_NO_REL || ELF_MACHINE_NO_RELA
diff --git a/elf/tst-ifunc-plt-dep.c b/elf/tst-ifunc-plt-dep.c
new file mode 100644
index 0000000000..9cf2b1b0b2
--- /dev/null
+++ b/elf/tst-ifunc-plt-dep.c
@@ -0,0 +1,23 @@ 
+/* Dependency library for tst-ifunc-plt.
+   Copyright (C) 2026 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+int
+get_value (void)
+{
+  return 42;
+}
diff --git a/elf/tst-ifunc-plt-dlopen.c b/elf/tst-ifunc-plt-dlopen.c
new file mode 100644
index 0000000000..ed9d69b69c
--- /dev/null
+++ b/elf/tst-ifunc-plt-dlopen.c
@@ -0,0 +1,46 @@ 
+/* Test for BZ #20673 via dlopen.
+   Copyright (C) 2026 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+
+/* dlopen tst-ifunc-plt-lib.so with RTLD_LAZY and verify that the IFUNC
+   resolver (which calls get_value() via PLT) ran successfully.  This
+   exercises the same _dl_relocate_object code path as startup loading
+   but via the dlopen entry point.  */
+
+#include <support/xdlfcn.h>
+#include <support/check.h>
+
+typedef int (*fn_t) (void);
+
+static int
+do_test (void)
+{
+  void *handle = xdlopen ("tst-ifunc-plt-lib.so", RTLD_LAZY | RTLD_LOCAL);
+
+  fn_t compute_a = (fn_t) xdlsym (handle, "compute_a");
+  TEST_COMPARE (compute_a (), 1);
+
+  fn_t compute_b = (fn_t) xdlsym (handle, "compute_b");
+  TEST_COMPARE (compute_b (), 2);
+
+  xdlclose (handle);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/elf/tst-ifunc-plt-lib.c b/elf/tst-ifunc-plt-lib.c
new file mode 100644
index 0000000000..650231ff28
--- /dev/null
+++ b/elf/tst-ifunc-plt-lib.c
@@ -0,0 +1,56 @@ 
+/* Shared library for tst-ifunc-plt-multi (bug 20673).
+   Two static IFUNCs whose resolvers both call get_value() via PLT.
+   Copyright (C) 2026 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+
+/* Both resolvers call get_value() via PLT (one JUMP_SLOT entry in
+   .rel{a}.plt).  This verifies that every IRELATIVE entry is deferred
+   until after .rela.plt has been processed, not just the first one.  */
+
+#include <stddef.h>
+
+extern int get_value (void);
+
+static int
+impl_a (void)
+{
+  return 1;
+}
+
+static int
+impl_b (void)
+{
+  return 2;
+}
+
+static int (*
+resolve_a (void)) (void)
+{
+  return get_value () == 42 ? impl_a : NULL;
+}
+
+static int (*
+resolve_b (void)) (void)
+{
+  return get_value () == 42 ? impl_b : NULL;
+}
+
+/* The test is only built for $(have-ifunc), so we can assume HAVE_GCC_IFUNC
+   here.  */
+int compute_a (void) __attribute__ ((ifunc ("resolve_a")));
+int compute_b (void) __attribute__ ((ifunc ("resolve_b")));
diff --git a/elf/tst-ifunc-plt.c b/elf/tst-ifunc-plt.c
new file mode 100644
index 0000000000..382f90d852
--- /dev/null
+++ b/elf/tst-ifunc-plt.c
@@ -0,0 +1,38 @@ 
+/* Test for BZ #20673.
+   Copyright (C) 2026 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+
+/* tst-ifunc-plt-multi-lib.so defines two static IFUNCs (compute_a and
+   compute_b), each producing an R_*_IRELATIVE in .rel{a}.dyn, with both
+   resolvers calling get_value() via PLT.  The test verifies that both
+   IRELATIVEs are deferred until after .rel{a}.plt is processed.  */
+
+#include <support/check.h>
+
+extern int compute_a (void);
+extern int compute_b (void);
+
+static int
+do_test (void)
+{
+  TEST_COMPARE (compute_a (), 1);
+  TEST_COMPARE (compute_b (), 2);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/aarch64/dl-machine.h b/sysdeps/aarch64/dl-machine.h
index 21af8bc56e..15651c62f3 100644
--- a/sysdeps/aarch64/dl-machine.h
+++ b/sysdeps/aarch64/dl-machine.h
@@ -119,6 +119,7 @@  elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
    | (((type) == R_AARCH64_COPY) * ELF_RTYPE_CLASS_COPY))
 
 #define ELF_MACHINE_JMP_SLOT	R_AARCH64_JUMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_AARCH64_IRELATIVE
 
 #define DL_PLATFORM_INIT dl_platform_init ()
 
diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
index e0065ce73c..15cced693e 100644
--- a/sysdeps/arm/dl-machine.h
+++ b/sysdeps/arm/dl-machine.h
@@ -190,6 +190,7 @@  _dl_start_user:\n\
 
 /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
 #define ELF_MACHINE_JMP_SLOT	R_ARM_JUMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_ARM_IRELATIVE
 
 /* We define an initialization functions.  This is called very early in
    _dl_sysdep_start.  */
diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h
index 6657f68791..dd49079d75 100644
--- a/sysdeps/i386/dl-machine.h
+++ b/sysdeps/i386/dl-machine.h
@@ -190,6 +190,7 @@  _dl_start_user:\n\
 
 /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
 #define ELF_MACHINE_JMP_SLOT	R_386_JMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_386_IRELATIVE
 
 /* We define an initialization functions.  This is called very early in
    _dl_sysdep_start.  */
diff --git a/sysdeps/powerpc/powerpc32/dl-machine.h b/sysdeps/powerpc/powerpc32/dl-machine.h
index d787298636..e07a44a5a5 100644
--- a/sysdeps/powerpc/powerpc32/dl-machine.h
+++ b/sysdeps/powerpc/powerpc32/dl-machine.h
@@ -146,6 +146,7 @@  __elf_preferred_address(struct link_map *loader, size_t maplength,
 
 /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
 #define ELF_MACHINE_JMP_SLOT	R_PPC_JMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_PPC_IRELATIVE
 
 /* We define an initialization function to initialize HWCAP/HWCAP2 and
    platform data so it can be copied into the TCB later.  This is called
diff --git a/sysdeps/powerpc/powerpc64/dl-machine.h b/sysdeps/powerpc/powerpc64/dl-machine.h
index 6e3771c91c..1f5d7a0170 100644
--- a/sysdeps/powerpc/powerpc64/dl-machine.h
+++ b/sysdeps/powerpc/powerpc64/dl-machine.h
@@ -304,6 +304,7 @@  BODY_PREFIX "_dl_start_user:\n"						\
 
 /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
 #define ELF_MACHINE_JMP_SLOT	R_PPC64_JMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_PPC64_IRELATIVE
 
 /* We define an initialization function to initialize HWCAP/HWCAP2 and
    platform data so it can be copied into the TCB later.  This is called
diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
index 8c7312ad98..babb52af20 100644
--- a/sysdeps/riscv/dl-machine.h
+++ b/sysdeps/riscv/dl-machine.h
@@ -42,6 +42,7 @@ 
 #endif
 
 #define ELF_MACHINE_JMP_SLOT R_RISCV_JUMP_SLOT
+#define ELF_MACHINE_IRELATIVE R_RISCV_IRELATIVE
 
 #define elf_machine_type_class(type)				\
   ((ELF_RTYPE_CLASS_PLT * ((type) == ELF_MACHINE_JMP_SLOT	\
diff --git a/sysdeps/sparc/sparc32/dl-machine.h b/sysdeps/sparc/sparc32/dl-machine.h
index 73b347bfdf..3771339a04 100644
--- a/sysdeps/sparc/sparc32/dl-machine.h
+++ b/sysdeps/sparc/sparc32/dl-machine.h
@@ -156,6 +156,7 @@  elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
 
 /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
 #define ELF_MACHINE_JMP_SLOT	R_SPARC_JMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_SPARC_IRELATIVE
 
 /* Undo the sub %sp, 6*4, %sp; add %sp, 22*4, %o0 below to get at the
    value we want in __libc_stack_end.  */
diff --git a/sysdeps/sparc/sparc64/dl-machine.h b/sysdeps/sparc/sparc64/dl-machine.h
index 856922680a..8a1a38d170 100644
--- a/sysdeps/sparc/sparc64/dl-machine.h
+++ b/sysdeps/sparc/sparc64/dl-machine.h
@@ -119,6 +119,7 @@  elf_machine_plt_value (struct link_map *map, const Elf64_Rela *reloc,
 
 /* A reloc type used for ld.so cmdline arg lookups to reject PLT entries.  */
 #define ELF_MACHINE_JMP_SLOT	R_SPARC_JMP_SLOT
+#define ELF_MACHINE_IRELATIVE	R_SPARC_IRELATIVE
 
 /* Set up the loaded object described by L so its unrelocated PLT
    entries will jump to the on-demand fixup code in dl-runtime.c.  */