elf: Avoid nested functions in the loader (x86-64) [BZ #27220]

Message ID 20210820080513.3004013-1-maskray@google.com
State Superseded
Headers
Series elf: Avoid nested functions in the loader (x86-64) [BZ #27220] |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Fangrui Song Aug. 20, 2021, 8:05 a.m. UTC
  dynamic-link.h is included more than once in some elf/ files (rtld.c,
dl-conflict.c, dl-reloc.c, dl-reloc-static-pie.c) and uses GCC nested
functions. This harms readability and the nested functions usage
is the biggest obstacle prevents CC=clang (which doesn't support the
feature).

To un-nest elf_machine_rela, the key idea is to pass the variable in
the containing scope as an extra argument.
Stan Shebs implemented this in the google/grte/v5-2.27/master branch.
This patch squashes and cleans up the commits.

This patch just fixes x86-64: x86-64 uses the `#ifndef NESTING` branch
to avoid nested functions. The `#ifdef NESTING` branch is used by all
other ports whose dl-machine.h files haven't been migrated.

For the time being, there is some duplicated code. `#ifdef NESTING`
dispatches can be removed in the future when all arches are migrated.

I can fix aarch64, powerpc64, riscv, and other arches subsequently.
Migrating all ports at once is just too risky. Also, appreciate help
from arch maintainers.

Tested on x86_64-linux-gnu (!NESTING) and aarch64-linux-gnu (NESTING).
---
 elf/dl-conflict.c           | 33 ++++++++++++-
 elf/dl-reloc-static-pie.c   | 15 +++++-
 elf/dl-reloc.c              | 51 ++++++++++++++++++-
 elf/do-rel.h                | 30 ++++++++++--
 elf/dynamic-link.h          | 97 ++++++++++++++++++++++++++++++++++++-
 elf/get-dynamic-info.h      | 10 ++++
 elf/rtld.c                  | 30 +++++++++++-
 sysdeps/x86_64/dl-machine.h | 16 +++++-
 8 files changed, 269 insertions(+), 13 deletions(-)
  

Comments

Joseph Myers Aug. 20, 2021, 4:52 p.m. UTC | #1
On Fri, 20 Aug 2021, Fangrui Song via Libc-alpha wrote:

> This patch just fixes x86-64: x86-64 uses the `#ifndef NESTING` branch
> to avoid nested functions. The `#ifdef NESTING` branch is used by all
> other ports whose dl-machine.h files haven't been migrated.
> 
> For the time being, there is some duplicated code. `#ifdef NESTING`
> dispatches can be removed in the future when all arches are migrated.

The code certainly looks a lot messier after this patch.  Could you please 
send a patch (working for x86_64 only, I suppose) showing what things 
would look like without those conditionals?  There are some key things a 
nested function removal patch ought to satisfy:

* The source code looks at least as clean, to human readers, after the 
patch as before.  If anything is factored out to a new function or macro, 
there should be the usual API comments on that function or macro 
definition explaining its semantics; if any existing function or macro 
gets new parameters to avoid the implicit passing of data to nested 
functions, the API comments on that function or macro need updating to 
describe the semantics of those parameters.

* The generated object code is of similar or better quality (have you 
compared it before and after the patch?).

Also, when proposing a patch that only updates some architectures for an 
issue where all architectures ought to be updated, please give the 
proposed text you would add to 
<https://sourceware.org/glibc/wiki/PortStatus> if the patch is accepted, 
with instructions for port maintainers on what would be involved in 
updating their ports.

> +#ifndef NESTING
> +
> +    /* String table object symbols.  */
> +
> +static struct link_map *glob_l;
> +static struct r_scope_elem **glob_scope;
> +static const char *glob_strtab;

Writable static variables generally look suspect; there should be as 
little writable global (as opposed to thread-local) data in glibc as 
possible.  At least I'd expect comments on each variable describing its 
semantics, and some kind of comment explaining why it's thread-safe to use 
such variables (e.g. saying what lock, acquired where, is used to prevent 
multiple threads from accessing them at once, or what other mechanism is 
used for thread safety).
  
Fangrui Song Aug. 21, 2021, 4:11 a.m. UTC | #2
Thanks for the reply!

On 2021-08-20, Joseph Myers wrote:
>On Fri, 20 Aug 2021, Fangrui Song via Libc-alpha wrote:
>
>> This patch just fixes x86-64: x86-64 uses the `#ifndef NESTING` branch
>> to avoid nested functions. The `#ifdef NESTING` branch is used by all
>> other ports whose dl-machine.h files haven't been migrated.
>>
>> For the time being, there is some duplicated code. `#ifdef NESTING`
>> dispatches can be removed in the future when all arches are migrated.
>
>The code certainly looks a lot messier after this patch.  Could you please
>send a patch (working for x86_64 only, I suppose) showing what things
>would look like without those conditionals?  There are some key things a
>nested function removal patch ought to satisfy:
>
>* The source code looks at least as clean, to human readers, after the
>patch as before.  If anything is factored out to a new function or macro,
>there should be the usual API comments on that function or macro
>definition explaining its semantics; if any existing function or macro
>gets new parameters to avoid the implicit passing of data to nested
>functions, the API comments on that function or macro need updating to
>describe the semantics of those parameters.

Pushed the clean form (without NESTING dispatches) to
https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/unnest

8 files changed, 101 insertions(+), 85 deletions(-)

>* The generated object code is of similar or better quality (have you
>compared it before and after the patch?).

The object file is slightly larger.

% readelf -WS /tmp/c/ld.so.old | grep text
   [12] .text             PROGBITS        0000000000001050 001050 023bae 00  AX  0   0 16
% readelf -WS /tmp/c/ld.so.new | grep text
   [12] .text             PROGBITS        0000000000001050 001050 023cce 00  AX  0   0 16

In dl-reloc.c, accessing the 3 internal linkage global variables takes more
instructions.  The file has 28 more instructions. But the cost looks negligible
when considering the function call to _dl_lookup_symbol_x.

>Also, when proposing a patch that only updates some architectures for an
>issue where all architectures ought to be updated, please give the
>proposed text you would add to
><https://sourceware.org/glibc/wiki/PortStatus> if the patch is accepted,
>with instructions for port maintainers on what would be involved in
>updating their ports.

Thanks for the instructions.

>> +#ifndef NESTING
>> +
>> +    /* String table object symbols.  */
>> +
>> +static struct link_map *glob_l;
>> +static struct r_scope_elem **glob_scope;
>> +static const char *glob_strtab;
>
>Writable static variables generally look suspect; there should be as
>little writable global (as opposed to thread-local) data in glibc as
>possible.  At least I'd expect comments on each variable describing its
>semantics, and some kind of comment explaining why it's thread-safe to use
>such variables (e.g. saying what lock, acquired where, is used to prevent
>multiple threads from accessing them at once, or what other mechanism is
>used for thread safety).

I renamed the variables and added comments.

+/* Used by RESOLVE_MAP. _dl_relocate_object is either called at init time or
+ * by dlopen with a global lock, so the variables cannot be accessed
+ * concurrently.  */
+static struct link_map *cur_l;
+static struct r_scope_elem **cur_scope;
+static const char *cur_strtab;

The new commit is on https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/nesting
  

Patch

diff --git a/elf/dl-conflict.c b/elf/dl-conflict.c
index 31a2f90770..832cba65c1 100644
--- a/elf/dl-conflict.c
+++ b/elf/dl-conflict.c
@@ -27,6 +27,25 @@ 
 #include <sys/types.h>
 #include "dynamic-link.h"
 
+#ifndef NESTING
+
+
+    /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code.  */
+#define RESOLVE_MAP(ref, version, flags) (*ref = NULL, NULL)
+#define RESOLVE(ref, version, flags) (*ref = NULL, 0)
+#define RESOLVE_CONFLICT_FIND_MAP(map, r_offset) \
+  do {									      \
+    while ((resolve_conflict_map->l_map_end < (ElfW(Addr)) (r_offset))	      \
+	   || (resolve_conflict_map->l_map_start > (ElfW(Addr)) (r_offset)))  \
+      resolve_conflict_map = resolve_conflict_map->l_next;		      \
+									      \
+    (map) = resolve_conflict_map;					      \
+  } while (0)
+
+#include "dynamic-link.h"
+
+#endif /* NESTING */
+
 void
 _dl_resolve_conflicts (struct link_map *l, ElfW(Rela) *conflict,
 		       ElfW(Rela) *conflictend)
@@ -39,6 +58,8 @@  _dl_resolve_conflicts (struct link_map *l, ElfW(Rela) *conflict,
     /* Do the conflict relocation of the object and library GOT and other
        data.  */
 
+#ifdef NESTING
+
     /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code.  */
 #define RESOLVE_MAP(ref, version, flags) (*ref = NULL, NULL)
 #define RESOLVE(ref, version, flags) (*ref = NULL, 0)
@@ -51,13 +72,19 @@  _dl_resolve_conflicts (struct link_map *l, ElfW(Rela) *conflict,
     (map) = resolve_conflict_map;					      \
   } while (0)
 
+#endif /* NESTING */
+
     /* Prelinking makes no sense for anything but the main namespace.  */
     assert (l->l_ns == LM_ID_BASE);
     struct link_map *resolve_conflict_map __attribute__ ((__unused__))
       = GL(dl_ns)[LM_ID_BASE]._ns_loaded;
 
+#ifdef NESTING
+
 #include "dynamic-link.h"
 
+#endif /* NESTING */
+
     /* Override these, defined in dynamic-link.h.  */
 #undef CHECK_STATIC_TLS
 #define CHECK_STATIC_TLS(ref_map, sym_map) ((void) 0)
@@ -68,7 +95,11 @@  _dl_resolve_conflicts (struct link_map *l, ElfW(Rela) *conflict,
 
     for (; conflict < conflictend; ++conflict)
       elf_machine_rela (l, conflict, NULL, NULL, (void *) conflict->r_offset,
-			0);
+			0
+#ifndef NESTING
+			, NULL
+#endif
+			);
   }
 #endif
 }
diff --git a/elf/dl-reloc-static-pie.c b/elf/dl-reloc-static-pie.c
index d5bd2f31e9..0ab4613021 100644
--- a/elf/dl-reloc-static-pie.c
+++ b/elf/dl-reloc-static-pie.c
@@ -23,6 +23,13 @@ 
 #include <ldsodefs.h>
 #include "dynamic-link.h"
 
+#ifndef NESTING
+# define STATIC_PIE_BOOTSTRAP
+# define BOOTSTRAP_MAP (main_map)
+# define RESOLVE_MAP(sym, version, flags) BOOTSTRAP_MAP
+# include "dynamic-link.h"
+#endif /* n NESTING */
+
 /* Relocate static executable with PIE.  */
 
 void
@@ -30,10 +37,12 @@  _dl_relocate_static_pie (void)
 {
   struct link_map *main_map = _dl_get_dl_main_map ();
 
+#ifdef NESTING
 # define STATIC_PIE_BOOTSTRAP
 # define BOOTSTRAP_MAP (main_map)
 # define RESOLVE_MAP(sym, version, flags) BOOTSTRAP_MAP
 # include "dynamic-link.h"
+#endif /* NESTING */
 
   /* Figure out the run-time load address of static PIE.  */
   main_map->l_addr = elf_machine_load_address ();
@@ -48,7 +57,11 @@  _dl_relocate_static_pie (void)
 
   /* Relocate ourselves so we can do normal function calls and
      data access using the global offset table.  */
-  ELF_DYNAMIC_RELOCATE (main_map, 0, 0, 0);
+  ELF_DYNAMIC_RELOCATE (main_map, 0, 0, 0
+#ifndef NESTING
+                        , main_map
+#endif
+                        );
   main_map->l_relocated = 1;
 
   /* Initialize _r_debug.  */
diff --git a/elf/dl-reloc.c b/elf/dl-reloc.c
index e13a672ade..6d8b64ecb9 100644
--- a/elf/dl-reloc.c
+++ b/elf/dl-reloc.c
@@ -162,6 +162,41 @@  _dl_nothread_init_static_tls (struct link_map *map)
 }
 #endif /* !THREAD_GSCOPE_IN_TCB */
 
+#ifndef NESTING
+
+    /* String table object symbols.  */
+
+static struct link_map *glob_l;
+static struct r_scope_elem **glob_scope;
+static const char *glob_strtab;
+
+/* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code.  */
+#define RESOLVE_MAP(ref, version, r_type) \
+    ((ELFW(ST_BIND) ((*ref)->st_info) != STB_LOCAL			      \
+      && __glibc_likely (!dl_symbol_visibility_binds_local_p (*ref)))	      \
+     ? ((__builtin_expect ((*ref) == glob_l->l_lookup_cache.sym, 0)		      \
+	 && elf_machine_type_class (r_type) == glob_l->l_lookup_cache.type_class) \
+	? (bump_num_cache_relocations (),				      \
+	   (*ref) = glob_l->l_lookup_cache.ret,				      \
+	   glob_l->l_lookup_cache.value)					      \
+	: ({ lookup_t _lr;						      \
+	     int _tc = elf_machine_type_class (r_type);			      \
+	     glob_l->l_lookup_cache.type_class = _tc;			      \
+	     glob_l->l_lookup_cache.sym = (*ref);				      \
+	     const struct r_found_version *v = NULL;			      \
+	     if ((version) != NULL && (version)->hash != 0)		      \
+	       v = (version);						      \
+	     _lr = _dl_lookup_symbol_x (glob_strtab + (*ref)->st_name, glob_l, (ref),   \
+					glob_scope, v, _tc,			      \
+					DL_LOOKUP_ADD_DEPENDENCY, NULL);      \
+	     glob_l->l_lookup_cache.ret = (*ref);				      \
+	     glob_l->l_lookup_cache.value = _lr; }))				      \
+     : glob_l)
+
+#include "dynamic-link.h"
+
+#endif /* n NESTING */
+
 void
 _dl_relocate_object (struct link_map *l, struct r_scope_elem *scope[],
 		     int reloc_mode, int consider_profiling)
@@ -243,6 +278,8 @@  _dl_relocate_object (struct link_map *l, struct r_scope_elem *scope[],
   {
     /* Do the actual relocation of the object's GOT and other data.  */
 
+#ifdef NESTING
+
     /* String table object symbols.  */
     const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
 
@@ -272,7 +309,19 @@  _dl_relocate_object (struct link_map *l, struct r_scope_elem *scope[],
 
 #include "dynamic-link.h"
 
-    ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
+#else /* n NESTING */
+
+    glob_l = l;
+    glob_scope = scope;
+    glob_strtab = (const void *) D_PTR (glob_l, l_info[DT_STRTAB]);
+
+#endif /* n NESTING */
+
+    ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc
+#ifndef NESTING
+			  , NULL
+#endif
+			  );
 
 #ifndef PROF
     if (__glibc_unlikely (consider_profiling)
diff --git a/elf/do-rel.h b/elf/do-rel.h
index 321ac2b359..401a6340c8 100644
--- a/elf/do-rel.h
+++ b/elf/do-rel.h
@@ -41,7 +41,11 @@  auto inline void __attribute__ ((always_inline))
 elf_dynamic_do_Rel (struct link_map *map,
 		    ElfW(Addr) reladdr, ElfW(Addr) relsize,
 		    __typeof (((ElfW(Dyn) *) 0)->d_un.d_val) nrelative,
-		    int lazy, int skip_ifunc)
+		    int lazy, int skip_ifunc
+#ifndef NESTING
+		    , struct link_map *boot_map
+#endif
+		    )
 {
   const ElfW(Rel) *r = (const void *) reladdr;
   const ElfW(Rel) *end = (const void *) (reladdr + relsize);
@@ -136,7 +140,11 @@  elf_dynamic_do_Rel (struct link_map *map,
 	      ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
 	      elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
 			       &map->l_versions[ndx],
-			       (void *) (l_addr + r->r_offset), skip_ifunc);
+			       (void *) (l_addr + r->r_offset), skip_ifunc
+#ifndef NESTING
+			       , boot_map
+#endif
+			       );
 	    }
 
 #if defined ELF_MACHINE_IRELATIVE && !defined RTLD_BOOTSTRAP
@@ -150,7 +158,11 @@  elf_dynamic_do_Rel (struct link_map *map,
 				   &symtab[ELFW(R_SYM) (r2->r_info)],
 				   &map->l_versions[ndx],
 				   (void *) (l_addr + r2->r_offset),
-				   skip_ifunc);
+				   skip_ifunc
+#ifndef NESTING
+				   , boot_map
+#endif
+				   );
 		}
 #endif
 	}
@@ -168,7 +180,11 @@  elf_dynamic_do_Rel (struct link_map *map,
 	    else
 # endif
 	      elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
-			       (void *) (l_addr + r->r_offset), skip_ifunc);
+			       (void *) (l_addr + r->r_offset), skip_ifunc
+#ifndef NESTING
+			       , boot_map
+#endif
+			       );
 
 # ifdef ELF_MACHINE_IRELATIVE
 	  if (r2 != NULL)
@@ -176,7 +192,11 @@  elf_dynamic_do_Rel (struct link_map *map,
 	      if (ELFW(R_TYPE) (r2->r_info) == ELF_MACHINE_IRELATIVE)
 		elf_machine_rel (map, r2, &symtab[ELFW(R_SYM) (r2->r_info)],
 				 NULL, (void *) (l_addr + r2->r_offset),
-				 skip_ifunc);
+				 skip_ifunc
+#ifndef NESTING
+				 , boot_map
+#endif
+				 );
 # endif
 	}
 #endif
diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h
index 3eb24ba3a6..3493940a1e 100644
--- a/elf/dynamic-link.h
+++ b/elf/dynamic-link.h
@@ -16,6 +16,13 @@ 
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
+#if defined __x86_64__
+# define auto static
+#else
+/* Other arches use nested functions.  */
+# define NESTING
+#endif
+
 /* This macro is used as a callback from elf_machine_rel{a,} when a
    static TLS reloc is about to be performed.  Since (in dl-load.c) we
    permit dynamic loading of objects that might use such relocs, we
@@ -71,7 +78,11 @@  elf_machine_rel_relative (ElfW(Addr) l_addr, const ElfW(Rel) *reloc,
 auto inline void __attribute__((always_inline))
 elf_machine_rela (struct link_map *map, const ElfW(Rela) *reloc,
 		  const ElfW(Sym) *sym, const struct r_found_version *version,
-		  void *const reloc_addr, int skip_ifunc);
+		  void *const reloc_addr, int skip_ifunc
+#ifndef NESTING
+		  , struct link_map *boot_map
+#endif
+		  );
 auto inline void __attribute__((always_inline))
 elf_machine_rela_relative (ElfW(Addr) l_addr, const ElfW(Rela) *reloc,
 			   void *const reloc_addr);
@@ -114,6 +125,60 @@  elf_machine_lazy_rel (struct link_map *map,
    consumes precisely the very end of the DT_REL*, or DT_JMPREL and DT_REL*
    are completely separate and there is a gap between them.  */
 
+#ifndef NESTING
+# define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, do_lazy, skip_ifunc, test_rel, boot_map) \
+  do {									      \
+    struct { ElfW(Addr) start, size;					      \
+	     __typeof (((ElfW(Dyn) *) 0)->d_un.d_val) nrelative; int lazy; }  \
+      ranges[2] = { { 0, 0, 0, 0 }, { 0, 0, 0, 0 } };			      \
+									      \
+    if ((map)->l_info[DT_##RELOC])					      \
+      {									      \
+	ranges[0].start = D_PTR ((map), l_info[DT_##RELOC]);		      \
+	ranges[0].size = (map)->l_info[DT_##RELOC##SZ]->d_un.d_val;	      \
+	if (map->l_info[VERSYMIDX (DT_##RELOC##COUNT)] != NULL)		      \
+	  ranges[0].nrelative						      \
+	    = map->l_info[VERSYMIDX (DT_##RELOC##COUNT)]->d_un.d_val;	      \
+      }									      \
+    if ((map)->l_info[DT_PLTREL]					      \
+	&& (!test_rel || (map)->l_info[DT_PLTREL]->d_un.d_val == DT_##RELOC)) \
+      {									      \
+	ElfW(Addr) start = D_PTR ((map), l_info[DT_JMPREL]);		      \
+	ElfW(Addr) size = (map)->l_info[DT_PLTRELSZ]->d_un.d_val;	      \
+									      \
+	if (ranges[0].start + ranges[0].size == (start + size))		      \
+	  ranges[0].size -= size;					      \
+	if (ELF_DURING_STARTUP						      \
+	    || (!(do_lazy)						      \
+		&& (ranges[0].start + ranges[0].size) == start))	      \
+	  {								      \
+	    /* Combine processing the sections.  */			      \
+	    ranges[0].size += size;					      \
+	  }								      \
+	else								      \
+	  {								      \
+	    ranges[1].start = start;					      \
+	    ranges[1].size = size;					      \
+	    ranges[1].lazy = (do_lazy);					      \
+	  }								      \
+      }									      \
+									      \
+    if (ELF_DURING_STARTUP)						      \
+      elf_dynamic_do_##reloc ((map), ranges[0].start, ranges[0].size,	      \
+			      ranges[0].nrelative, 0, skip_ifunc, boot_map);  \
+    else								      \
+      {									      \
+	int ranges_index;						      \
+	for (ranges_index = 0; ranges_index < 2; ++ranges_index)	      \
+	  elf_dynamic_do_##reloc ((map),				      \
+				  ranges[ranges_index].start,		      \
+				  ranges[ranges_index].size,		      \
+				  ranges[ranges_index].nrelative,	      \
+				  ranges[ranges_index].lazy,		      \
+				  skip_ifunc, boot_map);		      \
+      }									      \
+  } while (0)
+#else /* NESTING */
 # define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, do_lazy, skip_ifunc, test_rel) \
   do {									      \
     struct { ElfW(Addr) start, size;					      \
@@ -166,6 +231,7 @@  elf_machine_lazy_rel (struct link_map *map,
 				  skip_ifunc);				      \
       }									      \
   } while (0)
+#endif /* NESTING */
 
 # if ELF_MACHINE_NO_REL || ELF_MACHINE_NO_RELA
 #  define _ELF_CHECK_REL 0
@@ -173,6 +239,34 @@  elf_machine_lazy_rel (struct link_map *map,
 #  define _ELF_CHECK_REL 1
 # endif
 
+#ifndef NESTING
+# if ! ELF_MACHINE_NO_REL
+#  include "do-rel.h"
+#  define ELF_DYNAMIC_DO_REL(map, lazy, skip_ifunc, boot_map)			\
+  _ELF_DYNAMIC_DO_RELOC (REL, Rel, map, lazy, skip_ifunc, _ELF_CHECK_REL, boot_map)
+# else
+#  define ELF_DYNAMIC_DO_REL(map, lazy, skip_ifunc, boot_map) /* Nothing to do.  */
+# endif
+
+# if ! ELF_MACHINE_NO_RELA
+#  define DO_RELA
+#  include "do-rel.h"
+#  define ELF_DYNAMIC_DO_RELA(map, lazy, skip_ifunc, boot_map)			\
+  _ELF_DYNAMIC_DO_RELOC (RELA, Rela, map, lazy, skip_ifunc, _ELF_CHECK_REL, boot_map)
+# else
+#  define ELF_DYNAMIC_DO_RELA(map, lazy, skip_ifunc, boot_map) /* Nothing to do.  */
+# endif
+
+/* This can't just be an inline function because GCC is too dumb
+   to inline functions containing inlines themselves.  */
+# define ELF_DYNAMIC_RELOCATE(map, lazy, consider_profile, skip_ifunc, boot_map) \
+  do {									      \
+    int edr_lazy = elf_machine_runtime_setup ((map), (lazy),		      \
+					      (consider_profile));	      \
+    ELF_DYNAMIC_DO_REL ((map), edr_lazy, skip_ifunc, boot_map);			\
+    ELF_DYNAMIC_DO_RELA ((map), edr_lazy, skip_ifunc, boot_map);			\
+  } while (0)
+#else /* NESTING */
 # if ! ELF_MACHINE_NO_REL
 #  include "do-rel.h"
 #  define ELF_DYNAMIC_DO_REL(map, lazy, skip_ifunc) \
@@ -199,5 +293,6 @@  elf_machine_lazy_rel (struct link_map *map,
     ELF_DYNAMIC_DO_REL ((map), edr_lazy, skip_ifunc);			      \
     ELF_DYNAMIC_DO_RELA ((map), edr_lazy, skip_ifunc);			      \
   } while (0)
+#endif /* NESTING */
 
 #endif
diff --git a/elf/get-dynamic-info.h b/elf/get-dynamic-info.h
index d8ec32377d..3bb6ab1ce4 100644
--- a/elf/get-dynamic-info.h
+++ b/elf/get-dynamic-info.h
@@ -22,6 +22,8 @@ 
 #include <assert.h>
 #include <libc-diag.h>
 
+#if defined NESTING || !defined SAW_EGDI
+
 #ifndef RESOLVE_MAP
 static
 #else
@@ -180,3 +182,11 @@  elf_get_dynamic_info (struct link_map *l, ElfW(Dyn) *temp)
     info[DT_RPATH] = NULL;
 #endif
 }
+
+#endif
+
+#ifndef NESTING
+#ifndef SAW_EGDI
+# define SAW_EGDI
+#endif
+#endif /* n NESTING */
diff --git a/elf/rtld.c b/elf/rtld.c
index 878e6480f4..167ee66cb1 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -499,15 +499,36 @@  _dl_start_final (void *arg, struct dl_start_final_info *info)
   return start_addr;
 }
 
+#ifndef NESTING
+#ifdef DONT_USE_BOOTSTRAP_MAP
+# define bootstrap_map GL(dl_rtld_map)
+#else
+# define bootstrap_map info.l
+#endif
+
+  /* This #define produces dynamic linking inline functions for
+     bootstrap relocation instead of general-purpose relocation.
+     Since ld.so must not have any undefined symbols the result
+     is trivial: always the map of ld.so itself.  */
+#define RTLD_BOOTSTRAP
+#define RESOLVE_MAP(sym, version, flags) (&bootstrap_map)
+#include "dynamic-link.h"
+#endif /* n NESTING */
+
 static ElfW(Addr) __attribute_used__
 _dl_start (void *arg)
 {
+#ifndef NESTING
+#ifndef DONT_USE_BOOTSTRAP_MAP
+ struct dl_start_final_info info;
+#endif
+#else /* NESTING */
 #ifdef DONT_USE_BOOTSTRAP_MAP
 # define bootstrap_map GL(dl_rtld_map)
 #else
   struct dl_start_final_info info;
 # define bootstrap_map info.l
-#endif
+#endif /* NESTING */
 
   /* This #define produces dynamic linking inline functions for
      bootstrap relocation instead of general-purpose relocation.
@@ -517,6 +538,7 @@  _dl_start (void *arg)
 #define BOOTSTRAP_MAP (&bootstrap_map)
 #define RESOLVE_MAP(sym, version, flags) BOOTSTRAP_MAP
 #include "dynamic-link.h"
+#endif /* NESTING */
 
 #ifdef DONT_USE_BOOTSTRAP_MAP
   rtld_timer_start (&start_time);
@@ -561,7 +583,11 @@  _dl_start (void *arg)
       /* Relocate ourselves so we can do normal function calls and
 	 data access using the global offset table.  */
 
-      ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);
+      ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0
+#ifndef NESTING
+			    , &bootstrap_map
+#endif
+			    );
     }
   bootstrap_map.l_relocated = 1;
 
diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
index ceee50734e..1161f03771 100644
--- a/sysdeps/x86_64/dl-machine.h
+++ b/sysdeps/x86_64/dl-machine.h
@@ -255,7 +255,11 @@  auto inline void
 __attribute__ ((always_inline))
 elf_machine_rela (struct link_map *map, const ElfW(Rela) *reloc,
 		  const ElfW(Sym) *sym, const struct r_found_version *version,
-		  void *const reloc_addr_arg, int skip_ifunc)
+		  void *const reloc_addr_arg, int skip_ifunc
+#ifndef NESTING
+		  , struct link_map *boot_map
+#endif
+		  )
 {
   ElfW(Addr) *const reloc_addr = reloc_addr_arg;
   const unsigned long int r_type = ELFW(R_TYPE) (reloc->r_info);
@@ -293,7 +297,11 @@  elf_machine_rela (struct link_map *map, const ElfW(Rela) *reloc,
 # ifndef RTLD_BOOTSTRAP
       const ElfW(Sym) *const refsym = sym;
 # endif
+#if !defined NESTING && (defined RTLD_BOOTSTRAP || defined STATIC_PIE_BOOTSTRAP)
+      struct link_map *sym_map = boot_map;
+#else
       struct link_map *sym_map = RESOLVE_MAP (&sym, version, r_type);
+#endif
       ElfW(Addr) value = SYMBOL_ADDRESS (sym_map, sym, true);
 
       if (sym != NULL
@@ -573,7 +581,11 @@  elf_machine_lazy_rel (struct link_map *map,
 
       /* Always initialize TLS descriptors completely at load time, in
 	 case static TLS is allocated for it that requires locking.  */
-      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
+      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc
+#ifndef NESTING
+                        , 0
+#endif
+                        );
     }
   else if (__glibc_unlikely (r_type == R_X86_64_IRELATIVE))
     {