[v6,1/4] rtld: Use generic argv adjustment in ld.so [BZ #23293]

Message ID a7d9779fde219ba7d64b7a540f994a86914f8d9d.1651757640.git.szabolcs.nagy@arm.com
State Committed
Headers
Series Args adjustment with ./ld.so exe [BZ #23293] |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Szabolcs Nagy May 5, 2022, 1:58 p.m. UTC
  When an executable is invoked as

  ./ld.so [ld.so-args] ./exe [exe-args]

then the argv is adujusted in ld.so before calling the entry point of
the executable so ld.so args are not visible to it.  On most targets
this requires moving argv, env and auxv on the stack to ensure correct
stack alignment at the entry point.  This had several issues:

- The code for this adjustment on the stack is written in asm as part
  of the target specific ld.so _start code which is hard to maintain.

- The adjustment is done after _dl_start returns, where it's too late
  to update GLRO(dl_auxv), as it is already readonly, so it points to
  memory that was clobbered by the adjustment. This is bug 23293.

- _environ is also wrong in ld.so after the adjustment, but it is
  likely not used after _dl_start returns so this is not user visible.

- _dl_argv was updated, but for this it was moved out of relro, which
  changes security properties across targets unnecessarily.

This patch introduces a generic _dl_start_args_adjust function that
handles the argument adjustments after ld.so processed its own args
and before relro protection is applied.

The same algorithm is used on all targets, _dl_skip_args is now 0, so
existing target specific adjustment code is no longer used.  The bug
affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc,
other targets don't need the change in principle, only for consistency.

The GNU Hurd start code relied on _dl_skip_args after dl_main returned,
now it checks directly if args were adjusted and fixes the Hurd startup
data accordingly.

Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO.

Tested on aarch64-linux-gnu and cross tested on i686-gnu.
---
v6:
- don't pass start_argptr to _dl_main, just use _dl_argv-1.
- add comment for _dl_start_args_adjust.
- add assert checks to _dl_start_args_adjust and simplify it a bit.
v5:
- Hurd specific changes.
v4:
- New code is unconditionally used on all targets.
- Hide auxv adjustments behind HAVE_AUX_VECTOR.
- DL_NEED_START_ARGS_ADJUST macro is removed.
- _dl_skip_args is no longer unused.
- start_argptr is passed down to dl_main instead of using a global.
- moved aarch64 DL_ARGV_NOT_RELRO removal to separate patch.
v2:
- use p != NULL, and a_type != AT_NULL
- remove the confusing paragraph from the commit message.
---
 elf/rtld.c                    | 73 ++++++++++++++++++++++++++++-------
 sysdeps/mach/hurd/dl-sysdep.c | 30 +++++++-------
 2 files changed, 73 insertions(+), 30 deletions(-)
  

Comments

Adhemerval Zanella Netto May 13, 2022, 7:56 p.m. UTC | #1
On 05/05/2022 10:58, Szabolcs Nagy via Libc-alpha wrote:
> When an executable is invoked as
> 
>   ./ld.so [ld.so-args] ./exe [exe-args]
> 
> then the argv is adujusted in ld.so before calling the entry point of
> the executable so ld.so args are not visible to it.  On most targets
> this requires moving argv, env and auxv on the stack to ensure correct
> stack alignment at the entry point.  This had several issues:
> 
> - The code for this adjustment on the stack is written in asm as part
>   of the target specific ld.so _start code which is hard to maintain.
> 
> - The adjustment is done after _dl_start returns, where it's too late
>   to update GLRO(dl_auxv), as it is already readonly, so it points to
>   memory that was clobbered by the adjustment. This is bug 23293.
> 
> - _environ is also wrong in ld.so after the adjustment, but it is
>   likely not used after _dl_start returns so this is not user visible.
> 
> - _dl_argv was updated, but for this it was moved out of relro, which
>   changes security properties across targets unnecessarily.
> 
> This patch introduces a generic _dl_start_args_adjust function that
> handles the argument adjustments after ld.so processed its own args
> and before relro protection is applied.
> 
> The same algorithm is used on all targets, _dl_skip_args is now 0, so
> existing target specific adjustment code is no longer used.  The bug
> affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc,
> other targets don't need the change in principle, only for consistency.
> 
> The GNU Hurd start code relied on _dl_skip_args after dl_main returned,
> now it checks directly if args were adjusted and fixes the Hurd startup
> data accordingly.
> 
> Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO.
> 
> Tested on aarch64-linux-gnu and cross tested on i686-gnu.
> ---
> v6:
> - don't pass start_argptr to _dl_main, just use _dl_argv-1.
> - add comment for _dl_start_args_adjust.
> - add assert checks to _dl_start_args_adjust and simplify it a bit.
> v5:
> - Hurd specific changes.
> v4:
> - New code is unconditionally used on all targets.
> - Hide auxv adjustments behind HAVE_AUX_VECTOR.
> - DL_NEED_START_ARGS_ADJUST macro is removed.
> - _dl_skip_args is no longer unused.
> - start_argptr is passed down to dl_main instead of using a global.
> - moved aarch64 DL_ARGV_NOT_RELRO removal to separate patch.
> v2:
> - use p != NULL, and a_type != AT_NULL
> - remove the confusing paragraph from the commit message.

Looks ok, just a minor suggestion below.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

> ---
>  elf/rtld.c                    | 73 ++++++++++++++++++++++++++++-------
>  sysdeps/mach/hurd/dl-sysdep.c | 30 +++++++-------
>  2 files changed, 73 insertions(+), 30 deletions(-)
> 
> diff --git a/elf/rtld.c b/elf/rtld.c
> index 3b2e05bf4c..b5070d453f 100644
> --- a/elf/rtld.c
> +++ b/elf/rtld.c
> @@ -1306,6 +1306,62 @@ rtld_setup_main_map (struct link_map *main_map)
>    return has_interp;
>  }
>  
> +/* Adjusts the contents of the stack and related globals for the user
> +   entry point.  The ld.so processed skip_args arguments and bumped
> +   _dl_argv and _dl_argc accordingly.  Those arguments are removed from
> +   argv here.  */
> +static void
> +_dl_start_args_adjust (int skip_args)
> +{
> +  void **sp = (void **) (_dl_argv - skip_args - 1);

Is it fully correctly to materialize the address for 'skip_args' equal to 0?
I don't think it would matter anyway.

> +  void **p = sp + skip_args;
> +
> +  if (skip_args == 0)
> +    return;
> +
> +  /* Sanity check.  */
> +  intptr_t argc = (intptr_t) sp[0] - skip_args;
> +  assert (argc == _dl_argc);
> +
> +  /* Adjust argc on stack.  */
> +  sp[0] = (void *) (intptr_t) _dl_argc;
> +
> +  /* Update globals in rtld.  */
> +  _dl_argv -= skip_args;
> +  _environ -= skip_args;
> +
> +  /* Shuffle argv down.  */
> +  do
> +    *++sp = *++p;
> +  while (*p != NULL);
> +
> +  assert (_environ == (char **) (sp + 1));
> +
> +  /* Shuffle envp down.  */
> +  do
> +    *++sp = *++p;
> +  while (*p != NULL);
> +
> +#ifdef HAVE_AUX_VECTOR
> +  void **auxv = (void **) GLRO(dl_auxv) - skip_args;
> +  GLRO(dl_auxv) = (ElfW(auxv_t *)) auxv; /* Aliasing violation.  */
> +  assert (auxv == sp + 1);
> +
> +  /* Shuffle auxv down. */
> +  void *a, *b; /* Use a pair of pointers for an auxv entry.  */
> +  unsigned long a_type;
> +  do
> +    {
> +      a_type = ((ElfW(auxv_t) *) (p + 1))->a_type;
> +      a = *++p;
> +      b = *++p;
> +      *++sp = a;
> +      *++sp = b;
> +    }> +  while (a_type != AT_NULL);
> +#endif

Maybe:

  ElfW(auxv_t) ax;
  do
    { 
      p = (void**) ((uintptr_t) p + sizeof (ax));
      sp = (void**) ((uintptr_t) sp + sizeof (ax));
      memcpy (&ax, p, sizeof (ax));
      memcpy (sp, &ax, sizeof (ax));
    }
  while (ax.a_type != AT_NULL);

Most targets will inline memcpy and if they do not we don't enable IFUNC 
for ld.so.

> +}
> +
>  static void
>  dl_main (const ElfW(Phdr) *phdr,
>  	 ElfW(Word) phnum,
> @@ -1359,6 +1415,7 @@ dl_main (const ElfW(Phdr) *phdr,
>        rtld_is_main = true;
>  
>        char *argv0 = NULL;
> +      char **orig_argv = _dl_argv;
>  
>        /* Note the place where the dynamic linker actually came from.  */
>        GL(dl_rtld_map).l_name = rtld_progname;
> @@ -1373,7 +1430,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  		GLRO(dl_lazy) = -1;
>  	      }
>  
> -	    ++_dl_skip_args;
>  	    --_dl_argc;
>  	    ++_dl_argv;
>  	  }
> @@ -1382,14 +1438,12 @@ dl_main (const ElfW(Phdr) *phdr,
>  	    if (state.mode != rtld_mode_help)
>  	      state.mode = rtld_mode_verify;
>  
> -	    ++_dl_skip_args;
>  	    --_dl_argc;
>  	    ++_dl_argv;
>  	  }
>  	else if (! strcmp (_dl_argv[1], "--inhibit-cache"))
>  	  {
>  	    GLRO(dl_inhibit_cache) = 1;
> -	    ++_dl_skip_args;
>  	    --_dl_argc;
>  	    ++_dl_argv;
>  	  }
> @@ -1399,7 +1453,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  	    state.library_path = _dl_argv[2];
>  	    state.library_path_source = "--library-path";
>  
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
> @@ -1408,7 +1461,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  	  {
>  	    GLRO(dl_inhibit_rpath) = _dl_argv[2];
>  
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
> @@ -1416,14 +1468,12 @@ dl_main (const ElfW(Phdr) *phdr,
>  	  {
>  	    audit_list_add_string (&state.audit_list, _dl_argv[2]);
>  
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
>  	else if (! strcmp (_dl_argv[1], "--preload") && _dl_argc > 2)
>  	  {
>  	    state.preloadarg = _dl_argv[2];
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
> @@ -1431,7 +1481,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  	  {
>  	    argv0 = _dl_argv[2];
>  
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
> @@ -1439,7 +1488,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  		 && _dl_argc > 2)
>  	  {
>  	    state.glibc_hwcaps_prepend = _dl_argv[2];
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
> @@ -1447,7 +1495,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  		 && _dl_argc > 2)
>  	  {
>  	    state.glibc_hwcaps_mask = _dl_argv[2];
> -	    _dl_skip_args += 2;
>  	    _dl_argc -= 2;
>  	    _dl_argv += 2;
>  	  }
> @@ -1456,7 +1503,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  	  {
>  	    state.mode = rtld_mode_list_tunables;
>  
> -	    ++_dl_skip_args;
>  	    --_dl_argc;
>  	    ++_dl_argv;
>  	  }
> @@ -1465,7 +1511,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  	  {
>  	    state.mode = rtld_mode_list_diagnostics;
>  
> -	    ++_dl_skip_args;
>  	    --_dl_argc;
>  	    ++_dl_argv;
>  	  }
> @@ -1511,7 +1556,6 @@ dl_main (const ElfW(Phdr) *phdr,
>  	    _dl_usage (ld_so_name, NULL);
>  	}
>  
> -      ++_dl_skip_args;
>        --_dl_argc;
>        ++_dl_argv;
>  
> @@ -1610,6 +1654,9 @@ dl_main (const ElfW(Phdr) *phdr,
>        /* Set the argv[0] string now that we've processed the executable.  */
>        if (argv0 != NULL)
>          _dl_argv[0] = argv0;
> +
> +      /* Adjust arguments for the application entry point.  */
> +      _dl_start_args_adjust (_dl_argv - orig_argv);
>      }
>    else
>      {

Ok.

> diff --git a/sysdeps/mach/hurd/dl-sysdep.c b/sysdeps/mach/hurd/dl-sysdep.c
> index 3cbe075615..8373962e62 100644
> --- a/sysdeps/mach/hurd/dl-sysdep.c
> +++ b/sysdeps/mach/hurd/dl-sysdep.c
> @@ -76,6 +76,7 @@ _dl_sysdep_start (void **start_argptr,
>  {
>    void go (intptr_t *argdata)
>      {
> +      char *orig_argv0;
>        char **p;
>  
>        /* Cache the information in various global variables.  */
> @@ -84,6 +85,8 @@ _dl_sysdep_start (void **start_argptr,
>        _environ = &_dl_argv[_dl_argc + 1];
>        for (p = _environ; *p++;); /* Skip environ pointers and terminator.  */
>  
> +      orig_argv0 = _dl_argv[0];
> +
>        if ((void *) p == _dl_argv[0])
>  	{
>  	  static struct hurd_startup_data nodata;
> @@ -173,30 +176,23 @@ _dl_sysdep_start (void **start_argptr,
>  
>        /* The call above might screw a few things up.
>  
> -	 First of all, if _dl_skip_args is nonzero, we are ignoring
> -	 the first few arguments.  However, if we have no Hurd startup
> -	 data, it is the magical convention that ARGV[0] == P.  The
> +	 P is the location after the terminating NULL of the list of
> +	 environment variables.  It has to point to the Hurd startup
> +	 data or if that's missing then P == ARGV[0] must hold. The
>  	 startup code in init-first.c will get confused if this is not
>  	 the case, so we must rearrange things to make it so.  We'll
> -	 overwrite the origional ARGV[0] at P with ARGV[_dl_skip_args].
> +	 recompute P and move the Hurd data or the new ARGV[0] there.
>  
> -	 Secondly, if we need to be secure, it removes some dangerous
> -	 environment variables.  If we have no Hurd startup date this
> -	 changes P (since that's the location after the terminating
> -	 NULL in the list of environment variables).  We do the same
> -	 thing as in the first case but make sure we recalculate P.
> -	 If we do have Hurd startup data, we have to move the data
> -	 such that it starts just after the terminating NULL in the
> -	 environment list.
> +	 Note: directly invoked ld.so can move arguments and env vars.
>  
>  	 We use memmove, since the locations might overlap.  */
> -      if (__libc_enable_secure || _dl_skip_args)
> -	{
> -	  char **newp;
>  
> -	  for (newp = _environ; *newp++;);
> +      char **newp;
> +      for (newp = _environ; *newp++;);
>  
> -	  if (_dl_argv[-_dl_skip_args] == (char *) p)
> +      if (newp != p || _dl_argv[0] != orig_argv0)
> +	{
> +	  if (orig_argv0 == (char *) p)
>  	    {
>  	      if ((char *) newp != _dl_argv[0])
>  		{

Looks ok, but I can't really voucher for hurd code.
  
Szabolcs Nagy May 17, 2022, 9:27 a.m. UTC | #2
The 05/13/2022 16:56, Adhemerval Zanella wrote:
> On 05/05/2022 10:58, Szabolcs Nagy via Libc-alpha wrote:
> > +_dl_start_args_adjust (int skip_args)
> > +{
> > +  void **sp = (void **) (_dl_argv - skip_args - 1);
> 
> Is it fully correctly to materialize the address for 'skip_args' equal to 0?
> I don't think it would matter anyway.

skip_args == 0 should be fine (any skip_args makes sp point to argc).

> > +  /* Shuffle auxv down. */
> > +  void *a, *b; /* Use a pair of pointers for an auxv entry.  */
> > +  unsigned long a_type;
> > +  do
> > +    {
> > +      a_type = ((ElfW(auxv_t) *) (p + 1))->a_type;
> > +      a = *++p;
> > +      b = *++p;
> > +      *++sp = a;
> > +      *++sp = b;
> > +    }> +  while (a_type != AT_NULL);
> > +#endif
> 
> Maybe:
> 
>   ElfW(auxv_t) ax;
>   do
>     { 
>       p = (void**) ((uintptr_t) p + sizeof (ax));
>       sp = (void**) ((uintptr_t) sp + sizeof (ax));
>       memcpy (&ax, p, sizeof (ax));
>       memcpy (sp, &ax, sizeof (ax));
>     }
>   while (ax.a_type != AT_NULL);
> 
> Most targets will inline memcpy and if they do not we don't enable IFUNC 
> for ld.so.

i did this differently, but same in spirit.

attaching the patch i committed.

> > -      if (__libc_enable_secure || _dl_skip_args)
> > -	{
> > -	  char **newp;
> >  
> > -	  for (newp = _environ; *newp++;);
> > +      char **newp;
> > +      for (newp = _environ; *newp++;);
> >  
> > -	  if (_dl_argv[-_dl_skip_args] == (char *) p)
> > +      if (newp != p || _dl_argv[0] != orig_argv0)
> > +	{
> > +	  if (orig_argv0 == (char *) p)
> >  	    {
> >  	      if ((char *) newp != _dl_argv[0])
> >  		{
> 
> Looks ok, but I can't really voucher for hurd code.

committed the patch, if it breaks hurd we can fix/revert later.
From ad43cac44a6860eaefcadadfb2acb349921e96bf Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date: Fri, 15 Jun 2018 16:14:58 +0100
Subject: [COMMITTED v7] rtld: Use generic argv adjustment in ld.so [BZ #23293]

When an executable is invoked as

  ./ld.so [ld.so-args] ./exe [exe-args]

then the argv is adujusted in ld.so before calling the entry point of
the executable so ld.so args are not visible to it.  On most targets
this requires moving argv, env and auxv on the stack to ensure correct
stack alignment at the entry point.  This had several issues:

- The code for this adjustment on the stack is written in asm as part
  of the target specific ld.so _start code which is hard to maintain.

- The adjustment is done after _dl_start returns, where it's too late
  to update GLRO(dl_auxv), as it is already readonly, so it points to
  memory that was clobbered by the adjustment. This is bug 23293.

- _environ is also wrong in ld.so after the adjustment, but it is
  likely not used after _dl_start returns so this is not user visible.

- _dl_argv was updated, but for this it was moved out of relro, which
  changes security properties across targets unnecessarily.

This patch introduces a generic _dl_start_args_adjust function that
handles the argument adjustments after ld.so processed its own args
and before relro protection is applied.

The same algorithm is used on all targets, _dl_skip_args is now 0, so
existing target specific adjustment code is no longer used.  The bug
affects aarch64, alpha, arc, arm, csky, ia64, nios2, s390-32 and sparc,
other targets don't need the change in principle, only for consistency.

The GNU Hurd start code relied on _dl_skip_args after dl_main returned,
now it checks directly if args were adjusted and fixes the Hurd startup
data accordingly.

Follow up patches can remove _dl_skip_args and DL_ARGV_NOT_RELRO.

Tested on aarch64-linux-gnu and cross tested on i686-gnu.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
---
 elf/rtld.c                    | 73 ++++++++++++++++++++++++++++-------
 sysdeps/mach/hurd/dl-sysdep.c | 30 +++++++-------
 2 files changed, 73 insertions(+), 30 deletions(-)

diff --git a/elf/rtld.c b/elf/rtld.c
index 578fc14cdb..6e8ed430e2 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1301,6 +1301,62 @@ rtld_setup_main_map (struct link_map *main_map)
   return has_interp;
 }
 
+/* Adjusts the contents of the stack and related globals for the user
+   entry point.  The ld.so processed skip_args arguments and bumped
+   _dl_argv and _dl_argc accordingly.  Those arguments are removed from
+   argv here.  */
+static void
+_dl_start_args_adjust (int skip_args)
+{
+  void **sp = (void **) (_dl_argv - skip_args - 1);
+  void **p = sp + skip_args;
+
+  if (skip_args == 0)
+    return;
+
+  /* Sanity check.  */
+  intptr_t argc = (intptr_t) sp[0] - skip_args;
+  assert (argc == _dl_argc);
+
+  /* Adjust argc on stack.  */
+  sp[0] = (void *) (intptr_t) _dl_argc;
+
+  /* Update globals in rtld.  */
+  _dl_argv -= skip_args;
+  _environ -= skip_args;
+
+  /* Shuffle argv down.  */
+  do
+    *++sp = *++p;
+  while (*p != NULL);
+
+  assert (_environ == (char **) (sp + 1));
+
+  /* Shuffle envp down.  */
+  do
+    *++sp = *++p;
+  while (*p != NULL);
+
+#ifdef HAVE_AUX_VECTOR
+  void **auxv = (void **) GLRO(dl_auxv) - skip_args;
+  GLRO(dl_auxv) = (ElfW(auxv_t) *) auxv; /* Aliasing violation.  */
+  assert (auxv == sp + 1);
+
+  /* Shuffle auxv down. */
+  ElfW(auxv_t) ax;
+  char *oldp = (char *) (p + 1);
+  char *newp = (char *) (sp + 1);
+  do
+    {
+      memcpy (&ax, oldp, sizeof (ax));
+      memcpy (newp, &ax, sizeof (ax));
+      oldp += sizeof (ax);
+      newp += sizeof (ax);
+    }
+  while (ax.a_type != AT_NULL);
+#endif
+}
+
 static void
 dl_main (const ElfW(Phdr) *phdr,
 	 ElfW(Word) phnum,
@@ -1354,6 +1410,7 @@ dl_main (const ElfW(Phdr) *phdr,
       rtld_is_main = true;
 
       char *argv0 = NULL;
+      char **orig_argv = _dl_argv;
 
       /* Note the place where the dynamic linker actually came from.  */
       GL(dl_rtld_map).l_name = rtld_progname;
@@ -1368,7 +1425,6 @@ dl_main (const ElfW(Phdr) *phdr,
 		GLRO(dl_lazy) = -1;
 	      }
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1377,14 +1433,12 @@ dl_main (const ElfW(Phdr) *phdr,
 	    if (state.mode != rtld_mode_help)
 	      state.mode = rtld_mode_verify;
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
 	else if (! strcmp (_dl_argv[1], "--inhibit-cache"))
 	  {
 	    GLRO(dl_inhibit_cache) = 1;
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1394,7 +1448,6 @@ dl_main (const ElfW(Phdr) *phdr,
 	    state.library_path = _dl_argv[2];
 	    state.library_path_source = "--library-path";
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1403,7 +1456,6 @@ dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    GLRO(dl_inhibit_rpath) = _dl_argv[2];
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1411,14 +1463,12 @@ dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    audit_list_add_string (&state.audit_list, _dl_argv[2]);
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
 	else if (! strcmp (_dl_argv[1], "--preload") && _dl_argc > 2)
 	  {
 	    state.preloadarg = _dl_argv[2];
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1426,7 +1476,6 @@ dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    argv0 = _dl_argv[2];
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1434,7 +1483,6 @@ dl_main (const ElfW(Phdr) *phdr,
 		 && _dl_argc > 2)
 	  {
 	    state.glibc_hwcaps_prepend = _dl_argv[2];
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1442,7 +1490,6 @@ dl_main (const ElfW(Phdr) *phdr,
 		 && _dl_argc > 2)
 	  {
 	    state.glibc_hwcaps_mask = _dl_argv[2];
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1451,7 +1498,6 @@ dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    state.mode = rtld_mode_list_tunables;
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1460,7 +1506,6 @@ dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    state.mode = rtld_mode_list_diagnostics;
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1506,7 +1551,6 @@ dl_main (const ElfW(Phdr) *phdr,
 	    _dl_usage (ld_so_name, NULL);
 	}
 
-      ++_dl_skip_args;
       --_dl_argc;
       ++_dl_argv;
 
@@ -1605,6 +1649,9 @@ dl_main (const ElfW(Phdr) *phdr,
       /* Set the argv[0] string now that we've processed the executable.  */
       if (argv0 != NULL)
         _dl_argv[0] = argv0;
+
+      /* Adjust arguments for the application entry point.  */
+      _dl_start_args_adjust (_dl_argv - orig_argv);
     }
   else
     {
diff --git a/sysdeps/mach/hurd/dl-sysdep.c b/sysdeps/mach/hurd/dl-sysdep.c
index 3cbe075615..8373962e62 100644
--- a/sysdeps/mach/hurd/dl-sysdep.c
+++ b/sysdeps/mach/hurd/dl-sysdep.c
@@ -76,6 +76,7 @@ _dl_sysdep_start (void **start_argptr,
 {
   void go (intptr_t *argdata)
     {
+      char *orig_argv0;
       char **p;
 
       /* Cache the information in various global variables.  */
@@ -84,6 +85,8 @@ _dl_sysdep_start (void **start_argptr,
       _environ = &_dl_argv[_dl_argc + 1];
       for (p = _environ; *p++;); /* Skip environ pointers and terminator.  */
 
+      orig_argv0 = _dl_argv[0];
+
       if ((void *) p == _dl_argv[0])
 	{
 	  static struct hurd_startup_data nodata;
@@ -173,30 +176,23 @@ _dl_sysdep_start (void **start_argptr,
 
       /* The call above might screw a few things up.
 
-	 First of all, if _dl_skip_args is nonzero, we are ignoring
-	 the first few arguments.  However, if we have no Hurd startup
-	 data, it is the magical convention that ARGV[0] == P.  The
+	 P is the location after the terminating NULL of the list of
+	 environment variables.  It has to point to the Hurd startup
+	 data or if that's missing then P == ARGV[0] must hold. The
 	 startup code in init-first.c will get confused if this is not
 	 the case, so we must rearrange things to make it so.  We'll
-	 overwrite the origional ARGV[0] at P with ARGV[_dl_skip_args].
+	 recompute P and move the Hurd data or the new ARGV[0] there.
 
-	 Secondly, if we need to be secure, it removes some dangerous
-	 environment variables.  If we have no Hurd startup date this
-	 changes P (since that's the location after the terminating
-	 NULL in the list of environment variables).  We do the same
-	 thing as in the first case but make sure we recalculate P.
-	 If we do have Hurd startup data, we have to move the data
-	 such that it starts just after the terminating NULL in the
-	 environment list.
+	 Note: directly invoked ld.so can move arguments and env vars.
 
 	 We use memmove, since the locations might overlap.  */
-      if (__libc_enable_secure || _dl_skip_args)
-	{
-	  char **newp;
 
-	  for (newp = _environ; *newp++;);
+      char **newp;
+      for (newp = _environ; *newp++;);
 
-	  if (_dl_argv[-_dl_skip_args] == (char *) p)
+      if (newp != p || _dl_argv[0] != orig_argv0)
+	{
+	  if (orig_argv0 == (char *) p)
 	    {
 	      if ((char *) newp != _dl_argv[0])
 		{
  

Patch

diff --git a/elf/rtld.c b/elf/rtld.c
index 3b2e05bf4c..b5070d453f 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1306,6 +1306,62 @@  rtld_setup_main_map (struct link_map *main_map)
   return has_interp;
 }
 
+/* Adjusts the contents of the stack and related globals for the user
+   entry point.  The ld.so processed skip_args arguments and bumped
+   _dl_argv and _dl_argc accordingly.  Those arguments are removed from
+   argv here.  */
+static void
+_dl_start_args_adjust (int skip_args)
+{
+  void **sp = (void **) (_dl_argv - skip_args - 1);
+  void **p = sp + skip_args;
+
+  if (skip_args == 0)
+    return;
+
+  /* Sanity check.  */
+  intptr_t argc = (intptr_t) sp[0] - skip_args;
+  assert (argc == _dl_argc);
+
+  /* Adjust argc on stack.  */
+  sp[0] = (void *) (intptr_t) _dl_argc;
+
+  /* Update globals in rtld.  */
+  _dl_argv -= skip_args;
+  _environ -= skip_args;
+
+  /* Shuffle argv down.  */
+  do
+    *++sp = *++p;
+  while (*p != NULL);
+
+  assert (_environ == (char **) (sp + 1));
+
+  /* Shuffle envp down.  */
+  do
+    *++sp = *++p;
+  while (*p != NULL);
+
+#ifdef HAVE_AUX_VECTOR
+  void **auxv = (void **) GLRO(dl_auxv) - skip_args;
+  GLRO(dl_auxv) = (ElfW(auxv_t *)) auxv; /* Aliasing violation.  */
+  assert (auxv == sp + 1);
+
+  /* Shuffle auxv down. */
+  void *a, *b; /* Use a pair of pointers for an auxv entry.  */
+  unsigned long a_type;
+  do
+    {
+      a_type = ((ElfW(auxv_t) *) (p + 1))->a_type;
+      a = *++p;
+      b = *++p;
+      *++sp = a;
+      *++sp = b;
+    }
+  while (a_type != AT_NULL);
+#endif
+}
+
 static void
 dl_main (const ElfW(Phdr) *phdr,
 	 ElfW(Word) phnum,
@@ -1359,6 +1415,7 @@  dl_main (const ElfW(Phdr) *phdr,
       rtld_is_main = true;
 
       char *argv0 = NULL;
+      char **orig_argv = _dl_argv;
 
       /* Note the place where the dynamic linker actually came from.  */
       GL(dl_rtld_map).l_name = rtld_progname;
@@ -1373,7 +1430,6 @@  dl_main (const ElfW(Phdr) *phdr,
 		GLRO(dl_lazy) = -1;
 	      }
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1382,14 +1438,12 @@  dl_main (const ElfW(Phdr) *phdr,
 	    if (state.mode != rtld_mode_help)
 	      state.mode = rtld_mode_verify;
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
 	else if (! strcmp (_dl_argv[1], "--inhibit-cache"))
 	  {
 	    GLRO(dl_inhibit_cache) = 1;
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1399,7 +1453,6 @@  dl_main (const ElfW(Phdr) *phdr,
 	    state.library_path = _dl_argv[2];
 	    state.library_path_source = "--library-path";
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1408,7 +1461,6 @@  dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    GLRO(dl_inhibit_rpath) = _dl_argv[2];
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1416,14 +1468,12 @@  dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    audit_list_add_string (&state.audit_list, _dl_argv[2]);
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
 	else if (! strcmp (_dl_argv[1], "--preload") && _dl_argc > 2)
 	  {
 	    state.preloadarg = _dl_argv[2];
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1431,7 +1481,6 @@  dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    argv0 = _dl_argv[2];
 
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1439,7 +1488,6 @@  dl_main (const ElfW(Phdr) *phdr,
 		 && _dl_argc > 2)
 	  {
 	    state.glibc_hwcaps_prepend = _dl_argv[2];
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1447,7 +1495,6 @@  dl_main (const ElfW(Phdr) *phdr,
 		 && _dl_argc > 2)
 	  {
 	    state.glibc_hwcaps_mask = _dl_argv[2];
-	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
 	  }
@@ -1456,7 +1503,6 @@  dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    state.mode = rtld_mode_list_tunables;
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1465,7 +1511,6 @@  dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    state.mode = rtld_mode_list_diagnostics;
 
-	    ++_dl_skip_args;
 	    --_dl_argc;
 	    ++_dl_argv;
 	  }
@@ -1511,7 +1556,6 @@  dl_main (const ElfW(Phdr) *phdr,
 	    _dl_usage (ld_so_name, NULL);
 	}
 
-      ++_dl_skip_args;
       --_dl_argc;
       ++_dl_argv;
 
@@ -1610,6 +1654,9 @@  dl_main (const ElfW(Phdr) *phdr,
       /* Set the argv[0] string now that we've processed the executable.  */
       if (argv0 != NULL)
         _dl_argv[0] = argv0;
+
+      /* Adjust arguments for the application entry point.  */
+      _dl_start_args_adjust (_dl_argv - orig_argv);
     }
   else
     {
diff --git a/sysdeps/mach/hurd/dl-sysdep.c b/sysdeps/mach/hurd/dl-sysdep.c
index 3cbe075615..8373962e62 100644
--- a/sysdeps/mach/hurd/dl-sysdep.c
+++ b/sysdeps/mach/hurd/dl-sysdep.c
@@ -76,6 +76,7 @@  _dl_sysdep_start (void **start_argptr,
 {
   void go (intptr_t *argdata)
     {
+      char *orig_argv0;
       char **p;
 
       /* Cache the information in various global variables.  */
@@ -84,6 +85,8 @@  _dl_sysdep_start (void **start_argptr,
       _environ = &_dl_argv[_dl_argc + 1];
       for (p = _environ; *p++;); /* Skip environ pointers and terminator.  */
 
+      orig_argv0 = _dl_argv[0];
+
       if ((void *) p == _dl_argv[0])
 	{
 	  static struct hurd_startup_data nodata;
@@ -173,30 +176,23 @@  _dl_sysdep_start (void **start_argptr,
 
       /* The call above might screw a few things up.
 
-	 First of all, if _dl_skip_args is nonzero, we are ignoring
-	 the first few arguments.  However, if we have no Hurd startup
-	 data, it is the magical convention that ARGV[0] == P.  The
+	 P is the location after the terminating NULL of the list of
+	 environment variables.  It has to point to the Hurd startup
+	 data or if that's missing then P == ARGV[0] must hold. The
 	 startup code in init-first.c will get confused if this is not
 	 the case, so we must rearrange things to make it so.  We'll
-	 overwrite the origional ARGV[0] at P with ARGV[_dl_skip_args].
+	 recompute P and move the Hurd data or the new ARGV[0] there.
 
-	 Secondly, if we need to be secure, it removes some dangerous
-	 environment variables.  If we have no Hurd startup date this
-	 changes P (since that's the location after the terminating
-	 NULL in the list of environment variables).  We do the same
-	 thing as in the first case but make sure we recalculate P.
-	 If we do have Hurd startup data, we have to move the data
-	 such that it starts just after the terminating NULL in the
-	 environment list.
+	 Note: directly invoked ld.so can move arguments and env vars.
 
 	 We use memmove, since the locations might overlap.  */
-      if (__libc_enable_secure || _dl_skip_args)
-	{
-	  char **newp;
 
-	  for (newp = _environ; *newp++;);
+      char **newp;
+      for (newp = _environ; *newp++;);
 
-	  if (_dl_argv[-_dl_skip_args] == (char *) p)
+      if (newp != p || _dl_argv[0] != orig_argv0)
+	{
+	  if (orig_argv0 == (char *) p)
 	    {
 	      if ((char *) newp != _dl_argv[0])
 		{