ld.so: add an --argv0 option

Message ID CALjvMRXfEQA7j8LKQYJqCTdO+qcUuAoSbBXSe9FBa9zPgE82Vw@mail.gmail.com
State Superseded
Headers
Series ld.so: add an --argv0 option |

Commit Message

Vincent Mihalkovic July 22, 2020, 8:33 a.m. UTC
  Hi,
a few years ago there was an effort for adding --argv0 option:
https://sourceware.org/legacy-ml/libc-alpha/2016-04/msg00576.html.
I made the old patch actual for version 2.31.9000, added a test case and
changelog.
2020-07-22  Vincent Mihalkovic  <vmihalko@redhat.com>

	* elf/Makefile: added argv0 test case, Modified.
	* elf/rtld.c: added --argv0 option, Modified.
	* elf/argv0test.c: test case, New file.
	* elf/tst-rtld-argv0.sh: test case, New file.
  

Comments

Florian Weimer July 22, 2020, 9 a.m. UTC | #1
* Vincent Mihalkovic via Libc-alpha:

> a few years ago there was an effort for adding --argv0 option:
> https://sourceware.org/legacy-ml/libc-alpha/2016-04/msg00576.html.
> I made the old patch actual for version 2.31.9000, added a test case and
> changelog.

Many shells support “exec -a” to get a similar effect.  The advantage is
that this does not perturb the argument vector layout.  Given that, I
think promoting --argv0 would merely introduce further compatibility
issues.

Thanks,
Florian
  
Vincent Mihalkovic July 26, 2020, 8 p.m. UTC | #2
Hi,

Sorry, I forgot to CC the libc-alpha mailing-list on my reply.

We do not need the option for shell scripting, where it is indeed not
needed,
as you say.  We are developing a tool for running dynamic analysis tools on
RPM packages fully automatically.  The tool appends custom linker flags
while
executing the %build section of RPMs to make the binaries use our custom ELF
interpreter.  While running the %check section, the interpreter takes care
of
running the binaries built in %build through the selected dynamic analyzer
in
a way that does not break the testing framework.  For this to work, we need
to run dynamic linker explicitly.  There is currently no way to preserve the
original argv[0], which some programs are sensitive to.  This unnecessarily
breaks the tests running in %check of some RPM packages.

An experimental implementation of the custom ELF interpreter is available
here:

    https://github.com/kdudka/cswrap/pull/2

thanks for considering this idea,
vincent mihalkovic

On Wed, Jul 22, 2020 at 11:00 AM Florian Weimer <fweimer@redhat.com> wrote:

> * Vincent Mihalkovic via Libc-alpha:
>
> > a few years ago there was an effort for adding --argv0 option:
> > https://sourceware.org/legacy-ml/libc-alpha/2016-04/msg00576.html.
> > I made the old patch actual for version 2.31.9000, added a test case and
> > changelog.
>
> Many shells support “exec -a” to get a similar effect.  The advantage is
> that this does not perturb the argument vector layout.  Given that, I
> think promoting --argv0 would merely introduce further compatibility
> issues.
>
> Thanks,
> Florian
>
>
  
Florian Weimer July 27, 2020, 5:54 a.m. UTC | #3
* Vincent Mihalkovic:

> Sorry, I forgot to CC the libc-alpha mailing-list on my reply.

And I think this won't make it to the list due to the HTML filters, so
here's a full quote:

> We do not need the option for shell scripting, where it is indeed not
> needed, as you say.  We are developing a tool for running dynamic
> analysis tools on RPM packages fully automatically.  The tool appends
> custom linker flags while executing the %build section of RPMs to make
> the binaries use our custom ELF interpreter.  While running the %check
> section, the interpreter takes care of running the binaries built in
> %build through the selected dynamic analyzer in a way that does not
> break the testing framework.  For this to work, we need to run dynamic
> linker explicitly.  There is currently no way to preserve the original
> argv[0], which some programs are sensitive to.  This unnecessarily
> breaks the tests running in %check of some RPM packages.
>
> An experimental implementation of the custom ELF interpreter is available
> here:
>
>     https://github.com/kdudka/cswrap/pull/2
>
> thanks for considering this idea,
> vincent mihalkovic

I was wrong about this, and it is not possible to get the desired
behavior using “exec -a”.  The question remains if just updating the
pointer is sufficient in this context, or if a more elaborate copying
operation is needed to preserve the expected semantics of the argument
vector.

I guest we could add --argv0 now (well, after the 2.32 release), and if
it's incompatible with some applications, we can perhaps tweak it later.

Thanks,
Florian
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index 0b78721848..f38904d831 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -210,7 +210,8 @@  tests += restest1 preloadtest loadfail multiload origtest resolvfail \
 	 tst-filterobj tst-filterobj-dlopen tst-auxobj tst-auxobj-dlopen \
 	 tst-audit14 tst-audit15 tst-audit16 \
 	 tst-single_threaded tst-single_threaded-pthread \
-	 tst-tls-ie tst-tls-ie-dlmopen
+	 tst-tls-ie tst-tls-ie-dlmopen \
+	 argv0test
 #	 reldep9
 tests-internal += loadtest unload unload2 circleload1 \
 	 neededtest neededtest2 neededtest3 neededtest4 \
@@ -414,7 +415,7 @@  endif
 ifeq (yes,$(build-shared))
 ifeq ($(run-built-tests),yes)
 tests-special += $(objpfx)tst-pathopt.out $(objpfx)tst-rtld-load-self.out \
-		 $(objpfx)tst-rtld-preload.out
+		 $(objpfx)tst-rtld-preload.out $(objpfx)argv0test.out
 endif
 tests-special += $(objpfx)check-textrel.out $(objpfx)check-execstack.out \
 		 $(objpfx)check-wx-segment.out \
@@ -1796,3 +1797,11 @@  $(objpfx)tst-tls-ie-dlmopen.out: \
   $(objpfx)tst-tls-ie-mod6.so
 
 $(objpfx)tst-tls-surplus: $(libdl)
+
+ARGV0 = test-argv0
+$(objpfx)argv0test.out: tst-rtld-argv0.sh $(objpfx)ld.so \
+			$(objpfx)argv0test
+	$(SHELL) $< $(objpfx)ld.so $(objpfx)argv0test \
+            '$(test-wrapper-env)' '$(run_program_env)' \
+            '$(rpath-link)' '$(ARGV0)' > $@; \
+    $(evaluate-test)
diff --git a/elf/argv0test.c b/elf/argv0test.c
new file mode 100644
index 0000000000..4c79bebf23
--- /dev/null
+++ b/elf/argv0test.c
@@ -0,0 +1,13 @@ 
+#include <stdio.h>   // for printf
+#include <string.h>  // for strcmp
+
+
+int main( int argc, char *argv[] ) {
+	int result = strcmp( argv[0], "test-argv0");
+ 
+	printf ("argv[0] = %s, strcmp( argv[0], \"test-argv0\" ) = %d, %s\n", \
+					argv[0], result, !result ? "ok" : "wrong");
+
+	return result;
+
+}
diff --git a/elf/rtld.c b/elf/rtld.c
index 5b882163fa..cafa4f9bd3 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1202,6 +1202,8 @@  dl_main (const ElfW(Phdr) *phdr,
 	 installing it.  */
       rtld_is_main = true;
 
+      char *argv0 = NULL;
+
       /* Note the place where the dynamic linker actually came from.  */
       GL(dl_rtld_map).l_name = rtld_progname;
 
@@ -1259,6 +1261,14 @@  dl_main (const ElfW(Phdr) *phdr,
 	else if (! strcmp (_dl_argv[1], "--preload") && _dl_argc > 2)
 	  {
 	    preloadarg = _dl_argv[2];
+	    _dl_skip_args += 2;
+	    _dl_argc -= 2;
+	    _dl_argv += 2;
+	  }
+	else if (! strcmp (_dl_argv[1], "--argv0") && _dl_argc > 2)
+	  {
+	    argv0 = _dl_argv[2];
+
 	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
@@ -1292,7 +1302,8 @@  of this helper program; chances are you did not intend to run this program.\n\
   --inhibit-rpath LIST  ignore RUNPATH and RPATH information in object names\n\
 			in LIST\n\
   --audit LIST          use objects named in LIST as auditors\n\
-  --preload LIST        preload objects named in LIST\n");
+  --preload LIST        preload objects named in LIST\n\
+  --argv0 STRING        set argv[0] to STRING before running\n");
 
       ++_dl_skip_args;
       --_dl_argc;
@@ -1384,6 +1395,11 @@  of this helper program; chances are you did not intend to run this program.\n\
 	    break;
 	  }
 #endif
+
+      /* Set the argv[0] string now that we've processed the executable.  */
+      if (argv0 != NULL) {
+		_dl_argv[0] = argv0;
+	  }
     }
   else
     {
diff --git a/elf/tst-rtld-argv0.sh b/elf/tst-rtld-argv0.sh
new file mode 100755
index 0000000000..5f873b6c5c
--- /dev/null
+++ b/elf/tst-rtld-argv0.sh
@@ -0,0 +1,37 @@ 
+#!/bin/sh
+# Test --argv0 argument ld.so.
+# Copyright (C) 2019-2020 Free Software Foundation, Inc.
+# This file is part of the GNU C Library.
+#
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <https://www.gnu.org/licenses/>.
+
+set -e
+
+rtld=$1
+test_program=$2
+test_wrapper_env=$3
+run_program_env=$4
+library_path=$5
+argv0=$6
+
+echo "# [${test_wrapper_env}] [${run_program_env}] [$rtld] [--library-path]" \
+     "[$library_path] [--argv0] [$argv0] [$test_program]"
+${test_wrapper_env} \
+${run_program_env} \
+$rtld --library-path "$library_path" \
+  --argv0 "$argv0" $test_program 2>&1 && rc=0 || rc=$?
+echo "# exit status $rc"
+
+exit $rc