[RFC,7/9] hurd: Generalize init-first.c to support x86_64
Checks
Context |
Check |
Description |
dj/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
Commit Message
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
---
sysdeps/mach/hurd/{i386 => x86}/init-first.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
rename sysdeps/mach/hurd/{i386 => x86}/init-first.c (96%)
Comments
Sergey Bugaev, le sam. 18 févr. 2023 23:37:15 +0300, a ecrit:
> Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
> ---
> sysdeps/mach/hurd/{i386 => x86}/init-first.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
> rename sysdeps/mach/hurd/{i386 => x86}/init-first.c (96%)
>
> diff --git a/sysdeps/mach/hurd/i386/init-first.c b/sysdeps/mach/hurd/x86/init-first.c
> similarity index 96%
> rename from sysdeps/mach/hurd/i386/init-first.c
> rename to sysdeps/mach/hurd/x86/init-first.c
> index a558da16..75ac1ff2 100644
> --- a/sysdeps/mach/hurd/i386/init-first.c
> +++ b/sysdeps/mach/hurd/x86/init-first.c
> @@ -227,10 +227,15 @@ init (int *data)
> values we set just above. We have stashed in %eax the user code
> return address. Push it on the top of the stack so it acts as
> init1's return address, and then jump there. */
> +#ifdef __x86_64__
> +asm ("call_init1:\n"
> + " push %rax\n"
> + " jmp *%rcx\n");
> +#else
> asm ("call_init1:\n"
> " push %eax\n"
> " jmp *%ecx\n");
> -
> +#endif
>
> /* Do the first essential initializations that must precede all else. */
> static inline void
> @@ -242,7 +247,7 @@ first_init (void)
> #ifndef SHARED
> /* In the static case, we need to set up TLS early so that the stack
> protection guard can be read at gs:0x14 by the gcc-generated snippets. */
> - _hurd_tls_init(&__init1_tcbhead);
> + _hurd_tls_init (&__init1_tcbhead);
> asm ("movw %%gs,%w0" : "=m" (__init1_desc));
> #endif
>
> @@ -300,7 +305,7 @@ _hurd_stack_setup (void)
> {
> /* If we use ``__builtin_frame_address (0) + 2'' here, GCC gets
> confused. */
> - init ((int *) &argc);
> + init (&argc);
That won't work on x86_64: there, arguments are passed mostly through
registers, so &argc won't actually give you the address of arguments on
the stack. This can probably be checked on linux x86_64 to make sure how
arguments are passed.
> }
>
> /* Push the user return address after the argument data, and then
> @@ -308,9 +313,15 @@ _hurd_stack_setup (void)
> caller had called `doinit1' with the argument data already on the
> stack. */
> *--data = caller;
> +# ifdef __x86_64__
> + asm volatile ("movq %0, %%rsp\n" /* Switch to new outermost stack. */
> + "movq $0, %%rbp\n" /* Clear outermost frame pointer. */
> + "jmp *%1" : : "r" (data), "r" (&doinit1));
> +# else
> asm volatile ("movl %0, %%esp\n" /* Switch to new outermost stack. */
> "movl $0, %%ebp\n" /* Clear outermost frame pointer. */
> "jmp *%1" : : "r" (data), "r" (&doinit1));
> +# endif
> /* NOTREACHED */
> }
>
> --
> 2.39.2
>
>
On Mon, Feb 20, 2023 at 3:01 AM Samuel Thibault <samuel.thibault@gnu.org> wrote:
> That won't work on x86_64: there, arguments are passed mostly through
> registers, so &argc won't actually give you the address of arguments on
> the stack.
Right, good point.
I wish I had a better understanding of just what's going on in this
file. Maybe a lot of the hacks can be rewritten in a nicer way. For
instance, do we really need to hijack the return addresses and jump to
init1 in this weird way, only to enable it to access argc/arg0? Since
we know where they are on our stack (__builtin_frame_address (0) + 2
or something like that), can't we just pass it a pointer?
Let me actually try just that...
Sergey
> I wish I had a better understanding of just what's going on in this
> file. Maybe a lot of the hacks can be rewritten in a nicer way. For
> instance, do we really need to hijack the return addresses and jump to
> init1 in this weird way, only to enable it to access argc/arg0? Since
> we know where they are on our stack (__builtin_frame_address (0) + 2
> or something like that), can't we just pass it a pointer?
>
> Let me actually try just that...
Eh, no, let's start even earlier than that. Please correct me if I
(inevitably) get things wrong.
_dl_init_first is Hurd-specific. It is called from the assembly code
in dl-machine.h, specifically there's a RTLD_START_SPECIAL_INIT macro
that's defined to call _dl_init_first on the Hurd, and to nothing
otherwise. This RTLD_START_SPECIAL_INIT is used ("invoked") in
i386/dl-machine.h, s390/s390-{32,64}/dl-machine.h, ia64/dl-machine.h,
and alpha/dl-machine.h (but notably not in x86_64/dl-machine.h). In
all cases, it's emphasized that "The special initializer gets called
with the stack just as the application's entry point will see it; it
can switch stacks if it moves these contents over."
But I conclude that:
- s390-gnu, ia64-gnu, and alpha-gnu ports are nonexistent, nor are
they realistically ever going to happen, so we can ignore them
completely
- the implementation does not seem to actually switch stacks (in fact,
I have removed the unused switch_stacks function in the last commit)
-- so the "gets called with the stack just as the application's entry
point will see it" property may not be important anymore?
The only thing it really needs, it seems, is a pointer to
argc/argv/envp & Hurd data block *somewhere*. It does not have to be
on the stack (though where else would it be), or immediately preceding
its call frame -- all that really matters is that there's a pointer.
So my thinking goes, why don't we just hook into _dl_start, which
already has this very pointer? And in fact, _dl_start calls
_dl_sysdep_start, for which there is already a Hurd version. Can't we
just call our logic from there, and not worry about the stack layout
and overwriting return addresses?
That would work for the SHARED case; we also need to do something for
the static case. In that case, we are invoked by static-start.S; do I
understand it right that the argc/argv/whatever is still located
on-stack even on x86_64 in this case, and not passed in registers
according to the calling convention? Then again, we should be able to
just use __builtin_frame_address (0) + 2 and avoid most of the hacks?
Please tell me if this makes any sense.
Sergey
On Sat, Feb 18, 2023 at 2:40 PM Sergey Bugaev via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
> ---
> sysdeps/mach/hurd/{i386 => x86}/init-first.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
> rename sysdeps/mach/hurd/{i386 => x86}/init-first.c (96%)
>
> diff --git a/sysdeps/mach/hurd/i386/init-first.c b/sysdeps/mach/hurd/x86/init-first.c
> similarity index 96%
> rename from sysdeps/mach/hurd/i386/init-first.c
> rename to sysdeps/mach/hurd/x86/init-first.c
> index a558da16..75ac1ff2 100644
> --- a/sysdeps/mach/hurd/i386/init-first.c
> +++ b/sysdeps/mach/hurd/x86/init-first.c
> @@ -227,10 +227,15 @@ init (int *data)
> values we set just above. We have stashed in %eax the user code
> return address. Push it on the top of the stack so it acts as
> init1's return address, and then jump there. */
> +#ifdef __x86_64__
> +asm ("call_init1:\n"
> + " push %rax\n"
> + " jmp *%rcx\n");
> +#else
> asm ("call_init1:\n"
> " push %eax\n"
> " jmp *%ecx\n");
> -
> +#endif
>
> /* Do the first essential initializations that must precede all else. */
> static inline void
> @@ -242,7 +247,7 @@ first_init (void)
> #ifndef SHARED
> /* In the static case, we need to set up TLS early so that the stack
> protection guard can be read at gs:0x14 by the gcc-generated snippets. */
> - _hurd_tls_init(&__init1_tcbhead);
> + _hurd_tls_init (&__init1_tcbhead);
> asm ("movw %%gs,%w0" : "=m" (__init1_desc));
> #endif
>
> @@ -300,7 +305,7 @@ _hurd_stack_setup (void)
> {
> /* If we use ``__builtin_frame_address (0) + 2'' here, GCC gets
> confused. */
> - init ((int *) &argc);
> + init (&argc);
> }
>
> /* Push the user return address after the argument data, and then
> @@ -308,9 +313,15 @@ _hurd_stack_setup (void)
> caller had called `doinit1' with the argument data already on the
> stack. */
> *--data = caller;
> +# ifdef __x86_64__
> + asm volatile ("movq %0, %%rsp\n" /* Switch to new outermost stack. */
> + "movq $0, %%rbp\n" /* Clear outermost frame pointer. */
Unless you are intentionally preserving flags, `xorl %%ebp, %%ebp` is
better way to
zero a register.
> + "jmp *%1" : : "r" (data), "r" (&doinit1));
> +# else
> asm volatile ("movl %0, %%esp\n" /* Switch to new outermost stack. */
> "movl $0, %%ebp\n" /* Clear outermost frame pointer. */
> "jmp *%1" : : "r" (data), "r" (&doinit1));
> +# endif
> /* NOTREACHED */
> }
>
> --
> 2.39.2
>
Sergey Bugaev, le lun. 20 févr. 2023 20:10:29 +0300, a ecrit:
> - the implementation does not seem to actually switch stacks (in fact,
> I have removed the unused switch_stacks function in the last commit)
> -- so the "gets called with the stack just as the application's entry
> point will see it" property may not be important anymore?
That's possible, yes.
> So my thinking goes, why don't we just hook into _dl_start, which
> already has this very pointer? And in fact, _dl_start calls
> _dl_sysdep_start, for which there is already a Hurd version. Can't we
> just call our logic from there, and not worry about the stack layout
> and overwriting return addresses?
Possibly, yes.
> That would work for the SHARED case; we also need to do something for
> the static case. In that case, we are invoked by static-start.S; do I
> understand it right that the argc/argv/whatever is still located
> on-stack even on x86_64 in this case, and not passed in registers
> according to the calling convention?
I don't know. But you can check on Linux, we'll probably want to use the
same ABI.
Samuel
similarity index 96%
rename from sysdeps/mach/hurd/i386/init-first.c
rename to sysdeps/mach/hurd/x86/init-first.c
@@ -227,10 +227,15 @@ init (int *data)
values we set just above. We have stashed in %eax the user code
return address. Push it on the top of the stack so it acts as
init1's return address, and then jump there. */
+#ifdef __x86_64__
+asm ("call_init1:\n"
+ " push %rax\n"
+ " jmp *%rcx\n");
+#else
asm ("call_init1:\n"
" push %eax\n"
" jmp *%ecx\n");
-
+#endif
/* Do the first essential initializations that must precede all else. */
static inline void
@@ -242,7 +247,7 @@ first_init (void)
#ifndef SHARED
/* In the static case, we need to set up TLS early so that the stack
protection guard can be read at gs:0x14 by the gcc-generated snippets. */
- _hurd_tls_init(&__init1_tcbhead);
+ _hurd_tls_init (&__init1_tcbhead);
asm ("movw %%gs,%w0" : "=m" (__init1_desc));
#endif
@@ -300,7 +305,7 @@ _hurd_stack_setup (void)
{
/* If we use ``__builtin_frame_address (0) + 2'' here, GCC gets
confused. */
- init ((int *) &argc);
+ init (&argc);
}
/* Push the user return address after the argument data, and then
@@ -308,9 +313,15 @@ _hurd_stack_setup (void)
caller had called `doinit1' with the argument data already on the
stack. */
*--data = caller;
+# ifdef __x86_64__
+ asm volatile ("movq %0, %%rsp\n" /* Switch to new outermost stack. */
+ "movq $0, %%rbp\n" /* Clear outermost frame pointer. */
+ "jmp *%1" : : "r" (data), "r" (&doinit1));
+# else
asm volatile ("movl %0, %%esp\n" /* Switch to new outermost stack. */
"movl $0, %%ebp\n" /* Clear outermost frame pointer. */
"jmp *%1" : : "r" (data), "r" (&doinit1));
+# endif
/* NOTREACHED */
}