[RFC,09/11] x86-64: add nop instruction after syscall instrunction

Message ID 48e2b8f06d60dbc03b58d44e02af36baa003b952.1568219400.git.isaku.yamahata@gmail.com
State Dropped
Headers

Commit Message

Isaku Yamahata Sept. 11, 2019, 9:04 p.m. UTC
  This patch replaces syscall instruction with syscall + nops with
annotation.
The impact on traditional runtime is, extra nops and list of
syscall instruction.

LibOS hooks system call and redirects the control to it so that it can
handle system call instead of kernel.
The fallback way is trap-and-emulate(e.g. by SIGSYS, SIGILL), but it's
slow.
As optimization syscall instruction is replaced somehow. The approach
can vary among LibOSes.  The common challenges are
  a) identify syscall instruction and
  b) replace syscall with instruction sequence.
x86-64 instruction has variable length and syscall instruciton has 2
bytes. On the other hand 4 byte call/jump requires 5 bytes which
imposes difficulties. (If 8 bytes absolute address jump is wanted,
more space is needed.)

This patch create a list of syscall instructions and adds nops
after syscall instruction to keep enough room for binary editing
without fragile complex tricks.

The assumed instruction sequence to replace syscall instrction
is as follows. But LibOSes can do whatever they want.
Notice that Linux x86-64 syscall ABI is stricter than normal function
call convention.
(%rcx, %r11 clobbered, %rflags preserved, redzone can't be used.)
If we can relax it, those snippets can be optimized/shortened.
Actually almost all the callers of syscall instruction allow
the use of redzone, %rflags clobbered.
For now those sequence is chosen to minimize glibc impact.

syscall sequence:
> syscall
> nop; nop; ... (add enough room for binary editing)

replacing sequence:
>  leaq 1f(%rip), %rcx
>  jmp syscall_func
>  1f:
>
> 48 8d 0d 06 00 00 00         leaq   0x6(%rip),%rcx
> e9 00 00 00 00               jmpq   0x79
>          R_X86_64_PC32       syscall_func-0x4

the callee function looks something like this.
save %rflags, reserve redzone and call LibOS entry point,
restore redzone, restore %rflags and jump back to the caller.
> syscall_func:
>         xchgq %r11, -8(%rsp)
>         pushfq
>         xchgq %r11, (%rsp)
>         subq $120, %rsp
>         pushq %r11
>         pushq %rcx
>         callq <LibOS entry point>
>         popq %rcx
>         popq %r11
>         addq $120, %rsp
>         xchgq %r11, (%rsp)
>         popfq
>         xchgq %r11, -8(%rsp)
>         jmpq *%rcx

Signed-off-by: Isaku Yamahata <isaku.yamahata@gmail.com>
---
 sysdeps/unix/sysv/linux/x86_64/sysdep.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)
  

Patch

diff --git a/sysdeps/unix/sysv/linux/x86_64/sysdep.h b/sysdeps/unix/sysv/linux/x86_64/sysdep.h
index 4f1aab7209..d958c1ca7a 100644
--- a/sysdeps/unix/sysv/linux/x86_64/sysdep.h
+++ b/sysdeps/unix/sysv/linux/x86_64/sysdep.h
@@ -27,9 +27,28 @@ 
 #include <dl-sysdep.h>
 
 #ifdef __ASSEMBLER__
-# define SYSCALL_INST syscall
+.macro SYSCALL_INST
+    551:
+    syscall
+    nop;nop;nop;nop;nop;nop;nop;nop;nop;nop
+    552:
+    .pushsection .libos.instructions.syscall, "a"
+    .balign 8
+    .quad 551b
+    .byte 552b - 551b
+    .popsection
+.endm
 #else
-# define SYSCALL_INST "syscall\n\t"
+#define SYSCALL_INST                                        \
+    "551:\n\t"                                              \
+    "syscall\n\t"                                           \
+    "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop\n\t"           \
+    "552:\n\t"                                              \
+    ".pushsection .libos.instructions.syscall, \"a\"\n\t"   \
+    ".balign 8\n\t"                                         \
+    ".quad 551b\n\t"                                        \
+    ".byte 552b-551b\n\t"                                   \
+    ".popsection\n\t"
 #endif
 
 /* For Linux we can use the system call table in the header file