From patchwork Wed Nov 18 14:47:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Castanho X-Patchwork-Id: 41096 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E03123987447; Wed, 18 Nov 2020 14:47:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E03123987447 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1605710837; bh=bgHZz398oR69prN7NBDXFeoZ/SfQtLG7lLu/rliBPNs=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=luBWD5EgawajyDM2fpmq5/X+cqeLR+St22EShmQhkiM/BB4vaa8pj1aMnnQIAxFFK VUe0WBTaFCjrLRZ0X+ID4RLbaGkoYsIeyhvGOGt48cqd9WLbsReKH+SwPTBvvbhfCu /+QOtWp6G3rHdDGB0kSpoVxk4qlvuxB82pePuqeE= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 0C4043987502 for ; Wed, 18 Nov 2020 14:47:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0C4043987502 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0AIEYdlr110153 for ; Wed, 18 Nov 2020 09:47:14 -0500 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com with ESMTP id 34w4sha227-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 18 Nov 2020 09:47:14 -0500 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0AIEh1u8007890 for ; Wed, 18 Nov 2020 14:47:13 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma02wdc.us.ibm.com with ESMTP id 34vfja05q2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 18 Nov 2020 14:47:13 +0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0AIElCj463439170 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 18 Nov 2020 14:47:12 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 78903C6057; Wed, 18 Nov 2020 14:47:12 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E4765C6055; Wed, 18 Nov 2020 14:47:11 +0000 (GMT) Received: from localhost (unknown [9.160.22.117]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 18 Nov 2020 14:47:11 +0000 (GMT) To: libc-alpha@sourceware.org Subject: [PATCH 3/4] powerpc: Runtime selection between sc and scv for syscalls Date: Wed, 18 Nov 2020 11:47:02 -0300 Message-Id: <20201118144703.75569-4-msc@linux.ibm.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201118144703.75569-1-msc@linux.ibm.com> References: <20201118144703.75569-1-msc@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312, 18.0.737 definitions=2020-11-18_04:2020-11-17, 2020-11-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 malwarescore=0 spamscore=0 priorityscore=1501 adultscore=0 bulkscore=0 mlxlogscore=999 clxscore=1015 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2011180100 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Matheus Castanho via Libc-alpha From: Matheus Castanho Reply-To: Matheus Castanho Cc: tuliom@linux.ibm.com Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" Linux kernel v5.9 added support for system calls using the scv instruction for POWER9 and later. The new codepath provides better performance (see below) if compared to using sc. For the foreseeable future, both sc and scv mechanisms will co-exist, so this patch enables glibc to do a runtime check and always use scv when it is available. Before issuing the system call to the kernel, we check hwcap2 in the TCB for PPC_FEATURE2_SCV to see if scv is supported by the kernel. If not, we fallback to sc and keep the old behavior. The kernel implements a different error return convention for scv, so when returning from a system call we need to handle the return value differently depending on the instruction we used to enter the kernel. For syscalls implemented in ASM, entry and exit are implemented by different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in sequence (e.g. for templated syscalls) or with other instructions in between (e.g. clone). To avoid accessing the TCB a second time on PSEUDO_RET to check which instruction we used, the value read from hwcap2 is cached on a non-volatile register. This is not needed when using INTERNAL_SYSCALL macro, since entry and exit are bundled into the same inline asm directive. Since system calls may be called before the TCB has been setup (e.g. inside the dynamic loader), we also check the value of the thread pointer before effectively accessing the TCB. For such situations in which the availability of scv cannot be determined, sc is always used. Support for scv in syscalls implemented in their own ASM file (clone and vfork) will be added later. For now simply use sc as before. Average performance over 1M calls for each syscall "type": - stat: C wrapper calling INTERNAL_SYSCALL - getpid: templated ASM syscall - syscall: call to gettid using syscall function Standard: stat : 1.573445 us / ~3619 cycles getpid : 0.164986 us / ~379 cycles syscall : 0.162743 us / ~374 cycles With scv: stat : 1.537049 us / ~3535 cycles <~ -84 cycles / -2.32% getpid : 0.109923 us / ~253 cycles <~ -126 cycles / -33.25% syscall : 0.116410 us / ~268 cycles <~ -106 cycles / -28.34% Tested on powerpc, powerpc64, powerpc64le (with and without scv) --- sysdeps/powerpc/powerpc32/sysdep.h | 19 ++-- sysdeps/powerpc/powerpc64/sysdep.h | 90 ++++++++++++++++++- .../unix/sysv/linux/powerpc/powerpc64/clone.S | 9 +- .../unix/sysv/linux/powerpc/powerpc64/vfork.S | 6 +- sysdeps/unix/sysv/linux/powerpc/syscall.S | 11 ++- sysdeps/unix/sysv/linux/powerpc/sysdep.h | 78 +++++++++++----- 6 files changed, 174 insertions(+), 39 deletions(-) diff --git a/sysdeps/powerpc/powerpc32/sysdep.h b/sysdeps/powerpc/powerpc32/sysdep.h index 829eec266a..bff18bdc8b 100644 --- a/sysdeps/powerpc/powerpc32/sysdep.h +++ b/sysdeps/powerpc/powerpc32/sysdep.h @@ -90,9 +90,12 @@ GOT_LABEL: ; \ cfi_endproc; \ ASM_SIZE_DIRECTIVE(name) -#define DO_CALL(syscall) \ - li 0,syscall; \ - sc +#define DO_CALL(syscall) \ + li 0,syscall; \ + DO_CALL_SC + +#define DO_CALL_SC \ + sc #undef JUMPTARGET #ifdef PIC @@ -106,14 +109,20 @@ GOT_LABEL: ; \ # define HIDDEN_JUMPTARGET(name) __GI_##name##@local #endif +#define TAIL_CALL_SYSCALL_ERROR \ + b __syscall_error@local + #define PSEUDO(name, syscall_name, args) \ .section ".text"; \ ENTRY (name) \ DO_CALL (SYS_ify (syscall_name)); +#define RET_SC \ + bnslr+; + #define PSEUDO_RET \ - bnslr+; \ - b __syscall_error@local + RET_SC; \ + TAIL_CALL_SYSCALL_ERROR #define ret PSEUDO_RET #undef PSEUDO_END diff --git a/sysdeps/powerpc/powerpc64/sysdep.h b/sysdeps/powerpc/powerpc64/sysdep.h index d557098898..2d7dde64da 100644 --- a/sysdeps/powerpc/powerpc64/sysdep.h +++ b/sysdeps/powerpc/powerpc64/sysdep.h @@ -17,6 +17,7 @@ . */ #include +#include #ifdef __ASSEMBLER__ @@ -263,10 +264,72 @@ LT_LABELSUFFIX(name,_name_end): ; \ TRACEBACK_MASK(name,mask); \ END_2(name) +/* We will allocate a new frame to save LR and the non-volatile register used to + read the TCB when checking for scv support on syscall code. We actually just + need the minimum frame size plus room for 1 reg (64 bits). But the ABI + mandates stack frames should be aligned at 16 Bytes, so we end up allocating + a bit more space then what will actually be used. */ +#define SCV_FRAME_SIZE (FRAME_MIN_SIZE+16) +#define SCV_FRAME_NVOLREG_SAVE FRAME_MIN_SIZE + +/* Allocate frame and save register */ +#define NVOLREG_SAVE \ + stdu r1,-SCV_FRAME_SIZE(r1); \ + std r31,SCV_FRAME_NVOLREG_SAVE(r1); \ + cfi_adjust_cfa_offset(SCV_FRAME_SIZE); + +/* Restore register and destroy frame */ +#define NVOLREG_RESTORE \ + ld r31,SCV_FRAME_NVOLREG_SAVE(r1); \ + addi r1,r1,SCV_FRAME_SIZE; \ + cfi_adjust_cfa_offset(-SCV_FRAME_SIZE); + +/* Check PPC_FEATURE2_SCV bit from hwcap2 in the TCB and update CR0 + * accordingly. First, we check if the thread pointer != 0, so we don't try to + * access the TCB before it has been initialized, e.g. inside the dynamic + * loader. If it is already initialized, check if scv is available. On both + * negative cases, go to JUMPFALSE (label given by the macro's caller). We + * save the value we read from the TCB in a non-volatile register so we can + * reuse it later when exiting from the syscall in PSEUDO_RET. */ + .macro CHECK_SCV_SUPPORT REG JUMPFALSE + + /* Check if thread pointer has already been setup */ + cmpdi r13,0 + beq \JUMPFALSE + + /* Read PPC_FEATURE2_SCV from TCB and store it in REG */ + ld \REG,TCB_HWCAP(PT_THREAD_POINTER) + andis. \REG,\REG,PPC_FEATURE2_SCV>>16 + + beq \JUMPFALSE + .endm + +/* Before doing the syscall, check if we can use scv. scv is supported by P9 + * and later with Linux v5.9 and later. If so, use it. Otherwise, fallback to + * sc. We use a non-volatile register to save hwcap2 from the TCB, so we need + * to save its content beforehand. */ #define DO_CALL(syscall) \ - li 0,syscall; \ + li r0,syscall; \ + NVOLREG_SAVE; \ + CHECK_SCV_SUPPORT r31 0f; \ + DO_CALL_SCV; \ + b 1f; \ +0: DO_CALL_SC; \ +1: + +/* DO_CALL_SC and DO_CALL_SCV expect the syscall number to be loaded on r0. */ +#define DO_CALL_SC \ sc +#define DO_CALL_SCV \ + mflr r9; \ + std r9,FRAME_LR_SAVE(r1); \ + cfi_offset(lr,FRAME_LR_SAVE); \ + scv 0; \ + ld r9,FRAME_LR_SAVE(r1); \ + mtlr r9; \ + cfi_restore(lr); + /* ppc64 is always PIC */ #undef JUMPTARGET #define JUMPTARGET(name) FUNC_LABEL(name) @@ -304,9 +367,26 @@ LT_LABELSUFFIX(name,_name_end): ; \ .endif #endif +/* This should only be called after a DO_CALL. In such cases, r31 contains the + * value of PPC_FEATURE2_SCV read from hwcap2 by CHECK_SCV_SUPPORT. If it is + * set, we know we have entered the kernel using scv, so handle the return code + * accordingly. */ #define PSEUDO_RET \ - bnslr+; \ - TAIL_CALL_SYSCALL_ERROR + cmpdi cr5,r31,0; \ + NVOLREG_RESTORE; \ + beq cr5,0f; \ + RET_SCV; \ + b 1f; \ +0: RET_SC; \ +1: TAIL_CALL_SYSCALL_ERROR + +#define RET_SCV \ + cmpdi r3,0; \ + bgelr+; \ + neg r3,r3; + +#define RET_SC \ + bnslr+; #define ret PSEUDO_RET @@ -319,7 +399,9 @@ LT_LABELSUFFIX(name,_name_end): ; \ ENTRY (name); \ DO_CALL (SYS_ify (syscall_name)) +/* This should only be called after a DO_CALL. */ #define PSEUDO_RET_NOERRNO \ + NVOLREG_RESTORE; \ blr #define ret_NOERRNO PSEUDO_RET_NOERRNO @@ -333,7 +415,9 @@ LT_LABELSUFFIX(name,_name_end): ; \ ENTRY (name); \ DO_CALL (SYS_ify (syscall_name)) +/* This should only be called after a DO_CALL. */ #define PSEUDO_RET_ERRVAL \ + NVOLREG_RESTORE; \ blr #define ret_ERRVAL PSEUDO_RET_ERRVAL diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S index b30641c805..fc496fa671 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S @@ -68,7 +68,8 @@ ENTRY (__clone) cfi_endproc /* Do the call. */ - DO_CALL(SYS_ify(clone)) + li r0,SYS_ify(clone) + DO_CALL_SC /* Check for child process. */ cmpdi cr1,r3,0 @@ -82,7 +83,8 @@ ENTRY (__clone) bctrl ld r2,FRAME_TOC_SAVE(r1) - DO_CALL(SYS_ify(exit)) + li r0,(SYS_ify(exit)) + DO_CALL_SC /* We won't ever get here but provide a nop so that the linker will insert a toc adjusting stub if necessary. */ nop @@ -104,7 +106,8 @@ L(parent): cfi_restore(r30) cfi_restore(r31) - PSEUDO_RET + RET_SC + TAIL_CALL_SYSCALL_ERROR END (__clone) diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S b/sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S index 17199fb14a..a71f69e929 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S @@ -28,9 +28,11 @@ ENTRY (__vfork) CALL_MCOUNT 0 - DO_CALL (SYS_ify (vfork)) + li r0,SYS_ify (vfork) + DO_CALL_SC - PSEUDO_RET + RET_SC + TAIL_CALL_SYSCALL_ERROR PSEUDO_END (__vfork) libc_hidden_def (__vfork) diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S index 48dade4642..23ce2f69c9 100644 --- a/sysdeps/unix/sysv/linux/powerpc/syscall.S +++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S @@ -25,6 +25,13 @@ ENTRY (syscall) mr r6,r7 mr r7,r8 mr r8,r9 - sc - PSEUDO_RET +#if defined(__PPC64__) || defined(__powerpc64__) + CHECK_SCV_SUPPORT r9 0f + DO_CALL_SCV + RET_SCV + b 1f +#endif +0: DO_CALL_SC + RET_SC +1: TAIL_CALL_SYSCALL_ERROR PSEUDO_END (syscall) diff --git a/sysdeps/unix/sysv/linux/powerpc/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/sysdep.h index b2bca598b9..19f4321c6b 100644 --- a/sysdeps/unix/sysv/linux/powerpc/sysdep.h +++ b/sysdeps/unix/sysv/linux/powerpc/sysdep.h @@ -64,39 +64,69 @@ #define INTERNAL_VSYSCALL_CALL(funcptr, nr, args...) \ INTERNAL_VSYSCALL_CALL_TYPE(funcptr, long int, nr, args) +#define DECLARE_REGS \ + register long int r0 __asm__ ("r0"); \ + register long int r3 __asm__ ("r3"); \ + register long int r4 __asm__ ("r4"); \ + register long int r5 __asm__ ("r5"); \ + register long int r6 __asm__ ("r6"); \ + register long int r7 __asm__ ("r7"); \ + register long int r8 __asm__ ("r8"); + +#define SYSCALL_SCV(nr) \ + ({ \ + __asm__ __volatile__ \ + ("scv 0\n\t" \ + "0:" \ + : "=&r" (r0), \ + "=&r" (r3), "=&r" (r4), "=&r" (r5), \ + "=&r" (r6), "=&r" (r7), "=&r" (r8) \ + : ASM_INPUT_##nr \ + : "r9", "r10", "r11", "r12", \ + "lr", "ctr", "memory"); \ + r3; \ + }) -#undef INTERNAL_SYSCALL -#define INTERNAL_SYSCALL_NCS(name, nr, args...) \ - ({ \ - register long int r0 __asm__ ("r0"); \ - register long int r3 __asm__ ("r3"); \ - register long int r4 __asm__ ("r4"); \ - register long int r5 __asm__ ("r5"); \ - register long int r6 __asm__ ("r6"); \ - register long int r7 __asm__ ("r7"); \ - register long int r8 __asm__ ("r8"); \ - LOADARGS_##nr (name, ##args); \ - __asm__ __volatile__ \ - ("sc\n\t" \ - "mfcr %0\n\t" \ - "0:" \ - : "=&r" (r0), \ - "=&r" (r3), "=&r" (r4), "=&r" (r5), \ - "=&r" (r6), "=&r" (r7), "=&r" (r8) \ - : ASM_INPUT_##nr \ - : "r9", "r10", "r11", "r12", \ - "cr0", "ctr", "memory"); \ - r0 & (1 << 28) ? -r3 : r3; \ +#define SYSCALL_SC(nr) \ + ({ \ + __asm__ __volatile__ \ + ("sc\n\t" \ + "mfcr %0\n\t" \ + "0:" \ + : "=&r" (r0), \ + "=&r" (r3), "=&r" (r4), "=&r" (r5), \ + "=&r" (r6), "=&r" (r7), "=&r" (r8) \ + : ASM_INPUT_##nr \ + : "r9", "r10", "r11", "r12", \ + "cr0", "ctr", "memory"); \ + r0 & (1 << 28) ? -r3 : r3; \ }) -#define INTERNAL_SYSCALL(name, nr, args...) \ - INTERNAL_SYSCALL_NCS (__NR_##name, nr, args) #if defined(__PPC64__) || defined(__powerpc64__) # define SYSCALL_ARG_SIZE 8 + +# define INTERNAL_SYSCALL_NCS(name, nr, args...) \ + ({ \ + DECLARE_REGS; \ + LOADARGS_##nr (name, ##args); \ + __thread_register != 0 && THREAD_GET_HWCAP() & PPC_FEATURE2_SCV ? \ + SYSCALL_SCV(nr) : SYSCALL_SC(nr); \ + }) #else # define SYSCALL_ARG_SIZE 4 + +# define INTERNAL_SYSCALL_NCS(name, nr, args...) \ + ({ \ + DECLARE_REGS; \ + LOADARGS_##nr (name, ##args); \ + SYSCALL_SC(nr); \ + }) #endif +#undef INTERNAL_SYSCALL +#define INTERNAL_SYSCALL(name, nr, args...) \ + INTERNAL_SYSCALL_NCS (__NR_##name, nr, args) + #define LOADARGS_0(name, dummy) \ r0 = name #define LOADARGS_1(name, __arg1) \