From patchwork Thu Mar 10 01:00:57 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nix X-Patchwork-Id: 11297 Received: (qmail 115857 invoked by alias); 10 Mar 2016 01:01:22 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 115843 invoked by uid 89); 10 Mar 2016 01:01:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, UNPARSEABLE_RELAY autolearn=no version=3.3.2 spammy=remainder, Either, hows, dst X-HELO: aserp1040.oracle.com From: Nix To: Mike Frysinger Cc: libc-alpha@sourceware.org Subject: Re: [PATCH 05/18] Open-code the memcpy() at static TLS initialization time. References: <1457445064-7107-1-git-send-email-nix@esperi.org.uk> <1457445064-7107-6-git-send-email-nix@esperi.org.uk> <20160309224330.GE6588@vapier.lan> Date: Thu, 10 Mar 2016 01:00:57 +0000 In-Reply-To: <20160309224330.GE6588@vapier.lan> (Mike Frysinger's message of "Wed, 9 Mar 2016 17:43:30 -0500") Message-ID: <87y49rkuye.fsf@esperi.org.uk> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 On 9 Mar 2016, Mike Frysinger uttered the following: > On 08 Mar 2016 13:50, Nix wrote: >> This one is a bit nasty. Now that we are initializing TLS earlier for >> the stack canary's sake, existing memcpy() implementations become >> problematic. We can use the multiarch implementations, but they might >> not always be present, and even if they are present they might not always >> be in assembler, so might be compiled with stack-protection. We cannot >> use posix/memcpy.c without marking both it and */wordcopy.c as non-stack- >> protected, which for memcpy() of all things seems like a seriously bad >> idea: if any function in glibc should be stack-protected, it's memcpy() >> (though stack-protecting the many optimized assembly versions is not done >> in this patch series). >> >> So we have two real options: hack up the guts of posix/memcpy.c and >> */wordcopy.c so that they can be #included (renamed and declared static) >> inside libc-tls.c, or simply open-code the memcpy(). For simplicity's >> sake, this patch open-codes it, on the grounds that static binaries are >> relatively rare and quasi-deprecated anyway, and static binaries with >> large TLS sections are yet rarer, and not worth the complexity of hacking >> up all the arch-dependent wordcopy files. > > can we utilize _HAVE_STRING_ARCH_memcpy here ? the string includes will > sometimes provide inlined asm versions ... I think so, *if* we can be sure that arches that define _HAVE_STRING_ARCH_memcpy will never expand memcpy() to macros that sometimes call functions, but only to pure inline asm stuff. Since there is only one definer of that (x86), I'm not sure that's guaranteed... (This does explain why my original implementation, which didn't do any of this, also didn't have any problems on x86!) How's this instead? (Seems to work just as well on x86, which is after all the only arch this change will affect at all.) From 3427487ee08586f76f711758795dba31b62c4238 Mon Sep 17 00:00:00 2001 From: Nick Alcock Date: Tue, 23 Feb 2016 11:08:38 +0000 Subject: [PATCH] Open-code the memcpy() at static TLS initialization time. This one is a bit nasty. Now that we are initializing TLS earlier for the stack canary's sake, existing memcpy() implementations become problematic. We can use the multiarch implementations, but they might not always be present, and even if they are present they might not always be in assembler, so might be compiled with stack-protection. We cannot use posix/memcpy.c without marking both it and */wordcopy.c as non-stack- protected, which for memcpy() of all things seems like a seriously bad idea: if any function in glibc should be stack-protected, it's memcpy() (though stack-protecting the many optimized assembly versions is not done in this patch series). So we have two real options: hack up the guts of posix/memcpy.c and */wordcopy.c so that they can be #included (renamed and declared static) inside libc-tls.c, or simply open-code the memcpy(). For simplicity's sake, this patch open-codes it, on the grounds that static binaries are relatively rare and quasi-deprecated anyway, and static binaries with large TLS sections are yet rarer, and not worth the complexity of hacking up all the arch-dependent wordcopy files. There is one exception: if the arch provides an inline assembler memcpy() implementation, we can use that in preference. (This was not revealed when testing on x86 because on that platform GCC was open-coding the memcpy() for us.) v2: New, lets us remove the memcpy() -fno-stack-protection, which wasn't enough in any case. v4: Add an inhibit_loop_to_libcall to prevent GCC from turning the loop back into a memcpy() again. Wrap long lines. v6: Use the inline assembler ARCH_memcpy if available. * csu/libc-tls.c (__libc_setup_tls): Add inhibit_loop_to_libcall to avoid calls to potentially ifunced or stack-protected memcpy. Open-code the TLS-initialization memcpy. --- csu/libc-tls.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/csu/libc-tls.c b/csu/libc-tls.c index 3d67a64..af0d3e6 100644 --- a/csu/libc-tls.c +++ b/csu/libc-tls.c @@ -102,6 +102,7 @@ init_static_tls (size_t memsz, size_t align) } void +inhibit_loop_to_libcall __libc_setup_tls (size_t tcbsize, size_t tcbalign) { void *tlsblock; @@ -176,8 +177,22 @@ __libc_setup_tls (size_t tcbsize, size_t tcbalign) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" #endif _dl_static_dtv[2].pointer.is_static = true; - /* sbrk gives us zero'd memory, so we don't need to clear the remainder. */ + + /* sbrk gives us zero'd memory, so we don't need to clear the remainder. + + Use inlined asm implementation if available: otherwise, copy by hand, + because memcpy() is stack-protected and is often multiarch too. */ + +#if defined _HAVE_STRING_ARCH_memcpy memcpy (_dl_static_dtv[2].pointer.val, initimage, filesz); +#else + char *dst = (char *) _dl_static_dtv[2].pointer.val; + char *src = (char *) initimage; + size_t i; + + for (i = 0; i < filesz; dst++, src++, i++) + *dst = *src; +#endif /* Install the pointer to the dtv. */