Message ID | 20150814121907.GF28610@gmail.com |
---|---|
State | Dropped |
Headers |
Received: (qmail 28859 invoked by alias); 14 Aug 2015 12:19:15 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 28849 invoked by uid 89); 14 Aug 2015 12:19:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.2 required=5.0 tests=AWL, BAYES_50, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f169.google.com X-Received: by 10.70.98.163 with SMTP id ej3mr80189143pdb.61.1439554748613; Fri, 14 Aug 2015 05:19:08 -0700 (PDT) Date: Fri, 14 Aug 2015 05:19:07 -0700 From: "H.J. Lu" <hjl.tools@gmail.com> To: GNU C Library <libc-alpha@sourceware.org> Subject: [PATCH 6/6] Compile sched_getaffinity.c with -fno-builtin-memset Message-ID: <20150814121907.GF28610@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) |
Commit Message
H.J. Lu
Aug. 14, 2015, 12:19 p.m. UTC
Since sched_getaffinity.c calls memset which may not be inlined, we should compile it with -fno-builtin-memset so that the internal hidden memset is called without PLT. OK for master? * sysdeps/unix/sysv/linux/Makefile (CFLAGS-sched_getaffinity.c): New. Set to -fno-builtin-memset. --- sysdeps/unix/sysv/linux/Makefile | 2 ++ 1 file changed, 2 insertions(+)
Comments
On Fri, 14 Aug 2015, H.J. Lu wrote: > Since sched_getaffinity.c calls memset which may not be inlined, we > should compile it with -fno-builtin-memset so that the internal hidden > memset is called without PLT. > > OK for master? memset is called all over the place. I don't like hardcoding special options for particular files based on calls to memset, and especially not based on what some particular compiler version happens to do with a particular call. Why doesn't the call to libc_hidden_builtin_proto (memset) in include/string.h suffice? Can you devise some set of declarations / toplevel asms to put in a header that will ensure memset calls go via a hidden alias (ideally with the compiler knowing it's hidden, not just the assembler / linker, so the compiler can do any optimizations based on it being an intra-library call)?
On Fri, Aug 14, 2015 at 8:27 AM, Joseph Myers <joseph@codesourcery.com> wrote: > On Fri, 14 Aug 2015, H.J. Lu wrote: > >> Since sched_getaffinity.c calls memset which may not be inlined, we >> should compile it with -fno-builtin-memset so that the internal hidden >> memset is called without PLT. >> >> OK for master? > > memset is called all over the place. I don't like hardcoding special > options for particular files based on calls to memset, and especially not > based on what some particular compiler version happens to do with a > particular call. Why doesn't the call to libc_hidden_builtin_proto > (memset) in include/string.h suffice? Can you devise some set of > declarations / toplevel asms to put in a header that will ensure memset > calls go via a hidden alias (ideally with the compiler knowing it's > hidden, not just the assembler / linker, so the compiler can do any > optimizations based on it being an intra-library call)? libc_hidden_builtin_proto (memset) has no impact on memset inlined by compiler.
On Fri, 14 Aug 2015, H.J. Lu wrote: > On Fri, Aug 14, 2015 at 8:27 AM, Joseph Myers <joseph@codesourcery.com> wrote: > > On Fri, 14 Aug 2015, H.J. Lu wrote: > > > >> Since sched_getaffinity.c calls memset which may not be inlined, we > >> should compile it with -fno-builtin-memset so that the internal hidden > >> memset is called without PLT. > >> > >> OK for master? > > > > memset is called all over the place. I don't like hardcoding special > > options for particular files based on calls to memset, and especially not > > based on what some particular compiler version happens to do with a > > particular call. Why doesn't the call to libc_hidden_builtin_proto > > (memset) in include/string.h suffice? Can you devise some set of > > declarations / toplevel asms to put in a header that will ensure memset > > calls go via a hidden alias (ideally with the compiler knowing it's > > hidden, not just the assembler / linker, so the compiler can do any > > optimizations based on it being an intra-library call)? > > libc_hidden_builtin_proto (memset) > > has no impact on memset inlined by compiler. Well, I see references to __GI_memset, not memset, in sched_getaffinity.os for both x86 and x86_64 (building with GCC 5). So it seems to be working OK for me. It's true some of those references use PLT-generating relocations, but those relocations shouldn't actually generate PLT entries when they are relocations against symbols not exported from the library - the linker should convert them to direct calls. If they do generate PLT entries, it sounds like a linker optimization is missing. Apart from any missing linker optimizations, the compiler should know when calls are to hidden functions as calls to those may be more efficient than calls that only the linker determines don't need a PLT entry. On 32-bit x86, that means the compiler knowing it doesn't need to set %ebx up to point to the GOT as it would for a call through the PLT. I don't know whether this gets optimized for built-in function calls using libc_hidden_builtin_proto, but if it doesn't, that seems like a missed optimization that should be addressed in the compiler not in glibc.
On Fri, Aug 14, 2015 at 8:46 AM, Joseph Myers <joseph@codesourcery.com> wrote: > On Fri, 14 Aug 2015, H.J. Lu wrote: > >> On Fri, Aug 14, 2015 at 8:27 AM, Joseph Myers <joseph@codesourcery.com> wrote: >> > On Fri, 14 Aug 2015, H.J. Lu wrote: >> > >> >> Since sched_getaffinity.c calls memset which may not be inlined, we >> >> should compile it with -fno-builtin-memset so that the internal hidden >> >> memset is called without PLT. >> >> >> >> OK for master? >> > >> > memset is called all over the place. I don't like hardcoding special >> > options for particular files based on calls to memset, and especially not >> > based on what some particular compiler version happens to do with a >> > particular call. Why doesn't the call to libc_hidden_builtin_proto >> > (memset) in include/string.h suffice? Can you devise some set of >> > declarations / toplevel asms to put in a header that will ensure memset >> > calls go via a hidden alias (ideally with the compiler knowing it's >> > hidden, not just the assembler / linker, so the compiler can do any >> > optimizations based on it being an intra-library call)? >> >> libc_hidden_builtin_proto (memset) >> >> has no impact on memset inlined by compiler. > > Well, I see references to __GI_memset, not memset, in sched_getaffinity.os > for both x86 and x86_64 (building with GCC 5). So it seems to be working > OK for me. It's true some of those references use PLT-generating Yes, it does reference __GI_memset, but via PLT, since GCC doesn't know the libcall for the builtin memset is hidden. > relocations, but those relocations shouldn't actually generate PLT entries > when they are relocations against symbols not exported from the library - > the linker should convert them to direct calls. If they do generate PLT > entries, it sounds like a linker optimization is missing. > > Apart from any missing linker optimizations, the compiler should know when Linker isn't an issue. > calls are to hidden functions as calls to those may be more efficient than > calls that only the linker determines don't need a PLT entry. On 32-bit > x86, that means the compiler knowing it doesn't need to set %ebx up to %ebx is the issue. > point to the GOT as it would for a call through the PLT. I don't know > whether this gets optimized for built-in function calls using > libc_hidden_builtin_proto, but if it doesn't, that seems like a missed > optimization that should be addressed in the compiler not in glibc. > It is hard to say it is a GCC issue when we are kind of doing this behind the back of GCC unless there is a way to tell GCC that the libcall of a builtin function is hidden.
On Fri, 14 Aug 2015, H.J. Lu wrote: > > point to the GOT as it would for a call through the PLT. I don't know > > whether this gets optimized for built-in function calls using > > libc_hidden_builtin_proto, but if it doesn't, that seems like a missed > > optimization that should be addressed in the compiler not in glibc. > > > > It is hard to say it is a GCC issue when we are kind of doing this behind > the back of GCC unless there is a way to tell GCC that the libcall of > a builtin function is hidden. We're declaring (essentially, though a load of macros): extern __typeof (memset) memset __asm__ ("__GI_memset") __attribute__ ((visibility ("hidden"))); GCC is using the __asm__ information to compile calls to __builtin_memset to use the assembler name __GI_memset - that is, in that regard our declaration is being taken as applying to the libcall. If it's not also using the visibility information to know that those are calls within the library and don't need to set up %ebx, that's a clear inconsistency; all pieces of information from an explicit declaration of a built-in function should be treated the same unless there is a clear reason not to do so.
On Fri, Aug 14, 2015 at 9:02 AM, Joseph Myers <joseph@codesourcery.com> wrote: > On Fri, 14 Aug 2015, H.J. Lu wrote: > >> > point to the GOT as it would for a call through the PLT. I don't know >> > whether this gets optimized for built-in function calls using >> > libc_hidden_builtin_proto, but if it doesn't, that seems like a missed >> > optimization that should be addressed in the compiler not in glibc. >> > >> >> It is hard to say it is a GCC issue when we are kind of doing this behind >> the back of GCC unless there is a way to tell GCC that the libcall of >> a builtin function is hidden. > > We're declaring (essentially, though a load of macros): > > extern __typeof (memset) memset __asm__ ("__GI_memset") __attribute__ ((visibility ("hidden"))); > > GCC is using the __asm__ information to compile calls to __builtin_memset > to use the assembler name __GI_memset - that is, in that regard our > declaration is being taken as applying to the libcall. If it's not also > using the visibility information to know that those are calls within the > library and don't need to set up %ebx, that's a clear inconsistency; all > pieces of information from an explicit declaration of a built-in function > should be treated the same unless there is a clear reason not to do so. > Should I open a GCC bug?
On Fri, 14 Aug 2015, H.J. Lu wrote: > > We're declaring (essentially, though a load of macros): > > > > extern __typeof (memset) memset __asm__ ("__GI_memset") __attribute__ ((visibility ("hidden"))); > > > > GCC is using the __asm__ information to compile calls to __builtin_memset > > to use the assembler name __GI_memset - that is, in that regard our > > declaration is being taken as applying to the libcall. If it's not also > > using the visibility information to know that those are calls within the > > library and don't need to set up %ebx, that's a clear inconsistency; all > > pieces of information from an explicit declaration of a built-in function > > should be treated the same unless there is a clear reason not to do so. > > > > Should I open a GCC bug? If you wish (and have verified that %ebx is being unnecessarily set up). It's a missed optimization rather than an actual bug.
On Fri, Aug 14, 2015 at 9:11 AM, Joseph Myers <joseph@codesourcery.com> wrote: > On Fri, 14 Aug 2015, H.J. Lu wrote: > >> > We're declaring (essentially, though a load of macros): >> > >> > extern __typeof (memset) memset __asm__ ("__GI_memset") __attribute__ ((visibility ("hidden"))); >> > >> > GCC is using the __asm__ information to compile calls to __builtin_memset >> > to use the assembler name __GI_memset - that is, in that regard our >> > declaration is being taken as applying to the libcall. If it's not also >> > using the visibility information to know that those are calls within the >> > library and don't need to set up %ebx, that's a clear inconsistency; all >> > pieces of information from an explicit declaration of a built-in function >> > should be treated the same unless there is a clear reason not to do so. >> > >> >> Should I open a GCC bug? > > If you wish (and have verified that %ebx is being unnecessarily set up). > It's a missed optimization rather than an actual bug. I withdrew my patch and opened: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67220
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile index bfbabd4..368f57b 100644 --- a/sysdeps/unix/sysv/linux/Makefile +++ b/sysdeps/unix/sysv/linux/Makefile @@ -144,6 +144,8 @@ tests += tst-getcpu CFLAGS-fork.c = $(libio-mtsafe) CFLAGS-getpid.o = -fomit-frame-pointer CFLAGS-getpid.os = -fomit-frame-pointer +# Use the internal hidden memset. +CFLAGS-sched_getaffinity.c = -fno-builtin-memset endif ifeq ($(subdir),inet)