Do not transform strchr into rawmemchr

Message ID AM5PR0802MB26101582FEA5EBF10B7CE8F583BE0@AM5PR0802MB2610.eurprd08.prod.outlook.com
State Superseded
Headers

Commit Message

Wilco Dijkstra Nov. 16, 2016, 6:59 p.m. UTC
  GLIBC uses strchr (s, '\0') as an idiom to find the end of a string.
This is transformed into rawmemchr by the bits/string2.h header.
However this is generally slower than strlen on most targets, even when
an optimized rawmemchr implementation exists.  Since GCC7 optimizes
strchr (s, '\0') to strlen (s) + s, the GLIBC headers should not
transform this to rawmemchr.

GLIBC tests pass, OK for commit?

ChangeLog:
2015-11-16  Wilco Dijkstra  <wdijkstr@arm.com>

        * string/bits/string2.h (strchr): Use __builtin_strchr.
--
  

Comments

Zack Weinberg Nov. 16, 2016, 7:02 p.m. UTC | #1
On Wed, Nov 16, 2016 at 1:59 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> GLIBC uses strchr (s, '\0') as an idiom to find the end of a string.
> This is transformed into rawmemchr by the bits/string2.h header.
> However this is generally slower than strlen on most targets, even when
> an optimized rawmemchr implementation exists.  Since GCC7 optimizes
> strchr (s, '\0') to strlen (s) + s, the GLIBC headers should not
> transform this to rawmemchr.

I endorse the removal of the non-optimization, but ...

>  #ifndef _HAVE_STRING_ARCH_strchr
> -extern void *__rawmemchr (const void *__s, int __c);
> -#  define strchr(s, c) \
> -  (__extension__ (__builtin_constant_p (c) && !__builtin_constant_p (s)              \
> -                 && (c) == '\0'                                       \
> -                 ? (char *) __rawmemchr (s, c)                                \
> -                 : __builtin_strchr (s, c)))
> +# define strchr(s, c) __builtin_strchr (s, c)
>  #endif

... wouldn't it be just as effective to remove this block entirely?
That is, don't #define strchr at all.

zw
  
Wilco Dijkstra Nov. 17, 2016, 11:22 a.m. UTC | #2
Zack Weinberg wrote:

> I endorse the removal of the non-optimization, but ...

> ... wouldn't it be just as effective to remove this block entirely?
> That is, don't #define strchr at all.

Well that's a good question. Most string functions directly use the GCC
builtin either via a define or via an inline function, so I simply follow that
convention. 

However is there any benefit in doing so? If there is no benefit then we could
remove a lot of code from the GLIBC headers, particularly string.h (and your
covariance patch could become even simpler without the inline functions).

Wilco
  
Zack Weinberg Nov. 17, 2016, 1:54 p.m. UTC | #3
On 11/17/2016 06:22 AM, Wilco Dijkstra wrote:
> Zack Weinberg wrote:
> 
>> I endorse the removal of the non-optimization, but ...
> 
>> ... wouldn't it be just as effective to remove this block entirely?
>> That is, don't #define strchr at all.
> 
> Well that's a good question. Most string functions directly use the GCC
> builtin either via a define or via an inline function, so I simply follow that
> convention. 
> 
> However is there any benefit in doing so? If there is no benefit then we could
> remove a lot of code from the GLIBC headers, particularly string.h (and your
> covariance patch could become even simpler without the inline functions).

Funny you should mention that; I'm right now working on a branch that
completely removes both bits/string.h and bits/string2.h -- not because
I want to actually do that (though I'm hoping we can do _most_ of it),
but because I think it will be handy to have around for experiments to
find out whether the existing string inlines are actually any use.

Regarding __builtin_strfoo versus strfoo, I _think_ an inline forward to
__builtin_strfoo does something useful when either the name or the type
signature of 'strfoo' doesn't match what the compiler expects.  So, for
instance, we will want them in the covariance case as long as the C++
compiler hasn't had its builtins updated to know that there are now two
overloads of strwhatever -- but then they become unnecessary.  On the
other hand, in the strchr/rawmemchr case, the basic prototype for strchr
should be enough to clue the compiler that it knows what this function
does, with no further prompting.  I could be wrong about this.

cc: Joseph: do you know whether my understanding is accurate?

zw
  
Joseph Myers Nov. 17, 2016, 2:38 p.m. UTC | #4
On Thu, 17 Nov 2016, Zack Weinberg wrote:

> Regarding __builtin_strfoo versus strfoo, I _think_ an inline forward to
> __builtin_strfoo does something useful when either the name or the type
> signature of 'strfoo' doesn't match what the compiler expects.  So, for
> instance, we will want them in the covariance case as long as the C++
> compiler hasn't had its builtins updated to know that there are now two
> overloads of strwhatever -- but then they become unnecessary.  On the
> other hand, in the strchr/rawmemchr case, the basic prototype for strchr
> should be enough to clue the compiler that it knows what this function
> does, with no further prompting.  I could be wrong about this.
> 
> cc: Joseph: do you know whether my understanding is accurate?

Mapping strfoo to __builtin_strfoo may also be useful in the case where 
strfoo is not a C90 function (so the user may have used -std=c90 
-D_GNU_SOURCE, getting the declaration of strfoo but without it being 
built-in in the compiler).

In the strchr case, it's a C90 function so aside from inlines for C++ 
overloads mapping to __builtin_strchr should not be of use (if someone 
uses -fno-builtin or -ffreestanding there should be no expectation that 
headers need to try to optimize things anyway, and even the case where 
-std with feature test macros means the function is declared but not 
built-in is pretty marginal).
  
Wilco Dijkstra Nov. 18, 2016, 1:40 p.m. UTC | #5
Joseph Myers wrote:
> On Thu, 17 Nov 2016, Zack Weinberg wrote:
> > Regarding __builtin_strfoo versus strfoo, I _think_ an inline forward to
> > __builtin_strfoo does something useful when either the name or the type
> > signature of 'strfoo' doesn't match what the compiler expects.  So, for
> > instance, we will want them in the covariance case as long as the C++
> > compiler hasn't had its builtins updated to know that there are now two
> > overloads of strwhatever -- but then they become unnecessary.  On the
> > other hand, in the strchr/rawmemchr case, the basic prototype for strchr
> > should be enough to clue the compiler that it knows what this function
> > does, with no further prompting.  I could be wrong about this.

> Mapping strfoo to __builtin_strfoo may also be useful in the case where 
> strfoo is not a C90 function (so the user may have used -std=c90 
> -D_GNU_SOURCE, getting the declaration of strfoo but without it being 
> built-in in the compiler).

Also I believe this remapping is needed for non-C90 functions that are used
inside GLIBC itself. Eg. stpcpy is used a lot, and so is mapped to __stpcpy.
Because __stpcpy is not a GCC builtin, it is mapped to __builtin_stpcpy.

Oddly enough all this happens in string/bits/string2.h, which is only included 
for some optimization settings. If GLIBC is for example built for -Os it won't do
this remapping, which means the namespace is polluted with non-C90 symbols.
So should these internal remappings be moved to string.h and made conditional,
so we only do this when building GLIBC?

> In the strchr case, it's a C90 function so aside from inlines for C++ 
> overloads mapping to __builtin_strchr should not be of use (if someone 
> uses -fno-builtin or -ffreestanding there should be no expectation that 
> headers need to try to optimize things anyway, and even the case where 
> -std with feature test macros means the function is declared but not 
> built-in is pretty marginal).

I'll remove the strchr, strncpy, strcspn, strspn and strpbrk cases from string2.h then.

Wilco
  
Joseph Myers Nov. 18, 2016, 6:05 p.m. UTC | #6
On Fri, 18 Nov 2016, Wilco Dijkstra wrote:

> Also I believe this remapping is needed for non-C90 functions that are used
> inside GLIBC itself. Eg. stpcpy is used a lot, and so is mapped to __stpcpy.
> Because __stpcpy is not a GCC builtin, it is mapped to __builtin_stpcpy.
> 
> Oddly enough all this happens in string/bits/string2.h, which is only included 
> for some optimization settings. If GLIBC is for example built for -Os it won't do
> this remapping, which means the namespace is polluted with non-C90 symbols.
> So should these internal remappings be moved to string.h and made conditional,
> so we only do this when building GLIBC?

(Bug 19463 is linknamespace failures with -Os and bug 15105 is localplt 
failures, but even if stpcpy appears in one or both sets of failures, 
there are a lot more as well, so fixing things for stpcpy might be 
relevant to such bugs but wouldn't fix them.)

Since the built-in expansions of stpcpy may not be expanded inline, we 
also have

libc_hidden_builtin_proto (stpcpy)

#if (!IS_IN (libc) || !defined SHARED) \
  && !defined NO_MEMPCPY_STPCPY_REDIRECT
/* Redirect calls to __builtin_mempcpy and __builtin_stpcpy to call
   __mempcpy and __stpcpy if not inlined.  */
extern __typeof (mempcpy) mempcpy __asm__ ("__mempcpy");
extern __typeof (stpcpy) stpcpy __asm__ ("__stpcpy");
#endif

in include/string.h - the first to cover the case of references to stpcpy 
inside shared libc, the rest to cover references elsewhere.

Thus, it should be possible just to call stpcpy directly everywhere in 
glibc rather than __stpcpy, and rely on those redirections to ensure that 
the calls actually end up being namespace-clean (but can be inlined if 
appropriate as GCC knows about stpcpy as a built-in function in gnu11 
mode).  Then the definition of __stpcpy in bits/string2.h should not be 
needed for building glibc.
  

Patch

diff --git a/string/bits/string2.h b/string/bits/string2.h
index 80987602f34ded483854bcea86dabd5b81e42a18..f0e2ce9cd87033698236e7878c63a2e5f9bb1887 100644
--- a/string/bits/string2.h
+++ b/string/bits/string2.h
@@ -59,12 +59,7 @@ 
 
 
 #ifndef _HAVE_STRING_ARCH_strchr
-extern void *__rawmemchr (const void *__s, int __c);
-#  define strchr(s, c) \
-  (__extension__ (__builtin_constant_p (c) && !__builtin_constant_p (s)              \
-                 && (c) == '\0'                                       \
-                 ? (char *) __rawmemchr (s, c)                                \
-                 : __builtin_strchr (s, c)))
+# define strchr(s, c) __builtin_strchr (s, c)
 #endif