[1/4] Improve generic strcspn performance
Commit Message
From: Wilco Dijkstra <wdijkstr@arm.com>
Improve strcspn performance using a much faster algorithm. It is kept simple
so it works well on most targets. It is generally at least 10 times faster
than the existing implementation on bench-strcspn on a few AArch64
implementations, and for some tests 100 times as fast (repeatedly calling
strchr on a small string is extremely slow...).
In fact the string/bits/string2.h inlines make no longer sense, as GCC
already uses strlen if reject is an empty string, strchrnul is 5 times as
fast as __strcspn_c1, while __strcspn_c2 and __strcspn_c3 are slower than
the strcspn main loop for large strings (though reject length 2-4 could be
special cased in the future to gain even more performance).
Tested on x86_64, i686, and aarch64.
* string/strcspn.c (strcspn): Rewrite function.
* string/bits/string2.h (strcspn): Use __builtin_strcspn.
---
ChangeLog | 6 ++++++
string/bits/string2.h | 41 ++++++-----------------------------------
string/strcspn.c | 44 ++++++++++++++++++++++++++++++++++++--------
3 files changed, 48 insertions(+), 43 deletions(-)
Comments
Adhemerval Zanella wrote:
> + if (accept[0] == '\0')
> + return 0;
> + if (accept[1] == '\0')
> + {
GCC doesn't get the static branch prediction correct for the 2nd if,
so it would be useful to add __glibc_unlikely given that single-character
accepts are rare.
> + s = (unsigned char *) ((size_t)(s) & ~3);
> + unsigned int c0, c1, c2, c3;
> + do {
> + s += 4;
> + c0 = p[s[0]];
> + c1 = p[s[1]];
> + c2 = p[s[2]];
> + c3 = p[s[3]];
> + } while ((c0 && c1 && c2 && c3) == 1);
That should use '&' rather than '&&' and '!= 0' similar to how I did it in strcspn.
This will use 3 AND(S) instructions and a single branch.
> +
> + size_t count = s - (unsigned char *) str;
> + return (c0 && c1) == 0 ? count - !c0 + 1 : count - !c2 + 3;
Again, c0 & c1 is better and allows CSE with the while expression above.
Also -!c0 +1 is equivalent to c0, -!c2 + 3 is equivalent to c2 + 2 - this is simpler
and faster.
Otherwise it looks good, and thanks for doing this one too!
Cheers,
Wilco
On 03/28/2016 08:19 AM, Adhemerval Zanella wrote:
> + /* The inline functions are not used from GLIBC 2.24 and forward, however
> + they are required to provide the symbols through string-inlines.c
> + (if inlining is not possible for compatibility reasons). */
> __STRING_INLINE size_t __strcspn_c1 (const char *__s, int __reject);
> __STRING_INLINE size_t
> __strcspn_c1 (const char *__s, int __reject)
They could, however, be moved out of the installed header file and be given
compatibility symbol versions.
r~
Richard Henderson wrote:
> On 03/28/2016 08:19 AM, Adhemerval Zanella wrote:
> > + /* The inline functions are not used from GLIBC 2.24 and forward, however
> > + they are required to provide the symbols through string-inlines.c
> > + (if inlining is not possible for compatibility reasons). */
> > __STRING_INLINE size_t __strcspn_c1 (const char *__s, int __reject);
> > __STRING_INLINE size_t
> > __strcspn_c1 (const char *__s, int __reject)
>
> They could, however, be moved out of the installed header file and be given
> compatibility symbol versions.
I have several patches out for review that move all the inlines from string2.h
to string/string-inlines.c for backwards compatibility and discontinue the complex
inlining of some code from string2.h so one no longer needs to worry about
accidental ABI changes.
See http://patchwork.sourceware.org/patch/10936/ for the strcspn move (this builds on
http://patchwork.sourceware.org/patch/10933/ and http://patchwork.sourceware.org/patch/10934/).
What do you mean with "compatibility symbol versions"?
Wilco
On 30-03-2016 15:00, Wilco Dijkstra wrote:
>
> Richard Henderson wrote:
>> On 03/28/2016 08:19 AM, Adhemerval Zanella wrote:
>>> + /* The inline functions are not used from GLIBC 2.24 and forward, however
>>> + they are required to provide the symbols through string-inlines.c
>>> + (if inlining is not possible for compatibility reasons). */
>>> __STRING_INLINE size_t __strcspn_c1 (const char *__s, int __reject);
>>> __STRING_INLINE size_t
>>> __strcspn_c1 (const char *__s, int __reject)
>>
>> They could, however, be moved out of the installed header file and be given
>> compatibility symbol versions.
>
> I have several patches out for review that move all the inlines from string2.h
> to string/string-inlines.c for backwards compatibility and discontinue the complex
> inlining of some code from string2.h so one no longer needs to worry about
> accidental ABI changes.
>
> See http://patchwork.sourceware.org/patch/10936/ for the strcspn move (this builds on
> http://patchwork.sourceware.org/patch/10933/ and http://patchwork.sourceware.org/patch/10934/).
>
> What do you mean with "compatibility symbol versions"?
>
> Wilco
>
My understanding is since GLIBC will not emit the __str{spn,cspn,brk}
anymore the symbols can now be build as compatibility mode only.
I will change it for this patch series and check out the ones you
noted.
On 03/30/2016 11:00 AM, Wilco Dijkstra wrote:
>
> Richard Henderson wrote:
>> On 03/28/2016 08:19 AM, Adhemerval Zanella wrote:
>>> + /* The inline functions are not used from GLIBC 2.24 and forward, however
>>> + they are required to provide the symbols through string-inlines.c
>>> + (if inlining is not possible for compatibility reasons). */
>>> __STRING_INLINE size_t __strcspn_c1 (const char *__s, int __reject);
>>> __STRING_INLINE size_t
>>> __strcspn_c1 (const char *__s, int __reject)
>>
>> They could, however, be moved out of the installed header file and be given
>> compatibility symbol versions.
>
> I have several patches out for review that move all the inlines from string2.h
> to string/string-inlines.c for backwards compatibility and discontinue the complex
> inlining of some code from string2.h so one no longer needs to worry about
> accidental ABI changes.
>
> See http://patchwork.sourceware.org/patch/10936/ for the strcspn move (this builds on
> http://patchwork.sourceware.org/patch/10933/ and http://patchwork.sourceware.org/patch/10934/).
>
> What do you mean with "compatibility symbol versions"?
It means that the symbols are not available with a default version, and so new
programs cannot link against the symbol. However, old programs that are
already linked will continue to use the old symbol version.
See e.g. nptl/pt-fork.c, wherein "fork" and "__fork" were removed from the
default symbols for libpthread, and so now are exported only as compatibility
symbols.
r~
On 03/28/2016 08:19 AM, Adhemerval Zanella wrote:
> + s = (unsigned char *) ((size_t)s & ~3);
Nit: s/size_t/uintptr_t/.
It's the same type for all supported targets,
but the spelling says what you mean.
r~
> On 03/28/2016 08:19 AM, Adhemerval Zanella wrote:
> > + s = (unsigned char *) ((size_t)s & ~3);
>
> Nit: s/size_t/uintptr_t/.
>
> It's the same type for all supported targets,
> but the spelling says what you mean.
libc-internal.h has PTR_ALIGN_DOWN for this.
@@ -905,43 +905,14 @@ __stpcpy_small (char *__dest,
/* Return the length of the initial segment of S which
consists entirely of characters not in REJECT. */
-#if !defined _HAVE_STRING_ARCH_strcspn || defined _FORCE_INLINES
-# ifndef _HAVE_STRING_ARCH_strcspn
-# if __GNUC_PREREQ (3, 2)
-# define strcspn(s, reject) \
- __extension__ \
- ({ char __r0, __r1, __r2; \
- (__builtin_constant_p (reject) && __string2_1bptr_p (reject) \
- ? ((__builtin_constant_p (s) && __string2_1bptr_p (s)) \
- ? __builtin_strcspn (s, reject) \
- : ((__r0 = ((const char *) (reject))[0], __r0 == '\0') \
- ? strlen (s) \
- : ((__r1 = ((const char *) (reject))[1], __r1 == '\0') \
- ? __strcspn_c1 (s, __r0) \
- : ((__r2 = ((const char *) (reject))[2], __r2 == '\0') \
- ? __strcspn_c2 (s, __r0, __r1) \
- : (((const char *) (reject))[3] == '\0' \
- ? __strcspn_c3 (s, __r0, __r1, __r2) \
- : __builtin_strcspn (s, reject)))))) \
- : __builtin_strcspn (s, reject)); })
-# else
-# define strcspn(s, reject) \
- __extension__ \
- ({ char __r0, __r1, __r2; \
- (__builtin_constant_p (reject) && __string2_1bptr_p (reject) \
- ? ((__r0 = ((const char *) (reject))[0], __r0 == '\0') \
- ? strlen (s) \
- : ((__r1 = ((const char *) (reject))[1], __r1 == '\0') \
- ? __strcspn_c1 (s, __r0) \
- : ((__r2 = ((const char *) (reject))[2], __r2 == '\0') \
- ? __strcspn_c2 (s, __r0, __r1) \
- : (((const char *) (reject))[3] == '\0' \
- ? __strcspn_c3 (s, __r0, __r1, __r2) \
- : strcspn (s, reject))))) \
- : strcspn (s, reject)); })
-# endif
+#ifndef _HAVE_STRING_ARCH_strcspn
+# if __GNUC_PREREQ (3, 2)
+# define strcspn(s, reject) __builtin_strcspn (s, reject)
# endif
+ /* The inline functions are not used from GLIBC 2.24 and forward, however
+ they are required to provide the symbols through string-inlines.c
+ (if inlining is not possible for compatibility reasons). */
__STRING_INLINE size_t __strcspn_c1 (const char *__s, int __reject);
__STRING_INLINE size_t
__strcspn_c1 (const char *__s, int __reject)
@@ -26,16 +26,44 @@
/* Return the length of the maximum initial segment of S
which contains no characters from REJECT. */
size_t
-STRCSPN (const char *s, const char *reject)
+STRCSPN (const char *str, const char *reject)
{
- size_t count = 0;
+ if (reject[0] == '\0' || reject[1] == '\0')
+ return __strchrnul (str, reject [0]) - str;
- while (*s != '\0')
- if (strchr (reject, *s++) == NULL)
- ++count;
- else
- return count;
+ /* Use multiple small memsets to enable inlining on most targets. */
+ unsigned char table[256];
+ unsigned char *p = memset (table, 0, 64);
+ memset (p + 64, 0, 64);
+ memset (p + 128, 0, 64);
+ memset (p + 192, 0, 64);
- return count;
+ unsigned char *s = (unsigned char*) reject;
+ unsigned char tmp;
+ do
+ p[tmp = *s++] = 1;
+ while (tmp);
+
+ s = (unsigned char*) str;
+ if (p[s[0]]) return 0;
+ if (p[s[1]]) return 1;
+ if (p[s[2]]) return 2;
+ if (p[s[3]]) return 3;
+
+ s = (unsigned char *) ((size_t)s & ~3);
+
+ unsigned int c0, c1, c2, c3;
+ do
+ {
+ s += 4;
+ c0 = p[s[0]];
+ c1 = p[s[1]];
+ c2 = p[s[2]];
+ c3 = p[s[3]];
+ }
+ while ((c0 | c1 | c2 | c3) == 0);
+
+ size_t count = s - (unsigned char *) str;
+ return (c0 | c1) != 0 ? count - c0 + 1 : count - c2 + 3;
}
libc_hidden_builtin_def (strcspn)