x86-64: Update strlen.S to support wcslen/wcsnlen
Commit Message
On Mon, Jun 5, 2017 at 10:37 PM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2017.06.05 at 07:24 -0700, H.J. Lu wrote:
>> On Tue, May 30, 2017 at 3:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Sat, May 20, 2017 at 6:56 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> The difference between strlen and wcslen is byte vs int. We can
>> >> replace pminub and pcmpeqb with pminud and pcmpeqd to turn strlen
>> >> into wcslen. Tested on Ivy Bridge with benchtests/bench-wcslen.c,
>> >> the new strlen based wcslen is as fast as the old wcslen.
>> >>
>> >> OK for master?
>> >>
>> >> H.J.
>> >> ---
>> >> * sysdeps/x86_64/strlen.S (PMINU): New.
>> >> (PCMPEQ): Likewise.
>> >> (SHIFT_RETURN): Likewise.
>> >> (FIND_ZERO): Replace pcmpeqb with PCMPEQ.
>> >> (strlen): Add SHIFT_RETURN before ret. Replace pcmpeqb and
>> >> pminub with PCMPEQ and PMINU.
>> >> * sysdeps/x86_64/wcslen.S: Define AS_WCSLEN and strlen.
>> >> Include "strlen.S".
>> >> * sysdeps/x86_64/wcsnlen.S: New file.
>> >
>> > Here is the updated patch only with SSE2 wcsnlen.S. Any
>> > comments?
>> >
>>
>> I will check it in today.
>
> It doesn't work on old machines without SSE4.1:
>
> FAIL: stdio-common/tstdiomisc
> FAIL: wcsmbs/test-wcpncpy
> FAIL: wcsmbs/test-wcsncmp
> FAIL: wcsmbs/test-wcsncpy
> FAIL: wcsmbs/test-wcsnlen
> FAIL: wcsmbs/wcsatcliff
>
> markus@x4 tmp % gdb --args /var/tmp/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /var/tmp/glibc-build:/var/tmp/glibc-build/math:/var/tmp/glibc-build/elf:/var/tmp/gl[31/150]d/dlfcn:/var/tmp/glibc-build/nss:/var/tmp/glibc-build/nis:/var/tmp/glibc-build/rt:/var/tmp/glibc-build/resolv:/var/tmp/glibc-build/crypt:/var/tmp/glibc-build/mathvec:/var/tmp/glib
> c-build/support:/var/tmp/glibc-build/nptl /var/tmp/glibc-build/stdio-common/tstdiomisc
> Reading symbols from /var/tmp/glibc-build/elf/ld-linux-x86-64.so.2...done.
> (gdb) run
> Starting program: /home/markus/tmp/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /var/tmp/glibc-build:/var/tmp/glibc-build/math:/var/tmp/glibc-build/elf:/var/tmp/glibc-build/dlfcn:/var/tmp/glibc-build/nss:/var/tmp/glibc-build/nis:/var/tmp/glibc-build/rt:/var/tmp/glibc-build/resolv:/var/tmp/glibc-build/crypt:/var/tmp/glibc-build/mathvec:/var/tmp/glibc-build/support:/var/tmp/glibc-build/nptl /var/tmp/glibc-build/stdio-common/tstdiomisc
> t1: count=5
> sscanf ("12345", "%ld", &x) => 1, x = 12345
> sscanf ("12345", "%llllld", &x) => 0, x = -1
> sscanf ("12345", "%LLLLLd", &x) => 0, x = -1
> sscanf ("test ", "%*s%n", &x) => 0, x = 4
> sscanf ("test ", "%2*s%n", &x) => 0, x = -1
> sscanf ("12 ", "%l2d", &x) => 0, x = -1
> sscanf ("12 ", "%2ld", &x) => 1, x = 12
> sscanf ("1 1", "%d %Z", &n, &N) => 1, n = 1, N = -1
> expected "nan NAN nan NAN nan NAN nan NAN", got "nan NAN nan NAN nan NAN nan NAN"
> expected "-nan -NAN -nan -NAN -nan -NAN -nan -NAN", got "-nan -NAN -nan -NAN -nan -NAN -nan -NAN"
> expected "nan NAN nan NAN nan NAN nan NAN", got "nan NAN nan NAN nan NAN nan NAN"
> expected "-nan -NAN -nan -NAN -nan -NAN -nan -NAN", got "-nan -NAN -nan -NAN -nan -NAN -nan -NAN" expected "inf INF inf INF inf INF inf INF", got "inf INF inf INF inf INF inf INF"
> expected "-inf -INF -inf -INF -inf -INF -inf -INF", got "-inf -INF -inf -INF -inf -INF -inf -INF"
> Program received signal SIGILL, Illegal instruction.
> wcsnlen () at ../sysdeps/x86_64/strlen.S:180
> 180 PMINU 16(%rax), %xmm0
>
Please try this. Sorry for the breakage.
Comments
On 2017.06.06 at 05:37 -0700, H.J. Lu wrote:
> On Mon, Jun 5, 2017 at 10:37 PM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> >
> > It doesn't work on old machines without SSE4.1:
> >
> > FAIL: stdio-common/tstdiomisc
> > FAIL: wcsmbs/test-wcpncpy
> > FAIL: wcsmbs/test-wcsncmp
> > FAIL: wcsmbs/test-wcsncpy
> > FAIL: wcsmbs/test-wcsnlen
> > FAIL: wcsmbs/wcsatcliff
>
> Please try this. Sorry for the breakage.
It works fine now. Thanks for the quick patch.
On Tue, Jun 6, 2017 at 5:59 AM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2017.06.06 at 05:37 -0700, H.J. Lu wrote:
>> On Mon, Jun 5, 2017 at 10:37 PM, Markus Trippelsdorf
>> <markus@trippelsdorf.de> wrote:
>> >
>> > It doesn't work on old machines without SSE4.1:
>> >
>> > FAIL: stdio-common/tstdiomisc
>> > FAIL: wcsmbs/test-wcpncpy
>> > FAIL: wcsmbs/test-wcsncmp
>> > FAIL: wcsmbs/test-wcsncpy
>> > FAIL: wcsmbs/test-wcsnlen
>> > FAIL: wcsmbs/wcsatcliff
>>
>> Please try this. Sorry for the breakage.
>
> It works fine now. Thanks for the quick patch.
>
I checked it in.
From b5b8fce19221891afba4907e9dd7e05b9e797f53 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 6 Jun 2017 05:31:48 -0700
Subject: [PATCH] x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S
Since wcsnlen.S uses pminud which is the part of SSE4.1, move wcsnlen.S
to multiarch/wcsnlen-sse4_1.S.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
wcsnlen-sse4_1 and wcsnlen-c.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __wcsnlen_sse4_1 and
__wcsnlen_sse2.
* sysdeps/x86_64/multiarch/ifunc-sse4_1.h: New file.
* sysdeps/x86_64/multiarch/wcsnlen-c.c: Likewise.
* sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S: Likewise.
* sysdeps/x86_64/multiarch/wcsnlen.c: Likewise.
* sysdeps/x86_64/wcsnlen.S: Removed.
---
sysdeps/x86_64/multiarch/Makefile | 3 ++-
sysdeps/x86_64/multiarch/ifunc-impl-list.c | 7 ++++++
sysdeps/x86_64/multiarch/ifunc-sse4_1.h | 34 ++++++++++++++++++++++++++++++
sysdeps/x86_64/multiarch/wcsnlen-c.c | 9 ++++++++
sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S | 5 +++++
sysdeps/x86_64/multiarch/wcsnlen.c | 31 +++++++++++++++++++++++++++
sysdeps/x86_64/wcsnlen.S | 7 ------
7 files changed, 88 insertions(+), 8 deletions(-)
create mode 100644 sysdeps/x86_64/multiarch/ifunc-sse4_1.h
create mode 100644 sysdeps/x86_64/multiarch/wcsnlen-c.c
create mode 100644 sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S
create mode 100644 sysdeps/x86_64/multiarch/wcsnlen.c
delete mode 100644 sysdeps/x86_64/wcsnlen.S
@@ -33,7 +33,8 @@ endif
ifeq ($(subdir),wcsmbs)
sysdep_routines += wmemcmp-sse4 wmemcmp-ssse3 wmemcmp-c \
wmemcmp-avx2-movbe \
- wcscpy-ssse3 wcscpy-c
+ wcscpy-ssse3 wcscpy-c \
+ wcsnlen-sse4_1 wcsnlen-c
endif
ifeq ($(subdir),debug)
@@ -296,6 +296,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__wcscpy_ssse3)
IFUNC_IMPL_ADD (array, i, wcscpy, 1, __wcscpy_sse2))
+ /* Support sysdeps/x86_64/multiarch/wcsnlen.c. */
+ IFUNC_IMPL (i, name, wcsnlen,
+ IFUNC_IMPL_ADD (array, i, wcsnlen,
+ HAS_CPU_FEATURE (SSE4_1),
+ __wcsnlen_sse4_1)
+ IFUNC_IMPL_ADD (array, i, wcsnlen, 1, __wcsnlen_sse2))
+
/* Support sysdeps/x86_64/multiarch/wmemcmp.S. */
IFUNC_IMPL (i, name, wmemcmp,
IFUNC_IMPL_ADD (array, i, wmemcmp,
new file mode 100644
@@ -0,0 +1,34 @@
+/* Common definition for ifunc selections optimized with SSE2 and SSE4.1.
+ All versions must be listed in ifunc-impl-list.c.
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <init-arch.h>
+
+extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE (sse4_1) attribute_hidden;
+
+static inline void *
+IFUNC_SELECTOR (void)
+{
+ const struct cpu_features* cpu_features = __get_cpu_features ();
+
+ if (CPU_FEATURES_CPU_P (cpu_features, SSE4_1))
+ return OPTIMIZE (sse4_1);
+
+ return OPTIMIZE (sse2);
+}
new file mode 100644
@@ -0,0 +1,9 @@
+#if IS_IN (libc)
+# include <wchar.h>
+
+# define WCSNLEN __wcsnlen_sse2
+
+extern __typeof (wcsnlen) __wcsnlen_sse2;
+#endif
+
+#include "wcsmbs/wcsnlen.c"
new file mode 100644
@@ -0,0 +1,5 @@
+#define AS_WCSLEN
+#define AS_STRNLEN
+#define strlen __wcsnlen_sse4_1
+
+#include "../strlen.S"
new file mode 100644
@@ -0,0 +1,31 @@
+/* Multiple versions of wcsnlen.
+ All versions must be listed in ifunc-impl-list.c.
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* Define multiple versions only for the definition in libc. */
+#if IS_IN (libc)
+# define __wcsnlen __redirect_wcsnlen
+# include <wchar.h>
+# undef __wcsnlen
+
+# define SYMBOL_NAME wcsnlen
+# include "ifunc-sse4_1.h"
+
+libc_ifunc_redirected (__redirect_wcsnlen, __wcsnlen, IFUNC_SELECTOR ());
+weak_alias (__wcsnlen, wcsnlen);
+#endif
deleted file mode 100644
@@ -1,7 +0,0 @@
-#define AS_WCSLEN
-#define AS_STRNLEN
-#define strlen __wcsnlen
-
-#include "strlen.S"
-
-weak_alias(__wcsnlen, wcsnlen)
--
2.9.4