From patchwork Tue Jun 6 12:37:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 20810 Received: (qmail 43553 invoked by alias); 6 Jun 2017 12:37:35 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 43467 invoked by uid 89); 6 Jun 2017 12:37:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f193.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=MRl0SrA1X5e6yViUgRGnk+ZAh6evHxsvRuc/4Ne7Y9Q=; b=BX8/6tySFBGKyLXoBQ0C6CuXFTgY3LTCbDC/R2afssmfNeAQ0XSKUOCDC8yAjhJoCi QbQgJjkqc1d7wnPvzD4+BDTvnoJkksZv336O/yo04IxY8xLJuX6b9xRYI4kwpTBdoORu uPSEGyHm7OPlCD2NGv4uV2TU3fRXAIiuDsXoIJUT4y0v47dChNIQTWjPkGE0bEHP05Zk aHcOhIspu/LvhS7t9V/uj8GIrz1Vl0jbmAnFzfuafWyhZOvgdEUj+kvTQ3Hy2M9OWGY1 A2odkp2nf2p59X5IJSK1V+Lf9EAaaE0gkWdP6c1K8+Gu/Jz9Xj9vBRhUU/nz4Wrlq6nR fmTQ== X-Gm-Message-State: AKS2vOw9ZBbGP02KSlPUPktrR2dj8foHVZDusXYoUoaRU6yz+hXtoOe8 30aC8aLcP334inrlRWK72P7KMbTDPQ== X-Received: by 10.55.33.207 with SMTP id f76mr30590581qki.69.1496752653329; Tue, 06 Jun 2017 05:37:33 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20170606053710.GA283@x4> References: <20170520135619.GA17481@gmail.com> <20170606053710.GA283@x4> From: "H.J. Lu" Date: Tue, 6 Jun 2017 05:37:32 -0700 Message-ID: Subject: Re: [PATCH] x86-64: Update strlen.S to support wcslen/wcsnlen To: Markus Trippelsdorf Cc: GNU C Library On Mon, Jun 5, 2017 at 10:37 PM, Markus Trippelsdorf wrote: > On 2017.06.05 at 07:24 -0700, H.J. Lu wrote: >> On Tue, May 30, 2017 at 3:14 PM, H.J. Lu wrote: >> > On Sat, May 20, 2017 at 6:56 AM, H.J. Lu wrote: >> >> The difference between strlen and wcslen is byte vs int. We can >> >> replace pminub and pcmpeqb with pminud and pcmpeqd to turn strlen >> >> into wcslen. Tested on Ivy Bridge with benchtests/bench-wcslen.c, >> >> the new strlen based wcslen is as fast as the old wcslen. >> >> >> >> OK for master? >> >> >> >> H.J. >> >> --- >> >> * sysdeps/x86_64/strlen.S (PMINU): New. >> >> (PCMPEQ): Likewise. >> >> (SHIFT_RETURN): Likewise. >> >> (FIND_ZERO): Replace pcmpeqb with PCMPEQ. >> >> (strlen): Add SHIFT_RETURN before ret. Replace pcmpeqb and >> >> pminub with PCMPEQ and PMINU. >> >> * sysdeps/x86_64/wcslen.S: Define AS_WCSLEN and strlen. >> >> Include "strlen.S". >> >> * sysdeps/x86_64/wcsnlen.S: New file. >> > >> > Here is the updated patch only with SSE2 wcsnlen.S. Any >> > comments? >> > >> >> I will check it in today. > > It doesn't work on old machines without SSE4.1: > > FAIL: stdio-common/tstdiomisc > FAIL: wcsmbs/test-wcpncpy > FAIL: wcsmbs/test-wcsncmp > FAIL: wcsmbs/test-wcsncpy > FAIL: wcsmbs/test-wcsnlen > FAIL: wcsmbs/wcsatcliff > > markus@x4 tmp % gdb --args /var/tmp/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /var/tmp/glibc-build:/var/tmp/glibc-build/math:/var/tmp/glibc-build/elf:/var/tmp/gl[31/150]d/dlfcn:/var/tmp/glibc-build/nss:/var/tmp/glibc-build/nis:/var/tmp/glibc-build/rt:/var/tmp/glibc-build/resolv:/var/tmp/glibc-build/crypt:/var/tmp/glibc-build/mathvec:/var/tmp/glib > c-build/support:/var/tmp/glibc-build/nptl /var/tmp/glibc-build/stdio-common/tstdiomisc > Reading symbols from /var/tmp/glibc-build/elf/ld-linux-x86-64.so.2...done. > (gdb) run > Starting program: /home/markus/tmp/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /var/tmp/glibc-build:/var/tmp/glibc-build/math:/var/tmp/glibc-build/elf:/var/tmp/glibc-build/dlfcn:/var/tmp/glibc-build/nss:/var/tmp/glibc-build/nis:/var/tmp/glibc-build/rt:/var/tmp/glibc-build/resolv:/var/tmp/glibc-build/crypt:/var/tmp/glibc-build/mathvec:/var/tmp/glibc-build/support:/var/tmp/glibc-build/nptl /var/tmp/glibc-build/stdio-common/tstdiomisc > t1: count=5 > sscanf ("12345", "%ld", &x) => 1, x = 12345 > sscanf ("12345", "%llllld", &x) => 0, x = -1 > sscanf ("12345", "%LLLLLd", &x) => 0, x = -1 > sscanf ("test ", "%*s%n", &x) => 0, x = 4 > sscanf ("test ", "%2*s%n", &x) => 0, x = -1 > sscanf ("12 ", "%l2d", &x) => 0, x = -1 > sscanf ("12 ", "%2ld", &x) => 1, x = 12 > sscanf ("1 1", "%d %Z", &n, &N) => 1, n = 1, N = -1 > expected "nan NAN nan NAN nan NAN nan NAN", got "nan NAN nan NAN nan NAN nan NAN" > expected "-nan -NAN -nan -NAN -nan -NAN -nan -NAN", got "-nan -NAN -nan -NAN -nan -NAN -nan -NAN" > expected "nan NAN nan NAN nan NAN nan NAN", got "nan NAN nan NAN nan NAN nan NAN" > expected "-nan -NAN -nan -NAN -nan -NAN -nan -NAN", got "-nan -NAN -nan -NAN -nan -NAN -nan -NAN" expected "inf INF inf INF inf INF inf INF", got "inf INF inf INF inf INF inf INF" > expected "-inf -INF -inf -INF -inf -INF -inf -INF", got "-inf -INF -inf -INF -inf -INF -inf -INF" > Program received signal SIGILL, Illegal instruction. > wcsnlen () at ../sysdeps/x86_64/strlen.S:180 > 180 PMINU 16(%rax), %xmm0 > Please try this. Sorry for the breakage. From b5b8fce19221891afba4907e9dd7e05b9e797f53 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Tue, 6 Jun 2017 05:31:48 -0700 Subject: [PATCH] x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S Since wcsnlen.S uses pminud which is the part of SSE4.1, move wcsnlen.S to multiarch/wcsnlen-sse4_1.S. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add wcsnlen-sse4_1 and wcsnlen-c. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Test __wcsnlen_sse4_1 and __wcsnlen_sse2. * sysdeps/x86_64/multiarch/ifunc-sse4_1.h: New file. * sysdeps/x86_64/multiarch/wcsnlen-c.c: Likewise. * sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S: Likewise. * sysdeps/x86_64/multiarch/wcsnlen.c: Likewise. * sysdeps/x86_64/wcsnlen.S: Removed. --- sysdeps/x86_64/multiarch/Makefile | 3 ++- sysdeps/x86_64/multiarch/ifunc-impl-list.c | 7 ++++++ sysdeps/x86_64/multiarch/ifunc-sse4_1.h | 34 ++++++++++++++++++++++++++++++ sysdeps/x86_64/multiarch/wcsnlen-c.c | 9 ++++++++ sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S | 5 +++++ sysdeps/x86_64/multiarch/wcsnlen.c | 31 +++++++++++++++++++++++++++ sysdeps/x86_64/wcsnlen.S | 7 ------ 7 files changed, 88 insertions(+), 8 deletions(-) create mode 100644 sysdeps/x86_64/multiarch/ifunc-sse4_1.h create mode 100644 sysdeps/x86_64/multiarch/wcsnlen-c.c create mode 100644 sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S create mode 100644 sysdeps/x86_64/multiarch/wcsnlen.c delete mode 100644 sysdeps/x86_64/wcsnlen.S diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index b040288..310a3a4 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -33,7 +33,8 @@ endif ifeq ($(subdir),wcsmbs) sysdep_routines += wmemcmp-sse4 wmemcmp-ssse3 wmemcmp-c \ wmemcmp-avx2-movbe \ - wcscpy-ssse3 wcscpy-c + wcscpy-ssse3 wcscpy-c \ + wcsnlen-sse4_1 wcsnlen-c endif ifeq ($(subdir),debug) diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c index b61bc9f..ee4243a 100644 --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c @@ -296,6 +296,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __wcscpy_ssse3) IFUNC_IMPL_ADD (array, i, wcscpy, 1, __wcscpy_sse2)) + /* Support sysdeps/x86_64/multiarch/wcsnlen.c. */ + IFUNC_IMPL (i, name, wcsnlen, + IFUNC_IMPL_ADD (array, i, wcsnlen, + HAS_CPU_FEATURE (SSE4_1), + __wcsnlen_sse4_1) + IFUNC_IMPL_ADD (array, i, wcsnlen, 1, __wcsnlen_sse2)) + /* Support sysdeps/x86_64/multiarch/wmemcmp.S. */ IFUNC_IMPL (i, name, wmemcmp, IFUNC_IMPL_ADD (array, i, wmemcmp, diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_1.h b/sysdeps/x86_64/multiarch/ifunc-sse4_1.h new file mode 100644 index 0000000..2b89231 --- /dev/null +++ b/sysdeps/x86_64/multiarch/ifunc-sse4_1.h @@ -0,0 +1,34 @@ +/* Common definition for ifunc selections optimized with SSE2 and SSE4.1. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse4_1) attribute_hidden; + +static inline void * +IFUNC_SELECTOR (void) +{ + const struct cpu_features* cpu_features = __get_cpu_features (); + + if (CPU_FEATURES_CPU_P (cpu_features, SSE4_1)) + return OPTIMIZE (sse4_1); + + return OPTIMIZE (sse2); +} diff --git a/sysdeps/x86_64/multiarch/wcsnlen-c.c b/sysdeps/x86_64/multiarch/wcsnlen-c.c new file mode 100644 index 0000000..e1ec7cf --- /dev/null +++ b/sysdeps/x86_64/multiarch/wcsnlen-c.c @@ -0,0 +1,9 @@ +#if IS_IN (libc) +# include + +# define WCSNLEN __wcsnlen_sse2 + +extern __typeof (wcsnlen) __wcsnlen_sse2; +#endif + +#include "wcsmbs/wcsnlen.c" diff --git a/sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S b/sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S new file mode 100644 index 0000000..a8cab0c --- /dev/null +++ b/sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S @@ -0,0 +1,5 @@ +#define AS_WCSLEN +#define AS_STRNLEN +#define strlen __wcsnlen_sse4_1 + +#include "../strlen.S" diff --git a/sysdeps/x86_64/multiarch/wcsnlen.c b/sysdeps/x86_64/multiarch/wcsnlen.c new file mode 100644 index 0000000..5f74d2c --- /dev/null +++ b/sysdeps/x86_64/multiarch/wcsnlen.c @@ -0,0 +1,31 @@ +/* Multiple versions of wcsnlen. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ +#if IS_IN (libc) +# define __wcsnlen __redirect_wcsnlen +# include +# undef __wcsnlen + +# define SYMBOL_NAME wcsnlen +# include "ifunc-sse4_1.h" + +libc_ifunc_redirected (__redirect_wcsnlen, __wcsnlen, IFUNC_SELECTOR ()); +weak_alias (__wcsnlen, wcsnlen); +#endif diff --git a/sysdeps/x86_64/wcsnlen.S b/sysdeps/x86_64/wcsnlen.S deleted file mode 100644 index 968bb69..0000000 --- a/sysdeps/x86_64/wcsnlen.S +++ /dev/null @@ -1,7 +0,0 @@ -#define AS_WCSLEN -#define AS_STRNLEN -#define strlen __wcsnlen - -#include "strlen.S" - -weak_alias(__wcsnlen, wcsnlen) -- 2.9.4