From patchwork Thu Mar 12 15:30:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Joseph Yoo X-Patchwork-Id: 38533 Return-Path: X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by sourceware.org (Postfix) with ESMTPS id 96F383942014 for ; Thu, 12 Mar 2020 15:30:30 +0000 (GMT) Received: by mail-qt1-x842.google.com with SMTP id h16so4643986qtr.11 for ; Thu, 12 Mar 2020 08:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brown-edu.20150623.gappssmtp.com; s=20150623; h=from:subject:to:cc:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=kQk5eeiZgsNijp2YjBO54yOtn6NwKFoI0oH+tmlg3OE=; b=0HuFjgLUH3h3DajW48wR9oIK7iM2R6PI8rO0NYytgxnbD6LGDN53bJ3SsRj7exlSv2 iijzDwv8yTYMLQpK7WwEEbKDlnysR1T2FHuZWebqXMfHpeerRCPjUPksy/BmtcGsAMTE 53wzwrfyV0vl3OiJbEzGM7wCGBqUlHLAHZmaVXxm2rVIhT5SUmh71yjQxczPjU+i757K eVkdg72AuPJBtuTMFAPklNmig0vv+8BcyewzI2CAbkqUma6TbcB/kxA5OzCSOEL04ns4 o/iBAMNYjoyXqJIproOi7nc2k/BHAG/HHAeTaLKLp/vRx2Q0jdv1bj5P3YxT9/2/Z7sK 3huQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=kQk5eeiZgsNijp2YjBO54yOtn6NwKFoI0oH+tmlg3OE=; b=DX4o3ZEsTaMpbd3NZ2L5+b7PAGJa/PDyFoV4wvUB5HomjgsYKnCC2tw+Z8mFR8wuao f+SM/3VVHC7YdA/UW3i1UJMA9tTCIKcsAWI2NS/Yecl5sapkHh1dsGLwZkii1nBJOExN vLKKTavTzW0YcfTk9PEM7WJuu04P4k3AI5G9Bq/6lahhK6R/7b4oYQVu6zUb6TRVJzxU kgwijt5vJeSHFkXyyK9eOo+INgKgGIKg1OkT/u5MhaC21QQs317HPIv1OpZPGpHGD8rf KFhMseSRLYAfGVtgJJcHC5NNLHgbcLuYbGScCnL8sV+hA57GicDDWKmjK1NhwOtxqg4U tplQ== X-Gm-Message-State: ANhLgQ0kTLJNB/ETTrVsQVX6FZo2hzIBIU2t6iCriZxWhcwFX1xtde/f o+LNXNtD0eR6rlYFYh0vL/wwQQf0V1hQjQ== X-Google-Smtp-Source: ADFU+vtuxqdpMhVVLKS4WziRG55ji+k2o4lcMQErZsOk/q/iaLqRnNpnmxKLUPWib+aFjYNe9JUCfQ== X-Received: by 2002:ac8:6bd1:: with SMTP id b17mr7772156qtt.28.1584027028615; Thu, 12 Mar 2020 08:30:28 -0700 (PDT) Received: from [192.168.1.225] (pool-96-233-151-208.prvdri.fios.verizon.net. [96.233.151.208]) by smtp.gmail.com with ESMTPSA id m126sm6227472qkb.55.2020.03.12.08.30.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 12 Mar 2020 08:30:28 -0700 (PDT) From: Joseph Yoo Subject: [PATCH] Optimize generic strtok(_r) function. To: libc-alpha@sourceware.org Message-ID: <9d4a1ba5-544c-f5d9-261a-4ead8da3f6eb@brown.edu> Date: Thu, 12 Mar 2020 11:30:27 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 Content-Language: en-US X-Spam-Status: No, score=-26.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Mar 2020 15:30:33 -0000 The current generic strtok_r implementation calls strspn to find the beginning of the token and strcspn to find the end. Within each of these call stacks is a redundant 256-byte look-up table used to compare chars of S with DELIM. This patched version, for the most part, manually inlines them + adds other (commented) optimizations. Still, strtok(_r) benefits from existing str(c)spn vector implementations. So, any sub-arch with a __str(c)spn_(vx/sse42/etc) should also have a __strtok_r_(vx/sse42/etc) to call them explicitly. I've done so for those affected: x86_64, powerpc64/power8, s390(x), and i386/686. I've tested on my own x86_64 machine, but I'd like help with the others (there seem to be different conventions in how versioning is done, and I'm acquainted with at least none of them). I CC-ed Adhemerval Zanella because I was working under their branch, generic-strings, and also just noticed that this goes against their goal of promoting portability. Meaning, it seems it would be preferred to keep str(c)spn in this function. So, another solution could be to 'externally' inline the two calls (strspn+strcspn), allowing an optimizing compiler to prevent re-allocation. With optimization flags set at O[s,2-3] (tested in x86_64 gcc 9.2) the compiler does so. However, the instructions to zero the table, search through delim, and set the table still repeat. So, manual inlining seems to be 'optimal,' but I understand that it would mean more work in terms of having to write more architecture-specific implementations. I hope this patch/email/Changelog format is somewhat correct! ChangeLog: 2020-03-12  Joseph Yoo           * benchtests/bench-string.h (STRTOK_R): Define symbol for __strtok_r         * benchtests/bench-strtok.c: Change output to JSON and change IMPL calls to strtok_r instead of strtok so multiple versions of strtok_r can be compared. I also added some cases for when the size of delim is around 16 (the size of a 128bit register and also the limit at which str(c)spn just relegates the function call to __str(c)spn_sse2, which is just the generic version).         * string/strtok_r.c: Optimize function.         * sysdeps/i386/i686/multiarch/Makefile: Add strtok_r-c.c and strtok_r-ia32.c to sysdep_routines.        * sysdeps/i386/i686/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add ia32 (generic) and SSE4.2 versions.         * sysdeps/i386/i686/multiarch/strtok_r-c.c: SSE4.2 version that just includes the x86_64 SSE4.2 impl.         * sysdeps/i386/i686/multiarch/strtok_r-ia32.c: Default version that includes the generic impl.         * sysdeps/i386/i686/multiarch/strtok_r.c: Defines the versions.         * sysdeps/powerpc/powerpc64/multiarch/Makefile: Add strtok_r-power8 and strtok_r-ppc64 to sysdep_routines.         * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add ppc64 (generic) and power8 versions.         * sysdeps/powerpc/powerpc64/multiarch/strtok_r-power8.c: Calls the power8 vector implementations of strspn and strcspn (as the current) generic version does.         * sysdeps/powerpc/powerpc64/multiarch/strtok_r-ppc64.c: Default version that includes generic impl.         * sysdeps/powerpc/powerpc64/multiarch/strtok_r.c: Defines the versions.         * sysdeps/s390/Makefile: Add strtok_r, strtok_r-vx, and strtok_r-c to sysdep_routines.         * sysdeps/s390/ifunc-strtok_r.h: New ifunc header for strtok_r (to define macros).         * sysdeps/s390/multiarch/ifunc-impl-list.c: Add STRTOK_R_C (generic) and STRTOK_R_Z13 versions based on macros in header.         * sysdeps/s390/strtok_r-c.c: Default/generic version.         * sysdeps/s390/strtok_r-vx.c: Calls str(c)spn_vx implementations as the current generic version does.         * sysdeps/s390/strtok_r.c: Define versions.         * sysdeps/x86_64/multiarch/Makefile: Add strtok_r-sse2 and strtok_r-c to sysdep_routines.         * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Add sse2 (generic) and sse4.2 versions.         * sysdeps/x86_64/multiarch/strtok_r-c.c: Calls str(c)spn_sse42 versions as the current generic version does.         * sysdeps/x86_64/multiarch/strtok_r-sse2.c: Default/generic version.         * sysdeps/x86_64/multiarch/strtok_r.c: Define versions. --- benchtests/bench-string.h | 1 + benchtests/bench-strtok.c | 213 +++++++++-------- string/strtok_r.c | 225 +++++++++++++++--- sysdeps/i386/i686/multiarch/Makefile | 4 +- sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 6 + sysdeps/i386/i686/multiarch/strtok_r-c.c | 1 + sysdeps/i386/i686/multiarch/strtok_r-ia32.c | 23 ++ sysdeps/i386/i686/multiarch/strtok_r.c | 33 +++ sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +- .../powerpc64/multiarch/ifunc-impl-list.c | 7 + .../powerpc64/multiarch/strtok_r-power8.c | 58 +++++ .../powerpc64/multiarch/strtok_r-ppc64.c | 28 +++ .../powerpc/powerpc64/multiarch/strtok_r.c | 35 +++ sysdeps/s390/Makefile | 1 + sysdeps/s390/ifunc-strtok_r.h | 52 ++++ sysdeps/s390/multiarch/ifunc-impl-list.c | 13 + sysdeps/s390/strtok_r-c.c | 30 +++ sysdeps/s390/strtok_r-vx.c | 69 ++++++ sysdeps/s390/strtok_r.c | 40 ++++ sysdeps/x86_64/multiarch/Makefile | 3 +- sysdeps/x86_64/multiarch/ifunc-impl-list.c | 6 + sysdeps/x86_64/multiarch/strtok_r-c.c | 63 +++++ sysdeps/x86_64/multiarch/strtok_r-sse2.c | 23 ++ sysdeps/x86_64/multiarch/strtok_r.c | 42 ++++ 24 files changed, 845 insertions(+), 133 deletions(-) create mode 100644 sysdeps/i386/i686/multiarch/strtok_r-c.c create mode 100644 sysdeps/i386/i686/multiarch/strtok_r-ia32.c create mode 100644 sysdeps/i386/i686/multiarch/strtok_r.c create mode 100644 sysdeps/powerpc/powerpc64/multiarch/strtok_r-power8.c create mode 100644 sysdeps/powerpc/powerpc64/multiarch/strtok_r-ppc64.c create mode 100644 sysdeps/powerpc/powerpc64/multiarch/strtok_r.c create mode 100644 sysdeps/s390/ifunc-strtok_r.h create mode 100644 sysdeps/s390/strtok_r-c.c create mode 100644 sysdeps/s390/strtok_r-vx.c create mode 100644 sysdeps/s390/strtok_r.c create mode 100644 sysdeps/x86_64/multiarch/strtok_r-c.c create mode 100644 sysdeps/x86_64/multiarch/strtok_r-sse2.c create mode 100644 sysdeps/x86_64/multiarch/strtok_r.c diff --git a/benchtests/bench-string.h b/benchtests/bench-string.h index 841a66a9d8..20c74bbec3 100644 --- a/benchtests/bench-string.h +++ b/benchtests/bench-string.h @@ -89,6 +89,7 @@ extern impl_t __start_impls[], __stop_impls[]; # define STRSPN strspn # define STPCPY stpcpy # define STPNCPY stpncpy +# define STRTOK_R strtok_r # else # include # define CHAR wchar_t diff --git a/benchtests/bench-strtok.c b/benchtests/bench-strtok.c index 7012fb9265..2fcfa5c45d 100644 --- a/benchtests/bench-strtok.c +++ b/benchtests/bench-strtok.c @@ -19,69 +19,43 @@ #define TEST_MAIN #define TEST_NAME "strtok" #include "bench-string.h" +#include "json-lib.h" -char * -oldstrtok (char *s, const char *delim) -{ - static char *olds; - char *token; - - if (s == NULL) - s = olds; - - /* Scan leading delimiters. */ - s += strspn (s, delim); - if (*s == '\0') - { - olds = s; - return NULL; - } +typedef char *(*proto_t) (char *, const char *, char **); - /* Find the end of the token. */ - token = s; - s = strpbrk (token, delim); - if (s == NULL) - /* This token finishes the string. */ - olds = rawmemchr (token, '\0'); - else - { - /* Terminate the token and make OLDS point past it. */ - *s = '\0'; - olds = s + 1; - } - return token; -} +char *generic_strtok_r (char *, const char *, char **); -typedef char *(*proto_t) (const char *, const char *); - -IMPL (oldstrtok, 0) -IMPL (strtok, 1) +IMPL (generic_strtok_r, 0) +IMPL (STRTOK_R, 1) static void -do_one_test (impl_t * impl, const char *s1, const char *s2) +do_one_test (impl_t *impl, char *str, const char *delim, json_ctx_t *json_ctx) { size_t i, iters = INNER_LOOP_ITERS_SMALL; timing_t start, stop, cur; + + char *savep; TIMING_NOW (start); for (i = 0; i < iters; ++i) { - CALL (impl, s1, s2); - CALL (impl, NULL, s2); - CALL (impl, NULL, s2); + + char *_ = CALL (impl, str, delim, &savep); + + while ((_ = CALL (impl, NULL, delim, &savep))) + ; } TIMING_NOW (stop); TIMING_DIFF (cur, start, stop); - TIMING_PRINT_MEAN ((double) cur, (double) iters); - + json_element_double (json_ctx, (double)cur / (double)iters); } - static void -do_test (size_t align1, size_t align2, size_t len1, size_t len2, int fail) +do_test (size_t align1, size_t align2, size_t len1, size_t len2, int fail, + json_ctx_t *json_ctx) { - char *s2 = (char *) (buf2 + align2); + char *s2 = (char *)(buf2 + align2); static const char d[] = "1234567890abcdef"; #define dl (sizeof (d) - 1) char *ss2 = s2; @@ -92,89 +66,128 @@ do_test (size_t align1, size_t align2, size_t len1, size_t len2, int fail) } s2[len2] = '\0'; - printf ("Length %4zd/%zd, alignment %2zd/%2zd, %s:", - len1, len2, align1, align2, fail ? "fail" : "found"); + json_element_object_begin (json_ctx); + json_array_begin (json_ctx, "length"); + json_element_uint (json_ctx, len1); + json_element_uint (json_ctx, len2); + json_array_end (json_ctx); + json_array_begin (json_ctx, "alignment"); + json_element_uint (json_ctx, align1); + json_element_uint (json_ctx, align2); + json_array_end (json_ctx); + json_array_begin (json_ctx, fail ? "fail" : "found"); FOR_EACH_IMPL (impl, 0) { - char *s1 = (char *) (buf1 + align1); + char *s1 = (char *)(buf1 + align1); if (fail) { - char *ss1 = s1; - for (size_t l = len1; l > 0; l = l > dl ? l - dl : 0) - { - size_t t = l > dl ? dl : l; - memcpy (ss1, d, t); - ++ss1[len2 > 7 ? 7 : len2 - 1]; - ss1 += t; - } + char *ss1 = s1; + for (size_t l = len1; l > 0; l = l > dl ? l - dl : 0) + { + size_t t = l > dl ? dl : l; + memcpy (ss1, d, t); + ++ss1[len2 > 7 ? 7 : len2 - 1]; + ss1 += t; + } } else { - memset (s1, '0', len1); - memcpy (s1 + (len1 - len2) - 2, s2, len2); - if ((len1 / len2) > 4) - memcpy (s1 + (len1 - len2) - (3 * len2), s2, len2); + memset (s1, '0', len1); + memcpy (s1 + (len1 - len2) - 2, s2, len2); + if ((len1 / len2) > 4) + memcpy (s1 + (len1 - len2) - (3 * len2), s2, len2); } s1[len1] = '\0'; - do_one_test (impl, s1, s2); + do_one_test (impl, s1, s2, json_ctx); } - putchar ('\n'); + json_array_end (json_ctx); + json_element_object_end (json_ctx); } static int test_main (void) { + json_ctx_t json_ctx; test_init (); - printf ("%23s", ""); + json_init (&json_ctx, 0, stdout); + + json_document_begin (&json_ctx); + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); + + json_attr_object_begin (&json_ctx, "functions"); + json_attr_object_begin (&json_ctx, TEST_NAME); + json_attr_string (&json_ctx, "bench-variant", ""); + + json_array_begin (&json_ctx, "ifuncs"); FOR_EACH_IMPL (impl, 0) - printf ("\t%s", impl->name); - putchar ('\n'); + json_element_string (&json_ctx, impl->name); + json_array_end (&json_ctx); + + json_array_begin (&json_ctx, "results"); for (size_t klen = 2; klen < 32; ++klen) for (size_t hlen = 2 * klen; hlen < 16 * klen; hlen += klen) { - do_test (0, 0, hlen, klen, 0); - do_test (0, 0, hlen, klen, 1); - do_test (0, 3, hlen, klen, 0); - do_test (0, 3, hlen, klen, 1); - do_test (0, 9, hlen, klen, 0); - do_test (0, 9, hlen, klen, 1); - do_test (0, 15, hlen, klen, 0); - do_test (0, 15, hlen, klen, 1); - - do_test (3, 0, hlen, klen, 0); - do_test (3, 0, hlen, klen, 1); - do_test (3, 3, hlen, klen, 0); - do_test (3, 3, hlen, klen, 1); - do_test (3, 9, hlen, klen, 0); - do_test (3, 9, hlen, klen, 1); - do_test (3, 15, hlen, klen, 0); - do_test (3, 15, hlen, klen, 1); - - do_test (9, 0, hlen, klen, 0); - do_test (9, 0, hlen, klen, 1); - do_test (9, 3, hlen, klen, 0); - do_test (9, 3, hlen, klen, 1); - do_test (9, 9, hlen, klen, 0); - do_test (9, 9, hlen, klen, 1); - do_test (9, 15, hlen, klen, 0); - do_test (9, 15, hlen, klen, 1); - - do_test (15, 0, hlen, klen, 0); - do_test (15, 0, hlen, klen, 1); - do_test (15, 3, hlen, klen, 0); - do_test (15, 3, hlen, klen, 1); - do_test (15, 9, hlen, klen, 0); - do_test (15, 9, hlen, klen, 1); - do_test (15, 15, hlen, klen, 0); - do_test (15, 15, hlen, klen, 1); + do_test (0, 0, hlen, klen, 0, &json_ctx); + do_test (0, 0, hlen, klen, 1, &json_ctx); + do_test (0, 3, hlen, klen, 0, &json_ctx); + do_test (0, 3, hlen, klen, 1, &json_ctx); + do_test (0, 9, hlen, klen, 0, &json_ctx); + do_test (0, 9, hlen, klen, 1, &json_ctx); + do_test (0, 15, hlen, klen, 0, &json_ctx); + do_test (0, 15, hlen, klen, 1, &json_ctx); + + do_test (3, 0, hlen, klen, 0, &json_ctx); + do_test (3, 0, hlen, klen, 1, &json_ctx); + do_test (3, 3, hlen, klen, 0, &json_ctx); + do_test (3, 3, hlen, klen, 1, &json_ctx); + do_test (3, 9, hlen, klen, 0, &json_ctx); + do_test (3, 9, hlen, klen, 1, &json_ctx); + do_test (3, 15, hlen, klen, 0, &json_ctx); + do_test (3, 15, hlen, klen, 1, &json_ctx); + + do_test (9, 0, hlen, klen, 0, &json_ctx); + do_test (9, 0, hlen, klen, 1, &json_ctx); + do_test (9, 3, hlen, klen, 0, &json_ctx); + do_test (9, 3, hlen, klen, 1, &json_ctx); + do_test (9, 9, hlen, klen, 0, &json_ctx); + do_test (9, 9, hlen, klen, 1, &json_ctx); + do_test (9, 15, hlen, klen, 0, &json_ctx); + do_test (9, 15, hlen, klen, 1, &json_ctx); + + do_test (15, 0, hlen, klen, 0, &json_ctx); + do_test (15, 0, hlen, klen, 1, &json_ctx); + do_test (15, 3, hlen, klen, 0, &json_ctx); + do_test (15, 3, hlen, klen, 1, &json_ctx); + do_test (15, 9, hlen, klen, 0, &json_ctx); + do_test (15, 9, hlen, klen, 1, &json_ctx); + do_test (15, 15, hlen, klen, 0, &json_ctx); + do_test (15, 15, hlen, klen, 1, &json_ctx); } - do_test (0, 0, page_size - 1, 16, 0); - do_test (0, 0, page_size - 1, 16, 1); + + do_test (0, 0, page_size - 1, 3, 0, &json_ctx); + do_test (0, 0, page_size - 1, 3, 1, &json_ctx); + do_test (0, 0, page_size - 1, 7, 0, &json_ctx); + do_test (0, 0, page_size - 1, 7, 1, &json_ctx); + do_test (0, 0, page_size - 1, 15, 0, &json_ctx); + do_test (0, 0, page_size - 1, 15, 1, &json_ctx); + do_test (9, 0, page_size / 2, 17, 0, &json_ctx); + do_test (9, 0, page_size / 2, 17, 1, &json_ctx); + do_test (0, 0, page_size - 1, 100, 0, &json_ctx); + do_test (0, 0, page_size - 1, 100, 1, &json_ctx); + + json_array_end (&json_ctx); + json_attr_object_end (&json_ctx); + json_attr_object_end (&json_ctx); + json_document_end (&json_ctx); return ret; } #include + +#undef STRTOK_R +#define STRTOK_R generic_strtok_r +#include \ No newline at end of file diff --git a/string/strtok_r.c b/string/strtok_r.c index b8359c8653..784ecba8f7 100644 --- a/string/strtok_r.c +++ b/string/strtok_r.c @@ -16,16 +16,19 @@ License along with the GNU C Library; if not, see . */ -#ifdef HAVE_CONFIG_H -# include -#endif - +#include #include -#ifndef _LIBC -/* Get specification. */ -# include "strtok_r.h" -# define __strtok_r strtok_r +#undef __strtok_r +#undef strtok_r + +#ifndef STRTOK_R +# ifdef weak_alias +# define STRTOK_R __strtok_r + weak_alias (__strtok_r, strtok_r) +# else +# define STRTOK_R strtok_r +# endif #endif /* Parse S into tokens separated by characters in DELIM. @@ -37,43 +40,207 @@ x = strtok_r(NULL, "-=", &sp); // x = "def", sp = NULL x = strtok_r(NULL, "=", &sp); // x = NULL // s = "abc\0-def\0" + + This generic implementation can be thought of as 3 (sub-)routines: + [0] Process delims - + Set up a look-up table with the delimiting characters for + the input string to compare against + [1] Find start - Iterate through input string until non-delimiting + character is reached-- basically strspn. + [2] Find end - Iterate through input string until delimiting + character is reached-- basically strcspn. */ char * -__strtok_r (char *s, const char *delim, char **save_ptr) +STRTOK_R (char *start, const char *delim, char **save_ptr) { - char *end; + /* General pointer used to cast START, DELIM, and *SAVE_PTR + as unsigned char pointers */ + unsigned char *u; - if (s == NULL) - s = *save_ptr; + /** BEGIN ROUTINE 0 **/ + /* Zero-initialize a character-indexed look-up table. The offsets + corresponding to char values in DELIM store 1; otherwise, remain 0. + See str(c)spn implementations for original reference. */ + unsigned char dset[256]; + memset (dset, 0, 64); + memset (dset + 64, 0, 64); + memset (dset + 128, 0, 64); + memset (dset + 192, 0, 64); - if (*s == '\0') + /* To fill the table, search for the NUL byte in DELIM by checking 4 bytes, + and then aligning down (if, at all) to the closest lower + 4-byte boundary. Proceed to check 4 bytes at a time by loading into a + 4-byte integer 'word' */ + u = (unsigned char *)delim; + + if (__glibc_unlikely (u[0] == '\0')) ; + else if (u[1] == '\0') dset[u[0]] = 1; + else if (u[2] == '\0') dset[u[0]] = 1, dset[u[1]] = 1; + else if (u[3] == '\0') dset[u[0]] = 1, dset[u[1]] = 1, dset[u[2]] = 1; + else { - *save_ptr = s; - return NULL; + dset[u[0]] = 1, dset[u[1]] = 1, dset[u[2]] = 1, dset[u[3]] = 1; + + /* Align down to 4-byte boundary (+ the 4 bytes already checked) */ + u = PTR_ALIGN_DOWN (u, 4) + 4; + +#if __INT_LEAST32_WIDTH__ == 32 && __CHAR_BIT__ == 8 + + uint_fast32_t zmask, word = *(uint_least32_t *)u; + /* The classic bit-twiddling check for 0-byte in a word, in which + the resulting 'mask', ZMASK, sets 0x80 where WORD contains the zero + byte and 0x00 for nonzero bytes. Thus, break the loop if ZMASK + isn't all zeros. */ + while ((zmask = ~word & (word - 0x01010101UL) & 0x80808080UL) == 0) + { + dset[u[0]] = 1; + dset[u[1]] = 1; + dset[u[2]] = 1; + dset[u[3]] = 1; + word = *(uint_least32_t *)(u += 4); + } + +/* macro to handle the remaining bytes using zmask */ +# define handle_zmask(shft0, fst2_nonzero, shft2) \ + { \ + /* Move MSB to LSB and XOR to get (in bits): + 0..1 from 0..0 + 0..0 from 1..0, + effectively doing (!!) w/o flag dependency */ \ + dset[u[0]] = (unsigned char) (zmask >> shft0) ^ 1; \ + /* fst2_nonzero is true if the first 2 bytes of u are nonzero + (i.e. zmask[0/1] = 0/0) */ \ + if (fst2_nonzero) \ + dset[u[1]] = 1, \ + dset[u[2]] = (unsigned char) (zmask >> shft2) ^ 1; \ + } + +/* handle the remaining bytes */ +# if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + handle_zmask (7, (uint_least16_t)zmask == 0x0000, 23); +# elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + handle_zmask (31, zmask <= 0x8080, 15); +# elif __BYTE_ORDER__ == __ORDER_PDP_ENDIAN__ + handle_zmask (23, zmask <= 0x8080, 7); +# endif /* END __BYTE_ORDER__ == X */ + +#else /* Portably do it in bytes */ + int c0 = (_Bool)u[0], + c1 = (_Bool)u[1], + c2 = (_Bool)u[2], + c3 = (_Bool)u[3]; + + while (c0 & c1 & c2 & c3) + { + dset[u[0]] = 1; + dset[u[1]] = 1; + dset[u[2]] = 1; + dset[u[3]] = 1; + + u += 4; + + c0 = (_Bool)u[0]; + c1 = (_Bool)u[1]; + c2 = (_Bool)u[2]; + c3 = (_Bool)u[3]; + } + + dset[u[0]] = c0; + if (c0 & c1) + dset[u[1]] = 1, + dset[u[2]] = c2; +#endif /* END __INT_LEAST32_WIDTH__ == 32 && __CHAR_BIT__ == 8 */ } + /** END ROUTINE 0 **/ - /* Scan leading delimiters. */ - s += strspn (s, delim); - if (*s == '\0') + /* From this point on, U refers to the input string, conditionally + START or *SAVE_P. */ + u = (unsigned char *)*save_ptr; + if (__glibc_unlikely (start != NULL)) + u = (unsigned char *)start; + + /** BEGIN ROUTINE 1 **/ + /* Find first character in U that is not in DSET */ + if (!dset[u[0]]) + ; + else if (!dset[u[1]]) + u += 1; + else if (!dset[u[2]]) + u += 2; + else if (!dset[u[3]]) + u += 3; + else + { + /* If there were a 'fast type' for an implicit int (i.e. one without + a specified minimum width), it should be that of the int with the + minimum possible width, 16) */ + int_fast16_t s0, s2, det; + + u = PTR_ALIGN_DOWN (u, 4); + do + { + u += 4; + + s0 = dset[u[0]]; + det = dset[u[1]] & s0; + s2 = dset[u[2]]; + } + while (det & s2 & dset[u[3]]); + + u += !det ? s0 : s2 + 2; + } + /** END ROUTINE 1 **/ + + /* End of string is reached */ + if (__glibc_unlikely (*u == '\0')) { - *save_ptr = s; + *save_ptr = (char *)u; return NULL; } - /* Find the end of the token. */ - end = s + strcspn (s, delim); - if (*end == '\0') + /* End of string is not yet reached, so set START of return token */ + start = (char *)u; + /* For NUL to continue causing a break in ROUTINE 2, set DSET[NUL] to 1 */ + dset['\0'] = 1; + + /** BEGIN ROUTINE 2 **/ + /* Find first character in start that is in DSET */ + if (dset[u[0]]) + ; + else if (dset[u[1]]) + u += 1; + else if (dset[u[2]]) + u += 2; + else if (dset[u[3]]) + u += 3; + else + { + int_fast16_t s0, s2, det; + + u = PTR_ALIGN_DOWN (u, 4); + do + { + u += 4; + + s0 = dset[u[0]]; + det = dset[u[1]] | s0; + s2 = dset[u[2]]; + } + while ((det | s2 | dset[u[3]]) == 0); + + u += det ? 1 - s0 : 3 - s2; + } + /** END ROUTINE 2 **/ + + *save_ptr = (char *)u; + if (__glibc_likely (*u != 0)) { - *save_ptr = end; - return s; + *u = 0; + (*save_ptr)++; } - /* Terminate the token and make *SAVE_PTR point past it. */ - *end = '\0'; - *save_ptr = end + 1; - return s; + return start; } #ifdef weak_alias libc_hidden_def (__strtok_r) -weak_alias (__strtok_r, strtok_r) #endif diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile index bf75a9947f..83540b2d8b 100644 --- a/sysdeps/i386/i686/multiarch/Makefile +++ b/sysdeps/i386/i686/multiarch/Makefile @@ -24,13 +24,13 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \ strcasecmp_l-sse4 strncase_l-sse4 \ bcopy-sse2-unaligned memcpy-sse2-unaligned \ mempcpy-sse2-unaligned memmove-sse2-unaligned \ - strcspn-c strpbrk-c strspn-c \ + strcspn-c strpbrk-c strspn-c strtok_r-c \ bcopy-ia32 bzero-ia32 rawmemchr-ia32 \ memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \ mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \ strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \ strpbrk-ia32 strspn-ia32 strlen-ia32 stpcpy-ia32 \ - stpncpy-ia32 + stpncpy-ia32 strtok_r-ia32 CFLAGS-varshift.c += -msse4 CFLAGS-strcspn-c.c += -msse4 CFLAGS-strpbrk-c.c += -msse4 diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c index 23774fbe8a..022725e40c 100644 --- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c +++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c @@ -272,6 +272,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __strspn_sse42) IFUNC_IMPL_ADD (array, i, strspn, 1, __strspn_ia32)) + /* Support sysdeps/i386/i686/multiarch/strtok_r.c. */ + IFUNC_IMPL (i, name, strtok_r, + IFUNC_IMPL_ADD (array, i, strtok_r, HAS_CPU_FEATURE (SSE4_2), + __strtok_r_sse42) + IFUNC_IMPL_ADD (array, i, strtok_r, 1, __strtok_r_ia32)) + /* Support sysdeps/i386/i686/multiarch/wcschr.S. */ IFUNC_IMPL (i, name, wcschr, IFUNC_IMPL_ADD (array, i, wcschr, HAS_CPU_FEATURE (SSE2), diff --git a/sysdeps/i386/i686/multiarch/strtok_r-c.c b/sysdeps/i386/i686/multiarch/strtok_r-c.c new file mode 100644 index 0000000000..0e9b303c0d --- /dev/null +++ b/sysdeps/i386/i686/multiarch/strtok_r-c.c @@ -0,0 +1 @@ +#include \ No newline at end of file diff --git a/sysdeps/i386/i686/multiarch/strtok_r-ia32.c b/sysdeps/i386/i686/multiarch/strtok_r-ia32.c new file mode 100644 index 0000000000..2fa48abbb2 --- /dev/null +++ b/sysdeps/i386/i686/multiarch/strtok_r-ia32.c @@ -0,0 +1,23 @@ +/* strtok_r optimized for i686 (generic). + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STRTOK_R __strtok_r_ia32 + +#undef weak_alias +#define weak_alias(ignored1, ignored2) +#undef libc_hidden_def +#define libc_hidden_def(strtok_r) + +#include \ No newline at end of file diff --git a/sysdeps/i386/i686/multiarch/strtok_r.c b/sysdeps/i386/i686/multiarch/strtok_r.c new file mode 100644 index 0000000000..2662ad78db --- /dev/null +++ b/sysdeps/i386/i686/multiarch/strtok_r.c @@ -0,0 +1,33 @@ +/* Multiple versions of strtok_r. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ +#if IS_IN (libc) +# define strtok_r __redirect_strtok_r +# define __strtok_r __redirect___strtok_r +# include +# undef strtok_r +# undef __strtok_r + +# define SYMBOL_NAME strtok_r +# include "ifunc-sse4_2.h" + +libc_ifunc_redirected (__redirect_strtok_r, __strtok_r, IFUNC_SELECTOR ()); +weak_alias (__strtok_r, strtok_r) +#endif diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile index ea936bf9ed..f22fd825bb 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile @@ -29,7 +29,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \ strspn-power8 strspn-ppc64 strcspn-power8 strcspn-ppc64 \ strlen-power8 strcasestr-power8 strcasestr-ppc64 \ strcasecmp-ppc64 strcasecmp-power8 strncase-ppc64 \ - strncase-power8 + strncase-power8 strtok_r-power8 strtok_r-ppc64 ifneq (,$(filter %le,$(config-machine))) sysdep_routines += strcmp-power9 strncmp-power9 diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c index b9fef3f43c..cd131ec70b 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c @@ -355,6 +355,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_ppc)) + /* Support sysdeps/powerpc/powerpc64/multiarch/strtok_r.c. */ + IFUNC_IMPL (i, name, strtok_r, + IFUNC_IMPL_ADD (array, i, strtok_r, + hwcap2 & PPC_FEATURE2_ARCH_2_07, + __strtok_r_power8) + IFUNC_IMPL_ADD (array, i, strtok_r, 1, + __strtok_r_ppc)) /* Support sysdeps/powerpc/powerpc64/multiarch/strcasestr.c. */ IFUNC_IMPL (i, name, strcasestr, diff --git a/sysdeps/powerpc/powerpc64/multiarch/strtok_r-power8.c b/sysdeps/powerpc/powerpc64/multiarch/strtok_r-power8.c new file mode 100644 index 0000000000..d2ce7b1429 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/multiarch/strtok_r-power8.c @@ -0,0 +1,58 @@ +/* Optimized strtok_r implementation for POWER8. + Copyright (C) 2016-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern size_t __strspn_power8 (const char *, const char *) attribute_hidden; +extern size_t __strcspn_power8 (const char *, const char *) attribute_hidden; + +/* __asm__ (".machine power8"); */ +char * +__strtok_r_power8 (char * s, const char *delim, char **save_p) +{ + char *end; + + if (s == NULL) s = *save_p; + + if (*s == '\0') + { + *save_p = s; + return NULL; + } + + /* Scan leading delimiters. */ + s += __strspn_power8 (s, delim); + if (*s == '\0') + { + *save_p = s; + return NULL; + } + + /* Find the end of the token. */ + end = s + __strcspn_power8 (s, delim); + if (*end == '\0') + { + *save_p = end; + return s; + } + + /* Terminate the token and make *save_p point past it. */ + *end = '\0'; + *save_p = end + 1; + return s; +} \ No newline at end of file diff --git a/sysdeps/powerpc/powerpc64/multiarch/strtok_r-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/strtok_r-ppc64.c new file mode 100644 index 0000000000..27b73fdb37 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/multiarch/strtok_r-ppc64.c @@ -0,0 +1,28 @@ +/* PowerPC64 default implementation of strtok_r. + Copyright (C) 2013-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#define STRTOK_R __strtok_r_ppc + +#undef weak_alias +#define weak_alias(a,b ) + +extern __typeof (strtok_r) __strtok_r_ppc attribute_hidden; + +#include diff --git a/sysdeps/powerpc/powerpc64/multiarch/strtok_r.c b/sysdeps/powerpc/powerpc64/multiarch/strtok_r.c new file mode 100644 index 0000000000..fc54b994d7 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/multiarch/strtok_r.c @@ -0,0 +1,35 @@ +/* Multiple versions of strtok_r. PowerPC64 version. + Copyright (C) 2016-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if IS_IN (libc) +# include +# include +# include "init-arch.h" + +extern __typeof (__strtok_r) __strtok_r_ppc attribute_hidden; +extern __typeof (__strtok_r) __strtok_r_power8 attribute_hidden; + +libc_ifunc (__strtok_r, + (hwcap2 & PPC_FEATURE2_ARCH_2_07) + ? __strtok_r_power8 + : __strtok_r_ppc); + +weak_alias (__strtok_r, strtok_r) +#else +#include +#endif \ No newline at end of file diff --git a/sysdeps/s390/Makefile b/sysdeps/s390/Makefile index a8c49c928f..eb95105889 100644 --- a/sysdeps/s390/Makefile +++ b/sysdeps/s390/Makefile @@ -59,6 +59,7 @@ sysdep_routines += bzero memset memset-z900 \ mempcpy memcpy memcpy-z900 \ memmove memmove-c \ strstr strstr-arch13 strstr-vx strstr-c \ + strtok_r strtok_r-vx strtok_r-c \ memmem memmem-arch13 memmem-vx memmem-c \ strlen strlen-vx strlen-c \ strnlen strnlen-vx strnlen-c \ diff --git a/sysdeps/s390/ifunc-strtok_r.h b/sysdeps/s390/ifunc-strtok_r.h new file mode 100644 index 0000000000..836e2f36b2 --- /dev/null +++ b/sysdeps/s390/ifunc-strtok_r.h @@ -0,0 +1,52 @@ +/* strtok_r variant information on S/390 version. + Copyright (C) 2018-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if defined USE_MULTIARCH && IS_IN (libc) \ + && ! defined HAVE_S390_MIN_Z13_ZARCH_ASM_SUPPORT +# define HAVE_STRTOK_R_IFUNC 1 +#else +# define HAVE_STRTOK_R_IFUNC 0 +#endif + +#ifdef HAVE_S390_VX_ASM_SUPPORT +# define HAVE_STRTOK_R_IFUNC_AND_VX_SUPPORT HAVE_STRTOK_R_IFUNC +#else +# define HAVE_STRTOK_R_IFUNC_AND_VX_SUPPORT 0 +#endif + +#if defined HAVE_S390_MIN_Z13_ZARCH_ASM_SUPPORT +# define STRTOK_R_DEFAULT STRTOK_R_Z13 +# define HAVE_STRTOK_R_C 0 +# define HAVE_STRTOK_R_Z13 1 +#else +# define STRTOK_R_DEFAULT STRTOK_R_C +# define HAVE_STRTOK_R_C 1 +# define HAVE_STRTOK_R_Z13 HAVE_STRTOK_R_IFUNC_AND_VX_SUPPORT +#endif + +#if HAVE_STRTOK_R_C +# define STRTOK_R_C __strtok_r_c +#else +# define STRTOK_R_C NULL +#endif + +#if HAVE_STRTOK_R_Z13 +# define STRTOK_R_Z13 __strtok_r_vx +#else +# define STRTOK_R_Z13 NULL +#endif diff --git a/sysdeps/s390/multiarch/ifunc-impl-list.c b/sysdeps/s390/multiarch/ifunc-impl-list.c index e6195c6e26..4c86eac432 100644 --- a/sysdeps/s390/multiarch/ifunc-impl-list.c +++ b/sysdeps/s390/multiarch/ifunc-impl-list.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -200,6 +201,18 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, ) #endif /* HAVE_STRSTR_IFUNC */ +#if HAVE_STRTOK_R_IFUNC + IFUNC_IMPL (i, name, strtok_r, +# if HAVE_STRTOK_R_Z13 + IFUNC_IMPL_ADD (array, i, strtok_r, + dl_hwcap & HWCAP_S390_VX, STRTOK_R_Z13) +# endif +# if HAVE_STRTOK_R_C + IFUNC_IMPL_ADD (array, i, strtok_r, 1, STRTOK_R_C) +# endif + ) +#endif /* HAVE_STRTOK_R_IFUNC */ + #if HAVE_MEMMEM_IFUNC IFUNC_IMPL (i, name, memmem, # if HAVE_MEMMEM_ARCH13 diff --git a/sysdeps/s390/strtok_r-c.c b/sysdeps/s390/strtok_r-c.c new file mode 100644 index 0000000000..aa5547e29e --- /dev/null +++ b/sysdeps/s390/strtok_r-c.c @@ -0,0 +1,30 @@ +/* Default strtok_r implementation for S/390. + Copyright (C) 2015-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#if HAVE_STRTOK_R_C +# if HAVE_STRTOK_R_IFUNC +# define STRTOK_R STRTOK_R_C +# define __strtok_r STRTOK_R +# undef weak_alias +# define weak_alias(name, alias) +# endif + +# include +#endif diff --git a/sysdeps/s390/strtok_r-vx.c b/sysdeps/s390/strtok_r-vx.c new file mode 100644 index 0000000000..49b09e1fc3 --- /dev/null +++ b/sysdeps/s390/strtok_r-vx.c @@ -0,0 +1,69 @@ +/* Default strtok_r implementation with vector string functions for S/390. + Copyright (C) 2018-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#if HAVE_STRTOK_R_Z13 + +# include + +# ifdef USE_MULTIARCH +extern __typeof (strspn) __strspn_vx attribute_hidden; +extern __typeof (strcspn) __strcspn_vx attribute_hidden; +# endif + +/* __asm__ (".machine \"z13\"\n\t.machinemode \"zarch_nohighprs\""); */ + +char * +STRTOK_R_Z13 (char * s, const char *delim, char **save_p) +{ + char *end; + + if (s == NULL) s = *save_p; + + if (*s == '\0') + { + *save_p = s; + return NULL; + } + + /* Scan leading delimiters. */ + s += __strspn_vx (s, delim); + if (*s == '\0') + { + *save_p = s; + return NULL; + } + + /* Find the end of the token. */ + end = s + __strcspn_vx (s, delim); + if (*end == '\0') + { + *save_p = end; + return s; + } + + /* Terminate the token and make *save_p point past it. */ + *end = '\0'; + *save_p = end + 1; + return s; +} + +strong_alias (STRTOK_R_Z13, __strtok_r) +weak_alias (__strtok_r, strtok_r) +#endif diff --git a/sysdeps/s390/strtok_r.c b/sysdeps/s390/strtok_r.c new file mode 100644 index 0000000000..026c136490 --- /dev/null +++ b/sysdeps/s390/strtok_r.c @@ -0,0 +1,40 @@ +/* Multiple versions of strtok_r. + Copyright (C) 2015-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#if HAVE_STRTOK_R_IFUNC +# include +# include + +# if HAVE_STRTOK_R_C +extern __typeof (__strtok_r) STRTOK_R_C attribute_hidden; +# endif + +# if HAVE_STRTOK_R_Z13 +extern __typeof (__strtok_r) STRTOK_R_Z13 attribute_hidden; +# endif + +s390_libc_ifunc_expr (__strtok_r, __strtok_r, + (HAVE_STRTOK_R_Z13 && (hwcap & HWCAP_S390_VX)) + ? STRTOK_R_Z13 + : STRTOK_R_DEFAULT + ) +weak_alias (__strtok_r, strtok_r) +#endif + diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index 395e432c09..969fe405d7 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -43,7 +43,8 @@ sysdep_routines += strncat-c stpncpy-c strncpy-c \ memmove-avx512-unaligned-erms \ memset-sse2-unaligned-erms \ memset-avx2-unaligned-erms \ - memset-avx512-unaligned-erms + memset-avx512-unaligned-erms \ + strtok_r-sse2 strtok_r-c CFLAGS-varshift.c += -msse4 CFLAGS-strcspn-c.c += -msse4 CFLAGS-strpbrk-c.c += -msse4 diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c index ce7eb1eecf..53da1ab784 100644 --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c @@ -365,6 +365,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_sse2_unaligned) IFUNC_IMPL_ADD (array, i, strstr, 1, __strstr_sse2)) +/* Support sysdeps/x86_64/multiarch/strtok_r.c. */ + IFUNC_IMPL (i, name, strtok_r, + IFUNC_IMPL_ADD (array, i, strtok_r, HAS_CPU_FEATURE (SSE4_2), + __strtok_r_sse42) + IFUNC_IMPL_ADD (array, i, strtok_r, 1, __strtok_r_sse2)) + /* Support sysdeps/x86_64/multiarch/wcschr.c. */ IFUNC_IMPL (i, name, wcschr, IFUNC_IMPL_ADD (array, i, wcschr, diff --git a/sysdeps/x86_64/multiarch/strtok_r-c.c b/sysdeps/x86_64/multiarch/strtok_r-c.c new file mode 100644 index 0000000000..a7a95fc73c --- /dev/null +++ b/sysdeps/x86_64/multiarch/strtok_r-c.c @@ -0,0 +1,63 @@ +/* strtok_r with SSE4.2 intrinsics + Copyright (C) 2009-2020 Free Software Foundation, Inc. + Contributed by Intel Corporation. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern size_t __strspn_sse42 (const char *, const char *) attribute_hidden; +extern size_t __strcspn_sse42 (const char *, const char *) attribute_hidden; + +/* Uses the SSE4.2 implementations of strspn and strcspn. + * This used to be the former generic strtok_r. + */ +char * +__attribute__ ((section (".text.sse4.2"))) +__strtok_r_sse42 (char *s, const char *delim, char **save_p) +{ + char *end; + + if (s == NULL) + s = *save_p; + + if (*s == '\0') + { + *save_p = s; + return NULL; + } + + /* Scan leading delimiters. */ + s += __strspn_sse42 (s, delim); + if (*s == '\0') + { + *save_p = s; + return NULL; + } + + /* Find the end of the token. */ + end = s + __strcspn_sse42 (s, delim); + if (*end == '\0') + { + *save_p = end; + return s; + } + + /* Terminate the token and make *save_p point past it. */ + *end = '\0'; + *save_p = end + 1; + return s; +} diff --git a/sysdeps/x86_64/multiarch/strtok_r-sse2.c b/sysdeps/x86_64/multiarch/strtok_r-sse2.c new file mode 100644 index 0000000000..19059d5a9a --- /dev/null +++ b/sysdeps/x86_64/multiarch/strtok_r-sse2.c @@ -0,0 +1,23 @@ +/* strtok_r optimized with SSE2. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STRTOK_R __strtok_r_sse2 + +#undef weak_alias +#define weak_alias(ignored1, ignored2) +#undef libc_hidden_def +#define libc_hidden_def(strtok_r) + +#include \ No newline at end of file diff --git a/sysdeps/x86_64/multiarch/strtok_r.c b/sysdeps/x86_64/multiarch/strtok_r.c new file mode 100644 index 0000000000..a3b35b9a02 --- /dev/null +++ b/sysdeps/x86_64/multiarch/strtok_r.c @@ -0,0 +1,42 @@ +/* Multiple versions of strtok_r. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ +#if IS_IN(libc) +# define strtok_r __redirect_strtok_r +# define __strtok_r __redirect___strtok_r + +# include + +# undef __strtok_r +# undef strtok_r + +# define SYMBOL_NAME strtok_r + +# include "ifunc-sse4_2.h" + +libc_ifunc_redirected (__redirect_strtok_r, __strtok_r, IFUNC_SELECTOR ()); + +weak_alias (__strtok_r, strtok_r) + +# ifdef SHARED + __hidden_ver1 (__strtok_r, __GI___strtok_r, __redirect___strtok_r) + __attribute__ ((visibility ("hidden"))); +# endif +#endif \ No newline at end of file