From patchwork Wed Oct 24 12:38:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 29870 Received: (qmail 64856 invoked by alias); 24 Oct 2018 12:38:11 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 64845 invoked by uid 89); 24 Oct 2018 12:38:11 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=deadline, crystal, 1516, 6187 X-HELO: mail-oi1-f174.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Qt84l2WF9hLnawkWgVFeJC30aAqTMK5dQsYwPP0alF4=; b=LkLrrQlG99xd6vMxapbBj4f2rkc0CIJmyVUMySZdd+a89TI7ryaDC1DRn/iwhW6tzY f5yVcmASQNCC85q3wCNVfMpHPJepphTRQMobZ1ksbtTqzLUyY9XLHTG9JMnRborYKaKH K/K9TK4uHPiJk+dg8t/Uu7FIGrQ0eyWUeFSJZjL8G19ZFCUa+3q6xHfQNKiluP382PgE hY/Y9ANbQIL8SADvC6oG1PL5P9cEkA5yWIm9Vai3cN24yM/K5OJ4kpt4RrWH9/jJzyl7 dYDw/+WPGiG9TiHAVd6LXYlgVMcasqLHhQ/fQW96Lz+1zGLKLQ+d2xHGB7ch+5g7iaa8 6iYg== MIME-Version: 1.0 In-Reply-To: <87ftwv61uk.fsf@oldenburg.str.redhat.com> References: <20180927194327.7683-1-hjl.tools@gmail.com> <20180927194327.7683-3-hjl.tools@gmail.com> <878t2n908g.fsf@oldenburg.str.redhat.com> <87ftwv61uk.fsf@oldenburg.str.redhat.com> From: "H.J. Lu" Date: Wed, 24 Oct 2018 05:38:03 -0700 Message-ID: Subject: Re: V5 [PATCH 2/2] x86: Add a LD_PRELOAD IFUNC resolver test for CPU_FEATURE_USABLE To: Florian Weimer Cc: libc-alpha@sourceware.org On 10/24/18, Florian Weimer wrote: > * H. J. Lu: > >> I guess you knew that this issue was independent of my new functions. >> You will get the same error regardless of what the get_free body has. > > Yes, the check is certainly overly conservative. I thought we want to > remove it. Don't we trigger it in glibc in a few places? If the check > is gone, then I think we will see incorrect results from the new > interface. > > I think we are very consistent right now when it comes to relocations in > IFUNC handlers. I want to see this settled before adding something that > requires a relocation which is (among other things) targeted at IFUNC > resolvers. > isn't targeted for IFUNC. My first use is to add x86_tsc_to_ns and x86_ns_to_tsc. I am enclosing 2 patches here. From d596a8620e40be7052329acdc1c4dc505f68e017 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Thu, 26 Jul 2018 13:46:05 -0700 Subject: [PATCH 2/2] sys/platform/x86.h: Add x86_tsc_to_ns/x86_ns_to_tsc If "CPU_FEATURE_USABLE (TSC_To_NS)" evaluates true, extern unsigned long long x86_tsc_to_ns (unsigned long long); extern unsigned long long x86_ns_to_tsc (unsigned long long); can be used to convert between time stamp counters (TSCs) and nanoseconds. * manual/platform.texi: Document x86_tsc_to_ns, x86_ns_to_tsc and TSC_To_NS. * sysdeps/mach/hurd/i386/libc.abilist: Add x86_ns_to_tsc and x86_tsc_to_ns. * sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise. * sysdeps/x86/Makefile (sysdep_routines): Add tsc-ns. (tests): Add tst-tsc-ns and tst-tsc-ns-static. (tests-static): Add tst-tsc-ns-static. * sysdeps/x86/Versions (libc::GLIBC_2.29): Add x86_tsc_to_ns and x86_ns_to_tsc. * sysdeps/x86/cpu-features.c (init_cpu_features): Update tsc_nccc_data and set TSC_To_NS_Usable if supported. * sysdeps/x86/cpu-features.h (tsc_nccc_info): New. (cpu_features): Add tsc_nccc_data. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Support TSC_To_NS_Usable. * sysdeps/x86/sys/platform/x86.h (x86_tsc_to_ns): New. (x86_ns_to_tsc): Likewise. (bit_arch_TSC_To_NS_Usable): Likewise. (index_arch_TSC_To_NS_Usable): Likewise. (bit_cpu_TSC_To_NS): Likewise. (index_cpu_TSC_To_NS): Likewise. (reg_TSC_To_NS): Likewise. (need_arch_feature_TSC_To_NS): Likewise. * sysdeps/x86/tsc-ns.c: New file. * sysdeps/x86/tst-tsc-ns-static.c: Likewise. * sysdeps/x86/tst-tsc-ns.c: Likewise. * sysdeps/x86/tst-x86-platform-1.c (do_test): Also check TSC_To_NS. --- manual/platform.texi | 14 +++- sysdeps/mach/hurd/i386/libc.abilist | 2 + sysdeps/unix/sysv/linux/i386/libc.abilist | 2 + .../unix/sysv/linux/x86_64/64/libc.abilist | 2 + .../unix/sysv/linux/x86_64/x32/libc.abilist | 2 + sysdeps/x86/Makefile | 5 ++ sysdeps/x86/Versions | 6 ++ sysdeps/x86/cpu-features.c | 41 ++++++++++++ sysdeps/x86/cpu-features.h | 13 ++++ sysdeps/x86/cpu-tunables.c | 9 ++- sysdeps/x86/sys/platform/x86.h | 17 +++++ sysdeps/x86/tsc-ns.c | 64 +++++++++++++++++++ sysdeps/x86/tst-tsc-ns-static.c | 1 + sysdeps/x86/tst-tsc-ns.c | 51 +++++++++++++++ sysdeps/x86/tst-x86-platform-1.c | 2 + 15 files changed, 227 insertions(+), 4 deletions(-) create mode 100644 sysdeps/x86/tsc-ns.c create mode 100644 sysdeps/x86/tst-tsc-ns-static.c create mode 100644 sysdeps/x86/tst-tsc-ns.c diff --git a/manual/platform.texi b/manual/platform.texi index b0faa61a67..a3f43a1a1f 100644 --- a/manual/platform.texi +++ b/manual/platform.texi @@ -151,6 +151,18 @@ Internal function used by @code{HAS_CPU_FEATURE}. Internal function used by @code{CPU_FEATURE_USABLE}. @end deftypefun +@deftypefun {unsigned long long} x86_tsc_to_ns (unsigned long long) + +Convert time stamp counter (TSC) to nanosecond if +@code{CPU_FEATURE_USABLE (TSC_To_NS)} evaluates to true. +@end deftypefun + +@deftypefun {unsigned long long} x86_ns_to_tsc (unsigned long long) + +Convert nanosecond to time stamp counter (TSC) if +@code{CPU_FEATURE_USABLE (TSC_To_NS)} evaluates to true. +@end deftypefun + @defmac HAS_CPU_FEATURE(name) Evaluate to true if the CPU feature @code{name} is supported as indicated @@ -606,7 +618,7 @@ The supported features are: @item @code{TBM} @tab @code{TSC} @tab @code{VAES} @tab @code{VPCLMULQDQ} @item -@code{XOP} @tab @code{XSAVE} @tab @code{XSAVEC} +@code{XOP} @tab @code{XSAVE} @tab @code{XSAVEC} @tab @code{TSC_To_NS} @end multitable @end defmac diff --git a/sysdeps/mach/hurd/i386/libc.abilist b/sysdeps/mach/hurd/i386/libc.abilist index e3fc05137b..deadfda6a9 100644 --- a/sysdeps/mach/hurd/i386/libc.abilist +++ b/sysdeps/mach/hurd/i386/libc.abilist @@ -2038,6 +2038,8 @@ GLIBC_2.27 wcstof64x_l F GLIBC_2.28 fcntl64 F GLIBC_2.28 renameat2 F GLIBC_2.28 statx F +GLIBC_2.29 x86_ns_to_tsc F +GLIBC_2.29 x86_tsc_to_ns F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F GLIBC_2.3 __ctype_toupper_loc F diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist index 9762c81365..f499b74acc 100644 --- a/sysdeps/unix/sysv/linux/i386/libc.abilist +++ b/sysdeps/unix/sysv/linux/i386/libc.abilist @@ -2045,6 +2045,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 x86_ns_to_tsc F +GLIBC_2.29 x86_tsc_to_ns F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F GLIBC_2.3 __ctype_toupper_loc F diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist index 816e4a7426..436c18162b 100644 --- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist @@ -1895,6 +1895,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 x86_ns_to_tsc F +GLIBC_2.29 x86_tsc_to_ns F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F GLIBC_2.3 __ctype_toupper_loc F diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist index 6fee16a850..61a23fb686 100644 --- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist @@ -2146,3 +2146,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 x86_ns_to_tsc F +GLIBC_2.29 x86_tsc_to_ns F diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index 1e759d3efc..b2433c4317 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -26,6 +26,11 @@ endif ifeq ($(subdir),misc) sysdep_headers += sys/platform/x86.h + +sysdep_routines += tsc-ns + +tests += tst-tsc-ns tst-tsc-ns-static +tests-static += tst-tsc-ns-static endif ifeq ($(subdir),setjmp) diff --git a/sysdeps/x86/Versions b/sysdeps/x86/Versions index 92ab4d93a3..7481c6ec56 100644 --- a/sysdeps/x86/Versions +++ b/sysdeps/x86/Versions @@ -1,3 +1,9 @@ +libc { + GLIBC_2.29 { + x86_tsc_to_ns; x86_ns_to_tsc; + } +} + ld { GLIBC_2.29 { x86_get_cpuid_registers; x86_get_arch_feature; diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index f812f406fd..31e3474e38 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -423,6 +423,47 @@ init_cpu_features (struct cpu_features *cpu_features) else cpu_features->feature[index_arch_Prefer_No_AVX512] |= bit_arch_Prefer_No_AVX512; + + if (cpu_features->max_cpuid >= 0x15 + && CPU_FEATURES_CPU_P (cpu_features, INVARIANT_TSC)) + { + unsigned int frequency = 0; + unsigned int nominator = 0; + + __cpuid (0x15, + cpu_features->tsc_nccc_data.denominator, + nominator, frequency, edx); + if (nominator != 0 && frequency == 0 && family == 6) + switch (model) + { + case 0x55: + /* Skylake server is 25 MHz. */ + frequency = 25 * 1000 * 1000; + break; + case 0x5c: + /* Goldmont is 19.2 MHz. */ + frequency = 19.2 * 1000 * 1000; + break; + default: + if (CPU_FEATURES_CPU_P (cpu_features, AVX2) + && CPU_FEATURES_CPU_P (cpu_features, XSAVEC) + && CPU_FEATURES_CPU_P (cpu_features, + XGETBV_ECX_1)) + { + /* Skylake client is 24 MHz. */ + frequency = 24 * 1000 * 1000; + break; + } + } + if (frequency != 0) + { + /* Store frequency as kHz. */ + cpu_features->tsc_nccc_data.frequency = frequency / 1000; + cpu_features->tsc_nccc_data.nominator = nominator; + cpu_features->feature[index_arch_TSC_To_NS_Usable] + |= bit_arch_TSC_To_NS_Usable; + } + } } /* This spells out "AuthenticAMD". */ else if (ebx == 0x68747541 && ecx == 0x444d4163 && edx == 0x69746e65) diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h index 9ffa77e040..2db5249018 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/cpu-features.h @@ -70,6 +70,17 @@ enum cpu_features_kind arch_kind_other }; +/* Time Stamp Counter and Nominal Core Crystal Clock Information. */ +struct tsc_nccc_info +{ + /* The denominator of the TSC/core crystal clock ratio. */ + unsigned int denominator; + /* The numerator of the TSC/core crystal clock ratio. */ + unsigned int nominator; + /* The nominal frequency of the core crystal clock in Hz. */ + unsigned int frequency; +}; + struct cpu_features { enum cpu_features_kind kind; @@ -98,6 +109,8 @@ struct cpu_features unsigned long int shared_cache_size; /* Threshold to use non temporal store. */ unsigned long int non_temporal_threshold; + /* Time Stamp Counter and Nominal Core Crystal Clock Information. */ + struct tsc_nccc_info tsc_nccc_data; }; /* Used from outside of glibc to get access to the CPU features diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c index 8e92358c67..89df38c870 100644 --- a/sysdeps/x86/cpu-tunables.c +++ b/sysdeps/x86/cpu-tunables.c @@ -271,11 +271,14 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) disable, 15); break; case 16: + if (disable) { - CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH - (n, cpu_features, Prefer_No_AVX512, AVX512F_Usable, - disable, 16); + CHECK_GLIBC_IFUNC_ARCH_OFF (n, cpu_features, + TSC_To_NS_Usable, 16); } + CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH + (n, cpu_features, Prefer_No_AVX512, AVX512F_Usable, + disable, 16); break; case 18: { diff --git a/sysdeps/x86/sys/platform/x86.h b/sysdeps/x86/sys/platform/x86.h index a2c5c7074d..168081e08a 100644 --- a/sysdeps/x86/sys/platform/x86.h +++ b/sysdeps/x86/sys/platform/x86.h @@ -59,6 +59,9 @@ extern const struct cpuid_registers *x86_get_cpuid_registers extern unsigned int x86_get_arch_feature (unsigned int) __attribute__ ((const)); +extern unsigned long long x86_tsc_to_ns (unsigned long long); +extern unsigned long long x86_ns_to_tsc (unsigned long long); + /* HAS_CPU_FEATURE evaluates to true if CPU supports the feature. */ #define HAS_CPU_FEATURE(name) \ ((x86_get_cpuid_registers (index_cpu_##name)->reg_##name \ @@ -96,6 +99,7 @@ extern unsigned int x86_get_arch_feature (unsigned int) #define bit_arch_VPCLMULQDQ_Usable (1u << 20) #define bit_arch_XOP_Usable (1u << 21) #define bit_arch_XSAVEC_Usable (1u << 22) +#define bit_arch_TSC_To_NS_Usable (1u << 23) #define index_arch_AVX_Usable FEATURE_INDEX_1 #define index_arch_AVX2_Usable FEATURE_INDEX_1 @@ -120,6 +124,7 @@ extern unsigned int x86_get_arch_feature (unsigned int) #define index_arch_VPCLMULQDQ_Usable FEATURE_INDEX_1 #define index_arch_XOP_Usable FEATURE_INDEX_1 #define index_arch_XSAVEC_Usable FEATURE_INDEX_1 +#define index_arch_TSC_To_NS_Usable FEATURE_INDEX_1 /* Unused. Compiler will optimize them out. */ #define bit_arch_SSE3_Usable (1u << 0) @@ -234,6 +239,18 @@ extern unsigned int x86_get_arch_feature (unsigned int) #define index_arch_INVARIANT_TSC_Usable FEATURE_INDEX_1 #define index_arch_WBNOINVD_Usable FEATURE_INDEX_1 +/* Unused. Compiler will optimize them out. */ +#define bit_cpu_TSC_To_NS (1u << 0) + +/* Unused. Compiler will optimize them out. */ +#define index_cpu_TSC_To_NS COMMON_CPUID_INDEX_1 + +/* Unused. Compiler will optimize them out. */ +#define reg_TSC_To_NS eax + +/* There is no CPUID bit. */ +#define need_arch_feature_TSC_To_NS 1 + /* COMMON_CPUID_INDEX_1. */ /* ECX. */ diff --git a/sysdeps/x86/tsc-ns.c b/sysdeps/x86/tsc-ns.c new file mode 100644 index 0000000000..ba1e677382 --- /dev/null +++ b/sysdeps/x86/tsc-ns.c @@ -0,0 +1,64 @@ +/* Conversion functions between TSCs and nanoseconds. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +static struct tsc_nccc_info tsc_nccc_data; + +static void +__attribute__((constructor)) +init_tsc_nccc_data (void) +{ + const struct cpu_features* cpu_features = __get_cpu_features (); + if (CPU_FEATURES_ARCH_P (cpu_features, TSC_To_NS_Usable)) + tsc_nccc_data = cpu_features->tsc_nccc_data; +} + +unsigned long long +x86_tsc_to_ns (unsigned long long tsc) +{ + if (__glibc_unlikely (tsc_nccc_data.frequency == 0)) + return 0; + + /* Use double to avoid integer overflow. */ + double tmp = tsc; + tmp *= tsc_nccc_data.denominator * 1000000; + tmp /= tsc_nccc_data.frequency; + unsigned long long ns = tmp; + /* Round to the closest integer. */ + ns += tsc_nccc_data.nominator / 2; + return ns / tsc_nccc_data.nominator; + +} + +unsigned long long +x86_ns_to_tsc (unsigned long long ns) +{ + if (__glibc_unlikely (tsc_nccc_data.frequency == 0)) + return 0; + + /* Use double to avoid integer overflow. */ + double tmp = ns; + tmp *= tsc_nccc_data.frequency; + tmp *= tsc_nccc_data.nominator; + tmp /= tsc_nccc_data.denominator; + unsigned long long tsc = tmp; + /* Round to the closest integer. */ + ns += 1000000 / 2; + return tsc / 1000000; +} diff --git a/sysdeps/x86/tst-tsc-ns-static.c b/sysdeps/x86/tst-tsc-ns-static.c new file mode 100644 index 0000000000..ceb5ebc682 --- /dev/null +++ b/sysdeps/x86/tst-tsc-ns-static.c @@ -0,0 +1 @@ +#include "tst-tsc-ns.c" diff --git a/sysdeps/x86/tst-tsc-ns.c b/sysdeps/x86/tst-tsc-ns.c new file mode 100644 index 0000000000..b3a11f311c --- /dev/null +++ b/sysdeps/x86/tst-tsc-ns.c @@ -0,0 +1,51 @@ +/* Test cases for x86 conversion functions between TSCs and nanoseconds. + Copyright (C) 2015-2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include + +static int do_test (void); + +#include + +static int +do_test (void) +{ + if (!CPU_FEATURE_USABLE (TSC_To_NS)) + return EXIT_UNSUPPORTED; + + unsigned long long start_tscs, end_tscs, diff_tscs; + unsigned long long diff_nanoseconds; + start_tscs = _rdtsc (); + end_tscs = _rdtsc (); + diff_tscs = end_tscs - start_tscs; + diff_nanoseconds = x86_tsc_to_ns (diff_tscs); + + printf ("Diff: %lld (TSCs) -> %lld (nanoseconds)\n", + diff_tscs, diff_nanoseconds); + + diff_tscs = x86_ns_to_tsc (diff_nanoseconds); + + printf ("Diff: %lld (nanoseconds) -> %lld (TSCs)\n", + diff_nanoseconds, diff_tscs); + + return EXIT_SUCCESS; +} diff --git a/sysdeps/x86/tst-x86-platform-1.c b/sysdeps/x86/tst-x86-platform-1.c index 56c2c8c8c1..6c572f25c6 100644 --- a/sysdeps/x86/tst-x86-platform-1.c +++ b/sysdeps/x86/tst-x86-platform-1.c @@ -258,6 +258,8 @@ do_test (void) CHECK_CPU_FEATURE_USABLE (INVARIANT_TSC); CHECK_CPU_FEATURE_USABLE (WBNOINVD); + CHECK_CPU_FEATURE_USABLE (TSC_To_NS); + return 0; } -- 2.17.2