From patchwork Sat May 27 00:31:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Erich Elsen X-Patchwork-Id: 20613 Received: (qmail 91617 invoked by alias); 27 May 2017 00:31:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 91593 invoked by uid 89); 27 May 2017 00:31:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=2515 X-HELO: mail-oi0-f48.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=WlhUAX6tk5JRVpnfdzC8dR9xhkP4pYW2oaGQ2HLYE8A=; b=k0sN2j3jRR1eAs4wM1lZ1w7QEY96BEAvWMpK2vUQ47tCxVHozLFPo2s25h+R9cWJXL BBBYpPeEBy7BnQBonjZsDds8anTkQvH6r8+2J3FWUNw9lUTCJvshfZ7wmJScLSjstxAk JZh8Rzcjg2wvAzdB7jjsaQS7xJuNKdR9V62cysFf+IKozeSxwkyjsrAR0nE50cjF3HR9 hv2DKzD/Ucmsp+ZUQA5q7rSdnBqk89JWHVcjUr8TzQz5LzBjfZrz2Ch9Jhd0ZBGDGeFg 8ZNiB9zmE0fSpEU8PJMKdSBEUJxnm2xJ2sUPry0pOwkf5gxIVi9q1ZsOEWmezZUnA0F8 yCvg== X-Gm-Message-State: AODbwcDJVeX+Y4tYzU9+Ck5I7H80tDKKXacxNj6C8sb5XC78jvhBRP1g tAt2Zp72psGKh0w3uL31JZ/239rNYafr X-Received: by 10.202.214.6 with SMTP id n6mr2387429oig.190.1495845084587; Fri, 26 May 2017 17:31:24 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <9c563a4b-424b-242f-b82f-4650ab2637f7@redhat.com> <28e34264-e8c5-5570-c48c-9125893808b2@redhat.com> From: Erich Elsen Date: Fri, 26 May 2017 17:31:23 -0700 Message-ID: Subject: Re: memcpy performance regressions 2.19 -> 2.24(5) To: "H.J. Lu" Cc: "Carlos O'Donell" , GNU C Library Sorry for misinterpreting. Here is the full patch. On Thu, May 25, 2017 at 3:03 PM, H.J. Lu wrote: > On Thu, May 25, 2017 at 2:57 PM, Erich Elsen wrote: >> It looks like you already added the non_temporal_threshold as part of >> the cpu_features tunables? Here's a small patch that allows the > > No, I didn't. I only added cache info to CPU features. > >> cpu_features struct to be passed in. This is useful if you need to be >> able to call init_cacheinfo with cpu_features other than the global >> ones. > > I need to see the complete working patch. > >> >> >> On Thu, May 25, 2017 at 2:23 PM, Erich Elsen wrote: >>> Ok, will do. >>> >>> On Wed, May 24, 2017 at 2:36 PM, H.J. Lu wrote: >>>> On Mon, May 22, 2017 at 8:19 PM, Erich Elsen wrote: >>>>> Here is the patch that slightly refactors how init_cacheinfo is called. >>>>> >>>> >>>> Please take a look at hjl/tunables/master branch. You can add >>>> non_temporal_threshold support on top of it. >>>> >>>> >>>> -- >>>> H.J. > > > > -- > H.J. From bdbc243d9da3f5d59dc495970ef9572e7c446e94 Mon Sep 17 00:00:00 2001 From: Erich Elsen Date: Fri, 26 May 2017 17:28:06 -0700 Subject: [PATCH 1/1] add tunables for x86 cache info --- sysdeps/x86/cacheinfo.c | 60 +++++++++++++++++++++++++++++++++++++++++--- sysdeps/x86/dl-tunables.list | 15 +++++++++++ 2 files changed, 71 insertions(+), 4 deletions(-) diff --git a/sysdeps/x86/cacheinfo.c b/sysdeps/x86/cacheinfo.c index a46dd4dc30..ac98a951b0 100644 --- a/sysdeps/x86/cacheinfo.c +++ b/sysdeps/x86/cacheinfo.c @@ -25,6 +25,15 @@ #include #include +#if HAVE_TUNABLES +# define TUNABLE_NAMESPACE x86 +# include +#else +# include +# include +extern char **_environ; +#endif + static const struct intel_02_cache_info { unsigned char idx; @@ -482,9 +491,9 @@ int __x86_prefetchw attribute_hidden; #endif -static void -__attribute__((constructor)) -init_cacheinfo (void) +void +attribute_hidden +init_cacheinfo_impl (const struct cpu_features* cpu_features) { /* Find out what brand of processor. */ unsigned int eax; @@ -496,7 +505,6 @@ init_cacheinfo (void) long int shared = -1; unsigned int level; unsigned int threads = 0; - const struct cpu_features *cpu_features = __get_cpu_features (); int max_cpuid = cpu_features->max_cpuid; if (cpu_features->kind == arch_kind_intel) @@ -787,4 +795,48 @@ intel_bug_no_cache_info: : __x86_shared_cache_size * 6); } +static void +update_cpufeature_cache_info(struct cpu_features* cpu_features) +{ +#if HAVE_TUNABLES + TUNABLE_SET_VAL (non_temporal_threshold, + &(cpu_features->cache.non_temporal_threshold)); + TUNABLE_SET_VAL (data_size, + &(cpu_features->cache.data_size)); + TUNABLE_SET_VAL (shared_size, + &(cpu_features->cache.shared_size)); +#else + if (__glibc_likely (_environ != NULL) + && !__builtin_expect (__libc_enable_secure, 0)) + { + char **runp = _environ; + char *envline; + + while (*runp != NULL) + { + envline = *runp; + if (!DEFAULT_MEMCMP (envline, "GLIBC_NON_TEMPORAL_THRESHOLD=", 29)) + cpu_features->cache.non_temporal_threshold = atoi (&envline[29]); + else if (!DEFAULT_MEMCMP (envline, "GLIBC_DATA_SIZE=", 16)) + cpu_features->cache.data_size = atoi (&envline[16]); + else if (!DEFAULT_MEMCMP (envline, "GLIBC_SHARED_SIZE=", 18)) + cpu_features->cache.shared_size = atoi (&envline[18]); + + runp++; + } + } +#endif +} + +static void +__attribute__((constructor)) +init_cacheinfo (void) +{ + const struct cpu_features *cpu_features_const = __get_cpu_features (); + struct cpu_features cpu_features = *cpu_features_const; + + update_cpufeature_cache_info (&cpu_features); + + init_cacheinfo_impl (&cpu_features); +} #endif diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list index 0c9acc085c..136e455bcf 100644 --- a/sysdeps/x86/dl-tunables.list +++ b/sysdeps/x86/dl-tunables.list @@ -5,5 +5,20 @@ glibc { env_alias: GLIBC_IFUNC security_level: SXID_IGNORE } + non_temporal_threshold { + type: SIZE_T + env_alias: GLIBC_NON_TEMPORAL_THRESHOLD + security_level: SXID_IGNORE + } + data_size { + type: SIZE_T + env_alias: GLIBC_DATA_SIZE + security_level: SXID_IGNORE + } + shared_size { + type: SIZE_T + env_alias: GLIBC_SHARED_SIZE + security_level: SXID_IGNORE + } } } -- 2.13.0.219.gdb65acc882-goog