From patchwork Thu Jun 30 16:40:32 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lu, Hongjiu" X-Patchwork-Id: 13526 Received: (qmail 113905 invoked by alias); 30 Jun 2016 16:40:57 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 113895 invoked by uid 89); 30 Jun 2016 16:40:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=micro, H*R:D*gmail.com X-HELO: mga09.intel.com X-ExtLoop1: 1 Date: Thu, 30 Jun 2016 09:40:32 -0700 From: "H.J. Lu" To: GNU C Library Subject: [PATCH] Check -non_temporal_store in GLIBC_IFUNC Message-ID: <20160630164032.GA10835@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.6.1 (2016-04-27) The x86 non-temporal threshold is an approximate value. This patch checks -non_temporal_store in GLIBC_IFUNC to disable non-temporal store. * sysdeps/x86/cacheinfo.c (init_cacheinfo): Set __x86_shared_non_temporal_threshold only if it is not set. * sysdeps/x86/cpu-features.c (__x86_shared_non_temporal_threshold): New. (init_cpu_features): Check -non_temporal_store in GLIBC_IFUNC to disable non-temporal store. --- sysdeps/x86/cacheinfo.c | 13 +++++++++---- sysdeps/x86/cpu-features.c | 14 ++++++++++++++ 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/sysdeps/x86/cacheinfo.c b/sysdeps/x86/cacheinfo.c index cf4f64b..2291ad4 100644 --- a/sysdeps/x86/cacheinfo.c +++ b/sysdeps/x86/cacheinfo.c @@ -762,8 +762,13 @@ intel_bug_no_cache_info: __x86_shared_cache_size = shared; } - /* The large memcpy micro benchmark in glibc shows that 6 times of - shared cache size is the approximate value above which non-temporal - store becomes faster. */ - __x86_shared_non_temporal_threshold = __x86_shared_cache_size * 6; + /* Set non-temporal threshold to an approximate value if it hasn't + been set. */ + if (__x86_shared_non_temporal_threshold == 0) + { + /* The large memcpy microbenchmark in glibc shows that 6 times + of shared cache size is the approximate value above which + non-temporal store becomes faster. */ + __x86_shared_non_temporal_threshold = __x86_shared_cache_size * 6; + } } diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index 525b262..e53a302 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -220,6 +220,8 @@ equal (const char *a, const char *b, size_t len) break; \ } +extern long int __x86_shared_non_temporal_threshold attribute_hidden; + static inline void init_cpu_features (struct cpu_features *cpu_features, char **env) { @@ -540,6 +542,18 @@ no_cpuid: CHECK_GLIBC_IFUNC_ARCH_BOTH (Fast_Rep_String, disable); break; case 18: + if (disable) + { + if (equal (n, "non_temporal_store", + sizeof ("non_temporal_store") - 1)) + { + /* Disable non-temporal store with + "-non_temporal_store". */ + __x86_shared_non_temporal_threshold + = (long int) -1; + break; + } + } CHECK_GLIBC_IFUNC_ARCH_BOTH (Fast_Copy_Backward, disable); break;