From patchwork Thu Jun 30 16:21:35 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 13524 Received: (qmail 57956 invoked by alias); 30 Jun 2016 16:21:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 57947 invoked by uid 89); 30 Jun 2016 16:21:48 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=yyy X-HELO: mail-qk0-f176.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=QKFqmALJZQVa9ABLhSHZINT3psJypTct+wFQr8h/sJA=; b=gOclvuiyLow9vOPaRD6V752YIzOjswVgwivqGceIUD4WqG8k7tTBeV+1GzxNgiG6Bn /h2ptvshhBwBs3ou7Ogb5W99Pdt553oRuCFXkNJNE8dEoZiJWcu++WstQyfENYW614mo Os/zYCcRYg1OJISMHsdQyW3MJYK1wXjtO7uwT/vlHSciGHfkrzzerwqcFxmaSFsbheaR xalaFp2VcKoNkDKxlsoTOpVG0uh73nIuKbCpXv3Q4KzNc1x2rficReW9EjlrrcfRY2sw msCZi4JMMnSoTUwtMAJ5XimzDPgYytSVKB4piO7bv/ypZ1ur644Xd+pMklE9nUgll6fs OVww== X-Gm-Message-State: ALyK8tLBmaK0ItO+401eJE7fJ5NgEquQqFfYDGAbirOXvHpqDvkxOuLpMMNBbqvNvSV248z1l8qoa6eF4uUGqA== X-Received: by 10.55.105.5 with SMTP id e5mr20641349qkc.192.1467303696128; Thu, 30 Jun 2016 09:21:36 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20160630013716.GY4685@vapier.lan> <57752F1B.7010809@arm.com> From: "H.J. Lu" Date: Thu, 30 Jun 2016 09:21:35 -0700 Message-ID: Subject: Re: PATCH: Check GLIBC_IFUNC to enable/disable ifunc features To: Szabolcs Nagy Cc: "Carlos O'Donell" , GNU C Library , Siddhesh Poyarekar , nd On Thu, Jun 30, 2016 at 9:06 AM, H.J. Lu wrote: > On Thu, Jun 30, 2016 at 7:39 AM, Szabolcs Nagy wrote: >> On 30/06/16 03:56, H.J. Lu wrote: >>> The environment >>> variable, GLIBC_IFUNC=xxx=0:yyy=1:zzz=0...., can be used to enable >>> CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz, where the >>> feature name is case-sensitive and has to match the ones in >>> cpu-features.h. It can be used by glibc developers to override the >>> IFUNC selection to improve performance for a particular workload or >>> tune for a new processor. >>> >> >> since it is for glibc devs only is it expected to be >> stable across glibc versions? > > SInce INFUNC implementation changes over time, it won't be > stable. > >> may be the env var name could reflect that it is not >> a public api. > > GLIBC_IFUNC is private. > >>> Since all CPU/ARCH features are hardware optimizations without security >>> implication, except for Prefer_MAP_32BIT_EXEC, which can only be disabled, >>> we check GLIBC_IFUNC for programs, including set*id ones. >>> >> >> i think it can have security implications, but probably not significant. > > If there are security implications in IFUNC implementation, > it is a bug in IFUNC implementation. > >>> NOTE: the IFUNC selection may change over time. Please check all >>> multiarch implementations when experimenting. >>> >> >>> +init_cpu_features (struct cpu_features *cpu_features, char **env) >> ... >>> + while (*env != NULL) >>> + { >>> + const char *p, *end; >>> + size_t len = sizeof ("GLIBC_IFUNC="); >>> + >>> + end = *env; >>> + for (p = end; *p != '\0'; p++) >>> + if (--len == 0 && equal (end, "GLIBC_IFUNC=", 12)) >>> + { >>> + len = strlen (p); >> >> is this x86_64 only? > > No. It is in sysdeps/x86/cpu-features.c, which is shared by > i386 and x86_64. > >> how can strlen work before ifunc is done? >> (i think strlen is ifunc resolved on i386) >> >> i know ld.so is careful about this, but i think >> with static linking ifunc resolved functions >> should not be called before apply_irel is done >> > > I will remove strlen to be on the safe side. > Here is the updated patch with strlen removed. We don't need to check if env != NULL since it will never be NULL. From 9161d1b5f4b184e44fa1066f9c11d679ec8197be Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 27 Jun 2016 15:13:50 -0700 Subject: [PATCH] Add GLIBC_IFUNC to control IFUNC selection The current IFUNC selection is based on microbenchmarks in glibc. It should give the best performance for most workloads. But other choices may have better performance for a particular workload or on the hardware which wasn't available at the selection was made. The environment variable, GLIBC_IFUNC=-xxx,yyy,-zzz...., can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz, where the feature name is case-sensitive and has to match the ones in cpu-features.h. It can be used by glibc developers to override the IFUNC selection to tune for a new processor or improve performance for a particular workload. It isn't intended for normal end users. Since all CPU/ARCH features are hardware optimizations without security implication, except for Prefer_MAP_32BIT_EXEC, which can only be disabled, we check GLIBC_IFUNC for programs, including set*id ones. NOTE: the IFUNC selection may change over time. Please check all multiarch implementations when experimenting. * sysdeps/i386/dl-machine.h (dl_platform_init): Pass the array of environment strings to init_cpu_features. * sysdeps/x86/libc-start.c (__libc_start_main): Likewise. * sysdeps/x86/cpu-features.c (equal): New function. (CHECK_GLIBC_IFUNC_CPU_OFF): New macro. (CHECK_GLIBC_IFUNC_ARCH_OFF): Likewise. (CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH): Likewise. (CHECK_GLIBC_IFUNC_ARCH_NEED_CPU_BOTH): Likewise. (init_cpu_features): Updated to take the array of environment strings. Process GLIBC_IFUNC environment variable. --- sysdeps/i386/dl-machine.h | 3 +- sysdeps/x86/cpu-features.c | 305 +++++++++++++++++++++++++++++++++++++++++++- sysdeps/x86/libc-start.c | 2 +- sysdeps/x86_64/dl-machine.h | 3 +- 4 files changed, 309 insertions(+), 4 deletions(-) diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h index 4e3968a..7584931 100644 --- a/sysdeps/i386/dl-machine.h +++ b/sysdeps/i386/dl-machine.h @@ -240,7 +240,8 @@ dl_platform_init (void) #ifdef SHARED /* init_cpu_features has been called early from __libc_start_main in static executable. */ - init_cpu_features (&GLRO(dl_x86_cpu_features)); + init_cpu_features (&GLRO(dl_x86_cpu_features), + &_dl_argv[_dl_argc + 1]); #endif } diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index 9ce4b49..525b262 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -91,8 +91,137 @@ get_common_indeces (struct cpu_features *cpu_features, } } +#ifdef __x86_64__ +typedef long long op_t; +#else +typedef int op_t; +#endif + +/* Return true if the first LEN bytes of strings A and B are the same + where LEN != 0. We can't use string/memory functions because they + trigger an ifunc resolve loop. */ + +static bool +equal (const char *a, const char *b, size_t len) +{ + size_t op_len = len % sizeof (op_t); + if (op_len) + { + switch (op_len) + { + case 1: + if (*(char *) a != *(char *) b) + return false; + break; + case 2: + if (*(short *) a != *(short *) b) + return false; + break; + case 3: + if (*(short *) a != *(short *) b + || *(char *) (a + 2) != *(char *) (b + 2)) + return false; + break; +#ifdef __x86_64__ + case 4: + if (*(int *) a != *(int *) b) + return false; + break; + default: + if (*(int *) a != *(int *) b + || *(int *) (a + op_len - 4) != *(int *) (b + op_len - 4)) + return false; + break; +#else + default: + break; +#endif + } + /* Align length to size of op_t. */ + len -= op_len; + if (len == 0) + return true; + a += op_len; + b += op_len; + } + + /* Compare one op_t at a time. */ + do + { + if (*(op_t *) a != *(op_t *) b) + return false; + len -= sizeof (op_t); + if (len == 0) + return true; + a += sizeof (op_t); + b += sizeof (op_t); + } + while (1); +} + +/* Disable a CPU feature NAME. We don't enable a CPU feature which isn't + availble. */ +#define CHECK_GLIBC_IFUNC_CPU_OFF(name) \ + if (equal (n, #name, sizeof (#name) - 1)) \ + { \ + cpu_features->cpuid[index_cpu_##name].reg_##name \ + &= ~bit_cpu_##name; \ + break; \ + } + +/* Disable an ARCH feature NAME. We don't enable an ARCH feature which + isn't availble or has security implication. */ +#define CHECK_GLIBC_IFUNC_ARCH_OFF(name) \ + if (equal (n, #name, sizeof (#name) - 1)) \ + { \ + cpu_features->feature[index_arch_##name] \ + &= ~bit_arch_##name; \ + break; \ + } + +/* Enable/disable an ARCH feature NAME. */ +#define CHECK_GLIBC_IFUNC_ARCH_BOTH(name, disable) \ + if (equal (n, #name, sizeof (#name) - 1)) \ + { \ + if (disable) \ + cpu_features->feature[index_arch_##name] \ + &= ~bit_arch_##name; \ + else \ + cpu_features->feature[index_arch_##name] \ + |= bit_arch_##name; \ + break; \ + } + +/* Enable/disable an ARCH feature NAME. Enable an ARCH feature only + if the ARCH feature NEED is also enabled. */ +#define CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH(name, need, disable) \ + if (equal (n, #name, sizeof (#name) - 1)) \ + { \ + if (disable) \ + cpu_features->feature[index_arch_##name] \ + &= ~bit_arch_##name; \ + else if (CPU_FEATURES_ARCH_P (cpu_features, need)) \ + cpu_features->feature[index_arch_##name] \ + |= bit_arch_##name; \ + break; \ + } + +/* Enable/disable an ARCH feature NAME. Enable an ARCH feature only + if the CPU feature NEED is also enabled. */ +#define CHECK_GLIBC_IFUNC_ARCH_NEED_CPU_BOTH(name, need, disable) \ + if (equal (n, #name, sizeof (#name) - 1)) \ + { \ + if (disable) \ + cpu_features->feature[index_arch_##name] \ + &= ~bit_arch_##name; \ + else if (CPU_FEATURES_CPU_P (cpu_features, need)) \ + cpu_features->feature[index_arch_##name] \ + |= bit_arch_##name; \ + break; \ + } + static inline void -init_cpu_features (struct cpu_features *cpu_features) +init_cpu_features (struct cpu_features *cpu_features, char **env) { unsigned int ebx, ecx, edx; unsigned int family = 0; @@ -268,4 +397,178 @@ no_cpuid: cpu_features->family = family; cpu_features->model = model; cpu_features->kind = kind; + + /* The current IFUNC selection is based on microbenchmarks in glibc. + It should give the best performance for most workloads. But other + choices may have better performance for a particular workload or on + the hardware which wasn't available when the selection was made. + The environment variable, GLIBC_IFUNC=-xxx,yyy,-zzz...., can be + used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy + and zzz, where the feature name is case-sensitive and has to match + the ones in cpu-features.h. It can be used by glibc developers to + tune for a new processor or override the IFUNC selection to improve + performance for a particular workload. + + Since all CPU/ARCH features are hardware optimizations without + security implication, except for Prefer_MAP_32BIT_EXEC, which can + only be disabled, we check GLIBC_IFUNC for programs, including + set*id ones. + + NOTE: the IFUNC selection may change over time. Please check all + multiarch implementations when experimenting. */ + + while (*env != NULL) + { + const char *p, *end; + size_t len = sizeof ("GLIBC_IFUNC="); + + end = *env; + for (p = end; *p != '\0'; p++) + if (--len == 0 && equal (end, "GLIBC_IFUNC=", + sizeof ("GLIBC_IFUNC=") - 1)) + { + /* Can't use strlen because it may trigger an ifunc resolve + loop. */ + for (; *end != '\0'; end++); + do + { + const char *c, *n; + bool disable; + size_t nl; + + for (c = p; *c != ','; c++) + if (c >= end) + break; + + len = c - p; + disable = *p == '-'; + if (disable) + { + n = p + 1; + nl = len - 1; + } + else + { + n = p; + nl = len; + } + switch (nl) + { + default: + break; + case 3: + if (disable) + { + CHECK_GLIBC_IFUNC_CPU_OFF (AVX); + CHECK_GLIBC_IFUNC_CPU_OFF (CX8); + CHECK_GLIBC_IFUNC_CPU_OFF (FMA); + CHECK_GLIBC_IFUNC_CPU_OFF (HTT); + CHECK_GLIBC_IFUNC_CPU_OFF (RTM); + } + break; + case 4: + if (disable) + { + CHECK_GLIBC_IFUNC_CPU_OFF (AVX2); + CHECK_GLIBC_IFUNC_CPU_OFF (CMOV); + CHECK_GLIBC_IFUNC_CPU_OFF (ERMS); + CHECK_GLIBC_IFUNC_CPU_OFF (FMA4); + CHECK_GLIBC_IFUNC_CPU_OFF (SSE2); + CHECK_GLIBC_IFUNC_ARCH_OFF (I586); + CHECK_GLIBC_IFUNC_ARCH_OFF (I686); + } + break; + case 5: + if (disable) + { + CHECK_GLIBC_IFUNC_CPU_OFF (SSSE3); + } + break; + case 6: + if (disable) + { + CHECK_GLIBC_IFUNC_CPU_OFF (SSE4_1); + CHECK_GLIBC_IFUNC_CPU_OFF (SSE4_2); + } + break; + case 7: + if (disable) + { + CHECK_GLIBC_IFUNC_CPU_OFF (AVX512F); + CHECK_GLIBC_IFUNC_CPU_OFF (OSXSAVE); + } + break; + case 8: + if (disable) + { + CHECK_GLIBC_IFUNC_CPU_OFF (AVX512DQ); + CHECK_GLIBC_IFUNC_CPU_OFF (POPCOUNT); + } + CHECK_GLIBC_IFUNC_ARCH_BOTH (Slow_BSF, disable); + break; + case 10: + if (disable) + { + CHECK_GLIBC_IFUNC_ARCH_OFF (AVX_Usable); + CHECK_GLIBC_IFUNC_ARCH_OFF (FMA_Usable); + } + break; + case 11: + if (disable) + { + CHECK_GLIBC_IFUNC_ARCH_OFF (AVX2_Usable); + CHECK_GLIBC_IFUNC_ARCH_OFF (FMA4_Usable); + } + CHECK_GLIBC_IFUNC_ARCH_BOTH (Prefer_ERMS, disable); + CHECK_GLIBC_IFUNC_ARCH_NEED_CPU_BOTH (Slow_SSE4_2, + SSE4_2, + disable); + break; + case 13: + if (disable) + { + CHECK_GLIBC_IFUNC_ARCH_OFF (AVX512F_Usable); + } + CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH + (AVX_Fast_Unaligned_Load, AVX_Usable, disable); + break; + case 15: + if (disable) + { + CHECK_GLIBC_IFUNC_ARCH_OFF (AVX512DQ_Usable); + } + CHECK_GLIBC_IFUNC_ARCH_BOTH (Fast_Rep_String, disable); + break; + case 18: + CHECK_GLIBC_IFUNC_ARCH_BOTH (Fast_Copy_Backward, + disable); + break; + case 19: + CHECK_GLIBC_IFUNC_ARCH_BOTH (Fast_Unaligned_Load, + disable); + CHECK_GLIBC_IFUNC_ARCH_BOTH (Fast_Unaligned_Copy, + disable); + break; + case 20: + CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH + (Prefer_No_VZEROUPPER, AVX_Usable, disable); + break; + case 21: + if (disable) + { + CHECK_GLIBC_IFUNC_ARCH_OFF (Prefer_MAP_32BIT_EXEC); + } + break; + case 26: + CHECK_GLIBC_IFUNC_ARCH_NEED_CPU_BOTH + (Prefer_PMINUB_for_stringop, SSE2, disable); + break; + } + p += len + 1; + } + while (p < end); + return; + } + env++; + } } diff --git a/sysdeps/x86/libc-start.c b/sysdeps/x86/libc-start.c index 3b5ea6e..7dec1ca 100644 --- a/sysdeps/x86/libc-start.c +++ b/sysdeps/x86/libc-start.c @@ -34,7 +34,7 @@ __libc_start_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), void (*fini) (void), void (*rtld_fini) (void), void *stack_end) { - init_cpu_features (&_dl_x86_cpu_features); + init_cpu_features (&_dl_x86_cpu_features, &argv[argc + 1]); return generic_start_main (main, argc, argv, init, fini, rtld_fini, stack_end); } diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h index ed0c1a8..071a2e1 100644 --- a/sysdeps/x86_64/dl-machine.h +++ b/sysdeps/x86_64/dl-machine.h @@ -227,7 +227,8 @@ dl_platform_init (void) #ifdef SHARED /* init_cpu_features has been called early from __libc_start_main in static executable. */ - init_cpu_features (&GLRO(dl_x86_cpu_features)); + init_cpu_features (&GLRO(dl_x86_cpu_features), + &_dl_argv[_dl_argc + 1]); #endif } -- 2.7.4