From patchwork Fri Mar 5 16:53:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 42271 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2494C398E472; Fri, 5 Mar 2021 16:53:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2494C398E472 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1614963203; bh=yVCEqt73MNCLb6C90nuj9NxuEPoahzTZInYuEU4rYRs=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Wc+ZCxOi4099V9ySEoJkLymPRi1PpD1ux1Ma/W2jl8i3pb8ybio01d9UOOFZL6Mrk Wqu2hpiq1J1lVMEKdV1Lr0bzXeP18p3Es1DLV7edew/48I3IARCsp+Gxwil58QAccY 9d+FYQhAOYA0aMovPwkjLnQhTTYa1iq6wkz28ea8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id A7F5A386103E for ; Fri, 5 Mar 2021 16:53:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A7F5A386103E Received: by mail-pl1-x62a.google.com with SMTP id u18so1676309plc.12 for ; Fri, 05 Mar 2021 08:53:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yVCEqt73MNCLb6C90nuj9NxuEPoahzTZInYuEU4rYRs=; b=WqleUEB5cQ94OOo1yAJQ3sNyNryEIomfKVCAATdxWlo0Mw4InxtpKlzPbFF0cwik/7 nmET4sbSRZpzxiqa+KKhDOa1KGIFXBeDCPvqlRtA2FWYOXg8dwisUBbS6SPDK0hnLhos cWPkcxTzphwNHdDILKRW15Mh2BYepiWHiv1Ob17oyFuX5qJIA9KEhTMO5GGNupX9dcJ7 2KsnfgdhRsLIKcoiwvPJBbOODvk0WHlccDmHJYu4d4V02qURttDWLDh7CPt58/1Bdz20 4V4WzZnJewdylN9/H4RLHzTK3VkrFGVpx1B3K0wQF9NU6KJJYquwJWR/1CG3iJRfXtbz fwGg== X-Gm-Message-State: AOAM5306YrY8Ugd755us/gskavUyEZ+kgUe3AsXbfRVlw8+zZqHZY+OM cZFm3I/ibhWz/wClWhjGSisxNt1TUY0= X-Google-Smtp-Source: ABdhPJxZ654DU082THXCrPzs11vHguaCnj/McexqChxPx3cvAFmk3sjrY5sIilPcJrTJQncM2nO7Ww== X-Received: by 2002:a17:90a:a414:: with SMTP id y20mr10992788pjp.77.1614963199344; Fri, 05 Mar 2021 08:53:19 -0800 (PST) Received: from gnu-cfl-2.localdomain ([172.56.38.48]) by smtp.gmail.com with ESMTPSA id gz12sm2856746pjb.33.2021.03.05.08.53.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Mar 2021 08:53:18 -0800 (PST) Received: from gnu-tgl-2.localdomain (gnu-tgl-2 [192.168.1.34]) by gnu-cfl-2.localdomain (Postfix) with ESMTPS id 2D1A41A03FE for ; Fri, 5 Mar 2021 08:53:17 -0800 (PST) Received: from gnu-tgl-2.?040none?041 (localhost [IPv6:::1]) by gnu-tgl-2.localdomain (Postfix) with ESMTP id 31EFC300399 for ; Fri, 5 Mar 2021 08:53:16 -0800 (PST) To: libc-alpha@sourceware.org Subject: [PATCH 1/8] x86: Set Prefer_No_VZEROUPPER and add Prefer_AVX2_STRCMP Date: Fri, 5 Mar 2021 08:53:09 -0800 Message-Id: <20210305165316.323467-2-hjl.tools@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210305165316.323467-1-hjl.tools@gmail.com> References: <20210305165316.323467-1-hjl.tools@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3034.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" 1. Set Prefer_No_VZEROUPPER if RTM is usable to avoid RTM abort triggered by VZEROUPPER inside a transactionally executing RTM region. 2. Since to compare 2 32-byte strings, 256-bit EVEX strcmp requires 2 loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp requires 1 load, 2 VPCMPEQs, 1 VPMINU and 1 VPMOVMSKB, AVX2 strcmp is faster than EVEX strcmp. Add Prefer_AVX2_STRCMP to prefer AVX2 strcmp family functions. --- sysdeps/x86/cpu-features.c | 20 +++++++++++++++++-- sysdeps/x86/cpu-tunables.c | 2 ++ ...cpu-features-preferred_feature_index_1.def | 1 + 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index d7248cbb45..d7808acb33 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -531,8 +531,24 @@ init_cpu_features (struct cpu_features *cpu_features) cpu_features->preferred[index_arch_Prefer_No_VZEROUPPER] |= bit_arch_Prefer_No_VZEROUPPER; else - cpu_features->preferred[index_arch_Prefer_No_AVX512] - |= bit_arch_Prefer_No_AVX512; + { + cpu_features->preferred[index_arch_Prefer_No_AVX512] + |= bit_arch_Prefer_No_AVX512; + + /* Avoid RTM abort triggered by VZEROUPPER inside a + transactionally executing RTM region. */ + if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) + cpu_features->preferred[index_arch_Prefer_No_VZEROUPPER] + |= bit_arch_Prefer_No_VZEROUPPER; + + /* Since to compare 2 32-byte strings, 256-bit EVEX strcmp + requires 2 loads, 3 VPCMPs and 2 KORDs while AVX2 strcmp + requires 1 load, 2 VPCMPEQs, 1 VPMINU and 1 VPMOVMSKB, + AVX2 strcmp is faster than EVEX strcmp. */ + if (CPU_FEATURE_USABLE_P (cpu_features, AVX2)) + cpu_features->preferred[index_arch_Prefer_AVX2_STRCMP] + |= bit_arch_Prefer_AVX2_STRCMP; + } } /* This spells out "AuthenticAMD" or "HygonGenuine". */ else if ((ebx == 0x68747541 && ecx == 0x444d4163 && edx == 0x69746e65) diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c index 126896f41b..a90df39b78 100644 --- a/sysdeps/x86/cpu-tunables.c +++ b/sysdeps/x86/cpu-tunables.c @@ -238,6 +238,8 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) CHECK_GLIBC_IFUNC_PREFERRED_BOTH (n, cpu_features, Fast_Copy_Backward, disable, 18); + CHECK_GLIBC_IFUNC_PREFERRED_NEED_BOTH + (n, cpu_features, Prefer_AVX2_STRCMP, AVX2, disable, 18); } break; case 19: diff --git a/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def b/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def index 06af1a8dd5..133aab19f1 100644 --- a/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def +++ b/sysdeps/x86/include/cpu-features-preferred_feature_index_1.def @@ -32,3 +32,4 @@ BIT (Prefer_ERMS) BIT (Prefer_No_AVX512) BIT (MathVec_Prefer_No_AVX512) BIT (Prefer_FSRM) +BIT (Prefer_AVX2_STRCMP)