From patchwork Fri Feb 24 17:53:17 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lu, Hongjiu" X-Patchwork-Id: 19376 Received: (qmail 5578 invoked by alias); 24 Feb 2017 17:53:21 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 5569 invoked by uid 89); 24 Feb 2017 17:53:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-23.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=2016-11-28, 20161128 X-HELO: mga01.intel.com X-ExtLoop1: 1 Date: Fri, 24 Feb 2017 09:53:17 -0800 From: "H.J. Lu" To: GNU C Library Subject: [PATCH] [2.24] Add VZEROUPPER to memset-vec-unaligned-erms.S [BZ #21081] Message-ID: <20170224175317.GA15039@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.7.1 (2016-10-04) I am checking this into 2.24 branch. H.J. --- Since memset-vec-unaligned-erms.S has VDUP_TO_VEC0_AND_SET_RETURN at function entry, memset optimized for AVX2 and AVX512 will always use ymm/zmm register. VZEROUPPER should be placed before ret in L(stosb): movq %rdx, %rcx movzbl %sil, %eax movq %rdi, %rdx rep stosb movq %rdx, %rax ret since it can be reached from L(stosb_more_2x_vec): cmpq $REP_STOSB_THRESHOLD, %rdx ja L(stosb) [BZ #21081] * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S (L(stosb)): Add VZEROUPPER before ret. (cherry picked from commit 02b78ff749f0c88771713368dbb2a09b1979814f) --- ChangeLog | 6 ++++++ sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S | 2 ++ 2 files changed, 8 insertions(+) diff --git a/ChangeLog b/ChangeLog index a9b7540..1b7d40a 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2017-01-30 H.J. Lu + + [BZ #21081] + * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S + (L(stosb)): Add VZEROUPPER before ret. + 2016-11-28 H.J. Lu [BZ #20750] diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S index 28e71fd..acf448c 100644 --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S @@ -110,6 +110,8 @@ ENTRY (__memset_erms) ENTRY (MEMSET_SYMBOL (__memset, erms)) # endif L(stosb): + /* Issue vzeroupper before rep stosb. */ + VZEROUPPER movq %rdx, %rcx movzbl %sil, %eax movq %rdi, %rdx