From patchwork Sat Oct 23 05:26:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 46563 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 69504385782C for ; Sat, 23 Oct 2021 05:27:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 69504385782C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1634966838; bh=4YYMwPQfYa7qATU3x/S9s4YqeTs4EE7hWnuY4BJXi5U=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=X8OFs0odSmmW8TTalrEGafJWW7iLhbKnte7z3UzPKXVESE3q++RYyWzmUIc6YGuHO 4kSZr+KV5EWdorpYJtzoPFsEGSKhm1GCaDiRkyH3bc8ySrwjTjk8sVDTKWVoZJUpun 6u3GTREt2eirCoLc4j77wjpT4UiwGizFUyLe3Y0U= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) by sourceware.org (Postfix) with ESMTPS id 003623858416 for ; Sat, 23 Oct 2021 05:26:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 003623858416 Received: by mail-io1-xd2a.google.com with SMTP id z69so8020494iof.9 for ; Fri, 22 Oct 2021 22:26:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=4YYMwPQfYa7qATU3x/S9s4YqeTs4EE7hWnuY4BJXi5U=; b=24ZFt1ok5EmIC+LG9eH5Ou4rEBOxQKM6GmoWUmIfEoCUFVFzhCUiNNUYdwEjmj1vEq Owir/62IfS3G7SVOMzhkCRm863gvXChjL4UCCU9txA3clwaRmXZrlwt1EVIjllEwSfgf 8bmABnOoHgWI0lo8rcy+AeQqzMhDHqfGJ5Ml/uxA8sTZZEcl7Pf6W4EVHNJ/w0FH3jp5 nOrKekakqmnOBlHfTBBmj4PakZpMWxXSd7LJK6xlnPUlGdZ4sgqO5z6zlMvhHPVeKWS6 5zeQ6JfpnrMs4k7EhucJbwkCuBU1IhvIjqj49UtW4gxwkXcK2v4DP5Npxn5rUiymyWKu M+PA== X-Gm-Message-State: AOAM530QB275EEN9Am5xjEpCHxEeff960wbdUBWTlzSh1WJRt3f/CaRB a0V1uijj9gNEX/PS0mcr1uGbrF/DDcM= X-Google-Smtp-Source: ABdhPJwUo+JPrSvonRnS/zgmn2LsyZWqGK2jByM2ZbRi4ac8FabKjfCmBzyzeTRvvmKi83suFvGXrA== X-Received: by 2002:a05:6602:2a44:: with SMTP id k4mr2603872iov.56.1634966816188; Fri, 22 Oct 2021 22:26:56 -0700 (PDT) Received: from localhost.localdomain (node-17-161.flex.volo.net. [76.191.17.161]) by smtp.googlemail.com with ESMTPSA id k16sm5226227ior.50.2021.10.22.22.26.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Oct 2021 22:26:55 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1] x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S Date: Sat, 23 Oct 2021 01:26:47 -0400 Message-Id: <20211023052647.535991-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This commit replaces two usages of SSE2 'movups' with AVX 'vmovdqu'. it could potentially be dangerous to use SSE2 if this function is ever called without using 'vzeroupper' beforehand. While compilers appear to use 'vzeroupper' before function calls if AVX2 has been used, using SSE2 here is more brittle. Since it is not absolutely necessary it should be avoided. It costs 2-extra bytes but the extra bytes should only eat into alignment padding. Reviewed-by: H.J. Lu --- sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S index 2761b54f2e..640f6757fa 100644 --- a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S @@ -561,13 +561,13 @@ L(between_16_31): /* From 16 to 31 bytes. No branch when size == 16. */ /* Use movups to save code size. */ - movups (%rsi), %xmm2 + vmovdqu (%rsi), %xmm2 VPCMP $4, (%rdi), %xmm2, %k1 kmovd %k1, %eax testl %eax, %eax jnz L(return_vec_0_lv) /* Use overlapping loads to avoid branches. */ - movups -16(%rsi, %rdx, CHAR_SIZE), %xmm2 + vmovdqu -16(%rsi, %rdx, CHAR_SIZE), %xmm2 VPCMP $4, -16(%rdi, %rdx, CHAR_SIZE), %xmm2, %k1 addl $(CHAR_PER_VEC - (16 / CHAR_SIZE)), %edx kmovd %k1, %eax