From patchwork Mon Jun 7 08:30:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 43734 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D6828385380C for ; Mon, 7 Jun 2021 08:30:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D6828385380C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1623054648; bh=PPfQS/7fSLBv1gUBV0k4KKFB9ywKBhjA79gKRV/4g3Y=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=CDjL29FQ8jhcAqSbuQQRa859NvVAECQSjX9I+2Yr8GkmKpLPHPMQF3T9yqvsvtCdg DtE+8mpfeqW7IYudPpdTAm3nsqsi27ywo+momSgSicccI4wHE/9FatemrMnSH0BvHU qCDn0wJHHuDFaA8SBxUgYa5WYs9ZcyFtjtYTq2S8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by sourceware.org (Postfix) with ESMTPS id E0F353857012 for ; Mon, 7 Jun 2021 08:30:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E0F353857012 Received: by mail-qk1-x732.google.com with SMTP id j189so15891896qkf.2 for ; Mon, 07 Jun 2021 01:30:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=PPfQS/7fSLBv1gUBV0k4KKFB9ywKBhjA79gKRV/4g3Y=; b=GcsAibHIF8Gqm+a3/yD3Spqw46Xof4YZmv8tpFsN89xQkM7B1Wm7pG5mX38rspuHh6 qP0394HovlU4r15vvjUTSXQ8zx2O5EF/9+mK86CPIw0YK85EDxOEhwc6ICeIKIL8UhW8 LBhjfIFC+CDsa/KJ5/ml5wZ/bIrAKzEvJtDJ3sdgfnCOlQxANynV83c20lYG0olCTpqd DwADcteUJjo1oAW/xBMKrTgqAE/TpOjwY5iErkjq/MQG2zdzlP6WDrMWUExUydj+wiYc bhwUYCDW9L/xpNnxgX7i6S3vAsPM9HXngDAqF8gdQEkX5oWCxLfIlPlOW0ttCz/nub95 KXDw== X-Gm-Message-State: AOAM5336oY4Bu97hdlfrRXIngl4pZ3x3vdHdK0zGRnRlJbLsrNjJlO/L zslOzm6tbOB7fwHu6pUkFvAoySgVrGY= X-Google-Smtp-Source: ABdhPJzfbJeQT4RHAz40dfRa5lu7/uwuZbdpZKn97EnW71fZOE/PuYa3AwCl0ghaCcZ2S7G0Dac+6g== X-Received: by 2002:a37:444a:: with SMTP id r71mr15293291qka.381.1623054623199; Mon, 07 Jun 2021 01:30:23 -0700 (PDT) Received: from localhost.localdomain ([2600:1016:b046:d164:519f:377e:6cca:43f4]) by smtp.googlemail.com with ESMTPSA id t26sm5809666qtn.18.2021.06.07.01.30.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Jun 2021 01:30:22 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1] x86: memcmp-avx2-movbe.S and memcmp-evex-movbe.S fix overflow bug. Date: Mon, 7 Jun 2021 04:30:11 -0400 Message-Id: <20210607083011.855616-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Fix bugs introducted in commits: author Noah Goldstein Mon, 17 May 2021 17:57:24 +0000 (13:57 -0400) commit 4ad473e97acdc5f6d811755b67c09f2128a644ce And author Noah Goldstein Mon, 17 May 2021 17:56:52 +0000 (13:56 -0400) commit 16d12015c57701b08d7bbed6ec536641bcafb428 Which added a bug which would cause pointer + length overflow to lead to an early return as opposed to a Segmentation Fault. Signed-off-by: Noah Goldstein --- sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S | 13 ++++++++----- sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 15 ++++++++------- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S b/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S index 2621ec907a..4a9414ff61 100644 --- a/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S @@ -60,6 +60,7 @@ # endif # define VEC_SIZE 32 +# define LOG_VEC_SIZE 7 # define PAGE_SIZE 4096 /* Warning! @@ -200,8 +201,7 @@ L(return_vec_2): # endif VZEROUPPER_RETURN - /* NB: p2align 5 here to ensure 4x loop is 32 byte aligned. */ - .p2align 5 + .p2align 4 L(8x_return_vec_0_1_2_3): /* Returning from L(more_8x_vec) requires restoring rsi. */ addq %rdi, %rsi @@ -232,7 +232,7 @@ L(return_vec_3): # endif VZEROUPPER_RETURN - .p2align 4 + .p2align 5 L(more_8x_vec): /* Set end of s1 in rdx. */ leaq -(VEC_SIZE * 4)(%rdi, %rdx), %rdx @@ -241,8 +241,11 @@ L(more_8x_vec): subq %rdi, %rsi /* Align s1 pointer. */ andq $-VEC_SIZE, %rdi + leaq -1(%rdx), %rax + subq %rdi, %rax /* Adjust because first 4x vec where check already. */ subq $-(VEC_SIZE * 4), %rdi + sarq $LOG_VEC_SIZE, %rax .p2align 4 L(loop_4x_vec): /* rsi has s2 - s1 so get correct address by adding s1 (in rdi). @@ -267,8 +270,8 @@ L(loop_4x_vec): jnz L(8x_return_vec_0_1_2_3) subq $-(VEC_SIZE * 4), %rdi /* Check if s1 pointer at end. */ - cmpq %rdx, %rdi - jb L(loop_4x_vec) + decq %rax + jne L(loop_4x_vec) subq %rdx, %rdi /* rdi has 4 * VEC_SIZE - remaining length. */ diff --git a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S index 654dc7ac8c..60be3f43e7 100644 --- a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S @@ -53,6 +53,7 @@ # endif # define VEC_SIZE 32 +# define LOG_VEC_SIZE 7 # define PAGE_SIZE 4096 # define CHAR_PER_VEC (VEC_SIZE / CHAR_SIZE) @@ -163,10 +164,7 @@ ENTRY (MEMCMP) /* NB: eax must be zero to reach here. */ ret - /* NB: aligning 32 here allows for the rest of the jump targets - to be tuned for 32 byte alignment. Most important this ensures - the L(more_8x_vec) loop is 32 byte aligned. */ - .p2align 5 + .p2align 4 L(less_vec): /* Check if one or less CHAR. This is necessary for size = 0 but is also faster for size = CHAR_SIZE. */ @@ -277,7 +275,7 @@ L(return_vec_3): # endif ret - .p2align 4 + .p2align 5 L(more_8x_vec): /* Set end of s1 in rdx. */ leaq -(VEC_SIZE * 4)(%rdi, %rdx, CHAR_SIZE), %rdx @@ -286,8 +284,11 @@ L(more_8x_vec): subq %rdi, %rsi /* Align s1 pointer. */ andq $-VEC_SIZE, %rdi + leaq -1(%rdx), %rax + subq %rdi, %rax /* Adjust because first 4x vec where check already. */ subq $-(VEC_SIZE * 4), %rdi + sarq $LOG_VEC_SIZE, %rax .p2align 4 L(loop_4x_vec): VMOVU (%rsi, %rdi), %YMM1 @@ -307,8 +308,8 @@ L(loop_4x_vec): testl %ecx, %ecx jnz L(8x_return_vec_0_1_2_3) subq $-(VEC_SIZE * 4), %rdi - cmpq %rdx, %rdi - jb L(loop_4x_vec) + decq %rax + jnz L(loop_4x_vec) subq %rdx, %rdi /* rdi has 4 * VEC_SIZE - remaining length. */