From patchwork Mon Apr 19 19:35:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 43049 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 313D7396C829; Mon, 19 Apr 2021 19:35:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 313D7396C829 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1618860943; bh=FZVraw0+b2AXJnkhodZbSOWsok8Oq6vMK50ayB1v1ms=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Ff/ei60ne8Q+0XvhuEfze2I/VKoDSyJkIVgf9r+mxqK8oEp6pTKLCqbKG3J4otxXH Rhm5Ob3KxIen2R1Uza+JJQEuDQN0Km189PAHwWg/R5Ffy5s4P78HWjg5fENt7keuh7 XnrjnLmcP9tJzLyyWnh++n+U+S54+y56X7bKCDio= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x736.google.com (mail-qk1-x736.google.com [IPv6:2607:f8b0:4864:20::736]) by sourceware.org (Postfix) with ESMTPS id 6441B385E02F for ; Mon, 19 Apr 2021 19:35:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6441B385E02F Received: by mail-qk1-x736.google.com with SMTP id h13so18347351qka.2 for ; Mon, 19 Apr 2021 12:35:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=FZVraw0+b2AXJnkhodZbSOWsok8Oq6vMK50ayB1v1ms=; b=G3cIiOi+XkcKIPqpS2IR33mDMKD+TCSZMtHrTaMeAl0Co3vioAj91uRliWvT+hFvQR k23vt3mxaTFUl2+Jn/eazo8gXlBOmKPfxz5cIKGrY7mioVcMx/vxT8qG3uP/ssYD6aJJ Y9qqahyoxivtVtL416yDGXs5AtGlYJ9N9RgOXK1bKzqQdMkcXYSMbEMsY7Hns6gEnBgq AAXvStJ27Ajl649FnLyPcwBxpKlH5j43xaOxvH62XruV+RxZH6fft1IjLZ0o0Gmw0fF4 Z6JOpf+XEAi3KuMeuQx53GCP+rlrMB+/XnZTB7QaMFXxscUG57wKuPIELaVVBFUkJaO1 glng== X-Gm-Message-State: AOAM533L/DVSRSYWMsTk/5Jef9X2B+oQMjedi7p6y74rIFdG7PzMy/cy xOrQaxH9Gi5aqW4OfEkhrY8QTZnqxl4= X-Google-Smtp-Source: ABdhPJzs3mTrfc0Cjwgd/PbhBE8acZjZL9HwxD1EVFv55z3StOudCEEqHiuKU/kLPzCtjmqJbuYrxg== X-Received: by 2002:a37:5b43:: with SMTP id p64mr13566637qkb.131.1618860938544; Mon, 19 Apr 2021 12:35:38 -0700 (PDT) Received: from localhost.localdomain (pool-71-245-178-39.pitbpa.fios.verizon.net. [71.245.178.39]) by smtp.googlemail.com with ESMTPSA id b65sm110440qkc.119.2021.04.19.12.35.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Apr 2021 12:35:38 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 1/2] x86: Optimize less_vec evex and avx512 memset-vec-unaligned-erms.S Date: Mon, 19 Apr 2021 15:35:01 -0400 Message-Id: <20210419193502.586936-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" No bug. This commit adds optimized cased for less_vec memset case that uses the avx512vl/avx512bw mask store avoiding the excessive branches. test-memset and test-wmemset are passing. Signed-off-by: Noah Goldstein --- sysdeps/x86_64/multiarch/ifunc-impl-list.c | 40 +++++++++----- sysdeps/x86_64/multiarch/ifunc-memset.h | 6 ++- .../multiarch/memset-avx512-unaligned-erms.S | 2 +- .../multiarch/memset-evex-unaligned-erms.S | 2 +- .../multiarch/memset-vec-unaligned-erms.S | 52 +++++++++++++++---- 5 files changed, 75 insertions(+), 27 deletions(-) diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c index 0b0927b124..c377cab629 100644 --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c @@ -204,19 +204,23 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __memset_chk_avx2_unaligned_erms_rtm) IFUNC_IMPL_ADD (array, i, __memset_chk, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_chk_evex_unaligned) IFUNC_IMPL_ADD (array, i, __memset_chk, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_chk_evex_unaligned_erms) IFUNC_IMPL_ADD (array, i, __memset_chk, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_chk_avx512_unaligned_erms) IFUNC_IMPL_ADD (array, i, __memset_chk, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_chk_avx512_unaligned) IFUNC_IMPL_ADD (array, i, __memset_chk, CPU_FEATURE_USABLE (AVX512F), @@ -247,19 +251,23 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __memset_avx2_unaligned_erms_rtm) IFUNC_IMPL_ADD (array, i, memset, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_evex_unaligned) IFUNC_IMPL_ADD (array, i, memset, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_evex_unaligned_erms) IFUNC_IMPL_ADD (array, i, memset, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_avx512_unaligned_erms) IFUNC_IMPL_ADD (array, i, memset, (CPU_FEATURE_USABLE (AVX512VL) - && CPU_FEATURE_USABLE (AVX512BW)), + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __memset_avx512_unaligned) IFUNC_IMPL_ADD (array, i, memset, CPU_FEATURE_USABLE (AVX512F), @@ -728,10 +736,14 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, && CPU_FEATURE_USABLE (RTM)), __wmemset_avx2_unaligned_rtm) IFUNC_IMPL_ADD (array, i, wmemset, - CPU_FEATURE_USABLE (AVX512VL), + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __wmemset_evex_unaligned) IFUNC_IMPL_ADD (array, i, wmemset, - CPU_FEATURE_USABLE (AVX512VL), + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __wmemset_avx512_unaligned)) #ifdef SHARED @@ -935,10 +947,14 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, CPU_FEATURE_USABLE (AVX2), __wmemset_chk_avx2_unaligned) IFUNC_IMPL_ADD (array, i, __wmemset_chk, - CPU_FEATURE_USABLE (AVX512VL), + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __wmemset_chk_evex_unaligned) IFUNC_IMPL_ADD (array, i, __wmemset_chk, - CPU_FEATURE_USABLE (AVX512F), + (CPU_FEATURE_USABLE (AVX512VL) + && CPU_FEATURE_USABLE (AVX512BW) + && CPU_FEATURE_USABLE (BMI2)), __wmemset_chk_avx512_unaligned)) #endif diff --git a/sysdeps/x86_64/multiarch/ifunc-memset.h b/sysdeps/x86_64/multiarch/ifunc-memset.h index 502f946a84..eda5640541 100644 --- a/sysdeps/x86_64/multiarch/ifunc-memset.h +++ b/sysdeps/x86_64/multiarch/ifunc-memset.h @@ -54,7 +54,8 @@ IFUNC_SELECTOR (void) && !CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_AVX512)) { if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) - && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) + && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) + && CPU_FEATURE_USABLE_P (cpu_features, BMI2)) { if (CPU_FEATURE_USABLE_P (cpu_features, ERMS)) return OPTIMIZE (avx512_unaligned_erms); @@ -68,7 +69,8 @@ IFUNC_SELECTOR (void) if (CPU_FEATURE_USABLE_P (cpu_features, AVX2)) { if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) - && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) + && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW) + && CPU_FEATURE_USABLE_P (cpu_features, BMI2)) { if (CPU_FEATURE_USABLE_P (cpu_features, ERMS)) return OPTIMIZE (evex_unaligned_erms); diff --git a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S index 22e7b187c8..8ad842fc2f 100644 --- a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S @@ -19,6 +19,6 @@ # define SECTION(p) p##.evex512 # define MEMSET_SYMBOL(p,s) p##_avx512_##s # define WMEMSET_SYMBOL(p,s) p##_avx512_##s - +# define USE_LESS_VEC_MASK_STORE 1 # include "memset-vec-unaligned-erms.S" #endif diff --git a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S index ae0a4d6e46..640f092903 100644 --- a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S @@ -19,6 +19,6 @@ # define SECTION(p) p##.evex # define MEMSET_SYMBOL(p,s) p##_evex_##s # define WMEMSET_SYMBOL(p,s) p##_evex_##s - +# define USE_LESS_VEC_MASK_STORE 1 # include "memset-vec-unaligned-erms.S" #endif diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S index 584747f1a1..3a59d39267 100644 --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S @@ -63,6 +63,9 @@ # endif #endif +#define PAGE_SIZE 4096 +#define LOG_PAGE_SIZE 12 + #ifndef SECTION # error SECTION is not defined! #endif @@ -213,11 +216,38 @@ L(loop): cmpq %rcx, %rdx jne L(loop) VZEROUPPER_SHORT_RETURN + + .p2align 4 L(less_vec): /* Less than 1 VEC. */ # if VEC_SIZE != 16 && VEC_SIZE != 32 && VEC_SIZE != 64 # error Unsupported VEC_SIZE! # endif +# ifdef USE_LESS_VEC_MASK_STORE + /* Clear high bits from edi. Only keeping bits relevant to page + cross check. Using sall instead of andl saves 3 bytes. Note + that we are using rax which is set in + MEMSET_VDUP_TO_VEC0_AND_SET_RETURN as ptr from here on out. */ + sall $(32 - LOG_PAGE_SIZE), %edi + /* Check if VEC_SIZE load cross page. Mask loads suffer serious + performance degradation when it has to fault supress. */ + cmpl $((PAGE_SIZE - VEC_SIZE) << (32 - LOG_PAGE_SIZE)), %edi + ja L(cross_page) +# if VEC_SIZE > 32 + movq $-1, %rcx + bzhiq %rdx, %rcx, %rcx + kmovq %rcx, %k1 +# else + movl $-1, %ecx + bzhil %edx, %ecx, %ecx + kmovd %ecx, %k1 +# endif + vmovdqu8 %VEC(0), (%rax) {%k1} + VZEROUPPER_RETURN + + .p2align 4 +L(cross_page): +# endif # if VEC_SIZE > 32 cmpb $32, %dl jae L(between_32_63) @@ -234,36 +264,36 @@ L(less_vec): cmpb $1, %dl ja L(between_2_3) jb 1f - movb %cl, (%rdi) + movb %cl, (%rax) 1: VZEROUPPER_RETURN # if VEC_SIZE > 32 /* From 32 to 63. No branch when size == 32. */ L(between_32_63): - VMOVU %YMM0, -32(%rdi,%rdx) - VMOVU %YMM0, (%rdi) + VMOVU %YMM0, -32(%rax,%rdx) + VMOVU %YMM0, (%rax) VZEROUPPER_RETURN # endif # if VEC_SIZE > 16 /* From 16 to 31. No branch when size == 16. */ L(between_16_31): - VMOVU %XMM0, -16(%rdi,%rdx) - VMOVU %XMM0, (%rdi) + VMOVU %XMM0, -16(%rax,%rdx) + VMOVU %XMM0, (%rax) VZEROUPPER_RETURN # endif /* From 8 to 15. No branch when size == 8. */ L(between_8_15): - movq %rcx, -8(%rdi,%rdx) - movq %rcx, (%rdi) + movq %rcx, -8(%rax,%rdx) + movq %rcx, (%rax) VZEROUPPER_RETURN L(between_4_7): /* From 4 to 7. No branch when size == 4. */ - movl %ecx, -4(%rdi,%rdx) - movl %ecx, (%rdi) + movl %ecx, -4(%rax,%rdx) + movl %ecx, (%rax) VZEROUPPER_RETURN L(between_2_3): /* From 2 to 3. No branch when size == 2. */ - movw %cx, -2(%rdi,%rdx) - movw %cx, (%rdi) + movw %cx, -2(%rax,%rdx) + movw %cx, (%rax) VZEROUPPER_RETURN END (MEMSET_SYMBOL (__memset, unaligned_erms)) From patchwork Mon Apr 19 19:35:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 43050 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CA5D1385E02F; Mon, 19 Apr 2021 19:35:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CA5D1385E02F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1618860949; bh=l8KD1Dg/i15zQMa9KC/JlAIgnqMBAQMlbAeTENg6zUg=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=QmZd32YzFXaR/K1gqF8pN8XN9ImEgTZnx6Ht6Azt74V8ZQ+tBp9ErAPEW678rbHNr kKhX9ADYzRwSpCn1OrwVuNw2x9x+OSHS69f9M1/fJMJI4Vbdjbxl9eUmHFv1ENaXJu R2KhFeb0m4CeNebqDrCW+TKSxMPZu/a9+ntdlI8E= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by sourceware.org (Postfix) with ESMTPS id 19A233968C39 for ; Mon, 19 Apr 2021 19:35:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 19A233968C39 Received: by mail-qk1-x72e.google.com with SMTP id q136so16075566qka.7 for ; Mon, 19 Apr 2021 12:35:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l8KD1Dg/i15zQMa9KC/JlAIgnqMBAQMlbAeTENg6zUg=; b=Z/YDS1H5Nw0tVMIHPhU8RKu050JeIX0u4ofebwNlB+vtjnzsBJ6idcqN/pyqJt83WG Qi7SEVuUwpUiFrezr0v6GBOQ0UG1t3cg3LJupmPkr9qoCX5Iv2N3Wb/aQBxThWno6rm+ iTrmDqbG8e+Y5mgSf1y8MkrdDgmcna4mAbroBbgIknzRA8dnd28noTE3tG8Yes6Qrr6R bxcFMWt5Z1f+CqQSjq4x5mYhMnwIYMIKTWOfeZr5cKs+qn1Z9U4P2NMEMuVbXEmHcuoq C8YEb8o/fjtZ+6t4JK3hBBt1mADC1IkUgUUj/+0FOsRNDX1fu68FW0ztRLg8dKDwOx0B i1zg== X-Gm-Message-State: AOAM533iXydXoDlcUtFsor1E9nWIc6LEi07ObdDDdzNoEtRi7sscjYeC wSMTt5BBFbXxQS6AN1GR8XWR9Y2bmeQ= X-Google-Smtp-Source: ABdhPJyPZKPeQddUY7qM5yfCQID/sIE3adigxQ6ZfRu+dM2g/3N8KuYKESiXpFogqKTz5MGnZk7K1Q== X-Received: by 2002:a05:620a:110f:: with SMTP id o15mr13636484qkk.305.1618860941576; Mon, 19 Apr 2021 12:35:41 -0700 (PDT) Received: from localhost.localdomain (pool-71-245-178-39.pitbpa.fios.verizon.net. [71.245.178.39]) by smtp.googlemail.com with ESMTPSA id b65sm110440qkc.119.2021.04.19.12.35.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Apr 2021 12:35:41 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 2/2] x86: Expand test-memset.c and bench-memset.c Date: Mon, 19 Apr 2021 15:35:02 -0400 Message-Id: <20210419193502.586936-2-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210419193502.586936-1-goldstein.w.n@gmail.com> References: <20210419193502.586936-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" No bug. This commit adds tests cases and benchmarks for page cross and for memset to the end of the page without crossing. As well in test-memset.c this commit adds sentinel on start/end of tstbuf to test for overwrites Signed-off-by: Noah Goldstein --- benchtests/bench-memset.c | 6 ++++-- string/test-memset.c | 20 +++++++++++++++----- 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/benchtests/bench-memset.c b/benchtests/bench-memset.c index 1174900e88..d6619b4836 100644 --- a/benchtests/bench-memset.c +++ b/benchtests/bench-memset.c @@ -61,7 +61,7 @@ do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s, static void do_test (json_ctx_t *json_ctx, size_t align, int c, size_t len) { - align &= 63; + align &= 4095; if ((align + len) * sizeof (CHAR) > page_size) return; @@ -110,9 +110,11 @@ test_main (void) { for (i = 0; i < 18; ++i) do_test (&json_ctx, 0, c, 1 << i); - for (i = 1; i < 32; ++i) + for (i = 1; i < 64; ++i) { do_test (&json_ctx, i, c, i); + do_test (&json_ctx, 4096 - i, c, i); + do_test (&json_ctx, 4095, c, i); if (i & (i - 1)) do_test (&json_ctx, 0, c, i); } diff --git a/string/test-memset.c b/string/test-memset.c index eb71517390..82bfcd6ad4 100644 --- a/string/test-memset.c +++ b/string/test-memset.c @@ -109,16 +109,24 @@ SIMPLE_MEMSET (CHAR *s, int c, size_t n) static void do_one_test (impl_t *impl, CHAR *s, int c __attribute ((unused)), size_t n) { - CHAR tstbuf[n]; + CHAR buf[n + 2]; + CHAR *tstbuf = buf + 1; + CHAR sentinel = c - 1; + buf[0] = sentinel; + buf[n + 1] = sentinel; #ifdef TEST_BZERO simple_bzero (tstbuf, n); CALL (impl, s, n); - if (memcmp (s, tstbuf, n) != 0) + if (memcmp (s, tstbuf, n) != 0 + || buf[0] != sentinel + || buf[n + 1] != sentinel) #else CHAR *res = CALL (impl, s, c, n); if (res != s || SIMPLE_MEMSET (tstbuf, c, n) != tstbuf - || MEMCMP (s, tstbuf, n) != 0) + || MEMCMP (s, tstbuf, n) != 0 + || buf[0] != sentinel + || buf[n + 1] != sentinel) #endif /* !TEST_BZERO */ { error (0, 0, "Wrong result in function %s", impl->name); @@ -130,7 +138,7 @@ do_one_test (impl_t *impl, CHAR *s, int c __attribute ((unused)), size_t n) static void do_test (size_t align, int c, size_t len) { - align &= 7; + align &= 4095; if ((align + len) * sizeof (CHAR) > page_size) return; @@ -245,9 +253,11 @@ test_main (void) { for (i = 0; i < 18; ++i) do_test (0, c, 1 << i); - for (i = 1; i < 32; ++i) + for (i = 1; i < 64; ++i) { do_test (i, c, i); + do_test (4096 - i, c, i); + do_test (4095, c, i); if (i & (i - 1)) do_test (0, c, i); }