[v1,2/2] x86: Remove unnecessary overflow check from wcsnlen-sse4_1.S
Checks
Context |
Check |
Description |
dj/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
dj/TryBot-32bit |
success
|
Build for i686
|
Commit Message
No bug. The way wcsnlen will check if near the end of maxlen
is the following macro:
mov %r11, %rsi; \
subq %rax, %rsi; \
andq $-64, %rax; \
testq $-64, %rsi; \
je L(strnlen_ret)
Which words independently of s + maxlen overflowing. So the
second overflow check is unnecissary for correctness and
just extra overhead in the common no overflow case.
test-strlen.c, test-wcslen.c, test-strnlen.c and test-wcsnlen.c are
all passing
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
Sorry I didn't notice this earlier before my last commit. As
of submitting this patch
a775a7a3eb1e85b54af0b4ee5ff4dcf66772a1fb
Is HEAD of master to maybe rebase so commit history isnt messy?
sysdeps/x86_64/multiarch/strlen-vec.S | 7 -------
1 file changed, 7 deletions(-)
Comments
On Wed, Jun 23, 2021 at 5:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> No bug. The way wcsnlen will check if near the end of maxlen
> is the following macro:
>
> mov %r11, %rsi; \
> subq %rax, %rsi; \
> andq $-64, %rax; \
> testq $-64, %rsi; \
> je L(strnlen_ret)
>
> Which words independently of s + maxlen overflowing. So the
> second overflow check is unnecissary for correctness and
> just extra overhead in the common no overflow case.
>
> test-strlen.c, test-wcslen.c, test-strnlen.c and test-wcsnlen.c are
> all passing
>
> Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> ---
> Sorry I didn't notice this earlier before my last commit. As
> of submitting this patch
>
> a775a7a3eb1e85b54af0b4ee5ff4dcf66772a1fb
>
> Is HEAD of master to maybe rebase so commit history isnt messy?
>
>
> sysdeps/x86_64/multiarch/strlen-vec.S | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/sysdeps/x86_64/multiarch/strlen-vec.S b/sysdeps/x86_64/multiarch/strlen-vec.S
> index 439e486a43..b7657282bd 100644
> --- a/sysdeps/x86_64/multiarch/strlen-vec.S
> +++ b/sysdeps/x86_64/multiarch/strlen-vec.S
> @@ -71,19 +71,12 @@ L(n_nonzero):
> suffice. */
> mov %RSI_LP, %R10_LP
> sar $62, %R10_LP
> - test %R10_LP, %R10_LP
> jnz __wcslen_sse4_1
> sal $2, %RSI_LP
> # endif
>
> -
> /* Initialize long lived registers. */
> -
> add %RDI_LP, %RSI_LP
> -# ifdef AS_WCSLEN
> -/* Check for overflow again from s + maxlen * sizeof(wchar_t). */
> - jbe __wcslen_sse4_1
> -# endif
> mov %RSI_LP, %R10_LP
> and $-64, %R10_LP
> mov %RSI_LP, %R11_LP
> --
> 2.25.1
>
LGTM.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Thanks.
On Wed, Jun 23, 2021 at 5:42 PM H.J. Lu via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> On Wed, Jun 23, 2021 at 5:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > No bug. The way wcsnlen will check if near the end of maxlen
> > is the following macro:
> >
> > mov %r11, %rsi; \
> > subq %rax, %rsi; \
> > andq $-64, %rax; \
> > testq $-64, %rsi; \
> > je L(strnlen_ret)
> >
> > Which words independently of s + maxlen overflowing. So the
> > second overflow check is unnecissary for correctness and
> > just extra overhead in the common no overflow case.
> >
> > test-strlen.c, test-wcslen.c, test-strnlen.c and test-wcsnlen.c are
> > all passing
> >
> > Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> > ---
> > Sorry I didn't notice this earlier before my last commit. As
> > of submitting this patch
> >
> > a775a7a3eb1e85b54af0b4ee5ff4dcf66772a1fb
> >
> > Is HEAD of master to maybe rebase so commit history isnt messy?
> >
> >
> > sysdeps/x86_64/multiarch/strlen-vec.S | 7 -------
> > 1 file changed, 7 deletions(-)
> >
> > diff --git a/sysdeps/x86_64/multiarch/strlen-vec.S b/sysdeps/x86_64/multiarch/strlen-vec.S
> > index 439e486a43..b7657282bd 100644
> > --- a/sysdeps/x86_64/multiarch/strlen-vec.S
> > +++ b/sysdeps/x86_64/multiarch/strlen-vec.S
> > @@ -71,19 +71,12 @@ L(n_nonzero):
> > suffice. */
> > mov %RSI_LP, %R10_LP
> > sar $62, %R10_LP
> > - test %R10_LP, %R10_LP
> > jnz __wcslen_sse4_1
> > sal $2, %RSI_LP
> > # endif
> >
> > -
> > /* Initialize long lived registers. */
> > -
> > add %RDI_LP, %RSI_LP
> > -# ifdef AS_WCSLEN
> > -/* Check for overflow again from s + maxlen * sizeof(wchar_t). */
> > - jbe __wcslen_sse4_1
> > -# endif
> > mov %RSI_LP, %R10_LP
> > and $-64, %R10_LP
> > mov %RSI_LP, %R11_LP
> > --
> > 2.25.1
> >
>
> LGTM.
>
> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
>
> Thanks.
>
> --
> H.J.
I would like to backport this patch to release branches.
Any comments or objections?
--Sunil
@@ -71,19 +71,12 @@ L(n_nonzero):
suffice. */
mov %RSI_LP, %R10_LP
sar $62, %R10_LP
- test %R10_LP, %R10_LP
jnz __wcslen_sse4_1
sal $2, %RSI_LP
# endif
-
/* Initialize long lived registers. */
-
add %RDI_LP, %RSI_LP
-# ifdef AS_WCSLEN
-/* Check for overflow again from s + maxlen * sizeof(wchar_t). */
- jbe __wcslen_sse4_1
-# endif
mov %RSI_LP, %R10_LP
and $-64, %R10_LP
mov %RSI_LP, %R11_LP