[v2] x86: Align entry for memrchr to 64-bytes.

Message ID 20220624164216.2129400-1-goldstein.w.n@gmail.com
State Committed
Commit 227afaa67213efcdce6a870ef5086200f1076438
Headers
Series [v2] x86: Align entry for memrchr to 64-bytes. |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Noah Goldstein June 24, 2022, 4:42 p.m. UTC
  The function was tuned around 64-byte entry alignment and performs
better for all sizes with it.

As well different code boths where explicitly written to touch the
minimum number of cache line i.e sizes <= 32 touch only the entry
cache line.
---
 sysdeps/x86_64/multiarch/memrchr-avx2.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

H.J. Lu June 24, 2022, 5:15 p.m. UTC | #1
On Fri, Jun 24, 2022 at 9:42 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> The function was tuned around 64-byte entry alignment and performs
> better for all sizes with it.
>
> As well different code boths where explicitly written to touch the
> minimum number of cache line i.e sizes <= 32 touch only the entry
> cache line.
> ---
>  sysdeps/x86_64/multiarch/memrchr-avx2.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sysdeps/x86_64/multiarch/memrchr-avx2.S b/sysdeps/x86_64/multiarch/memrchr-avx2.S
> index 9c83c76d3c..f300d7daf4 100644
> --- a/sysdeps/x86_64/multiarch/memrchr-avx2.S
> +++ b/sysdeps/x86_64/multiarch/memrchr-avx2.S
> @@ -35,7 +35,7 @@
>  # define VEC_SIZE                      32
>  # define PAGE_SIZE                     4096
>         .section SECTION(.text), "ax", @progbits
> -ENTRY(MEMRCHR)
> +ENTRY_P2ALIGN(MEMRCHR, 6)
>  # ifdef __ILP32__
>         /* Clear upper bits.  */
>         and     %RDX_LP, %RDX_LP
> --
> 2.34.1
>

LGTM.

Thanks.
  
Sunil Pandey July 14, 2022, 2:59 a.m. UTC | #2
On Fri, Jun 24, 2022 at 10:16 AM H.J. Lu via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> On Fri, Jun 24, 2022 at 9:42 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > The function was tuned around 64-byte entry alignment and performs
> > better for all sizes with it.
> >
> > As well different code boths where explicitly written to touch the
> > minimum number of cache line i.e sizes <= 32 touch only the entry
> > cache line.
> > ---
> >  sysdeps/x86_64/multiarch/memrchr-avx2.S | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/sysdeps/x86_64/multiarch/memrchr-avx2.S b/sysdeps/x86_64/multiarch/memrchr-avx2.S
> > index 9c83c76d3c..f300d7daf4 100644
> > --- a/sysdeps/x86_64/multiarch/memrchr-avx2.S
> > +++ b/sysdeps/x86_64/multiarch/memrchr-avx2.S
> > @@ -35,7 +35,7 @@
> >  # define VEC_SIZE                      32
> >  # define PAGE_SIZE                     4096
> >         .section SECTION(.text), "ax", @progbits
> > -ENTRY(MEMRCHR)
> > +ENTRY_P2ALIGN(MEMRCHR, 6)
> >  # ifdef __ILP32__
> >         /* Clear upper bits.  */
> >         and     %RDX_LP, %RDX_LP
> > --
> > 2.34.1
> >
>
> LGTM.
>
> Thanks.
>
> --
> H.J.

I would like to backport this patch to release branches.
Any comments or objections?

--Sunil
  

Patch

diff --git a/sysdeps/x86_64/multiarch/memrchr-avx2.S b/sysdeps/x86_64/multiarch/memrchr-avx2.S
index 9c83c76d3c..f300d7daf4 100644
--- a/sysdeps/x86_64/multiarch/memrchr-avx2.S
+++ b/sysdeps/x86_64/multiarch/memrchr-avx2.S
@@ -35,7 +35,7 @@ 
 # define VEC_SIZE			32
 # define PAGE_SIZE			4096
 	.section SECTION(.text), "ax", @progbits
-ENTRY(MEMRCHR)
+ENTRY_P2ALIGN(MEMRCHR, 6)
 # ifdef __ILP32__
 	/* Clear upper bits.  */
 	and	%RDX_LP, %RDX_LP