[v1,1/2] x86: Add comment explaining no Slow_SSE4_2 check in ifunc-sse4_2

Message ID 20220628152717.17838-1-goldstein.w.n@gmail.com
State Superseded
Headers
Series [v1,1/2] x86: Add comment explaining no Slow_SSE4_2 check in ifunc-sse4_2 |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Noah Goldstein June 28, 2022, 3:27 p.m. UTC
  Just for clarities sake and so that if a future implementation is
added we remember to add the check.
---
 sysdeps/x86_64/multiarch/ifunc-sse4_2.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

H.J. Lu June 29, 2022, 6:53 p.m. UTC | #1
On Tue, Jun 28, 2022 at 8:27 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> Just for clarities sake and so that if a future implementation is
> added we remember to add the check.
> ---
>  sysdeps/x86_64/multiarch/ifunc-sse4_2.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> index ee36525bcf..204c4b5406 100644
> --- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> +++ b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> @@ -27,7 +27,11 @@ IFUNC_SELECTOR (void)
>  {
>    const struct cpu_features* cpu_features = __get_cpu_features ();
>
> -  if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
> +  /* This function uses slow sse4.2 instructions (pcmpstri) but since
> +     there is no other optimized implementation keep using.  If an
> +     optimized fallback is added add a X86_ISA_CPU_FEATURE_USABLE_P
> +     (cpu_features, SSE4_2) check.  */

Did you mean Slow_SSE4_2?

> +  if (ISA_CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
>      return OPTIMIZE (sse42);
>
>    return OPTIMIZE (generic);
> --
> 2.34.1
>
  
Noah Goldstein June 29, 2022, 7:19 p.m. UTC | #2
On Wed, Jun 29, 2022 at 11:54 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Jun 28, 2022 at 8:27 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > Just for clarities sake and so that if a future implementation is
> > added we remember to add the check.
> > ---
> >  sysdeps/x86_64/multiarch/ifunc-sse4_2.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > index ee36525bcf..204c4b5406 100644
> > --- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > +++ b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > @@ -27,7 +27,11 @@ IFUNC_SELECTOR (void)
> >  {
> >    const struct cpu_features* cpu_features = __get_cpu_features ();
> >
> > -  if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
> > +  /* This function uses slow sse4.2 instructions (pcmpstri) but since
> > +     there is no other optimized implementation keep using.  If an
> > +     optimized fallback is added add a X86_ISA_CPU_FEATURE_USABLE_P
> > +     (cpu_features, SSE4_2) check.  */
>
> Did you mean Slow_SSE4_2?

Yes. Will fix for V2.
>
> > +  if (ISA_CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
> >      return OPTIMIZE (sse42);
> >
> >    return OPTIMIZE (generic);
> > --
> > 2.34.1
> >
>
>
> --
> H.J.
  
Noah Goldstein June 29, 2022, 10:06 p.m. UTC | #3
On Wed, Jun 29, 2022 at 12:19 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Wed, Jun 29, 2022 at 11:54 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Tue, Jun 28, 2022 at 8:27 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > Just for clarities sake and so that if a future implementation is
> > > added we remember to add the check.
> > > ---
> > >  sysdeps/x86_64/multiarch/ifunc-sse4_2.h | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > > index ee36525bcf..204c4b5406 100644
> > > --- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > > +++ b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > > @@ -27,7 +27,11 @@ IFUNC_SELECTOR (void)
> > >  {
> > >    const struct cpu_features* cpu_features = __get_cpu_features ();
> > >
> > > -  if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
> > > +  /* This function uses slow sse4.2 instructions (pcmpstri) but since
> > > +     there is no other optimized implementation keep using.  If an
> > > +     optimized fallback is added add a X86_ISA_CPU_FEATURE_USABLE_P
> > > +     (cpu_features, SSE4_2) check.  */
> >
> > Did you mean Slow_SSE4_2?
>
> Yes. Will fix for V2.

Fixed in V2.
> >
> > > +  if (ISA_CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
> > >      return OPTIMIZE (sse42);
> > >
> > >    return OPTIMIZE (generic);
> > > --
> > > 2.34.1
> > >
> >
> >
> > --
> > H.J.
  

Patch

diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
index ee36525bcf..204c4b5406 100644
--- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
+++ b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
@@ -27,7 +27,11 @@  IFUNC_SELECTOR (void)
 {
   const struct cpu_features* cpu_features = __get_cpu_features ();
 
-  if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
+  /* This function uses slow sse4.2 instructions (pcmpstri) but since
+     there is no other optimized implementation keep using.  If an
+     optimized fallback is added add a X86_ISA_CPU_FEATURE_USABLE_P
+     (cpu_features, SSE4_2) check.  */
+  if (ISA_CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
     return OPTIMIZE (sse42);
 
   return OPTIMIZE (generic);