[v2,0/6] x86: Fix AVX2 string functions requiring BMI1, BMI2 or LZCNT (BZ #29611)

Message ID 20221002123424.3079805-1-aurelien@aurel32.net
Headers
Series x86: Fix AVX2 string functions requiring BMI1, BMI2 or LZCNT (BZ #29611) |

Message

Aurelien Jarno Oct. 2, 2022, 12:34 p.m. UTC
  Some early Intel Haswell CPU have AVX2 instructions, but do not have
BMI1 and BMI2 instructions. Some AVX2 string functions only check for
AVX2, but use BMI1, BMI2 or LZCNT instructions. This patchset tries to
fix that.

While most fixes only change ifunc-impl-list.c, and thus only concerns
the testsuite, the strn(case)cmp is a real issue affecting early Intel
Haswell CPU, reported to affect Debian Sid and Fedora Rawhide.

On the other hand, the check for LZCNT in memrchr is purely for
correctness, I am not aware of a CPU implementing AVX2 without LZCNT.

This has been tested by remplacing all BMI1 and BMI2 instructions in the
source code by the "ud2" instruction and disabling the BMI1, BMI2
feature detection, and running the testsuite.

Resolves: BZ #29611

Change v1 -> v2:
- Better scan for BMI2 instructions (shlx and shrx) and BMI1
  instructions (blsmsk) instructions following the feedback from Noah
  Goldstein

Aurelien Jarno (6):
  x86: include BMI1 and BMI2 in x86-64-v3 level
  x86-64: Require BMI2 for AVX2 str*cmp and wcs(n)cmp implementations
  x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
  x86-64: Require LZCNT for AVX2 memrchr implementation
  x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
  x86-64: Require BMI2 for AVX2 memrchr implementation

 sysdeps/x86/get-isa-level.h                 |  2 +
 sysdeps/x86/isa-level.h                     |  2 +
 sysdeps/x86_64/multiarch/ifunc-avx2.h       |  2 +
 sysdeps/x86_64/multiarch/ifunc-impl-list.c  | 86 ++++++++++++++++-----
 sysdeps/x86_64/multiarch/ifunc-strcasecmp.h |  1 +
 sysdeps/x86_64/multiarch/strcmp.c           |  4 +-
 sysdeps/x86_64/multiarch/strncmp.c          |  4 +-
 7 files changed, 76 insertions(+), 25 deletions(-)
  

Comments

Noah Goldstein Oct. 2, 2022, 4:21 p.m. UTC | #1
On Sun, Oct 2, 2022 at 5:34 AM Aurelien Jarno <aurelien@aurel32.net> wrote:
>
> Some early Intel Haswell CPU have AVX2 instructions, but do not have
> BMI1 and BMI2 instructions. Some AVX2 string functions only check for
> AVX2, but use BMI1, BMI2 or LZCNT instructions. This patchset tries to
> fix that.
>
> While most fixes only change ifunc-impl-list.c, and thus only concerns
> the testsuite, the strn(case)cmp is a real issue affecting early Intel

str(case)cmp as well, correct?

> Haswell CPU, reported to affect Debian Sid and Fedora Rawhide.
>
> On the other hand, the check for LZCNT in memrchr is purely for
> correctness, I am not aware of a CPU implementing AVX2 without LZCNT.
>
> This has been tested by remplacing all BMI1 and BMI2 instructions in the
> source code by the "ud2" instruction and disabling the BMI1, BMI2
> feature detection, and running the testsuite.
>
> Resolves: BZ #29611
>
> Change v1 -> v2:
> - Better scan for BMI2 instructions (shlx and shrx) and BMI1
>   instructions (blsmsk) instructions following the feedback from Noah
>   Goldstein
>
> Aurelien Jarno (6):
>   x86: include BMI1 and BMI2 in x86-64-v3 level
>   x86-64: Require BMI2 for AVX2 str*cmp and wcs(n)cmp implementations
>   x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
>   x86-64: Require LZCNT for AVX2 memrchr implementation
>   x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
>   x86-64: Require BMI2 for AVX2 memrchr implementation
>
>  sysdeps/x86/get-isa-level.h                 |  2 +
>  sysdeps/x86/isa-level.h                     |  2 +
>  sysdeps/x86_64/multiarch/ifunc-avx2.h       |  2 +
>  sysdeps/x86_64/multiarch/ifunc-impl-list.c  | 86 ++++++++++++++++-----
>  sysdeps/x86_64/multiarch/ifunc-strcasecmp.h |  1 +
>  sysdeps/x86_64/multiarch/strcmp.c           |  4 +-
>  sysdeps/x86_64/multiarch/strncmp.c          |  4 +-
>  7 files changed, 76 insertions(+), 25 deletions(-)
>
> --
> 2.35.1
>
  
Aurelien Jarno Oct. 2, 2022, 6:09 p.m. UTC | #2
On 2022-10-02 09:21, Noah Goldstein wrote:
> On Sun, Oct 2, 2022 at 5:34 AM Aurelien Jarno <aurelien@aurel32.net> wrote:
> >
> > Some early Intel Haswell CPU have AVX2 instructions, but do not have
> > BMI1 and BMI2 instructions. Some AVX2 string functions only check for
> > AVX2, but use BMI1, BMI2 or LZCNT instructions. This patchset tries to
> > fix that.
> >
> > While most fixes only change ifunc-impl-list.c, and thus only concerns
> > the testsuite, the strn(case)cmp is a real issue affecting early Intel
> 
> str(case)cmp as well, correct?

Oops, yes forgot to update the cover letter on that aspect.

> > Haswell CPU, reported to affect Debian Sid and Fedora Rawhide.
> >
> > On the other hand, the check for LZCNT in memrchr is purely for
> > correctness, I am not aware of a CPU implementing AVX2 without LZCNT.
> >
> > This has been tested by remplacing all BMI1 and BMI2 instructions in the
> > source code by the "ud2" instruction and disabling the BMI1, BMI2
> > feature detection, and running the testsuite.
> >
> > Resolves: BZ #29611
> >
> > Change v1 -> v2:
> > - Better scan for BMI2 instructions (shlx and shrx) and BMI1
> >   instructions (blsmsk) instructions following the feedback from Noah
> >   Goldstein
> >
> > Aurelien Jarno (6):
> >   x86: include BMI1 and BMI2 in x86-64-v3 level
> >   x86-64: Require BMI2 for AVX2 str*cmp and wcs(n)cmp implementations
> >   x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
> >   x86-64: Require LZCNT for AVX2 memrchr implementation
> >   x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
> >   x86-64: Require BMI2 for AVX2 memrchr implementation
> >
> >  sysdeps/x86/get-isa-level.h                 |  2 +
> >  sysdeps/x86/isa-level.h                     |  2 +
> >  sysdeps/x86_64/multiarch/ifunc-avx2.h       |  2 +
> >  sysdeps/x86_64/multiarch/ifunc-impl-list.c  | 86 ++++++++++++++++-----
> >  sysdeps/x86_64/multiarch/ifunc-strcasecmp.h |  1 +
> >  sysdeps/x86_64/multiarch/strcmp.c           |  4 +-
> >  sysdeps/x86_64/multiarch/strncmp.c          |  4 +-
> >  7 files changed, 76 insertions(+), 25 deletions(-)
> >
> > --
> > 2.35.1
> >
>
  
Noah Goldstein Oct. 2, 2022, 9:11 p.m. UTC | #3
On Sun, Oct 2, 2022 at 2:09 PM Aurelien Jarno <aurelien@aurel32.net> wrote:
>
> On 2022-10-02 09:21, Noah Goldstein wrote:
> > On Sun, Oct 2, 2022 at 5:34 AM Aurelien Jarno <aurelien@aurel32.net> wrote:
> > >
> > > Some early Intel Haswell CPU have AVX2 instructions, but do not have
> > > BMI1 and BMI2 instructions. Some AVX2 string functions only check for
> > > AVX2, but use BMI1, BMI2 or LZCNT instructions. This patchset tries to
> > > fix that.
> > >
> > > While most fixes only change ifunc-impl-list.c, and thus only concerns
> > > the testsuite, the strn(case)cmp is a real issue affecting early Intel
> >
> > str(case)cmp as well, correct?
>
> Oops, yes forgot to update the cover letter on that aspect.
>
> > > Haswell CPU, reported to affect Debian Sid and Fedora Rawhide.
> > >
> > > On the other hand, the check for LZCNT in memrchr is purely for
> > > correctness, I am not aware of a CPU implementing AVX2 without LZCNT.
> > >
> > > This has been tested by remplacing all BMI1 and BMI2 instructions in the
> > > source code by the "ud2" instruction and disabling the BMI1, BMI2
> > > feature detection, and running the testsuite.
> > >
> > > Resolves: BZ #29611
> > >
> > > Change v1 -> v2:
> > > - Better scan for BMI2 instructions (shlx and shrx) and BMI1
> > >   instructions (blsmsk) instructions following the feedback from Noah
> > >   Goldstein
> > >
> > > Aurelien Jarno (6):
> > >   x86: include BMI1 and BMI2 in x86-64-v3 level
> > >   x86-64: Require BMI2 for AVX2 str*cmp and wcs(n)cmp implementations
> > >   x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
> > >   x86-64: Require LZCNT for AVX2 memrchr implementation
> > >   x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
> > >   x86-64: Require BMI2 for AVX2 memrchr implementation
> > >
> > >  sysdeps/x86/get-isa-level.h                 |  2 +
> > >  sysdeps/x86/isa-level.h                     |  2 +
> > >  sysdeps/x86_64/multiarch/ifunc-avx2.h       |  2 +
> > >  sysdeps/x86_64/multiarch/ifunc-impl-list.c  | 86 ++++++++++++++++-----
> > >  sysdeps/x86_64/multiarch/ifunc-strcasecmp.h |  1 +
> > >  sysdeps/x86_64/multiarch/strcmp.c           |  4 +-
> > >  sysdeps/x86_64/multiarch/strncmp.c          |  4 +-
> > >  7 files changed, 76 insertions(+), 25 deletions(-)
> > >
> > > --
> > > 2.35.1
> > >
> >
>
> --
> Aurelien Jarno                          GPG: 4096R/1DDD8C9B
> aurelien@aurel32.net                 http://www.aurel32.net

Patchset looks good.

Do you have commit permissions? If not I can push them for you.

Thanks for the bugfix!

NB:
the str*(case)cmp, wcs(n)cmp bug affects 2.36, 2.35, 2.34, 2.33.
  
Aurelien Jarno Oct. 3, 2022, 5:36 p.m. UTC | #4
On 2022-10-02 17:11, Noah Goldstein via Libc-alpha wrote:
> On Sun, Oct 2, 2022 at 2:09 PM Aurelien Jarno <aurelien@aurel32.net> wrote:
> >
> > On 2022-10-02 09:21, Noah Goldstein wrote:
> > > On Sun, Oct 2, 2022 at 5:34 AM Aurelien Jarno <aurelien@aurel32.net> wrote:
> > > >
> > > > Some early Intel Haswell CPU have AVX2 instructions, but do not have
> > > > BMI1 and BMI2 instructions. Some AVX2 string functions only check for
> > > > AVX2, but use BMI1, BMI2 or LZCNT instructions. This patchset tries to
> > > > fix that.
> > > >
> > > > While most fixes only change ifunc-impl-list.c, and thus only concerns
> > > > the testsuite, the strn(case)cmp is a real issue affecting early Intel
> > >
> > > str(case)cmp as well, correct?
> >
> > Oops, yes forgot to update the cover letter on that aspect.
> >
> > > > Haswell CPU, reported to affect Debian Sid and Fedora Rawhide.
> > > >
> > > > On the other hand, the check for LZCNT in memrchr is purely for
> > > > correctness, I am not aware of a CPU implementing AVX2 without LZCNT.
> > > >
> > > > This has been tested by remplacing all BMI1 and BMI2 instructions in the
> > > > source code by the "ud2" instruction and disabling the BMI1, BMI2
> > > > feature detection, and running the testsuite.
> > > >
> > > > Resolves: BZ #29611
> > > >
> > > > Change v1 -> v2:
> > > > - Better scan for BMI2 instructions (shlx and shrx) and BMI1
> > > >   instructions (blsmsk) instructions following the feedback from Noah
> > > >   Goldstein
> > > >
> > > > Aurelien Jarno (6):
> > > >   x86: include BMI1 and BMI2 in x86-64-v3 level
> > > >   x86-64: Require BMI2 for AVX2 str*cmp and wcs(n)cmp implementations
> > > >   x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
> > > >   x86-64: Require LZCNT for AVX2 memrchr implementation
> > > >   x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
> > > >   x86-64: Require BMI2 for AVX2 memrchr implementation
> > > >
> > > >  sysdeps/x86/get-isa-level.h                 |  2 +
> > > >  sysdeps/x86/isa-level.h                     |  2 +
> > > >  sysdeps/x86_64/multiarch/ifunc-avx2.h       |  2 +
> > > >  sysdeps/x86_64/multiarch/ifunc-impl-list.c  | 86 ++++++++++++++++-----
> > > >  sysdeps/x86_64/multiarch/ifunc-strcasecmp.h |  1 +
> > > >  sysdeps/x86_64/multiarch/strcmp.c           |  4 +-
> > > >  sysdeps/x86_64/multiarch/strncmp.c          |  4 +-
> > > >  7 files changed, 76 insertions(+), 25 deletions(-)
> > > >
> > > > --
> > > > 2.35.1
> > > >
> > >
> >
> > --
> > Aurelien Jarno                          GPG: 4096R/1DDD8C9B
> > aurelien@aurel32.net                 http://www.aurel32.net
> 
> Patchset looks good.
> 
> Do you have commit permissions? If not I can push them for you.

Yes, I can commit them, though I'll wait for v3 to be reviewed.

> Thanks for the bugfix!
> 
> NB:
> the str*(case)cmp, wcs(n)cmp bug affects 2.36, 2.35, 2.34, 2.33.

Yep, I have already prepared a backport of the whole patchset to those
branches.
  
Noah Goldstein Oct. 3, 2022, 5:51 p.m. UTC | #5
On Mon, Oct 3, 2022 at 10:36 AM Aurelien Jarno <aurelien@aurel32.net> wrote:
>
> On 2022-10-02 17:11, Noah Goldstein via Libc-alpha wrote:
> > On Sun, Oct 2, 2022 at 2:09 PM Aurelien Jarno <aurelien@aurel32.net> wrote:
> > >
> > > On 2022-10-02 09:21, Noah Goldstein wrote:
> > > > On Sun, Oct 2, 2022 at 5:34 AM Aurelien Jarno <aurelien@aurel32.net> wrote:
> > > > >
> > > > > Some early Intel Haswell CPU have AVX2 instructions, but do not have
> > > > > BMI1 and BMI2 instructions. Some AVX2 string functions only check for
> > > > > AVX2, but use BMI1, BMI2 or LZCNT instructions. This patchset tries to
> > > > > fix that.
> > > > >
> > > > > While most fixes only change ifunc-impl-list.c, and thus only concerns
> > > > > the testsuite, the strn(case)cmp is a real issue affecting early Intel
> > > >
> > > > str(case)cmp as well, correct?
> > >
> > > Oops, yes forgot to update the cover letter on that aspect.
> > >
> > > > > Haswell CPU, reported to affect Debian Sid and Fedora Rawhide.
> > > > >
> > > > > On the other hand, the check for LZCNT in memrchr is purely for
> > > > > correctness, I am not aware of a CPU implementing AVX2 without LZCNT.
> > > > >
> > > > > This has been tested by remplacing all BMI1 and BMI2 instructions in the
> > > > > source code by the "ud2" instruction and disabling the BMI1, BMI2
> > > > > feature detection, and running the testsuite.
> > > > >
> > > > > Resolves: BZ #29611
> > > > >
> > > > > Change v1 -> v2:
> > > > > - Better scan for BMI2 instructions (shlx and shrx) and BMI1
> > > > >   instructions (blsmsk) instructions following the feedback from Noah
> > > > >   Goldstein
> > > > >
> > > > > Aurelien Jarno (6):
> > > > >   x86: include BMI1 and BMI2 in x86-64-v3 level
> > > > >   x86-64: Require BMI2 for AVX2 str*cmp and wcs(n)cmp implementations
> > > > >   x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
> > > > >   x86-64: Require LZCNT for AVX2 memrchr implementation
> > > > >   x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
> > > > >   x86-64: Require BMI2 for AVX2 memrchr implementation
> > > > >
> > > > >  sysdeps/x86/get-isa-level.h                 |  2 +
> > > > >  sysdeps/x86/isa-level.h                     |  2 +
> > > > >  sysdeps/x86_64/multiarch/ifunc-avx2.h       |  2 +
> > > > >  sysdeps/x86_64/multiarch/ifunc-impl-list.c  | 86 ++++++++++++++++-----
> > > > >  sysdeps/x86_64/multiarch/ifunc-strcasecmp.h |  1 +
> > > > >  sysdeps/x86_64/multiarch/strcmp.c           |  4 +-
> > > > >  sysdeps/x86_64/multiarch/strncmp.c          |  4 +-
> > > > >  7 files changed, 76 insertions(+), 25 deletions(-)
> > > > >
> > > > > --
> > > > > 2.35.1
> > > > >
> > > >
> > >
> > > --
> > > Aurelien Jarno                          GPG: 4096R/1DDD8C9B
> > > aurelien@aurel32.net                 http://www.aurel32.net
> >
> > Patchset looks good.
> >
> > Do you have commit permissions? If not I can push them for you.
>
> Yes, I can commit them, though I'll wait for v3 to be reviewed.
>
> > Thanks for the bugfix!
> >
> > NB:
> > the str*(case)cmp, wcs(n)cmp bug affects 2.36, 2.35, 2.34, 2.33.
>
> Yep, I have already prepared a backport of the whole patchset to those
> branches.
>

If thats the case you may not need V3 because AFAIK thats the only
reason to split the strcmp patches up.

Sunil can you comment?
> --
> Aurelien Jarno                          GPG: 4096R/1DDD8C9B
> aurelien@aurel32.net                 http://www.aurel32.net