[v5,2/3] x86: Add macros for GPRs / mask insn based on VEC_SIZE

Message ID 20221014211501.524094-2-goldstein.w.n@gmail.com
State Superseded
Headers
Series [v5,1/3] x86: Update evex256/512 vec macros |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Noah Goldstein Oct. 14, 2022, 9:15 p.m. UTC
  This is to make it easier to do think like:
```
vpcmpb %VEC(0), %VEC(1), %k0
kmov{d|q} %k0, %{eax|rax}
test %{eax|rax}
```

It adds macro s.t any GPR can get the proper width with:
    `V{upper_case_GPR_name}`

and any mask insn can get the proper width with:
    `{mask_insn_without_postfix}V`

This commit does not change libc.so

Tested build on x86-64
---
 sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
 .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
 2 files changed, 289 insertions(+)
 create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
 create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
  

Comments

H.J. Lu Oct. 14, 2022, 9:28 p.m. UTC | #1
On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> This is to make it easier to do think like:
> ```
> vpcmpb %VEC(0), %VEC(1), %k0
> kmov{d|q} %k0, %{eax|rax}
> test %{eax|rax}
> ```
>
> It adds macro s.t any GPR can get the proper width with:
>     `V{upper_case_GPR_name}`
>
> and any mask insn can get the proper width with:
>     `{mask_insn_without_postfix}V`
>
> This commit does not change libc.so
>
> Tested build on x86-64
> ---
>  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
>  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
>  2 files changed, 289 insertions(+)
>  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
>  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
>
> diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> new file mode 100644
> index 0000000000..16168b6fda
> --- /dev/null
> +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> @@ -0,0 +1,166 @@
> +/* This file was generated by: gen-reg-macros.py.
> +
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _REG_MACROS_H
> +#define _REG_MACROS_H  1
> +
> +#define rax_8  al
> +#define rax_16 ax
> +#define rax_32 eax
> +#define rax_64 rax
> +#define rbx_8  bl
> +#define rbx_16 bx
> +#define rbx_32 ebx
> +#define rbx_64 rbx
> +#define rcx_8  cl
> +#define rcx_16 cx
> +#define rcx_32 ecx
> +#define rcx_64 rcx
> +#define rdx_8  dl
> +#define rdx_16 dx
> +#define rdx_32 edx
> +#define rdx_64 rdx
> +#define rbp_8  bpl
> +#define rbp_16 bp
> +#define rbp_32 ebp
> +#define rbp_64 rbp
> +#define rsp_8  spl
> +#define rsp_16 sp
> +#define rsp_32 esp
> +#define rsp_64 rsp
> +#define rsi_8  sil
> +#define rsi_16 si
> +#define rsi_32 esi
> +#define rsi_64 rsi
> +#define rdi_8  dil
> +#define rdi_16 di
> +#define rdi_32 edi
> +#define rdi_64 rdi
> +#define r8_8   r8b
> +#define r8_16  r8w
> +#define r8_32  r8d
> +#define r8_64  r8
> +#define r9_8   r9b
> +#define r9_16  r9w
> +#define r9_32  r9d
> +#define r9_64  r9
> +#define r10_8  r10b
> +#define r10_16 r10w
> +#define r10_32 r10d
> +#define r10_64 r10
> +#define r11_8  r11b
> +#define r11_16 r11w
> +#define r11_32 r11d
> +#define r11_64 r11
> +#define r12_8  r12b
> +#define r12_16 r12w
> +#define r12_32 r12d
> +#define r12_64 r12
> +#define r13_8  r13b
> +#define r13_16 r13w
> +#define r13_32 r13d
> +#define r13_64 r13
> +#define r14_8  r14b
> +#define r14_16 r14w
> +#define r14_32 r14d
> +#define r14_64 r14
> +#define r15_8  r15b
> +#define r15_16 r15w
> +#define r15_32 r15d
> +#define r15_64 r15
> +
> +#define kmov_8 kmovb
> +#define kmov_16        kmovw
> +#define kmov_32        kmovd
> +#define kmov_64        kmovq
> +#define kortest_8      kortestb
> +#define kortest_16     kortestw
> +#define kortest_32     kortestd
> +#define kortest_64     kortestq
> +#define kor_8  korb
> +#define kor_16 korw
> +#define kor_32 kord
> +#define kor_64 korq
> +#define ktest_8        ktestb
> +#define ktest_16       ktestw
> +#define ktest_32       ktestd
> +#define ktest_64       ktestq
> +#define kand_8 kandb
> +#define kand_16        kandw
> +#define kand_32        kandd
> +#define kand_64        kandq
> +#define kxor_8 kxorb
> +#define kxor_16        kxorw
> +#define kxor_32        kxord
> +#define kxor_64        kxorq
> +#define knot_8 knotb
> +#define knot_16        knotw
> +#define knot_32        knotd
> +#define knot_64        knotq
> +#define kxnor_8        kxnorb
> +#define kxnor_16       kxnorw
> +#define kxnor_32       kxnord
> +#define kxnor_64       kxnorq
> +#define kunpack_8      kunpackbw
> +#define kunpack_16     kunpackwd
> +#define kunpack_32     kunpackdq
> +
> +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> +#define VRAX   VGPR(rax)
> +#define VRBX   VGPR(rbx)
> +#define VRCX   VGPR(rcx)
> +#define VRDX   VGPR(rdx)
> +#define VRBP   VGPR(rbp)
> +#define VRSP   VGPR(rsp)
> +#define VRSI   VGPR(rsi)
> +#define VRDI   VGPR(rdi)
> +#define VR8    VGPR(r8)
> +#define VR9    VGPR(r9)
> +#define VR10   VGPR(r10)
> +#define VR11   VGPR(r11)
> +#define VR12   VGPR(r12)
> +#define VR13   VGPR(r13)
> +#define VR14   VGPR(r14)
> +#define VR15   VGPR(r15)
> +
> +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> +#define KMOV   VKINSN(kmov)
> +#define KORTEST        VKINSN(kortest)
> +#define KOR    VKINSN(kor)
> +#define KTEST  VKINSN(ktest)
> +#define KAND   VKINSN(kand)
> +#define KXOR   VKINSN(kxor)
> +#define KNOT   VKINSN(knot)
> +#define KXNOR  VKINSN(kxnor)
> +#define KUNPACK        VKINSN(kunpack)
> +
> +#ifndef REG_WIDTH
> +# define REG_WIDTH VEC_SIZE
> +#endif

Which files will define REG_WIDTH?  What values will it be for
YMM and ZMM vectors?

> +#define VPASTER(x, y)  x##_##y
> +#define VEVALUATOR(x, y)       VPASTER(x, y)
> +
> +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> +
> +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> +
> +#endif
> diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> new file mode 100644
> index 0000000000..c7296a8104
> --- /dev/null
> +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> @@ -0,0 +1,123 @@
> +#!/usr/bin/python3
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +"""Generate macros for getting GPR name of a certain size
> +
> +Inputs: None
> +Output: Prints header fill to stdout
> +
> +API:
> +    VGPR(reg_name)
> +        - Get register name VEC_SIZE component of `reg_name`
> +    VGPR_SZ(reg_name, reg_size)
> +        - Get register name `reg_size` component of `reg_name`
> +"""
> +
> +import sys
> +import os
> +from datetime import datetime
> +
> +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> +
> +mask_insns = [
> +    "kmov",
> +    "kortest",
> +    "kor",
> +    "ktest",
> +    "kand",
> +    "kxor",
> +    "knot",
> +    "kxnor",
> +]
> +mask_insns_ext = ["b", "w", "d", "q"]
> +
> +cr = """
> +   Copyright (C) {} Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +"""
> +
> +print("/* This file was generated by: {}.".format(os.path.basename(
> +    sys.argv[0])))
> +print(cr.format(datetime.today().year))
> +
> +print("#ifndef _REG_MACROS_H")
> +print("#define _REG_MACROS_H\t1")
> +print("")
> +for reg in registers:
> +    for i in range(0, 4):
> +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> +
> +print("")
> +for mask_insn in mask_insns:
> +    for i in range(0, 4):
> +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> +                                           mask_insns_ext[i]))
> +for i in range(0, 3):
> +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> +                                                   mask_insns_ext[i + 1]))
> +mask_insns.append("kunpack")
> +
> +print("")
> +print(
> +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> +for reg in registers:
> +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> +
> +print("")
> +
> +print(
> +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> +)
> +for mask_insn in mask_insns:
> +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> +print("")
> +
> +print("#ifndef REG_WIDTH")
> +print("# define REG_WIDTH VEC_SIZE")
> +print("#endif")
> +print("")
> +print("#define VPASTER(x, y)\tx##_##y")
> +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> +print("")
> +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> +print("")
> +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> +
> +print("\n#endif")
> --
> 2.34.1
>
  
Noah Goldstein Oct. 14, 2022, 10:01 p.m. UTC | #2
On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > This is to make it easier to do think like:
> > ```
> > vpcmpb %VEC(0), %VEC(1), %k0
> > kmov{d|q} %k0, %{eax|rax}
> > test %{eax|rax}
> > ```
> >
> > It adds macro s.t any GPR can get the proper width with:
> >     `V{upper_case_GPR_name}`
> >
> > and any mask insn can get the proper width with:
> >     `{mask_insn_without_postfix}V`
> >
> > This commit does not change libc.so
> >
> > Tested build on x86-64
> > ---
> >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> >  2 files changed, 289 insertions(+)
> >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> >
> > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > new file mode 100644
> > index 0000000000..16168b6fda
> > --- /dev/null
> > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > @@ -0,0 +1,166 @@
> > +/* This file was generated by: gen-reg-macros.py.
> > +
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#ifndef _REG_MACROS_H
> > +#define _REG_MACROS_H  1
> > +
> > +#define rax_8  al
> > +#define rax_16 ax
> > +#define rax_32 eax
> > +#define rax_64 rax
> > +#define rbx_8  bl
> > +#define rbx_16 bx
> > +#define rbx_32 ebx
> > +#define rbx_64 rbx
> > +#define rcx_8  cl
> > +#define rcx_16 cx
> > +#define rcx_32 ecx
> > +#define rcx_64 rcx
> > +#define rdx_8  dl
> > +#define rdx_16 dx
> > +#define rdx_32 edx
> > +#define rdx_64 rdx
> > +#define rbp_8  bpl
> > +#define rbp_16 bp
> > +#define rbp_32 ebp
> > +#define rbp_64 rbp
> > +#define rsp_8  spl
> > +#define rsp_16 sp
> > +#define rsp_32 esp
> > +#define rsp_64 rsp
> > +#define rsi_8  sil
> > +#define rsi_16 si
> > +#define rsi_32 esi
> > +#define rsi_64 rsi
> > +#define rdi_8  dil
> > +#define rdi_16 di
> > +#define rdi_32 edi
> > +#define rdi_64 rdi
> > +#define r8_8   r8b
> > +#define r8_16  r8w
> > +#define r8_32  r8d
> > +#define r8_64  r8
> > +#define r9_8   r9b
> > +#define r9_16  r9w
> > +#define r9_32  r9d
> > +#define r9_64  r9
> > +#define r10_8  r10b
> > +#define r10_16 r10w
> > +#define r10_32 r10d
> > +#define r10_64 r10
> > +#define r11_8  r11b
> > +#define r11_16 r11w
> > +#define r11_32 r11d
> > +#define r11_64 r11
> > +#define r12_8  r12b
> > +#define r12_16 r12w
> > +#define r12_32 r12d
> > +#define r12_64 r12
> > +#define r13_8  r13b
> > +#define r13_16 r13w
> > +#define r13_32 r13d
> > +#define r13_64 r13
> > +#define r14_8  r14b
> > +#define r14_16 r14w
> > +#define r14_32 r14d
> > +#define r14_64 r14
> > +#define r15_8  r15b
> > +#define r15_16 r15w
> > +#define r15_32 r15d
> > +#define r15_64 r15
> > +
> > +#define kmov_8 kmovb
> > +#define kmov_16        kmovw
> > +#define kmov_32        kmovd
> > +#define kmov_64        kmovq
> > +#define kortest_8      kortestb
> > +#define kortest_16     kortestw
> > +#define kortest_32     kortestd
> > +#define kortest_64     kortestq
> > +#define kor_8  korb
> > +#define kor_16 korw
> > +#define kor_32 kord
> > +#define kor_64 korq
> > +#define ktest_8        ktestb
> > +#define ktest_16       ktestw
> > +#define ktest_32       ktestd
> > +#define ktest_64       ktestq
> > +#define kand_8 kandb
> > +#define kand_16        kandw
> > +#define kand_32        kandd
> > +#define kand_64        kandq
> > +#define kxor_8 kxorb
> > +#define kxor_16        kxorw
> > +#define kxor_32        kxord
> > +#define kxor_64        kxorq
> > +#define knot_8 knotb
> > +#define knot_16        knotw
> > +#define knot_32        knotd
> > +#define knot_64        knotq
> > +#define kxnor_8        kxnorb
> > +#define kxnor_16       kxnorw
> > +#define kxnor_32       kxnord
> > +#define kxnor_64       kxnorq
> > +#define kunpack_8      kunpackbw
> > +#define kunpack_16     kunpackwd
> > +#define kunpack_32     kunpackdq
> > +
> > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > +#define VRAX   VGPR(rax)
> > +#define VRBX   VGPR(rbx)
> > +#define VRCX   VGPR(rcx)
> > +#define VRDX   VGPR(rdx)
> > +#define VRBP   VGPR(rbp)
> > +#define VRSP   VGPR(rsp)
> > +#define VRSI   VGPR(rsi)
> > +#define VRDI   VGPR(rdi)
> > +#define VR8    VGPR(r8)
> > +#define VR9    VGPR(r9)
> > +#define VR10   VGPR(r10)
> > +#define VR11   VGPR(r11)
> > +#define VR12   VGPR(r12)
> > +#define VR13   VGPR(r13)
> > +#define VR14   VGPR(r14)
> > +#define VR15   VGPR(r15)
> > +
> > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > +#define KMOV   VKINSN(kmov)
> > +#define KORTEST        VKINSN(kortest)
> > +#define KOR    VKINSN(kor)
> > +#define KTEST  VKINSN(ktest)
> > +#define KAND   VKINSN(kand)
> > +#define KXOR   VKINSN(kxor)
> > +#define KNOT   VKINSN(knot)
> > +#define KXNOR  VKINSN(kxnor)
> > +#define KUNPACK        VKINSN(kunpack)
> > +
> > +#ifndef REG_WIDTH
> > +# define REG_WIDTH VEC_SIZE
> > +#endif
>
> Which files will define REG_WIDTH?  What values will it be for
> YMM and ZMM vectors?

for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.

For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.
>
> > +#define VPASTER(x, y)  x##_##y
> > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > +
> > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > +
> > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > +
> > +#endif
> > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > new file mode 100644
> > index 0000000000..c7296a8104
> > --- /dev/null
> > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > @@ -0,0 +1,123 @@
> > +#!/usr/bin/python3
> > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > +# This file is part of the GNU C Library.
> > +#
> > +# The GNU C Library is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU Lesser General Public
> > +# License as published by the Free Software Foundation; either
> > +# version 2.1 of the License, or (at your option) any later version.
> > +#
> > +# The GNU C Library is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +# Lesser General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public
> > +# License along with the GNU C Library; if not, see
> > +# <https://www.gnu.org/licenses/>.
> > +"""Generate macros for getting GPR name of a certain size
> > +
> > +Inputs: None
> > +Output: Prints header fill to stdout
> > +
> > +API:
> > +    VGPR(reg_name)
> > +        - Get register name VEC_SIZE component of `reg_name`
> > +    VGPR_SZ(reg_name, reg_size)
> > +        - Get register name `reg_size` component of `reg_name`
> > +"""
> > +
> > +import sys
> > +import os
> > +from datetime import datetime
> > +
> > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > +
> > +mask_insns = [
> > +    "kmov",
> > +    "kortest",
> > +    "kor",
> > +    "ktest",
> > +    "kand",
> > +    "kxor",
> > +    "knot",
> > +    "kxnor",
> > +]
> > +mask_insns_ext = ["b", "w", "d", "q"]
> > +
> > +cr = """
> > +   Copyright (C) {} Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +"""
> > +
> > +print("/* This file was generated by: {}.".format(os.path.basename(
> > +    sys.argv[0])))
> > +print(cr.format(datetime.today().year))
> > +
> > +print("#ifndef _REG_MACROS_H")
> > +print("#define _REG_MACROS_H\t1")
> > +print("")
> > +for reg in registers:
> > +    for i in range(0, 4):
> > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > +
> > +print("")
> > +for mask_insn in mask_insns:
> > +    for i in range(0, 4):
> > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > +                                           mask_insns_ext[i]))
> > +for i in range(0, 3):
> > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > +                                                   mask_insns_ext[i + 1]))
> > +mask_insns.append("kunpack")
> > +
> > +print("")
> > +print(
> > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > +for reg in registers:
> > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > +
> > +print("")
> > +
> > +print(
> > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > +)
> > +for mask_insn in mask_insns:
> > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > +print("")
> > +
> > +print("#ifndef REG_WIDTH")
> > +print("# define REG_WIDTH VEC_SIZE")
> > +print("#endif")
> > +print("")
> > +print("#define VPASTER(x, y)\tx##_##y")
> > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > +print("")
> > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > +print("")
> > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > +
> > +print("\n#endif")
> > --
> > 2.34.1
> >
>
>
> --
> H.J.
  
H.J. Lu Oct. 14, 2022, 10:05 p.m. UTC | #3
On Fri, Oct 14, 2022 at 3:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > This is to make it easier to do think like:
> > > ```
> > > vpcmpb %VEC(0), %VEC(1), %k0
> > > kmov{d|q} %k0, %{eax|rax}
> > > test %{eax|rax}
> > > ```
> > >
> > > It adds macro s.t any GPR can get the proper width with:
> > >     `V{upper_case_GPR_name}`
> > >
> > > and any mask insn can get the proper width with:
> > >     `{mask_insn_without_postfix}V`
> > >
> > > This commit does not change libc.so
> > >
> > > Tested build on x86-64
> > > ---
> > >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> > >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> > >  2 files changed, 289 insertions(+)
> > >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> > >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > >
> > > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > > new file mode 100644
> > > index 0000000000..16168b6fda
> > > --- /dev/null
> > > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > > @@ -0,0 +1,166 @@
> > > +/* This file was generated by: gen-reg-macros.py.
> > > +
> > > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > > +   This file is part of the GNU C Library.
> > > +
> > > +   The GNU C Library is free software; you can redistribute it and/or
> > > +   modify it under the terms of the GNU Lesser General Public
> > > +   License as published by the Free Software Foundation; either
> > > +   version 2.1 of the License, or (at your option) any later version.
> > > +
> > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +   Lesser General Public License for more details.
> > > +
> > > +   You should have received a copy of the GNU Lesser General Public
> > > +   License along with the GNU C Library; if not, see
> > > +   <https://www.gnu.org/licenses/>.  */
> > > +
> > > +#ifndef _REG_MACROS_H
> > > +#define _REG_MACROS_H  1
> > > +
> > > +#define rax_8  al
> > > +#define rax_16 ax
> > > +#define rax_32 eax
> > > +#define rax_64 rax
> > > +#define rbx_8  bl
> > > +#define rbx_16 bx
> > > +#define rbx_32 ebx
> > > +#define rbx_64 rbx
> > > +#define rcx_8  cl
> > > +#define rcx_16 cx
> > > +#define rcx_32 ecx
> > > +#define rcx_64 rcx
> > > +#define rdx_8  dl
> > > +#define rdx_16 dx
> > > +#define rdx_32 edx
> > > +#define rdx_64 rdx
> > > +#define rbp_8  bpl
> > > +#define rbp_16 bp
> > > +#define rbp_32 ebp
> > > +#define rbp_64 rbp
> > > +#define rsp_8  spl
> > > +#define rsp_16 sp
> > > +#define rsp_32 esp
> > > +#define rsp_64 rsp
> > > +#define rsi_8  sil
> > > +#define rsi_16 si
> > > +#define rsi_32 esi
> > > +#define rsi_64 rsi
> > > +#define rdi_8  dil
> > > +#define rdi_16 di
> > > +#define rdi_32 edi
> > > +#define rdi_64 rdi
> > > +#define r8_8   r8b
> > > +#define r8_16  r8w
> > > +#define r8_32  r8d
> > > +#define r8_64  r8
> > > +#define r9_8   r9b
> > > +#define r9_16  r9w
> > > +#define r9_32  r9d
> > > +#define r9_64  r9
> > > +#define r10_8  r10b
> > > +#define r10_16 r10w
> > > +#define r10_32 r10d
> > > +#define r10_64 r10
> > > +#define r11_8  r11b
> > > +#define r11_16 r11w
> > > +#define r11_32 r11d
> > > +#define r11_64 r11
> > > +#define r12_8  r12b
> > > +#define r12_16 r12w
> > > +#define r12_32 r12d
> > > +#define r12_64 r12
> > > +#define r13_8  r13b
> > > +#define r13_16 r13w
> > > +#define r13_32 r13d
> > > +#define r13_64 r13
> > > +#define r14_8  r14b
> > > +#define r14_16 r14w
> > > +#define r14_32 r14d
> > > +#define r14_64 r14
> > > +#define r15_8  r15b
> > > +#define r15_16 r15w
> > > +#define r15_32 r15d
> > > +#define r15_64 r15
> > > +
> > > +#define kmov_8 kmovb
> > > +#define kmov_16        kmovw
> > > +#define kmov_32        kmovd
> > > +#define kmov_64        kmovq
> > > +#define kortest_8      kortestb
> > > +#define kortest_16     kortestw
> > > +#define kortest_32     kortestd
> > > +#define kortest_64     kortestq
> > > +#define kor_8  korb
> > > +#define kor_16 korw
> > > +#define kor_32 kord
> > > +#define kor_64 korq
> > > +#define ktest_8        ktestb
> > > +#define ktest_16       ktestw
> > > +#define ktest_32       ktestd
> > > +#define ktest_64       ktestq
> > > +#define kand_8 kandb
> > > +#define kand_16        kandw
> > > +#define kand_32        kandd
> > > +#define kand_64        kandq
> > > +#define kxor_8 kxorb
> > > +#define kxor_16        kxorw
> > > +#define kxor_32        kxord
> > > +#define kxor_64        kxorq
> > > +#define knot_8 knotb
> > > +#define knot_16        knotw
> > > +#define knot_32        knotd
> > > +#define knot_64        knotq
> > > +#define kxnor_8        kxnorb
> > > +#define kxnor_16       kxnorw
> > > +#define kxnor_32       kxnord
> > > +#define kxnor_64       kxnorq
> > > +#define kunpack_8      kunpackbw
> > > +#define kunpack_16     kunpackwd
> > > +#define kunpack_32     kunpackdq
> > > +
> > > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > > +#define VRAX   VGPR(rax)
> > > +#define VRBX   VGPR(rbx)
> > > +#define VRCX   VGPR(rcx)
> > > +#define VRDX   VGPR(rdx)
> > > +#define VRBP   VGPR(rbp)
> > > +#define VRSP   VGPR(rsp)
> > > +#define VRSI   VGPR(rsi)
> > > +#define VRDI   VGPR(rdi)
> > > +#define VR8    VGPR(r8)
> > > +#define VR9    VGPR(r9)
> > > +#define VR10   VGPR(r10)
> > > +#define VR11   VGPR(r11)
> > > +#define VR12   VGPR(r12)
> > > +#define VR13   VGPR(r13)
> > > +#define VR14   VGPR(r14)
> > > +#define VR15   VGPR(r15)
> > > +
> > > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > > +#define KMOV   VKINSN(kmov)
> > > +#define KORTEST        VKINSN(kortest)
> > > +#define KOR    VKINSN(kor)
> > > +#define KTEST  VKINSN(ktest)
> > > +#define KAND   VKINSN(kand)
> > > +#define KXOR   VKINSN(kxor)
> > > +#define KNOT   VKINSN(knot)
> > > +#define KXNOR  VKINSN(kxnor)
> > > +#define KUNPACK        VKINSN(kunpack)
> > > +
> > > +#ifndef REG_WIDTH
> > > +# define REG_WIDTH VEC_SIZE
> > > +#endif
> >
> > Which files will define REG_WIDTH?  What values will it be for
> > YMM and ZMM vectors?
>
> for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
> so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.
>
> For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.

Then we should have

#ifdef USE_WIDE_CHAR
# define REG_WIDTH 32
#else
# define REG_WIDTH VEC_SIZE
#endif

> >
> > > +#define VPASTER(x, y)  x##_##y
> > > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > > +
> > > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > > +
> > > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > > +
> > > +#endif
> > > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > new file mode 100644
> > > index 0000000000..c7296a8104
> > > --- /dev/null
> > > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > @@ -0,0 +1,123 @@
> > > +#!/usr/bin/python3
> > > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > > +# This file is part of the GNU C Library.
> > > +#
> > > +# The GNU C Library is free software; you can redistribute it and/or
> > > +# modify it under the terms of the GNU Lesser General Public
> > > +# License as published by the Free Software Foundation; either
> > > +# version 2.1 of the License, or (at your option) any later version.
> > > +#
> > > +# The GNU C Library is distributed in the hope that it will be useful,
> > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +# Lesser General Public License for more details.
> > > +#
> > > +# You should have received a copy of the GNU Lesser General Public
> > > +# License along with the GNU C Library; if not, see
> > > +# <https://www.gnu.org/licenses/>.
> > > +"""Generate macros for getting GPR name of a certain size
> > > +
> > > +Inputs: None
> > > +Output: Prints header fill to stdout
> > > +
> > > +API:
> > > +    VGPR(reg_name)
> > > +        - Get register name VEC_SIZE component of `reg_name`
> > > +    VGPR_SZ(reg_name, reg_size)
> > > +        - Get register name `reg_size` component of `reg_name`
> > > +"""
> > > +
> > > +import sys
> > > +import os
> > > +from datetime import datetime
> > > +
> > > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > > +
> > > +mask_insns = [
> > > +    "kmov",
> > > +    "kortest",
> > > +    "kor",
> > > +    "ktest",
> > > +    "kand",
> > > +    "kxor",
> > > +    "knot",
> > > +    "kxnor",
> > > +]
> > > +mask_insns_ext = ["b", "w", "d", "q"]
> > > +
> > > +cr = """
> > > +   Copyright (C) {} Free Software Foundation, Inc.
> > > +   This file is part of the GNU C Library.
> > > +
> > > +   The GNU C Library is free software; you can redistribute it and/or
> > > +   modify it under the terms of the GNU Lesser General Public
> > > +   License as published by the Free Software Foundation; either
> > > +   version 2.1 of the License, or (at your option) any later version.
> > > +
> > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +   Lesser General Public License for more details.
> > > +
> > > +   You should have received a copy of the GNU Lesser General Public
> > > +   License along with the GNU C Library; if not, see
> > > +   <https://www.gnu.org/licenses/>.  */
> > > +"""
> > > +
> > > +print("/* This file was generated by: {}.".format(os.path.basename(
> > > +    sys.argv[0])))
> > > +print(cr.format(datetime.today().year))
> > > +
> > > +print("#ifndef _REG_MACROS_H")
> > > +print("#define _REG_MACROS_H\t1")
> > > +print("")
> > > +for reg in registers:
> > > +    for i in range(0, 4):
> > > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > > +
> > > +print("")
> > > +for mask_insn in mask_insns:
> > > +    for i in range(0, 4):
> > > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > > +                                           mask_insns_ext[i]))
> > > +for i in range(0, 3):
> > > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > > +                                                   mask_insns_ext[i + 1]))
> > > +mask_insns.append("kunpack")
> > > +
> > > +print("")
> > > +print(
> > > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > > +for reg in registers:
> > > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > > +
> > > +print("")
> > > +
> > > +print(
> > > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > > +)
> > > +for mask_insn in mask_insns:
> > > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > > +print("")
> > > +
> > > +print("#ifndef REG_WIDTH")
> > > +print("# define REG_WIDTH VEC_SIZE")
> > > +print("#endif")
> > > +print("")
> > > +print("#define VPASTER(x, y)\tx##_##y")
> > > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > > +print("")
> > > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > > +print("")
> > > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > > +
> > > +print("\n#endif")
> > > --
> > > 2.34.1
> > >
> >
> >
> > --
> > H.J.
  
Noah Goldstein Oct. 14, 2022, 10:27 p.m. UTC | #4
On Fri, Oct 14, 2022 at 5:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
>  On Fri, Oct 14, 2022 at 3:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > >
> > > > This is to make it easier to do think like:
> > > > ```
> > > > vpcmpb %VEC(0), %VEC(1), %k0
> > > > kmov{d|q} %k0, %{eax|rax}
> > > > test %{eax|rax}
> > > > ```
> > > >
> > > > It adds macro s.t any GPR can get the proper width with:
> > > >     `V{upper_case_GPR_name}`
> > > >
> > > > and any mask insn can get the proper width with:
> > > >     `{mask_insn_without_postfix}V`
> > > >
> > > > This commit does not change libc.so
> > > >
> > > > Tested build on x86-64
> > > > ---
> > > >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> > > >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> > > >  2 files changed, 289 insertions(+)
> > > >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> > > >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > >
> > > > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > new file mode 100644
> > > > index 0000000000..16168b6fda
> > > > --- /dev/null
> > > > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > @@ -0,0 +1,166 @@
> > > > +/* This file was generated by: gen-reg-macros.py.
> > > > +
> > > > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > > > +   This file is part of the GNU C Library.
> > > > +
> > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > +   modify it under the terms of the GNU Lesser General Public
> > > > +   License as published by the Free Software Foundation; either
> > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > +
> > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > +   Lesser General Public License for more details.
> > > > +
> > > > +   You should have received a copy of the GNU Lesser General Public
> > > > +   License along with the GNU C Library; if not, see
> > > > +   <https://www.gnu.org/licenses/>.  */
> > > > +
> > > > +#ifndef _REG_MACROS_H
> > > > +#define _REG_MACROS_H  1
> > > > +
> > > > +#define rax_8  al
> > > > +#define rax_16 ax
> > > > +#define rax_32 eax
> > > > +#define rax_64 rax
> > > > +#define rbx_8  bl
> > > > +#define rbx_16 bx
> > > > +#define rbx_32 ebx
> > > > +#define rbx_64 rbx
> > > > +#define rcx_8  cl
> > > > +#define rcx_16 cx
> > > > +#define rcx_32 ecx
> > > > +#define rcx_64 rcx
> > > > +#define rdx_8  dl
> > > > +#define rdx_16 dx
> > > > +#define rdx_32 edx
> > > > +#define rdx_64 rdx
> > > > +#define rbp_8  bpl
> > > > +#define rbp_16 bp
> > > > +#define rbp_32 ebp
> > > > +#define rbp_64 rbp
> > > > +#define rsp_8  spl
> > > > +#define rsp_16 sp
> > > > +#define rsp_32 esp
> > > > +#define rsp_64 rsp
> > > > +#define rsi_8  sil
> > > > +#define rsi_16 si
> > > > +#define rsi_32 esi
> > > > +#define rsi_64 rsi
> > > > +#define rdi_8  dil
> > > > +#define rdi_16 di
> > > > +#define rdi_32 edi
> > > > +#define rdi_64 rdi
> > > > +#define r8_8   r8b
> > > > +#define r8_16  r8w
> > > > +#define r8_32  r8d
> > > > +#define r8_64  r8
> > > > +#define r9_8   r9b
> > > > +#define r9_16  r9w
> > > > +#define r9_32  r9d
> > > > +#define r9_64  r9
> > > > +#define r10_8  r10b
> > > > +#define r10_16 r10w
> > > > +#define r10_32 r10d
> > > > +#define r10_64 r10
> > > > +#define r11_8  r11b
> > > > +#define r11_16 r11w
> > > > +#define r11_32 r11d
> > > > +#define r11_64 r11
> > > > +#define r12_8  r12b
> > > > +#define r12_16 r12w
> > > > +#define r12_32 r12d
> > > > +#define r12_64 r12
> > > > +#define r13_8  r13b
> > > > +#define r13_16 r13w
> > > > +#define r13_32 r13d
> > > > +#define r13_64 r13
> > > > +#define r14_8  r14b
> > > > +#define r14_16 r14w
> > > > +#define r14_32 r14d
> > > > +#define r14_64 r14
> > > > +#define r15_8  r15b
> > > > +#define r15_16 r15w
> > > > +#define r15_32 r15d
> > > > +#define r15_64 r15
> > > > +
> > > > +#define kmov_8 kmovb
> > > > +#define kmov_16        kmovw
> > > > +#define kmov_32        kmovd
> > > > +#define kmov_64        kmovq
> > > > +#define kortest_8      kortestb
> > > > +#define kortest_16     kortestw
> > > > +#define kortest_32     kortestd
> > > > +#define kortest_64     kortestq
> > > > +#define kor_8  korb
> > > > +#define kor_16 korw
> > > > +#define kor_32 kord
> > > > +#define kor_64 korq
> > > > +#define ktest_8        ktestb
> > > > +#define ktest_16       ktestw
> > > > +#define ktest_32       ktestd
> > > > +#define ktest_64       ktestq
> > > > +#define kand_8 kandb
> > > > +#define kand_16        kandw
> > > > +#define kand_32        kandd
> > > > +#define kand_64        kandq
> > > > +#define kxor_8 kxorb
> > > > +#define kxor_16        kxorw
> > > > +#define kxor_32        kxord
> > > > +#define kxor_64        kxorq
> > > > +#define knot_8 knotb
> > > > +#define knot_16        knotw
> > > > +#define knot_32        knotd
> > > > +#define knot_64        knotq
> > > > +#define kxnor_8        kxnorb
> > > > +#define kxnor_16       kxnorw
> > > > +#define kxnor_32       kxnord
> > > > +#define kxnor_64       kxnorq
> > > > +#define kunpack_8      kunpackbw
> > > > +#define kunpack_16     kunpackwd
> > > > +#define kunpack_32     kunpackdq
> > > > +
> > > > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > > > +#define VRAX   VGPR(rax)
> > > > +#define VRBX   VGPR(rbx)
> > > > +#define VRCX   VGPR(rcx)
> > > > +#define VRDX   VGPR(rdx)
> > > > +#define VRBP   VGPR(rbp)
> > > > +#define VRSP   VGPR(rsp)
> > > > +#define VRSI   VGPR(rsi)
> > > > +#define VRDI   VGPR(rdi)
> > > > +#define VR8    VGPR(r8)
> > > > +#define VR9    VGPR(r9)
> > > > +#define VR10   VGPR(r10)
> > > > +#define VR11   VGPR(r11)
> > > > +#define VR12   VGPR(r12)
> > > > +#define VR13   VGPR(r13)
> > > > +#define VR14   VGPR(r14)
> > > > +#define VR15   VGPR(r15)
> > > > +
> > > > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > > > +#define KMOV   VKINSN(kmov)
> > > > +#define KORTEST        VKINSN(kortest)
> > > > +#define KOR    VKINSN(kor)
> > > > +#define KTEST  VKINSN(ktest)
> > > > +#define KAND   VKINSN(kand)
> > > > +#define KXOR   VKINSN(kxor)
> > > > +#define KNOT   VKINSN(knot)
> > > > +#define KXNOR  VKINSN(kxnor)
> > > > +#define KUNPACK        VKINSN(kunpack)
> > > > +
> > > > +#ifndef REG_WIDTH
> > > > +# define REG_WIDTH VEC_SIZE
> > > > +#endif
> > >
> > > Which files will define REG_WIDTH?  What values will it be for
> > > YMM and ZMM vectors?
> >
> > for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
> > so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.
> >
> > For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.
>
> Then we should have
>
> #ifdef USE_WIDE_CHAR
> # define REG_WIDTH 32
> #else
> # define REG_WIDTH VEC_SIZE
> #endif
>

It may not be universal. It may be that some wide-char impls will want
REG_WIDTH == 8/16 if they rely heavily on `inc` to do zero test or
for some reason or another uses the full VEC_SIZE (as wcslen-evex512
currently does).

Also don't really see what it saves to give up the granularity.
Either way to specify a seperate reg width the wchar impl will
need to define something else. Seems reasonable for that
something else to just be REG_WIDTH directly as opposed to
USE_WIDE_CHAR.

What do you think?
> > >
> > > > +#define VPASTER(x, y)  x##_##y
> > > > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > > > +
> > > > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > > > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > > > +
> > > > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > > > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > > > +
> > > > +#endif
> > > > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > new file mode 100644
> > > > index 0000000000..c7296a8104
> > > > --- /dev/null
> > > > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > @@ -0,0 +1,123 @@
> > > > +#!/usr/bin/python3
> > > > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > > > +# This file is part of the GNU C Library.
> > > > +#
> > > > +# The GNU C Library is free software; you can redistribute it and/or
> > > > +# modify it under the terms of the GNU Lesser General Public
> > > > +# License as published by the Free Software Foundation; either
> > > > +# version 2.1 of the License, or (at your option) any later version.
> > > > +#
> > > > +# The GNU C Library is distributed in the hope that it will be useful,
> > > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > +# Lesser General Public License for more details.
> > > > +#
> > > > +# You should have received a copy of the GNU Lesser General Public
> > > > +# License along with the GNU C Library; if not, see
> > > > +# <https://www.gnu.org/licenses/>.
> > > > +"""Generate macros for getting GPR name of a certain size
> > > > +
> > > > +Inputs: None
> > > > +Output: Prints header fill to stdout
> > > > +
> > > > +API:
> > > > +    VGPR(reg_name)
> > > > +        - Get register name VEC_SIZE component of `reg_name`
> > > > +    VGPR_SZ(reg_name, reg_size)
> > > > +        - Get register name `reg_size` component of `reg_name`
> > > > +"""
> > > > +
> > > > +import sys
> > > > +import os
> > > > +from datetime import datetime
> > > > +
> > > > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > > > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > > > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > > > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > > > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > > > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > > > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > > > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > > > +
> > > > +mask_insns = [
> > > > +    "kmov",
> > > > +    "kortest",
> > > > +    "kor",
> > > > +    "ktest",
> > > > +    "kand",
> > > > +    "kxor",
> > > > +    "knot",
> > > > +    "kxnor",
> > > > +]
> > > > +mask_insns_ext = ["b", "w", "d", "q"]
> > > > +
> > > > +cr = """
> > > > +   Copyright (C) {} Free Software Foundation, Inc.
> > > > +   This file is part of the GNU C Library.
> > > > +
> > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > +   modify it under the terms of the GNU Lesser General Public
> > > > +   License as published by the Free Software Foundation; either
> > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > +
> > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > +   Lesser General Public License for more details.
> > > > +
> > > > +   You should have received a copy of the GNU Lesser General Public
> > > > +   License along with the GNU C Library; if not, see
> > > > +   <https://www.gnu.org/licenses/>.  */
> > > > +"""
> > > > +
> > > > +print("/* This file was generated by: {}.".format(os.path.basename(
> > > > +    sys.argv[0])))
> > > > +print(cr.format(datetime.today().year))
> > > > +
> > > > +print("#ifndef _REG_MACROS_H")
> > > > +print("#define _REG_MACROS_H\t1")
> > > > +print("")
> > > > +for reg in registers:
> > > > +    for i in range(0, 4):
> > > > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > > > +
> > > > +print("")
> > > > +for mask_insn in mask_insns:
> > > > +    for i in range(0, 4):
> > > > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > > > +                                           mask_insns_ext[i]))
> > > > +for i in range(0, 3):
> > > > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > > > +                                                   mask_insns_ext[i + 1]))
> > > > +mask_insns.append("kunpack")
> > > > +
> > > > +print("")
> > > > +print(
> > > > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > > > +for reg in registers:
> > > > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > > > +
> > > > +print("")
> > > > +
> > > > +print(
> > > > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > > > +)
> > > > +for mask_insn in mask_insns:
> > > > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > > > +print("")
> > > > +
> > > > +print("#ifndef REG_WIDTH")
> > > > +print("# define REG_WIDTH VEC_SIZE")
> > > > +print("#endif")
> > > > +print("")
> > > > +print("#define VPASTER(x, y)\tx##_##y")
> > > > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > > > +print("")
> > > > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > > > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > > > +print("")
> > > > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > > > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > > > +
> > > > +print("\n#endif")
> > > > --
> > > > 2.34.1
> > > >
> > >
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.
  
H.J. Lu Oct. 14, 2022, 10:41 p.m. UTC | #5
On Fri, Oct 14, 2022 at 3:27 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 5:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> >  On Fri, Oct 14, 2022 at 3:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > > >
> > > > > This is to make it easier to do think like:
> > > > > ```
> > > > > vpcmpb %VEC(0), %VEC(1), %k0
> > > > > kmov{d|q} %k0, %{eax|rax}
> > > > > test %{eax|rax}
> > > > > ```
> > > > >
> > > > > It adds macro s.t any GPR can get the proper width with:
> > > > >     `V{upper_case_GPR_name}`
> > > > >
> > > > > and any mask insn can get the proper width with:
> > > > >     `{mask_insn_without_postfix}V`
> > > > >
> > > > > This commit does not change libc.so
> > > > >
> > > > > Tested build on x86-64
> > > > > ---
> > > > >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> > > > >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> > > > >  2 files changed, 289 insertions(+)
> > > > >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> > > > >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > >
> > > > > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > new file mode 100644
> > > > > index 0000000000..16168b6fda
> > > > > --- /dev/null
> > > > > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > @@ -0,0 +1,166 @@
> > > > > +/* This file was generated by: gen-reg-macros.py.
> > > > > +
> > > > > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > +   This file is part of the GNU C Library.
> > > > > +
> > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > +   License as published by the Free Software Foundation; either
> > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > +
> > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > +   Lesser General Public License for more details.
> > > > > +
> > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > +   License along with the GNU C Library; if not, see
> > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > +
> > > > > +#ifndef _REG_MACROS_H
> > > > > +#define _REG_MACROS_H  1
> > > > > +
> > > > > +#define rax_8  al
> > > > > +#define rax_16 ax
> > > > > +#define rax_32 eax
> > > > > +#define rax_64 rax
> > > > > +#define rbx_8  bl
> > > > > +#define rbx_16 bx
> > > > > +#define rbx_32 ebx
> > > > > +#define rbx_64 rbx
> > > > > +#define rcx_8  cl
> > > > > +#define rcx_16 cx
> > > > > +#define rcx_32 ecx
> > > > > +#define rcx_64 rcx
> > > > > +#define rdx_8  dl
> > > > > +#define rdx_16 dx
> > > > > +#define rdx_32 edx
> > > > > +#define rdx_64 rdx
> > > > > +#define rbp_8  bpl
> > > > > +#define rbp_16 bp
> > > > > +#define rbp_32 ebp
> > > > > +#define rbp_64 rbp
> > > > > +#define rsp_8  spl
> > > > > +#define rsp_16 sp
> > > > > +#define rsp_32 esp
> > > > > +#define rsp_64 rsp
> > > > > +#define rsi_8  sil
> > > > > +#define rsi_16 si
> > > > > +#define rsi_32 esi
> > > > > +#define rsi_64 rsi
> > > > > +#define rdi_8  dil
> > > > > +#define rdi_16 di
> > > > > +#define rdi_32 edi
> > > > > +#define rdi_64 rdi
> > > > > +#define r8_8   r8b
> > > > > +#define r8_16  r8w
> > > > > +#define r8_32  r8d
> > > > > +#define r8_64  r8
> > > > > +#define r9_8   r9b
> > > > > +#define r9_16  r9w
> > > > > +#define r9_32  r9d
> > > > > +#define r9_64  r9
> > > > > +#define r10_8  r10b
> > > > > +#define r10_16 r10w
> > > > > +#define r10_32 r10d
> > > > > +#define r10_64 r10
> > > > > +#define r11_8  r11b
> > > > > +#define r11_16 r11w
> > > > > +#define r11_32 r11d
> > > > > +#define r11_64 r11
> > > > > +#define r12_8  r12b
> > > > > +#define r12_16 r12w
> > > > > +#define r12_32 r12d
> > > > > +#define r12_64 r12
> > > > > +#define r13_8  r13b
> > > > > +#define r13_16 r13w
> > > > > +#define r13_32 r13d
> > > > > +#define r13_64 r13
> > > > > +#define r14_8  r14b
> > > > > +#define r14_16 r14w
> > > > > +#define r14_32 r14d
> > > > > +#define r14_64 r14
> > > > > +#define r15_8  r15b
> > > > > +#define r15_16 r15w
> > > > > +#define r15_32 r15d
> > > > > +#define r15_64 r15
> > > > > +
> > > > > +#define kmov_8 kmovb
> > > > > +#define kmov_16        kmovw
> > > > > +#define kmov_32        kmovd
> > > > > +#define kmov_64        kmovq
> > > > > +#define kortest_8      kortestb
> > > > > +#define kortest_16     kortestw
> > > > > +#define kortest_32     kortestd
> > > > > +#define kortest_64     kortestq
> > > > > +#define kor_8  korb
> > > > > +#define kor_16 korw
> > > > > +#define kor_32 kord
> > > > > +#define kor_64 korq
> > > > > +#define ktest_8        ktestb
> > > > > +#define ktest_16       ktestw
> > > > > +#define ktest_32       ktestd
> > > > > +#define ktest_64       ktestq
> > > > > +#define kand_8 kandb
> > > > > +#define kand_16        kandw
> > > > > +#define kand_32        kandd
> > > > > +#define kand_64        kandq
> > > > > +#define kxor_8 kxorb
> > > > > +#define kxor_16        kxorw
> > > > > +#define kxor_32        kxord
> > > > > +#define kxor_64        kxorq
> > > > > +#define knot_8 knotb
> > > > > +#define knot_16        knotw
> > > > > +#define knot_32        knotd
> > > > > +#define knot_64        knotq
> > > > > +#define kxnor_8        kxnorb
> > > > > +#define kxnor_16       kxnorw
> > > > > +#define kxnor_32       kxnord
> > > > > +#define kxnor_64       kxnorq
> > > > > +#define kunpack_8      kunpackbw
> > > > > +#define kunpack_16     kunpackwd
> > > > > +#define kunpack_32     kunpackdq
> > > > > +
> > > > > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > > > > +#define VRAX   VGPR(rax)
> > > > > +#define VRBX   VGPR(rbx)
> > > > > +#define VRCX   VGPR(rcx)
> > > > > +#define VRDX   VGPR(rdx)
> > > > > +#define VRBP   VGPR(rbp)
> > > > > +#define VRSP   VGPR(rsp)
> > > > > +#define VRSI   VGPR(rsi)
> > > > > +#define VRDI   VGPR(rdi)
> > > > > +#define VR8    VGPR(r8)
> > > > > +#define VR9    VGPR(r9)
> > > > > +#define VR10   VGPR(r10)
> > > > > +#define VR11   VGPR(r11)
> > > > > +#define VR12   VGPR(r12)
> > > > > +#define VR13   VGPR(r13)
> > > > > +#define VR14   VGPR(r14)
> > > > > +#define VR15   VGPR(r15)
> > > > > +
> > > > > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > > > > +#define KMOV   VKINSN(kmov)
> > > > > +#define KORTEST        VKINSN(kortest)
> > > > > +#define KOR    VKINSN(kor)
> > > > > +#define KTEST  VKINSN(ktest)
> > > > > +#define KAND   VKINSN(kand)
> > > > > +#define KXOR   VKINSN(kxor)
> > > > > +#define KNOT   VKINSN(knot)
> > > > > +#define KXNOR  VKINSN(kxnor)
> > > > > +#define KUNPACK        VKINSN(kunpack)
> > > > > +
> > > > > +#ifndef REG_WIDTH
> > > > > +# define REG_WIDTH VEC_SIZE
> > > > > +#endif
> > > >
> > > > Which files will define REG_WIDTH?  What values will it be for
> > > > YMM and ZMM vectors?
> > >
> > > for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
> > > so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.
> > >
> > > For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.
> >
> > Then we should have
> >
> > #ifdef USE_WIDE_CHAR
> > # define REG_WIDTH 32
> > #else
> > # define REG_WIDTH VEC_SIZE
> > #endif
> >
>
> It may not be universal. It may be that some wide-char impls will want
> REG_WIDTH == 8/16 if they rely heavily on `inc` to do zero test or

I think we can define a macro for it if needed.

> for some reason or another uses the full VEC_SIZE (as wcslen-evex512
> currently does).

Will REG_WIDTH == 32 work for wcslen-evex512?

> Also don't really see what it saves to give up the granularity.
> Either way to specify a seperate reg width the wchar impl will
> need to define something else. Seems reasonable for that
> something else to just be REG_WIDTH directly as opposed to
> USE_WIDE_CHAR.
>
> What do you think?
> > > >
> > > > > +#define VPASTER(x, y)  x##_##y
> > > > > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > > > > +
> > > > > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > > > > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > > > > +
> > > > > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > > > > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > > > > +
> > > > > +#endif
> > > > > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > new file mode 100644
> > > > > index 0000000000..c7296a8104
> > > > > --- /dev/null
> > > > > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > @@ -0,0 +1,123 @@
> > > > > +#!/usr/bin/python3
> > > > > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > +# This file is part of the GNU C Library.
> > > > > +#
> > > > > +# The GNU C Library is free software; you can redistribute it and/or
> > > > > +# modify it under the terms of the GNU Lesser General Public
> > > > > +# License as published by the Free Software Foundation; either
> > > > > +# version 2.1 of the License, or (at your option) any later version.
> > > > > +#
> > > > > +# The GNU C Library is distributed in the hope that it will be useful,
> > > > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > +# Lesser General Public License for more details.
> > > > > +#
> > > > > +# You should have received a copy of the GNU Lesser General Public
> > > > > +# License along with the GNU C Library; if not, see
> > > > > +# <https://www.gnu.org/licenses/>.
> > > > > +"""Generate macros for getting GPR name of a certain size
> > > > > +
> > > > > +Inputs: None
> > > > > +Output: Prints header fill to stdout
> > > > > +
> > > > > +API:
> > > > > +    VGPR(reg_name)
> > > > > +        - Get register name VEC_SIZE component of `reg_name`
> > > > > +    VGPR_SZ(reg_name, reg_size)
> > > > > +        - Get register name `reg_size` component of `reg_name`
> > > > > +"""
> > > > > +
> > > > > +import sys
> > > > > +import os
> > > > > +from datetime import datetime
> > > > > +
> > > > > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > > > > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > > > > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > > > > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > > > > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > > > > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > > > > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > > > > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > > > > +
> > > > > +mask_insns = [
> > > > > +    "kmov",
> > > > > +    "kortest",
> > > > > +    "kor",
> > > > > +    "ktest",
> > > > > +    "kand",
> > > > > +    "kxor",
> > > > > +    "knot",
> > > > > +    "kxnor",
> > > > > +]
> > > > > +mask_insns_ext = ["b", "w", "d", "q"]
> > > > > +
> > > > > +cr = """
> > > > > +   Copyright (C) {} Free Software Foundation, Inc.
> > > > > +   This file is part of the GNU C Library.
> > > > > +
> > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > +   License as published by the Free Software Foundation; either
> > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > +
> > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > +   Lesser General Public License for more details.
> > > > > +
> > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > +   License along with the GNU C Library; if not, see
> > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > +"""
> > > > > +
> > > > > +print("/* This file was generated by: {}.".format(os.path.basename(
> > > > > +    sys.argv[0])))
> > > > > +print(cr.format(datetime.today().year))
> > > > > +
> > > > > +print("#ifndef _REG_MACROS_H")
> > > > > +print("#define _REG_MACROS_H\t1")
> > > > > +print("")
> > > > > +for reg in registers:
> > > > > +    for i in range(0, 4):
> > > > > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > > > > +
> > > > > +print("")
> > > > > +for mask_insn in mask_insns:
> > > > > +    for i in range(0, 4):
> > > > > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > > > > +                                           mask_insns_ext[i]))
> > > > > +for i in range(0, 3):
> > > > > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > > > > +                                                   mask_insns_ext[i + 1]))
> > > > > +mask_insns.append("kunpack")
> > > > > +
> > > > > +print("")
> > > > > +print(
> > > > > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > > > > +for reg in registers:
> > > > > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > > > > +
> > > > > +print("")
> > > > > +
> > > > > +print(
> > > > > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > > > > +)
> > > > > +for mask_insn in mask_insns:
> > > > > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > > > > +print("")
> > > > > +
> > > > > +print("#ifndef REG_WIDTH")
> > > > > +print("# define REG_WIDTH VEC_SIZE")
> > > > > +print("#endif")
> > > > > +print("")
> > > > > +print("#define VPASTER(x, y)\tx##_##y")
> > > > > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > > > > +print("")
> > > > > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > > > > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > > > > +print("")
> > > > > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > > > > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > > > > +
> > > > > +print("\n#endif")
> > > > > --
> > > > > 2.34.1
> > > > >
> > > >
> > > >
> > > > --
> > > > H.J.
> >
> >
> >
> > --
> > H.J.
  
Noah Goldstein Oct. 14, 2022, 11:15 p.m. UTC | #6
On Fri, Oct 14, 2022 at 5:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 3:27 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Fri, Oct 14, 2022 at 5:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > >  On Fri, Oct 14, 2022 at 3:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > >
> > > > > On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > > > >
> > > > > > This is to make it easier to do think like:
> > > > > > ```
> > > > > > vpcmpb %VEC(0), %VEC(1), %k0
> > > > > > kmov{d|q} %k0, %{eax|rax}
> > > > > > test %{eax|rax}
> > > > > > ```
> > > > > >
> > > > > > It adds macro s.t any GPR can get the proper width with:
> > > > > >     `V{upper_case_GPR_name}`
> > > > > >
> > > > > > and any mask insn can get the proper width with:
> > > > > >     `{mask_insn_without_postfix}V`
> > > > > >
> > > > > > This commit does not change libc.so
> > > > > >
> > > > > > Tested build on x86-64
> > > > > > ---
> > > > > >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> > > > > >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> > > > > >  2 files changed, 289 insertions(+)
> > > > > >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> > > > > >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > >
> > > > > > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > new file mode 100644
> > > > > > index 0000000000..16168b6fda
> > > > > > --- /dev/null
> > > > > > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > @@ -0,0 +1,166 @@
> > > > > > +/* This file was generated by: gen-reg-macros.py.
> > > > > > +
> > > > > > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > > +   This file is part of the GNU C Library.
> > > > > > +
> > > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > > +   License as published by the Free Software Foundation; either
> > > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > > +
> > > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > +   Lesser General Public License for more details.
> > > > > > +
> > > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > > +   License along with the GNU C Library; if not, see
> > > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > > +
> > > > > > +#ifndef _REG_MACROS_H
> > > > > > +#define _REG_MACROS_H  1
> > > > > > +
> > > > > > +#define rax_8  al
> > > > > > +#define rax_16 ax
> > > > > > +#define rax_32 eax
> > > > > > +#define rax_64 rax
> > > > > > +#define rbx_8  bl
> > > > > > +#define rbx_16 bx
> > > > > > +#define rbx_32 ebx
> > > > > > +#define rbx_64 rbx
> > > > > > +#define rcx_8  cl
> > > > > > +#define rcx_16 cx
> > > > > > +#define rcx_32 ecx
> > > > > > +#define rcx_64 rcx
> > > > > > +#define rdx_8  dl
> > > > > > +#define rdx_16 dx
> > > > > > +#define rdx_32 edx
> > > > > > +#define rdx_64 rdx
> > > > > > +#define rbp_8  bpl
> > > > > > +#define rbp_16 bp
> > > > > > +#define rbp_32 ebp
> > > > > > +#define rbp_64 rbp
> > > > > > +#define rsp_8  spl
> > > > > > +#define rsp_16 sp
> > > > > > +#define rsp_32 esp
> > > > > > +#define rsp_64 rsp
> > > > > > +#define rsi_8  sil
> > > > > > +#define rsi_16 si
> > > > > > +#define rsi_32 esi
> > > > > > +#define rsi_64 rsi
> > > > > > +#define rdi_8  dil
> > > > > > +#define rdi_16 di
> > > > > > +#define rdi_32 edi
> > > > > > +#define rdi_64 rdi
> > > > > > +#define r8_8   r8b
> > > > > > +#define r8_16  r8w
> > > > > > +#define r8_32  r8d
> > > > > > +#define r8_64  r8
> > > > > > +#define r9_8   r9b
> > > > > > +#define r9_16  r9w
> > > > > > +#define r9_32  r9d
> > > > > > +#define r9_64  r9
> > > > > > +#define r10_8  r10b
> > > > > > +#define r10_16 r10w
> > > > > > +#define r10_32 r10d
> > > > > > +#define r10_64 r10
> > > > > > +#define r11_8  r11b
> > > > > > +#define r11_16 r11w
> > > > > > +#define r11_32 r11d
> > > > > > +#define r11_64 r11
> > > > > > +#define r12_8  r12b
> > > > > > +#define r12_16 r12w
> > > > > > +#define r12_32 r12d
> > > > > > +#define r12_64 r12
> > > > > > +#define r13_8  r13b
> > > > > > +#define r13_16 r13w
> > > > > > +#define r13_32 r13d
> > > > > > +#define r13_64 r13
> > > > > > +#define r14_8  r14b
> > > > > > +#define r14_16 r14w
> > > > > > +#define r14_32 r14d
> > > > > > +#define r14_64 r14
> > > > > > +#define r15_8  r15b
> > > > > > +#define r15_16 r15w
> > > > > > +#define r15_32 r15d
> > > > > > +#define r15_64 r15
> > > > > > +
> > > > > > +#define kmov_8 kmovb
> > > > > > +#define kmov_16        kmovw
> > > > > > +#define kmov_32        kmovd
> > > > > > +#define kmov_64        kmovq
> > > > > > +#define kortest_8      kortestb
> > > > > > +#define kortest_16     kortestw
> > > > > > +#define kortest_32     kortestd
> > > > > > +#define kortest_64     kortestq
> > > > > > +#define kor_8  korb
> > > > > > +#define kor_16 korw
> > > > > > +#define kor_32 kord
> > > > > > +#define kor_64 korq
> > > > > > +#define ktest_8        ktestb
> > > > > > +#define ktest_16       ktestw
> > > > > > +#define ktest_32       ktestd
> > > > > > +#define ktest_64       ktestq
> > > > > > +#define kand_8 kandb
> > > > > > +#define kand_16        kandw
> > > > > > +#define kand_32        kandd
> > > > > > +#define kand_64        kandq
> > > > > > +#define kxor_8 kxorb
> > > > > > +#define kxor_16        kxorw
> > > > > > +#define kxor_32        kxord
> > > > > > +#define kxor_64        kxorq
> > > > > > +#define knot_8 knotb
> > > > > > +#define knot_16        knotw
> > > > > > +#define knot_32        knotd
> > > > > > +#define knot_64        knotq
> > > > > > +#define kxnor_8        kxnorb
> > > > > > +#define kxnor_16       kxnorw
> > > > > > +#define kxnor_32       kxnord
> > > > > > +#define kxnor_64       kxnorq
> > > > > > +#define kunpack_8      kunpackbw
> > > > > > +#define kunpack_16     kunpackwd
> > > > > > +#define kunpack_32     kunpackdq
> > > > > > +
> > > > > > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > > > > > +#define VRAX   VGPR(rax)
> > > > > > +#define VRBX   VGPR(rbx)
> > > > > > +#define VRCX   VGPR(rcx)
> > > > > > +#define VRDX   VGPR(rdx)
> > > > > > +#define VRBP   VGPR(rbp)
> > > > > > +#define VRSP   VGPR(rsp)
> > > > > > +#define VRSI   VGPR(rsi)
> > > > > > +#define VRDI   VGPR(rdi)
> > > > > > +#define VR8    VGPR(r8)
> > > > > > +#define VR9    VGPR(r9)
> > > > > > +#define VR10   VGPR(r10)
> > > > > > +#define VR11   VGPR(r11)
> > > > > > +#define VR12   VGPR(r12)
> > > > > > +#define VR13   VGPR(r13)
> > > > > > +#define VR14   VGPR(r14)
> > > > > > +#define VR15   VGPR(r15)
> > > > > > +
> > > > > > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > > > > > +#define KMOV   VKINSN(kmov)
> > > > > > +#define KORTEST        VKINSN(kortest)
> > > > > > +#define KOR    VKINSN(kor)
> > > > > > +#define KTEST  VKINSN(ktest)
> > > > > > +#define KAND   VKINSN(kand)
> > > > > > +#define KXOR   VKINSN(kxor)
> > > > > > +#define KNOT   VKINSN(knot)
> > > > > > +#define KXNOR  VKINSN(kxnor)
> > > > > > +#define KUNPACK        VKINSN(kunpack)
> > > > > > +
> > > > > > +#ifndef REG_WIDTH
> > > > > > +# define REG_WIDTH VEC_SIZE
> > > > > > +#endif
> > > > >
> > > > > Which files will define REG_WIDTH?  What values will it be for
> > > > > YMM and ZMM vectors?
> > > >
> > > > for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
> > > > so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.
> > > >
> > > > For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.
> > >
> > > Then we should have
> > >
> > > #ifdef USE_WIDE_CHAR
> > > # define REG_WIDTH 32
> > > #else
> > > # define REG_WIDTH VEC_SIZE
> > > #endif
> > >
> >
> > It may not be universal. It may be that some wide-char impls will want
> > REG_WIDTH == 8/16 if they rely heavily on `inc` to do zero test or
>
> I think we can define a macro for it if needed.

We can but don't you think just REG_WIDTH is more direct?

>
> > for some reason or another uses the full VEC_SIZE (as wcslen-evex512
> > currently does).
>
> Will REG_WIDTH == 32 work for wcslen-evex512?
>

I believe so but am trying to make these patch zero-affect. I think a seperate
patch to actually make substantive changes make more sense.
> > Also don't really see what it saves to give up the granularity.
> > Either way to specify a seperate reg width the wchar impl will
> > need to define something else. Seems reasonable for that
> > something else to just be REG_WIDTH directly as opposed to
> > USE_WIDE_CHAR.
> >
> > What do you think?
> > > > >
> > > > > > +#define VPASTER(x, y)  x##_##y
> > > > > > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > > > > > +
> > > > > > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > > > > > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > > > > > +
> > > > > > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > > > > > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > > > > > +
> > > > > > +#endif
> > > > > > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > new file mode 100644
> > > > > > index 0000000000..c7296a8104
> > > > > > --- /dev/null
> > > > > > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > @@ -0,0 +1,123 @@
> > > > > > +#!/usr/bin/python3
> > > > > > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > > +# This file is part of the GNU C Library.
> > > > > > +#
> > > > > > +# The GNU C Library is free software; you can redistribute it and/or
> > > > > > +# modify it under the terms of the GNU Lesser General Public
> > > > > > +# License as published by the Free Software Foundation; either
> > > > > > +# version 2.1 of the License, or (at your option) any later version.
> > > > > > +#
> > > > > > +# The GNU C Library is distributed in the hope that it will be useful,
> > > > > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > +# Lesser General Public License for more details.
> > > > > > +#
> > > > > > +# You should have received a copy of the GNU Lesser General Public
> > > > > > +# License along with the GNU C Library; if not, see
> > > > > > +# <https://www.gnu.org/licenses/>.
> > > > > > +"""Generate macros for getting GPR name of a certain size
> > > > > > +
> > > > > > +Inputs: None
> > > > > > +Output: Prints header fill to stdout
> > > > > > +
> > > > > > +API:
> > > > > > +    VGPR(reg_name)
> > > > > > +        - Get register name VEC_SIZE component of `reg_name`
> > > > > > +    VGPR_SZ(reg_name, reg_size)
> > > > > > +        - Get register name `reg_size` component of `reg_name`
> > > > > > +"""
> > > > > > +
> > > > > > +import sys
> > > > > > +import os
> > > > > > +from datetime import datetime
> > > > > > +
> > > > > > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > > > > > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > > > > > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > > > > > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > > > > > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > > > > > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > > > > > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > > > > > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > > > > > +
> > > > > > +mask_insns = [
> > > > > > +    "kmov",
> > > > > > +    "kortest",
> > > > > > +    "kor",
> > > > > > +    "ktest",
> > > > > > +    "kand",
> > > > > > +    "kxor",
> > > > > > +    "knot",
> > > > > > +    "kxnor",
> > > > > > +]
> > > > > > +mask_insns_ext = ["b", "w", "d", "q"]
> > > > > > +
> > > > > > +cr = """
> > > > > > +   Copyright (C) {} Free Software Foundation, Inc.
> > > > > > +   This file is part of the GNU C Library.
> > > > > > +
> > > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > > +   License as published by the Free Software Foundation; either
> > > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > > +
> > > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > +   Lesser General Public License for more details.
> > > > > > +
> > > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > > +   License along with the GNU C Library; if not, see
> > > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > > +"""
> > > > > > +
> > > > > > +print("/* This file was generated by: {}.".format(os.path.basename(
> > > > > > +    sys.argv[0])))
> > > > > > +print(cr.format(datetime.today().year))
> > > > > > +
> > > > > > +print("#ifndef _REG_MACROS_H")
> > > > > > +print("#define _REG_MACROS_H\t1")
> > > > > > +print("")
> > > > > > +for reg in registers:
> > > > > > +    for i in range(0, 4):
> > > > > > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > > > > > +
> > > > > > +print("")
> > > > > > +for mask_insn in mask_insns:
> > > > > > +    for i in range(0, 4):
> > > > > > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > > > > > +                                           mask_insns_ext[i]))
> > > > > > +for i in range(0, 3):
> > > > > > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > > > > > +                                                   mask_insns_ext[i + 1]))
> > > > > > +mask_insns.append("kunpack")
> > > > > > +
> > > > > > +print("")
> > > > > > +print(
> > > > > > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > > > > > +for reg in registers:
> > > > > > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > > > > > +
> > > > > > +print("")
> > > > > > +
> > > > > > +print(
> > > > > > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > > > > > +)
> > > > > > +for mask_insn in mask_insns:
> > > > > > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > > > > > +print("")
> > > > > > +
> > > > > > +print("#ifndef REG_WIDTH")
> > > > > > +print("# define REG_WIDTH VEC_SIZE")
> > > > > > +print("#endif")
> > > > > > +print("")
> > > > > > +print("#define VPASTER(x, y)\tx##_##y")
> > > > > > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > > > > > +print("")
> > > > > > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > > > > > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > > > > > +print("")
> > > > > > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > > > > > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > > > > > +
> > > > > > +print("\n#endif")
> > > > > > --
> > > > > > 2.34.1
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > H.J.
> > >
> > >
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.
  
H.J. Lu Oct. 14, 2022, 11:22 p.m. UTC | #7
On Fri, Oct 14, 2022 at 4:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Fri, Oct 14, 2022 at 5:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Fri, Oct 14, 2022 at 3:27 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > On Fri, Oct 14, 2022 at 5:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > >  On Fri, Oct 14, 2022 at 3:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > > >
> > > > > On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > > > > >
> > > > > > > This is to make it easier to do think like:
> > > > > > > ```
> > > > > > > vpcmpb %VEC(0), %VEC(1), %k0
> > > > > > > kmov{d|q} %k0, %{eax|rax}
> > > > > > > test %{eax|rax}
> > > > > > > ```
> > > > > > >
> > > > > > > It adds macro s.t any GPR can get the proper width with:
> > > > > > >     `V{upper_case_GPR_name}`
> > > > > > >
> > > > > > > and any mask insn can get the proper width with:
> > > > > > >     `{mask_insn_without_postfix}V`
> > > > > > >
> > > > > > > This commit does not change libc.so
> > > > > > >
> > > > > > > Tested build on x86-64
> > > > > > > ---
> > > > > > >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> > > > > > >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> > > > > > >  2 files changed, 289 insertions(+)
> > > > > > >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > >
> > > > > > > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > > new file mode 100644
> > > > > > > index 0000000000..16168b6fda
> > > > > > > --- /dev/null
> > > > > > > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > > @@ -0,0 +1,166 @@
> > > > > > > +/* This file was generated by: gen-reg-macros.py.
> > > > > > > +
> > > > > > > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > > > +   This file is part of the GNU C Library.
> > > > > > > +
> > > > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > > > +   License as published by the Free Software Foundation; either
> > > > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > > > +
> > > > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > > +   Lesser General Public License for more details.
> > > > > > > +
> > > > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > > > +   License along with the GNU C Library; if not, see
> > > > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > > > +
> > > > > > > +#ifndef _REG_MACROS_H
> > > > > > > +#define _REG_MACROS_H  1
> > > > > > > +
> > > > > > > +#define rax_8  al
> > > > > > > +#define rax_16 ax
> > > > > > > +#define rax_32 eax
> > > > > > > +#define rax_64 rax
> > > > > > > +#define rbx_8  bl
> > > > > > > +#define rbx_16 bx
> > > > > > > +#define rbx_32 ebx
> > > > > > > +#define rbx_64 rbx
> > > > > > > +#define rcx_8  cl
> > > > > > > +#define rcx_16 cx
> > > > > > > +#define rcx_32 ecx
> > > > > > > +#define rcx_64 rcx
> > > > > > > +#define rdx_8  dl
> > > > > > > +#define rdx_16 dx
> > > > > > > +#define rdx_32 edx
> > > > > > > +#define rdx_64 rdx
> > > > > > > +#define rbp_8  bpl
> > > > > > > +#define rbp_16 bp
> > > > > > > +#define rbp_32 ebp
> > > > > > > +#define rbp_64 rbp
> > > > > > > +#define rsp_8  spl
> > > > > > > +#define rsp_16 sp
> > > > > > > +#define rsp_32 esp
> > > > > > > +#define rsp_64 rsp
> > > > > > > +#define rsi_8  sil
> > > > > > > +#define rsi_16 si
> > > > > > > +#define rsi_32 esi
> > > > > > > +#define rsi_64 rsi
> > > > > > > +#define rdi_8  dil
> > > > > > > +#define rdi_16 di
> > > > > > > +#define rdi_32 edi
> > > > > > > +#define rdi_64 rdi
> > > > > > > +#define r8_8   r8b
> > > > > > > +#define r8_16  r8w
> > > > > > > +#define r8_32  r8d
> > > > > > > +#define r8_64  r8
> > > > > > > +#define r9_8   r9b
> > > > > > > +#define r9_16  r9w
> > > > > > > +#define r9_32  r9d
> > > > > > > +#define r9_64  r9
> > > > > > > +#define r10_8  r10b
> > > > > > > +#define r10_16 r10w
> > > > > > > +#define r10_32 r10d
> > > > > > > +#define r10_64 r10
> > > > > > > +#define r11_8  r11b
> > > > > > > +#define r11_16 r11w
> > > > > > > +#define r11_32 r11d
> > > > > > > +#define r11_64 r11
> > > > > > > +#define r12_8  r12b
> > > > > > > +#define r12_16 r12w
> > > > > > > +#define r12_32 r12d
> > > > > > > +#define r12_64 r12
> > > > > > > +#define r13_8  r13b
> > > > > > > +#define r13_16 r13w
> > > > > > > +#define r13_32 r13d
> > > > > > > +#define r13_64 r13
> > > > > > > +#define r14_8  r14b
> > > > > > > +#define r14_16 r14w
> > > > > > > +#define r14_32 r14d
> > > > > > > +#define r14_64 r14
> > > > > > > +#define r15_8  r15b
> > > > > > > +#define r15_16 r15w
> > > > > > > +#define r15_32 r15d
> > > > > > > +#define r15_64 r15
> > > > > > > +
> > > > > > > +#define kmov_8 kmovb
> > > > > > > +#define kmov_16        kmovw
> > > > > > > +#define kmov_32        kmovd
> > > > > > > +#define kmov_64        kmovq
> > > > > > > +#define kortest_8      kortestb
> > > > > > > +#define kortest_16     kortestw
> > > > > > > +#define kortest_32     kortestd
> > > > > > > +#define kortest_64     kortestq
> > > > > > > +#define kor_8  korb
> > > > > > > +#define kor_16 korw
> > > > > > > +#define kor_32 kord
> > > > > > > +#define kor_64 korq
> > > > > > > +#define ktest_8        ktestb
> > > > > > > +#define ktest_16       ktestw
> > > > > > > +#define ktest_32       ktestd
> > > > > > > +#define ktest_64       ktestq
> > > > > > > +#define kand_8 kandb
> > > > > > > +#define kand_16        kandw
> > > > > > > +#define kand_32        kandd
> > > > > > > +#define kand_64        kandq
> > > > > > > +#define kxor_8 kxorb
> > > > > > > +#define kxor_16        kxorw
> > > > > > > +#define kxor_32        kxord
> > > > > > > +#define kxor_64        kxorq
> > > > > > > +#define knot_8 knotb
> > > > > > > +#define knot_16        knotw
> > > > > > > +#define knot_32        knotd
> > > > > > > +#define knot_64        knotq
> > > > > > > +#define kxnor_8        kxnorb
> > > > > > > +#define kxnor_16       kxnorw
> > > > > > > +#define kxnor_32       kxnord
> > > > > > > +#define kxnor_64       kxnorq
> > > > > > > +#define kunpack_8      kunpackbw
> > > > > > > +#define kunpack_16     kunpackwd
> > > > > > > +#define kunpack_32     kunpackdq
> > > > > > > +
> > > > > > > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > > > > > > +#define VRAX   VGPR(rax)
> > > > > > > +#define VRBX   VGPR(rbx)
> > > > > > > +#define VRCX   VGPR(rcx)
> > > > > > > +#define VRDX   VGPR(rdx)
> > > > > > > +#define VRBP   VGPR(rbp)
> > > > > > > +#define VRSP   VGPR(rsp)
> > > > > > > +#define VRSI   VGPR(rsi)
> > > > > > > +#define VRDI   VGPR(rdi)
> > > > > > > +#define VR8    VGPR(r8)
> > > > > > > +#define VR9    VGPR(r9)
> > > > > > > +#define VR10   VGPR(r10)
> > > > > > > +#define VR11   VGPR(r11)
> > > > > > > +#define VR12   VGPR(r12)
> > > > > > > +#define VR13   VGPR(r13)
> > > > > > > +#define VR14   VGPR(r14)
> > > > > > > +#define VR15   VGPR(r15)
> > > > > > > +
> > > > > > > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > > > > > > +#define KMOV   VKINSN(kmov)
> > > > > > > +#define KORTEST        VKINSN(kortest)
> > > > > > > +#define KOR    VKINSN(kor)
> > > > > > > +#define KTEST  VKINSN(ktest)
> > > > > > > +#define KAND   VKINSN(kand)
> > > > > > > +#define KXOR   VKINSN(kxor)
> > > > > > > +#define KNOT   VKINSN(knot)
> > > > > > > +#define KXNOR  VKINSN(kxnor)
> > > > > > > +#define KUNPACK        VKINSN(kunpack)
> > > > > > > +
> > > > > > > +#ifndef REG_WIDTH
> > > > > > > +# define REG_WIDTH VEC_SIZE
> > > > > > > +#endif
> > > > > >
> > > > > > Which files will define REG_WIDTH?  What values will it be for
> > > > > > YMM and ZMM vectors?
> > > > >
> > > > > for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
> > > > > so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.
> > > > >
> > > > > For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.
> > > >
> > > > Then we should have
> > > >
> > > > #ifdef USE_WIDE_CHAR
> > > > # define REG_WIDTH 32
> > > > #else
> > > > # define REG_WIDTH VEC_SIZE
> > > > #endif
> > > >
> > >
> > > It may not be universal. It may be that some wide-char impls will want
> > > REG_WIDTH == 8/16 if they rely heavily on `inc` to do zero test or
> >
> > I think we can define a macro for it if needed.
>
> We can but don't you think just REG_WIDTH is more direct?

It is very likely that 8-bit/16-bit registers will be used only for specific
operations.  Majority operations will be in 32-bit.  Things like

#ifndef REG_WIDTH
# define REG_WIDTH VEC_SIZE
#endif

may lead to questions.

> >
> > > for some reason or another uses the full VEC_SIZE (as wcslen-evex512
> > > currently does).
> >
> > Will REG_WIDTH == 32 work for wcslen-evex512?
> >
>
> I believe so but am trying to make these patch zero-affect. I think a seperate
> patch to actually make substantive changes make more sense.

USE_WIDE_CHAR is undefined currently.   There is no impact.

> > > Also don't really see what it saves to give up the granularity.
> > > Either way to specify a seperate reg width the wchar impl will
> > > need to define something else. Seems reasonable for that
> > > something else to just be REG_WIDTH directly as opposed to
> > > USE_WIDE_CHAR.
> > >
> > > What do you think?
> > > > > >
> > > > > > > +#define VPASTER(x, y)  x##_##y
> > > > > > > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > > > > > > +
> > > > > > > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > > > > > > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > > > > > > +
> > > > > > > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > > > > > > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > > > > > > +
> > > > > > > +#endif
> > > > > > > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > > new file mode 100644
> > > > > > > index 0000000000..c7296a8104
> > > > > > > --- /dev/null
> > > > > > > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > > @@ -0,0 +1,123 @@
> > > > > > > +#!/usr/bin/python3
> > > > > > > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > > > +# This file is part of the GNU C Library.
> > > > > > > +#
> > > > > > > +# The GNU C Library is free software; you can redistribute it and/or
> > > > > > > +# modify it under the terms of the GNU Lesser General Public
> > > > > > > +# License as published by the Free Software Foundation; either
> > > > > > > +# version 2.1 of the License, or (at your option) any later version.
> > > > > > > +#
> > > > > > > +# The GNU C Library is distributed in the hope that it will be useful,
> > > > > > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > > +# Lesser General Public License for more details.
> > > > > > > +#
> > > > > > > +# You should have received a copy of the GNU Lesser General Public
> > > > > > > +# License along with the GNU C Library; if not, see
> > > > > > > +# <https://www.gnu.org/licenses/>.
> > > > > > > +"""Generate macros for getting GPR name of a certain size
> > > > > > > +
> > > > > > > +Inputs: None
> > > > > > > +Output: Prints header fill to stdout
> > > > > > > +
> > > > > > > +API:
> > > > > > > +    VGPR(reg_name)
> > > > > > > +        - Get register name VEC_SIZE component of `reg_name`
> > > > > > > +    VGPR_SZ(reg_name, reg_size)
> > > > > > > +        - Get register name `reg_size` component of `reg_name`
> > > > > > > +"""
> > > > > > > +
> > > > > > > +import sys
> > > > > > > +import os
> > > > > > > +from datetime import datetime
> > > > > > > +
> > > > > > > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > > > > > > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > > > > > > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > > > > > > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > > > > > > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > > > > > > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > > > > > > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > > > > > > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > > > > > > +
> > > > > > > +mask_insns = [
> > > > > > > +    "kmov",
> > > > > > > +    "kortest",
> > > > > > > +    "kor",
> > > > > > > +    "ktest",
> > > > > > > +    "kand",
> > > > > > > +    "kxor",
> > > > > > > +    "knot",
> > > > > > > +    "kxnor",
> > > > > > > +]
> > > > > > > +mask_insns_ext = ["b", "w", "d", "q"]
> > > > > > > +
> > > > > > > +cr = """
> > > > > > > +   Copyright (C) {} Free Software Foundation, Inc.
> > > > > > > +   This file is part of the GNU C Library.
> > > > > > > +
> > > > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > > > +   License as published by the Free Software Foundation; either
> > > > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > > > +
> > > > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > > +   Lesser General Public License for more details.
> > > > > > > +
> > > > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > > > +   License along with the GNU C Library; if not, see
> > > > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > > > +"""
> > > > > > > +
> > > > > > > +print("/* This file was generated by: {}.".format(os.path.basename(
> > > > > > > +    sys.argv[0])))
> > > > > > > +print(cr.format(datetime.today().year))
> > > > > > > +
> > > > > > > +print("#ifndef _REG_MACROS_H")
> > > > > > > +print("#define _REG_MACROS_H\t1")
> > > > > > > +print("")
> > > > > > > +for reg in registers:
> > > > > > > +    for i in range(0, 4):
> > > > > > > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > > > > > > +
> > > > > > > +print("")
> > > > > > > +for mask_insn in mask_insns:
> > > > > > > +    for i in range(0, 4):
> > > > > > > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > > > > > > +                                           mask_insns_ext[i]))
> > > > > > > +for i in range(0, 3):
> > > > > > > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > > > > > > +                                                   mask_insns_ext[i + 1]))
> > > > > > > +mask_insns.append("kunpack")
> > > > > > > +
> > > > > > > +print("")
> > > > > > > +print(
> > > > > > > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > > > > > > +for reg in registers:
> > > > > > > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > > > > > > +
> > > > > > > +print("")
> > > > > > > +
> > > > > > > +print(
> > > > > > > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > > > > > > +)
> > > > > > > +for mask_insn in mask_insns:
> > > > > > > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > > > > > > +print("")
> > > > > > > +
> > > > > > > +print("#ifndef REG_WIDTH")
> > > > > > > +print("# define REG_WIDTH VEC_SIZE")
> > > > > > > +print("#endif")
> > > > > > > +print("")
> > > > > > > +print("#define VPASTER(x, y)\tx##_##y")
> > > > > > > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > > > > > > +print("")
> > > > > > > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > > > > > > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > > > > > > +print("")
> > > > > > > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > > > > > > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > > > > > > +
> > > > > > > +print("\n#endif")
> > > > > > > --
> > > > > > > 2.34.1
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > H.J.
> > > >
> > > >
> > > >
> > > > --
> > > > H.J.
> >
> >
> >
> > --
> > H.J.
  
Noah Goldstein Oct. 14, 2022, 11:25 p.m. UTC | #8
On Fri, Oct 14, 2022 at 6:23 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
>  On Fri, Oct 14, 2022 at 4:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Fri, Oct 14, 2022 at 5:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Fri, Oct 14, 2022 at 3:27 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 14, 2022 at 5:06 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > >
> > > > >  On Fri, Oct 14, 2022 at 3:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Oct 14, 2022 at 4:28 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Oct 14, 2022 at 2:15 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > > > > > >
> > > > > > > > This is to make it easier to do think like:
> > > > > > > > ```
> > > > > > > > vpcmpb %VEC(0), %VEC(1), %k0
> > > > > > > > kmov{d|q} %k0, %{eax|rax}
> > > > > > > > test %{eax|rax}
> > > > > > > > ```
> > > > > > > >
> > > > > > > > It adds macro s.t any GPR can get the proper width with:
> > > > > > > >     `V{upper_case_GPR_name}`
> > > > > > > >
> > > > > > > > and any mask insn can get the proper width with:
> > > > > > > >     `{mask_insn_without_postfix}V`
> > > > > > > >
> > > > > > > > This commit does not change libc.so
> > > > > > > >
> > > > > > > > Tested build on x86-64
> > > > > > > > ---
> > > > > > > >  sysdeps/x86_64/multiarch/reg-macros.h         | 166 ++++++++++++++++++
> > > > > > > >  .../multiarch/scripts/gen-reg-macros.py       | 123 +++++++++++++
> > > > > > > >  2 files changed, 289 insertions(+)
> > > > > > > >  create mode 100644 sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > > >  create mode 100644 sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > > >
> > > > > > > > diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > > > new file mode 100644
> > > > > > > > index 0000000000..16168b6fda
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/sysdeps/x86_64/multiarch/reg-macros.h
> > > > > > > > @@ -0,0 +1,166 @@
> > > > > > > > +/* This file was generated by: gen-reg-macros.py.
> > > > > > > > +
> > > > > > > > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > > > > +   This file is part of the GNU C Library.
> > > > > > > > +
> > > > > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > > > > +   License as published by the Free Software Foundation; either
> > > > > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > > > > +
> > > > > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > > > +   Lesser General Public License for more details.
> > > > > > > > +
> > > > > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > > > > +   License along with the GNU C Library; if not, see
> > > > > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > > > > +
> > > > > > > > +#ifndef _REG_MACROS_H
> > > > > > > > +#define _REG_MACROS_H  1
> > > > > > > > +
> > > > > > > > +#define rax_8  al
> > > > > > > > +#define rax_16 ax
> > > > > > > > +#define rax_32 eax
> > > > > > > > +#define rax_64 rax
> > > > > > > > +#define rbx_8  bl
> > > > > > > > +#define rbx_16 bx
> > > > > > > > +#define rbx_32 ebx
> > > > > > > > +#define rbx_64 rbx
> > > > > > > > +#define rcx_8  cl
> > > > > > > > +#define rcx_16 cx
> > > > > > > > +#define rcx_32 ecx
> > > > > > > > +#define rcx_64 rcx
> > > > > > > > +#define rdx_8  dl
> > > > > > > > +#define rdx_16 dx
> > > > > > > > +#define rdx_32 edx
> > > > > > > > +#define rdx_64 rdx
> > > > > > > > +#define rbp_8  bpl
> > > > > > > > +#define rbp_16 bp
> > > > > > > > +#define rbp_32 ebp
> > > > > > > > +#define rbp_64 rbp
> > > > > > > > +#define rsp_8  spl
> > > > > > > > +#define rsp_16 sp
> > > > > > > > +#define rsp_32 esp
> > > > > > > > +#define rsp_64 rsp
> > > > > > > > +#define rsi_8  sil
> > > > > > > > +#define rsi_16 si
> > > > > > > > +#define rsi_32 esi
> > > > > > > > +#define rsi_64 rsi
> > > > > > > > +#define rdi_8  dil
> > > > > > > > +#define rdi_16 di
> > > > > > > > +#define rdi_32 edi
> > > > > > > > +#define rdi_64 rdi
> > > > > > > > +#define r8_8   r8b
> > > > > > > > +#define r8_16  r8w
> > > > > > > > +#define r8_32  r8d
> > > > > > > > +#define r8_64  r8
> > > > > > > > +#define r9_8   r9b
> > > > > > > > +#define r9_16  r9w
> > > > > > > > +#define r9_32  r9d
> > > > > > > > +#define r9_64  r9
> > > > > > > > +#define r10_8  r10b
> > > > > > > > +#define r10_16 r10w
> > > > > > > > +#define r10_32 r10d
> > > > > > > > +#define r10_64 r10
> > > > > > > > +#define r11_8  r11b
> > > > > > > > +#define r11_16 r11w
> > > > > > > > +#define r11_32 r11d
> > > > > > > > +#define r11_64 r11
> > > > > > > > +#define r12_8  r12b
> > > > > > > > +#define r12_16 r12w
> > > > > > > > +#define r12_32 r12d
> > > > > > > > +#define r12_64 r12
> > > > > > > > +#define r13_8  r13b
> > > > > > > > +#define r13_16 r13w
> > > > > > > > +#define r13_32 r13d
> > > > > > > > +#define r13_64 r13
> > > > > > > > +#define r14_8  r14b
> > > > > > > > +#define r14_16 r14w
> > > > > > > > +#define r14_32 r14d
> > > > > > > > +#define r14_64 r14
> > > > > > > > +#define r15_8  r15b
> > > > > > > > +#define r15_16 r15w
> > > > > > > > +#define r15_32 r15d
> > > > > > > > +#define r15_64 r15
> > > > > > > > +
> > > > > > > > +#define kmov_8 kmovb
> > > > > > > > +#define kmov_16        kmovw
> > > > > > > > +#define kmov_32        kmovd
> > > > > > > > +#define kmov_64        kmovq
> > > > > > > > +#define kortest_8      kortestb
> > > > > > > > +#define kortest_16     kortestw
> > > > > > > > +#define kortest_32     kortestd
> > > > > > > > +#define kortest_64     kortestq
> > > > > > > > +#define kor_8  korb
> > > > > > > > +#define kor_16 korw
> > > > > > > > +#define kor_32 kord
> > > > > > > > +#define kor_64 korq
> > > > > > > > +#define ktest_8        ktestb
> > > > > > > > +#define ktest_16       ktestw
> > > > > > > > +#define ktest_32       ktestd
> > > > > > > > +#define ktest_64       ktestq
> > > > > > > > +#define kand_8 kandb
> > > > > > > > +#define kand_16        kandw
> > > > > > > > +#define kand_32        kandd
> > > > > > > > +#define kand_64        kandq
> > > > > > > > +#define kxor_8 kxorb
> > > > > > > > +#define kxor_16        kxorw
> > > > > > > > +#define kxor_32        kxord
> > > > > > > > +#define kxor_64        kxorq
> > > > > > > > +#define knot_8 knotb
> > > > > > > > +#define knot_16        knotw
> > > > > > > > +#define knot_32        knotd
> > > > > > > > +#define knot_64        knotq
> > > > > > > > +#define kxnor_8        kxnorb
> > > > > > > > +#define kxnor_16       kxnorw
> > > > > > > > +#define kxnor_32       kxnord
> > > > > > > > +#define kxnor_64       kxnorq
> > > > > > > > +#define kunpack_8      kunpackbw
> > > > > > > > +#define kunpack_16     kunpackwd
> > > > > > > > +#define kunpack_32     kunpackdq
> > > > > > > > +
> > > > > > > > +/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
> > > > > > > > +#define VRAX   VGPR(rax)
> > > > > > > > +#define VRBX   VGPR(rbx)
> > > > > > > > +#define VRCX   VGPR(rcx)
> > > > > > > > +#define VRDX   VGPR(rdx)
> > > > > > > > +#define VRBP   VGPR(rbp)
> > > > > > > > +#define VRSP   VGPR(rsp)
> > > > > > > > +#define VRSI   VGPR(rsi)
> > > > > > > > +#define VRDI   VGPR(rdi)
> > > > > > > > +#define VR8    VGPR(r8)
> > > > > > > > +#define VR9    VGPR(r9)
> > > > > > > > +#define VR10   VGPR(r10)
> > > > > > > > +#define VR11   VGPR(r11)
> > > > > > > > +#define VR12   VGPR(r12)
> > > > > > > > +#define VR13   VGPR(r13)
> > > > > > > > +#define VR14   VGPR(r14)
> > > > > > > > +#define VR15   VGPR(r15)
> > > > > > > > +
> > > > > > > > +/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
> > > > > > > > +#define KMOV   VKINSN(kmov)
> > > > > > > > +#define KORTEST        VKINSN(kortest)
> > > > > > > > +#define KOR    VKINSN(kor)
> > > > > > > > +#define KTEST  VKINSN(ktest)
> > > > > > > > +#define KAND   VKINSN(kand)
> > > > > > > > +#define KXOR   VKINSN(kxor)
> > > > > > > > +#define KNOT   VKINSN(knot)
> > > > > > > > +#define KXNOR  VKINSN(kxnor)
> > > > > > > > +#define KUNPACK        VKINSN(kunpack)
> > > > > > > > +
> > > > > > > > +#ifndef REG_WIDTH
> > > > > > > > +# define REG_WIDTH VEC_SIZE
> > > > > > > > +#endif
> > > > > > >
> > > > > > > Which files will define REG_WIDTH?  What values will it be for
> > > > > > > YMM and ZMM vectors?
> > > > > >
> > > > > > for non-wide char evex or avx2/sse2 REG_WIDTH = VEC_SIZE
> > > > > > so for YMM REG_WIDTH = 32, for ZMM REG_WIDTH = 64.
> > > > > >
> > > > > > For wchar impls REG_WIDTH will often be 32 irrelivant of YMM/ZMM.
> > > > >
> > > > > Then we should have
> > > > >
> > > > > #ifdef USE_WIDE_CHAR
> > > > > # define REG_WIDTH 32
> > > > > #else
> > > > > # define REG_WIDTH VEC_SIZE
> > > > > #endif
> > > > >
> > > >
> > > > It may not be universal. It may be that some wide-char impls will want
> > > > REG_WIDTH == 8/16 if they rely heavily on `inc` to do zero test or
> > >
> > > I think we can define a macro for it if needed.
> >
> > We can but don't you think just REG_WIDTH is more direct?
>
> It is very likely that 8-bit/16-bit registers will be used only for specific
> operations.  Majority operations will be in 32-bit.  Things like
>
> #ifndef REG_WIDTH
> # define REG_WIDTH VEC_SIZE
> #endif
>
> may lead to questions.
>
> > >
> > > > for some reason or another uses the full VEC_SIZE (as wcslen-evex512
> > > > currently does).
> > >
> > > Will REG_WIDTH == 32 work for wcslen-evex512?
> > >
> >
> > I believe so but am trying to make these patch zero-affect. I think a seperate
> > patch to actually make substantive changes make more sense.
>
> USE_WIDE_CHAR is undefined currently.   There is no impact.
>
> > > > Also don't really see what it saves to give up the granularity.
> > > > Either way to specify a seperate reg width the wchar impl will
> > > > need to define something else. Seems reasonable for that
> > > > something else to just be REG_WIDTH directly as opposed to
> > > > USE_WIDE_CHAR.
> > > >
> > > > What do you think?
> > > > > > >
> > > > > > > > +#define VPASTER(x, y)  x##_##y
> > > > > > > > +#define VEVALUATOR(x, y)       VPASTER(x, y)
> > > > > > > > +
> > > > > > > > +#define VGPR_SZ(reg_name, reg_size)    VEVALUATOR(reg_name, reg_size)
> > > > > > > > +#define VKINSN_SZ(insn, reg_size)      VEVALUATOR(insn, reg_size)
> > > > > > > > +
> > > > > > > > +#define VGPR(reg_name) VGPR_SZ(reg_name, REG_WIDTH)
> > > > > > > > +#define VKINSN(mask_insn)      VKINSN_SZ(mask_insn, REG_WIDTH)
> > > > > > > > +
> > > > > > > > +#endif
> > > > > > > > diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > > > new file mode 100644
> > > > > > > > index 0000000000..c7296a8104
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
> > > > > > > > @@ -0,0 +1,123 @@
> > > > > > > > +#!/usr/bin/python3
> > > > > > > > +# Copyright (C) 2022 Free Software Foundation, Inc.
> > > > > > > > +# This file is part of the GNU C Library.
> > > > > > > > +#
> > > > > > > > +# The GNU C Library is free software; you can redistribute it and/or
> > > > > > > > +# modify it under the terms of the GNU Lesser General Public
> > > > > > > > +# License as published by the Free Software Foundation; either
> > > > > > > > +# version 2.1 of the License, or (at your option) any later version.
> > > > > > > > +#
> > > > > > > > +# The GNU C Library is distributed in the hope that it will be useful,
> > > > > > > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > > > +# Lesser General Public License for more details.
> > > > > > > > +#
> > > > > > > > +# You should have received a copy of the GNU Lesser General Public
> > > > > > > > +# License along with the GNU C Library; if not, see
> > > > > > > > +# <https://www.gnu.org/licenses/>.
> > > > > > > > +"""Generate macros for getting GPR name of a certain size
> > > > > > > > +
> > > > > > > > +Inputs: None
> > > > > > > > +Output: Prints header fill to stdout
> > > > > > > > +
> > > > > > > > +API:
> > > > > > > > +    VGPR(reg_name)
> > > > > > > > +        - Get register name VEC_SIZE component of `reg_name`
> > > > > > > > +    VGPR_SZ(reg_name, reg_size)
> > > > > > > > +        - Get register name `reg_size` component of `reg_name`
> > > > > > > > +"""
> > > > > > > > +
> > > > > > > > +import sys
> > > > > > > > +import os
> > > > > > > > +from datetime import datetime
> > > > > > > > +
> > > > > > > > +registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
> > > > > > > > +             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
> > > > > > > > +             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
> > > > > > > > +             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
> > > > > > > > +             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
> > > > > > > > +             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
> > > > > > > > +             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
> > > > > > > > +             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
> > > > > > > > +
> > > > > > > > +mask_insns = [
> > > > > > > > +    "kmov",
> > > > > > > > +    "kortest",
> > > > > > > > +    "kor",
> > > > > > > > +    "ktest",
> > > > > > > > +    "kand",
> > > > > > > > +    "kxor",
> > > > > > > > +    "knot",
> > > > > > > > +    "kxnor",
> > > > > > > > +]
> > > > > > > > +mask_insns_ext = ["b", "w", "d", "q"]
> > > > > > > > +
> > > > > > > > +cr = """
> > > > > > > > +   Copyright (C) {} Free Software Foundation, Inc.
> > > > > > > > +   This file is part of the GNU C Library.
> > > > > > > > +
> > > > > > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > > > > > +   modify it under the terms of the GNU Lesser General Public
> > > > > > > > +   License as published by the Free Software Foundation; either
> > > > > > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > > > > > +
> > > > > > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > > > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > > > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > > > > > +   Lesser General Public License for more details.
> > > > > > > > +
> > > > > > > > +   You should have received a copy of the GNU Lesser General Public
> > > > > > > > +   License along with the GNU C Library; if not, see
> > > > > > > > +   <https://www.gnu.org/licenses/>.  */
> > > > > > > > +"""
> > > > > > > > +
> > > > > > > > +print("/* This file was generated by: {}.".format(os.path.basename(
> > > > > > > > +    sys.argv[0])))
> > > > > > > > +print(cr.format(datetime.today().year))
> > > > > > > > +
> > > > > > > > +print("#ifndef _REG_MACROS_H")
> > > > > > > > +print("#define _REG_MACROS_H\t1")
> > > > > > > > +print("")
> > > > > > > > +for reg in registers:
> > > > > > > > +    for i in range(0, 4):
> > > > > > > > +        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
> > > > > > > > +
> > > > > > > > +print("")
> > > > > > > > +for mask_insn in mask_insns:
> > > > > > > > +    for i in range(0, 4):
> > > > > > > > +        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
> > > > > > > > +                                           mask_insns_ext[i]))
> > > > > > > > +for i in range(0, 3):
> > > > > > > > +    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
> > > > > > > > +                                                   mask_insns_ext[i + 1]))
> > > > > > > > +mask_insns.append("kunpack")
> > > > > > > > +
> > > > > > > > +print("")
> > > > > > > > +print(
> > > > > > > > +    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
> > > > > > > > +for reg in registers:
> > > > > > > > +    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
> > > > > > > > +
> > > > > > > > +print("")
> > > > > > > > +
> > > > > > > > +print(
> > > > > > > > +    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
> > > > > > > > +)
> > > > > > > > +for mask_insn in mask_insns:
> > > > > > > > +    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
> > > > > > > > +print("")
> > > > > > > > +
> > > > > > > > +print("#ifndef REG_WIDTH")
> > > > > > > > +print("# define REG_WIDTH VEC_SIZE")
> > > > > > > > +print("#endif")
> > > > > > > > +print("")
> > > > > > > > +print("#define VPASTER(x, y)\tx##_##y")
> > > > > > > > +print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
> > > > > > > > +print("")
> > > > > > > > +print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
> > > > > > > > +print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
> > > > > > > > +print("")
> > > > > > > > +print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
> > > > > > > > +print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
> > > > > > > > +
> > > > > > > > +print("\n#endif")
> > > > > > > > --
> > > > > > > > 2.34.1
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > H.J.
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > H.J.
> > >
> > >
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.

kk
  

Patch

diff --git a/sysdeps/x86_64/multiarch/reg-macros.h b/sysdeps/x86_64/multiarch/reg-macros.h
new file mode 100644
index 0000000000..16168b6fda
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/reg-macros.h
@@ -0,0 +1,166 @@ 
+/* This file was generated by: gen-reg-macros.py.
+
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _REG_MACROS_H
+#define _REG_MACROS_H	1
+
+#define rax_8	al
+#define rax_16	ax
+#define rax_32	eax
+#define rax_64	rax
+#define rbx_8	bl
+#define rbx_16	bx
+#define rbx_32	ebx
+#define rbx_64	rbx
+#define rcx_8	cl
+#define rcx_16	cx
+#define rcx_32	ecx
+#define rcx_64	rcx
+#define rdx_8	dl
+#define rdx_16	dx
+#define rdx_32	edx
+#define rdx_64	rdx
+#define rbp_8	bpl
+#define rbp_16	bp
+#define rbp_32	ebp
+#define rbp_64	rbp
+#define rsp_8	spl
+#define rsp_16	sp
+#define rsp_32	esp
+#define rsp_64	rsp
+#define rsi_8	sil
+#define rsi_16	si
+#define rsi_32	esi
+#define rsi_64	rsi
+#define rdi_8	dil
+#define rdi_16	di
+#define rdi_32	edi
+#define rdi_64	rdi
+#define r8_8	r8b
+#define r8_16	r8w
+#define r8_32	r8d
+#define r8_64	r8
+#define r9_8	r9b
+#define r9_16	r9w
+#define r9_32	r9d
+#define r9_64	r9
+#define r10_8	r10b
+#define r10_16	r10w
+#define r10_32	r10d
+#define r10_64	r10
+#define r11_8	r11b
+#define r11_16	r11w
+#define r11_32	r11d
+#define r11_64	r11
+#define r12_8	r12b
+#define r12_16	r12w
+#define r12_32	r12d
+#define r12_64	r12
+#define r13_8	r13b
+#define r13_16	r13w
+#define r13_32	r13d
+#define r13_64	r13
+#define r14_8	r14b
+#define r14_16	r14w
+#define r14_32	r14d
+#define r14_64	r14
+#define r15_8	r15b
+#define r15_16	r15w
+#define r15_32	r15d
+#define r15_64	r15
+
+#define kmov_8	kmovb
+#define kmov_16	kmovw
+#define kmov_32	kmovd
+#define kmov_64	kmovq
+#define kortest_8	kortestb
+#define kortest_16	kortestw
+#define kortest_32	kortestd
+#define kortest_64	kortestq
+#define kor_8	korb
+#define kor_16	korw
+#define kor_32	kord
+#define kor_64	korq
+#define ktest_8	ktestb
+#define ktest_16	ktestw
+#define ktest_32	ktestd
+#define ktest_64	ktestq
+#define kand_8	kandb
+#define kand_16	kandw
+#define kand_32	kandd
+#define kand_64	kandq
+#define kxor_8	kxorb
+#define kxor_16	kxorw
+#define kxor_32	kxord
+#define kxor_64	kxorq
+#define knot_8	knotb
+#define knot_16	knotw
+#define knot_32	knotd
+#define knot_64	knotq
+#define kxnor_8	kxnorb
+#define kxnor_16	kxnorw
+#define kxnor_32	kxnord
+#define kxnor_64	kxnorq
+#define kunpack_8	kunpackbw
+#define kunpack_16	kunpackwd
+#define kunpack_32	kunpackdq
+
+/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */
+#define VRAX	VGPR(rax)
+#define VRBX	VGPR(rbx)
+#define VRCX	VGPR(rcx)
+#define VRDX	VGPR(rdx)
+#define VRBP	VGPR(rbp)
+#define VRSP	VGPR(rsp)
+#define VRSI	VGPR(rsi)
+#define VRDI	VGPR(rdi)
+#define VR8	VGPR(r8)
+#define VR9	VGPR(r9)
+#define VR10	VGPR(r10)
+#define VR11	VGPR(r11)
+#define VR12	VGPR(r12)
+#define VR13	VGPR(r13)
+#define VR14	VGPR(r14)
+#define VR15	VGPR(r15)
+
+/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */
+#define KMOV 	VKINSN(kmov)
+#define KORTEST 	VKINSN(kortest)
+#define KOR 	VKINSN(kor)
+#define KTEST 	VKINSN(ktest)
+#define KAND 	VKINSN(kand)
+#define KXOR 	VKINSN(kxor)
+#define KNOT 	VKINSN(knot)
+#define KXNOR 	VKINSN(kxnor)
+#define KUNPACK 	VKINSN(kunpack)
+
+#ifndef REG_WIDTH
+# define REG_WIDTH VEC_SIZE
+#endif
+
+#define VPASTER(x, y)	x##_##y
+#define VEVALUATOR(x, y)	VPASTER(x, y)
+
+#define VGPR_SZ(reg_name, reg_size)	VEVALUATOR(reg_name, reg_size)
+#define VKINSN_SZ(insn, reg_size)	VEVALUATOR(insn, reg_size)
+
+#define VGPR(reg_name)	VGPR_SZ(reg_name, REG_WIDTH)
+#define VKINSN(mask_insn)	VKINSN_SZ(mask_insn, REG_WIDTH)
+
+#endif
diff --git a/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
new file mode 100644
index 0000000000..c7296a8104
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/scripts/gen-reg-macros.py
@@ -0,0 +1,123 @@ 
+#!/usr/bin/python3
+# Copyright (C) 2022 Free Software Foundation, Inc.
+# This file is part of the GNU C Library.
+#
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <https://www.gnu.org/licenses/>.
+"""Generate macros for getting GPR name of a certain size
+
+Inputs: None
+Output: Prints header fill to stdout
+
+API:
+    VGPR(reg_name)
+        - Get register name VEC_SIZE component of `reg_name`
+    VGPR_SZ(reg_name, reg_size)
+        - Get register name `reg_size` component of `reg_name`
+"""
+
+import sys
+import os
+from datetime import datetime
+
+registers = [["rax", "eax", "ax", "al"], ["rbx", "ebx", "bx", "bl"],
+             ["rcx", "ecx", "cx", "cl"], ["rdx", "edx", "dx", "dl"],
+             ["rbp", "ebp", "bp", "bpl"], ["rsp", "esp", "sp", "spl"],
+             ["rsi", "esi", "si", "sil"], ["rdi", "edi", "di", "dil"],
+             ["r8", "r8d", "r8w", "r8b"], ["r9", "r9d", "r9w", "r9b"],
+             ["r10", "r10d", "r10w", "r10b"], ["r11", "r11d", "r11w", "r11b"],
+             ["r12", "r12d", "r12w", "r12b"], ["r13", "r13d", "r13w", "r13b"],
+             ["r14", "r14d", "r14w", "r14b"], ["r15", "r15d", "r15w", "r15b"]]
+
+mask_insns = [
+    "kmov",
+    "kortest",
+    "kor",
+    "ktest",
+    "kand",
+    "kxor",
+    "knot",
+    "kxnor",
+]
+mask_insns_ext = ["b", "w", "d", "q"]
+
+cr = """
+   Copyright (C) {} Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+"""
+
+print("/* This file was generated by: {}.".format(os.path.basename(
+    sys.argv[0])))
+print(cr.format(datetime.today().year))
+
+print("#ifndef _REG_MACROS_H")
+print("#define _REG_MACROS_H\t1")
+print("")
+for reg in registers:
+    for i in range(0, 4):
+        print("#define {}_{}\t{}".format(reg[0], 8 << i, reg[3 - i]))
+
+print("")
+for mask_insn in mask_insns:
+    for i in range(0, 4):
+        print("#define {}_{}\t{}{}".format(mask_insn, 8 << i, mask_insn,
+                                           mask_insns_ext[i]))
+for i in range(0, 3):
+    print("#define kunpack_{}\tkunpack{}{}".format(8 << i, mask_insns_ext[i],
+                                                   mask_insns_ext[i + 1]))
+mask_insns.append("kunpack")
+
+print("")
+print(
+    "/* Common API for accessing proper width GPR is V{upcase_GPR_name}.  */")
+for reg in registers:
+    print("#define V{}\tVGPR({})".format(reg[0].upper(), reg[0]))
+
+print("")
+
+print(
+    "/* Common API for accessing proper width mask insn is {upcase_mask_insn}.  */"
+)
+for mask_insn in mask_insns:
+    print("#define {} \tVKINSN({})".format(mask_insn.upper(), mask_insn))
+print("")
+
+print("#ifndef REG_WIDTH")
+print("# define REG_WIDTH VEC_SIZE")
+print("#endif")
+print("")
+print("#define VPASTER(x, y)\tx##_##y")
+print("#define VEVALUATOR(x, y)\tVPASTER(x, y)")
+print("")
+print("#define VGPR_SZ(reg_name, reg_size)\tVEVALUATOR(reg_name, reg_size)")
+print("#define VKINSN_SZ(insn, reg_size)\tVEVALUATOR(insn, reg_size)")
+print("")
+print("#define VGPR(reg_name)\tVGPR_SZ(reg_name, REG_WIDTH)")
+print("#define VKINSN(mask_insn)\tVKINSN_SZ(mask_insn, REG_WIDTH)")
+
+print("\n#endif")