[v2] aarch64: add support or hwcap3,4

Message ID 20250404141511.1767584-1-yury.khrustalev@arm.com (mailing list archive)
State Changes Requested
Headers
Series [v2] aarch64: add support or hwcap3,4 |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed
redhat-pt-bot/TryBot-32bit success Build for i686

Commit Message

Yury Khrustalev April 4, 2025, 2:15 p.m. UTC
  This is just to ensure both hwcap3,4 are usable on AArch64 targets.
---
Corresponding ABI spec update:
https://github.com/ARM-software/abi-aa/pull/320

Regression tested on AArch64 and x86 and no regressions has been found.

OK for trunk?

base commit: aaf94ec804

Changes in v2:
 - Expanded comment in <sys/ifunc.h> to explain how to use the _size
   field to check the version of the contents of the __ifunc_arg_t
   struct.
Link to v1:
https://inbox.sourceware.org/libc-alpha/20250401141851.129625-1-yury.khrustalev@arm.com/
---
 sysdeps/aarch64/dl-irel.h         |  2 ++
 sysdeps/aarch64/sys/ifunc.h       | 10 +++++++++-
 sysdeps/aarch64/tst-ifunc-arg-1.c |  2 ++
 sysdeps/aarch64/tst-ifunc-arg-2.c |  2 ++
 4 files changed, 15 insertions(+), 1 deletion(-)
  

Comments

Florian Weimer April 4, 2025, 4:24 p.m. UTC | #1
* Yury Khrustalev:

> diff --git a/sysdeps/aarch64/sys/ifunc.h b/sysdeps/aarch64/sys/ifunc.h
> index 7781b37a29..eaabb6aaae 100644
> --- a/sysdeps/aarch64/sys/ifunc.h
> +++ b/sysdeps/aarch64/sys/ifunc.h
> @@ -27,7 +27,13 @@
>       ElfW(Addr) ifunc_resolver (uint64_t, const __ifunc_arg_t *);
>  
>     the first argument should have the _IFUNC_ARG_HWCAP bit set and
> -   the remaining bits should match the AT_HWCAP settings.  */
> +   the remaining bits should match the AT_HWCAP settings.
> +
> +   When a pointer to this struct is passed to an indirect function
> +   resolver as a second parameter, the resolver can use its _size
> +   field to determine if additional _hwcap fields are available.
> +
> +   The value in the _size fields is set to sizeof(__ifunc_arg_t).  */
>  
>  /* Second argument to an ifunc resolver.  */
>  struct __ifunc_arg_t

This comment glosses over the fact that in reality, the glibc the
application but be running on has a smaller __ifunc_arg_t.

Thanks,
Florian
  
Yury Khrustalev April 8, 2025, 11:02 a.m. UTC | #2
Should we consider cases like below?

After adding the new fields `_hwcap3` and `_hwcap4`, there will be
two versions of the `__ifunc_arg_t` type depending on which header
is used during compilation of the code for a resolver function with
no compile-time way to decern between these versions.

Those resolvers that want to use the new fields (naturally, guarded
by the runtime checks based on the `_size` field), will no longer
compile with older headers because they are missing the `_hwcap3,4`
fields.

Ideally, __ifunc_arg_t should've been defined as

struct __ifunc_arg_t
{
  unsigned long _size;
  unsigned long _hwcap_array[N];
}

where N could be increased over time as needed.

To retain backward compatibility with existing code, we could use a
union:


struct __ifunc_arg_t
{
  unsigned long _size;
  union {
    struct {
      unsigned long _hwcap;
      unsigned long _hwcap2;
    };
    unsigned long _hwcap_array[4];
  };
};

and deprecate `_hwcap` and `_hwcap2` in favour of `_hwcap_array`.


However, using `_hwcap` and `_hwcap2` may become UB when strict aliasing
rules are enforced.

Another way could be introducing macros like this:

struct __ifunc_arg_t
{
  unsigned long _size;
  unsigned long _hwcap;
  unsigned long _hwcap2;
  unsigned long _hwcap3;
  unsigned long _hwcap4;
}

#define __IFUNC_ARG_T_HWCAP3
#define __IFUNC_ARG_T_HWCAP4


In this case code that want to use the new fields when they are available,
would be able to rely on these macros.

What would be the best way forward here?

Kind regards,
Yury
  
Adhemerval Zanella Netto April 8, 2025, 12:09 p.m. UTC | #3
On 08/04/25 08:02, Yury Khrustalev wrote:
> Should we consider cases like below?
> 
> After adding the new fields `_hwcap3` and `_hwcap4`, there will be
> two versions of the `__ifunc_arg_t` type depending on which header
> is used during compilation of the code for a resolver function with
> no compile-time way to decern between these versions.
> 
> Those resolvers that want to use the new fields (naturally, guarded
> by the runtime checks based on the `_size` field), will no longer
> compile with older headers because they are missing the `_hwcap3,4`
> fields.
> 
> Ideally, __ifunc_arg_t should've been defined as
> 
> struct __ifunc_arg_t
> {
>   unsigned long _size;
>   unsigned long _hwcap_array[N];
> }
> 
> where N could be increased over time as needed.

Indeed, the N would allow to compile-time assert the struct size.  This
is how kernel does for variable struct arguments for syscall.

I think that regardless, it would be good to add a compile-time constant
to check for the current supported __ifunc_arg_t version.

> 
> To retain backward compatibility with existing code, we could use a
> union:
> 
> 
> struct __ifunc_arg_t
> {
>   unsigned long _size;
>   union {
>     struct {
>       unsigned long _hwcap;
>       unsigned long _hwcap2;
>     };
>     unsigned long _hwcap_array[4];
>   };
> };
> 
> and deprecate `_hwcap` and `_hwcap2` in favour of `_hwcap_array`.
> 
> 
> However, using `_hwcap` and `_hwcap2` may become UB when strict aliasing
> rules are enforced.
> 
> Another way could be introducing macros like this:
> 
> struct __ifunc_arg_t
> {
>   unsigned long _size;
>   unsigned long _hwcap;
>   unsigned long _hwcap2;
>   unsigned long _hwcap3;
>   unsigned long _hwcap4;
> }
> 
> #define __IFUNC_ARG_T_HWCAP3
> #define __IFUNC_ARG_T_HWCAP4
> 
> 
> In this case code that want to use the new fields when they are available,
> would be able to rely on these macros.
> 
> What would be the best way forward here?

I am not sure to properly accomplish backward *source* compatibility with
current ABI constraints.  The best way I have is to have a configure time
to check for the size of __ifunc_arg_t and redefine it with the expected 
size if system headers does not fit the requirements.

> 
> Kind regards,
> Yury
> 
> 
>
  
enh April 8, 2025, 1:22 p.m. UTC | #4
On Tue, Apr 8, 2025 at 8:11 AM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 08/04/25 08:02, Yury Khrustalev wrote:
> > Should we consider cases like below?
> >
> > After adding the new fields `_hwcap3` and `_hwcap4`, there will be
> > two versions of the `__ifunc_arg_t` type depending on which header
> > is used during compilation of the code for a resolver function with
> > no compile-time way to decern between these versions.
> >
> > Those resolvers that want to use the new fields (naturally, guarded
> > by the runtime checks based on the `_size` field), will no longer
> > compile with older headers because they are missing the `_hwcap3,4`
> > fields.
> >
> > Ideally, __ifunc_arg_t should've been defined as
> >
> > struct __ifunc_arg_t
> > {
> >   unsigned long _size;
> >   unsigned long _hwcap_array[N];
> > }
> >
> > where N could be increased over time as needed.
>
> Indeed, the N would allow to compile-time assert the struct size.  This
> is how kernel does for variable struct arguments for syscall.
>
> I think that regardless, it would be good to add a compile-time constant
> to check for the current supported __ifunc_arg_t version.
>
> >
> > To retain backward compatibility with existing code, we could use a
> > union:
> >
> >
> > struct __ifunc_arg_t
> > {
> >   unsigned long _size;
> >   union {
> >     struct {
> >       unsigned long _hwcap;
> >       unsigned long _hwcap2;
> >     };
> >     unsigned long _hwcap_array[4];
> >   };
> > };
> >
> > and deprecate `_hwcap` and `_hwcap2` in favour of `_hwcap_array`.
> >
> >
> > However, using `_hwcap` and `_hwcap2` may become UB when strict aliasing
> > rules are enforced.
> >
> > Another way could be introducing macros like this:
> >
> > struct __ifunc_arg_t
> > {
> >   unsigned long _size;
> >   unsigned long _hwcap;
> >   unsigned long _hwcap2;
> >   unsigned long _hwcap3;
> >   unsigned long _hwcap4;
> > }
> >
> > #define __IFUNC_ARG_T_HWCAP3
> > #define __IFUNC_ARG_T_HWCAP4
> >
> >
> > In this case code that want to use the new fields when they are available,
> > would be able to rely on these macros.
> >
> > What would be the best way forward here?
>
> I am not sure to properly accomplish backward *source* compatibility with
> current ABI constraints.  The best way I have is to have a configure time
> to check for the size of __ifunc_arg_t and redefine it with the expected
> size if system headers does not fit the requirements.

the kernel has things like

#define _hwcap2 union_name._hwcap_array[1]

for similar situations. the _size field is probably the biggest problem.

there's always another bit to spare:

#define _IFUNC_ARG_HWCAP (1ULL << 62)

with an _IFUNC_ARG_HWCAP2 or whatever, you could have full
source/binary compatibility and free reign to try again with the type:
(uint64_t hwcap, __ifunc_arg_t* arg, __ifunc_arg2_t* arg2)

alternatively, since this is only for arm64 and we know the packing,
you could just have a macro that converts _size into "i can haz
hwcap<N>?". i think dealing with byte sizes directly in ifunc
resolvers is the unpleasantness we're trying to avoid? making that our
problem in a macro might be the least worst option, given that we
already have an unfortunate interface?

(but, yeah, any future inventions along these lines might want to be
more in the style of statx() than this.)

> > Kind regards,
> > Yury
> >
> >
> >
>
  
Adhemerval Zanella Netto April 8, 2025, 1:57 p.m. UTC | #5
On 08/04/25 10:22, enh wrote:
> On Tue, Apr 8, 2025 at 8:11 AM Adhemerval Zanella Netto
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>>
>> On 08/04/25 08:02, Yury Khrustalev wrote:
>>> Should we consider cases like below?
>>>
>>> After adding the new fields `_hwcap3` and `_hwcap4`, there will be
>>> two versions of the `__ifunc_arg_t` type depending on which header
>>> is used during compilation of the code for a resolver function with
>>> no compile-time way to decern between these versions.
>>>
>>> Those resolvers that want to use the new fields (naturally, guarded
>>> by the runtime checks based on the `_size` field), will no longer
>>> compile with older headers because they are missing the `_hwcap3,4`
>>> fields.
>>>
>>> Ideally, __ifunc_arg_t should've been defined as
>>>
>>> struct __ifunc_arg_t
>>> {
>>>   unsigned long _size;
>>>   unsigned long _hwcap_array[N];
>>> }
>>>
>>> where N could be increased over time as needed.
>>
>> Indeed, the N would allow to compile-time assert the struct size.  This
>> is how kernel does for variable struct arguments for syscall.
>>
>> I think that regardless, it would be good to add a compile-time constant
>> to check for the current supported __ifunc_arg_t version.
>>
>>>
>>> To retain backward compatibility with existing code, we could use a
>>> union:
>>>
>>>
>>> struct __ifunc_arg_t
>>> {
>>>   unsigned long _size;
>>>   union {
>>>     struct {
>>>       unsigned long _hwcap;
>>>       unsigned long _hwcap2;
>>>     };
>>>     unsigned long _hwcap_array[4];
>>>   };
>>> };
>>>
>>> and deprecate `_hwcap` and `_hwcap2` in favour of `_hwcap_array`.
>>>
>>>
>>> However, using `_hwcap` and `_hwcap2` may become UB when strict aliasing
>>> rules are enforced.
>>>
>>> Another way could be introducing macros like this:
>>>
>>> struct __ifunc_arg_t
>>> {
>>>   unsigned long _size;
>>>   unsigned long _hwcap;
>>>   unsigned long _hwcap2;
>>>   unsigned long _hwcap3;
>>>   unsigned long _hwcap4;
>>> }
>>>
>>> #define __IFUNC_ARG_T_HWCAP3
>>> #define __IFUNC_ARG_T_HWCAP4
>>>
>>>
>>> In this case code that want to use the new fields when they are available,
>>> would be able to rely on these macros.
>>>
>>> What would be the best way forward here?
>>
>> I am not sure to properly accomplish backward *source* compatibility with
>> current ABI constraints.  The best way I have is to have a configure time
>> to check for the size of __ifunc_arg_t and redefine it with the expected
>> size if system headers does not fit the requirements.
> 
> the kernel has things like
> 
> #define _hwcap2 union_name._hwcap_array[1]
> 
> for similar situations. the _size field is probably the biggest problem.
> 
> there's always another bit to spare:
> 
> #define _IFUNC_ARG_HWCAP (1ULL << 62)
> 
> with an _IFUNC_ARG_HWCAP2 or whatever, you could have full
> source/binary compatibility and free reign to try again with the type:
> (uint64_t hwcap, __ifunc_arg_t* arg, __ifunc_arg2_t* arg2)
> 
> alternatively, since this is only for arm64 and we know the packing,
> you could just have a macro that converts _size into "i can haz
> hwcap<N>?". i think dealing with byte sizes directly in ifunc
> resolvers is the unpleasantness we're trying to avoid? making that our
> problem in a macro might be the least worst option, given that we
> already have an unfortunate interface?
> 
> (but, yeah, any future inventions along these lines might want to be
> more in the style of statx() than this.)

If source compatibility is really required, I would prefer the union
type instead of a new bit flags instead of a new function prototype.  
Something like:

  #define _IFUNC_ARG_HWCAP        (1ULL << 62)
  #define _IFUNC_HWCAP_MAX        4

  struct __ifunc_arg_t
  {
    unsigned long _size;
    union {
      struct {
        unsigned long _hwcap;
        unsigned long _hwcap2;
      } _hwcap_struct;
     unsigned long _hwcap_array[_IFUNC_HWCAP_MAX];
    } _hwcap_union;
  #define _hwcap  _hwcap_union._hwcap_array[0]
  #define _hwcap2 _hwcap_union._hwcap_array[1]
  #define _hwcap3 _hwcap_union._hwcap_array[2]
  #define _hwcap4 _hwcap_union._hwcap_array[3]
  };

  #define _IFUNC_ARG_VER1_SIZE 24
  #define _IFUNC_ARG_VER2_SIZE 40
  #define _IFUNC_ARG_SIZE _IFUNC_ARG_VER2_SIZE

I don't think strict aliasing would be a problem on ifunc resolvers.
  

Patch

diff --git a/sysdeps/aarch64/dl-irel.h b/sysdeps/aarch64/dl-irel.h
index ae402bc367..1879a83000 100644
--- a/sysdeps/aarch64/dl-irel.h
+++ b/sysdeps/aarch64/dl-irel.h
@@ -37,6 +37,8 @@  elf_ifunc_invoke (ElfW(Addr) addr)
   arg._size = sizeof (arg);
   arg._hwcap = GLRO(dl_hwcap);
   arg._hwcap2 = GLRO(dl_hwcap2);
+  arg._hwcap3 = GLRO(dl_hwcap3);
+  arg._hwcap4 = GLRO(dl_hwcap4);
   return ((ElfW(Addr) (*) (uint64_t, const __ifunc_arg_t *)) (addr))
 	 (GLRO(dl_hwcap) | _IFUNC_ARG_HWCAP, &arg);
 }
diff --git a/sysdeps/aarch64/sys/ifunc.h b/sysdeps/aarch64/sys/ifunc.h
index 7781b37a29..eaabb6aaae 100644
--- a/sysdeps/aarch64/sys/ifunc.h
+++ b/sysdeps/aarch64/sys/ifunc.h
@@ -27,7 +27,13 @@ 
      ElfW(Addr) ifunc_resolver (uint64_t, const __ifunc_arg_t *);
 
    the first argument should have the _IFUNC_ARG_HWCAP bit set and
-   the remaining bits should match the AT_HWCAP settings.  */
+   the remaining bits should match the AT_HWCAP settings.
+
+   When a pointer to this struct is passed to an indirect function
+   resolver as a second parameter, the resolver can use its _size
+   field to determine if additional _hwcap fields are available.
+
+   The value in the _size fields is set to sizeof(__ifunc_arg_t).  */
 
 /* Second argument to an ifunc resolver.  */
 struct __ifunc_arg_t
@@ -35,6 +41,8 @@  struct __ifunc_arg_t
   unsigned long _size; /* Size of the struct, so it can grow.  */
   unsigned long _hwcap;
   unsigned long _hwcap2;
+  unsigned long _hwcap3;
+  unsigned long _hwcap4;
 };
 
 typedef struct __ifunc_arg_t __ifunc_arg_t;
diff --git a/sysdeps/aarch64/tst-ifunc-arg-1.c b/sysdeps/aarch64/tst-ifunc-arg-1.c
index b90c836000..77634f8169 100644
--- a/sysdeps/aarch64/tst-ifunc-arg-1.c
+++ b/sysdeps/aarch64/tst-ifunc-arg-1.c
@@ -57,6 +57,8 @@  do_test (void)
   TEST_COMPARE (saved_arg2._size, sizeof (__ifunc_arg_t));
   TEST_COMPARE (saved_arg2._hwcap, getauxval (AT_HWCAP));
   TEST_COMPARE (saved_arg2._hwcap2, getauxval (AT_HWCAP2));
+  TEST_COMPARE (saved_arg2._hwcap3, getauxval (AT_HWCAP3));
+  TEST_COMPARE (saved_arg2._hwcap4, getauxval (AT_HWCAP4));
   return 0;
 }
 
diff --git a/sysdeps/aarch64/tst-ifunc-arg-2.c b/sysdeps/aarch64/tst-ifunc-arg-2.c
index dac144d937..02c103a120 100644
--- a/sysdeps/aarch64/tst-ifunc-arg-2.c
+++ b/sysdeps/aarch64/tst-ifunc-arg-2.c
@@ -60,6 +60,8 @@  do_test (void)
   TEST_COMPARE (saved_arg2._size, sizeof (__ifunc_arg_t));
   TEST_COMPARE (saved_arg2._hwcap, getauxval (AT_HWCAP));
   TEST_COMPARE (saved_arg2._hwcap2, getauxval (AT_HWCAP2));
+  TEST_COMPARE (saved_arg2._hwcap3, getauxval (AT_HWCAP3));
+  TEST_COMPARE (saved_arg2._hwcap4, getauxval (AT_HWCAP4));
   return 0;
 }