[18/28] powerpc64le: Add glibc-hwcaps support

Message ID 01faff4932d02c7e3224b50a1cdb5956354b1fc2.1601569371.git.fweimer@redhat.com
State Superseded
Headers
Series glibc-hwcaps support |

Commit Message

Florian Weimer Oct. 1, 2020, 4:33 p.m. UTC
  The "power10" and "power9" subdirectories are selected.
---
 .../powerpc/powerpc64/le/dl-hwcaps-subdirs.c  | 34 +++++++++++++++++++
 1 file changed, 34 insertions(+)
 create mode 100644 sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
  

Comments

Paul A. Clarke Oct. 1, 2020, 6:56 p.m. UTC | #1
On Thu, Oct 01, 2020 at 06:33:29PM +0200, Florian Weimer via Libc-alpha wrote:
> The "power10" and "power9" subdirectories are selected.
> ---
>  .../powerpc/powerpc64/le/dl-hwcaps-subdirs.c  | 34 +++++++++++++++++++
>  1 file changed, 34 insertions(+)
>  create mode 100644 sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> 
> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> new file mode 100644
> index 0000000000..496daf0fa0
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> @@ -0,0 +1,34 @@
> +/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <dl-hwcaps.h>
> +#include <ldsodefs.h>
> +
> +const char _dl_hwcaps_subdirs[] = "power10:power9";
> +
> +int32_t

Is 32 bits enough?  Any reason not to make this 64 bits?

> +_dl_hwcaps_subdirs_active (void)
> +{
> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
> +    return 3;
> +
> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
> +    return 1;

Is there some way to tie these magic numbers closer to their meaning?
The bits are also inverse the the relative offset in the string,
which could be confusing (index 0 is the 2nd bit and index 1 is the 1st).

Perhaps something like (not tested):
--
const char * const _dl_hwcaps_subdirs[] = {
#define _DL_HWCAPS_SUBDIR_POWER10_BIT 0x2 /* or 1 to preserve same order.  */
	"power10",
#define _DL_HWCAPS_SUBDIR_POWER9_BIT 0x1 /* or 2.  */
	"power9"
};

int32_t
_dl_hwcaps_subdirs_active (void)
{
  int32_t result = 0;

  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
    result |= _DL_HWCAPS_SUBDIR_POWER10_BIT;

  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
    result |= _DL_HWCAPS_SUBDIR_POWER9_BIT;

  return result;
}
--

Of course, that would require changes to the code that parses
_dl_hwcaps_subdirs.

> +
> +  return 0;
> +}

PC
  
Florian Weimer Oct. 5, 2020, 9:47 a.m. UTC | #2
* Paul A. Clarke:

>> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
>> new file mode 100644
>> index 0000000000..496daf0fa0
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
>> @@ -0,0 +1,34 @@
>> +/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
>> +   Copyright (C) 2020 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +#include <dl-hwcaps.h>
>> +#include <ldsodefs.h>
>> +
>> +const char _dl_hwcaps_subdirs[] = "power10:power9";
>> +
>> +int32_t
>
> Is 32 bits enough?  Any reason not to make this 64 bits?

It's enough for now for all the targets that have glibc-hwcaps support
so far.  It's a strictly internal interface, so it's easy to change
later.

We also need to consider the impact on library path processing.  Even
just 20 additional open & stat calls for each LD_LIBRARY_PATH entry
might be quite visible in process startup benchmarks.  So I'm not sure
if we will ever use those 32 bits.

>> +_dl_hwcaps_subdirs_active (void)
>> +{
>> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
>> +    return 3;
>> +
>> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
>> +    return 1;
>
> Is there some way to tie these magic numbers closer to their meaning?

Would you please have a look at the the x86-64 implementation and check
if it is closer to your liking?

> Perhaps something like (not tested):
> --
> const char * const _dl_hwcaps_subdirs[] = {
> #define _DL_HWCAPS_SUBDIR_POWER10_BIT 0x2 /* or 1 to preserve same order.  */
> 	"power10",
> #define _DL_HWCAPS_SUBDIR_POWER9_BIT 0x1 /* or 2.  */
> 	"power9"
> };
>
> int32_t
> _dl_hwcaps_subdirs_active (void)
> {
>   int32_t result = 0;
>
>   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
>     result |= _DL_HWCAPS_SUBDIR_POWER10_BIT;
>
>   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
>     result |= _DL_HWCAPS_SUBDIR_POWER9_BIT;
>
>   return result;
> }
> --
>
> Of course, that would require changes to the code that parses
> _dl_hwcaps_subdirs.

I chose the current approach to avoid relocations and memory allocations
for processing hwcaps settings (e.g. from the ld.so command line).

Thanks,
Florian
  
Paul A. Clarke Oct. 5, 2020, 7:15 p.m. UTC | #3
On Mon, Oct 05, 2020 at 11:47:32AM +0200, Florian Weimer wrote:
> * Paul A. Clarke:
> >> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> >> new file mode 100644
> >> index 0000000000..496daf0fa0
> >> --- /dev/null
> >> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
[...snip...]
> >> +_dl_hwcaps_subdirs_active (void)
> >> +{
> >> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
> >> +    return 3;
> >> +
> >> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
> >> +    return 1;
> >
> > Is there some way to tie these magic numbers closer to their meaning?
> 
> Would you please have a look at the the x86-64 implementation and check
> if it is closer to your liking?

A bit. I am concerned about the opaqueness of the interface.
The x86 code basically does:

>  int32_t result = 0;
>  int32_t bit = 1 << 2;
>  if (...)
>    return result;
>  result |= bit;
>  bit >>= 1;
>  if (...)
>    return result;
>  result |= bit;
>  bit >>= 1;
>  if (...)
>    return result;
>  result |= bit;
>  bit >>= 1;
>  return result;

...which still creates magic numbers and really doesn't explain what the
result represents.
The entire API is documented in a 2-line comment before count_hwcaps().
For someone to update the implementation for a new hwcap, which will
need to be done fairly often, seems to me to be a bit challenging for
the uninitiated (like me ;-).

> > Perhaps something like (not tested):
> > --
> > const char * const _dl_hwcaps_subdirs[] = {
> > #define _DL_HWCAPS_SUBDIR_POWER10_BIT 0x2 /* or 1 to preserve same order.  */
> > 	"power10",
> > #define _DL_HWCAPS_SUBDIR_POWER9_BIT 0x1 /* or 2.  */
> > 	"power9"
> > };
> >
> > int32_t
> > _dl_hwcaps_subdirs_active (void)
> > {
> >   int32_t result = 0;
> >
> >   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
> >     result |= _DL_HWCAPS_SUBDIR_POWER10_BIT;
> >
> >   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
> >     result |= _DL_HWCAPS_SUBDIR_POWER9_BIT;
> >
> >   return result;
> > }
> > --
> >
> > Of course, that would require changes to the code that parses
> > _dl_hwcaps_subdirs.
> 
> I chose the current approach to avoid relocations and memory allocations
> for processing hwcaps settings (e.g. from the ld.so command line).

Does the "array of strings" approach introduce new/additional relocations
and memory allocations?  It would seem to avoid the need for the splitting
code, at least.

PC
  
Florian Weimer Oct. 6, 2020, 12:20 p.m. UTC | #4
* Paul A. Clarke:

> On Mon, Oct 05, 2020 at 11:47:32AM +0200, Florian Weimer wrote:
>> * Paul A. Clarke:
>> >> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
>> >> new file mode 100644
>> >> index 0000000000..496daf0fa0
>> >> --- /dev/null
>> >> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> [...snip...]
>> >> +_dl_hwcaps_subdirs_active (void)
>> >> +{
>> >> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
>> >> +    return 3;
>> >> +
>> >> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
>> >> +    return 1;

Meh, the second test should do: return 2; 8-(

Maybe we should write it this way:

  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) == 0)
    return 0;  /* No subdirectories active.  */

  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)) == 0)
    return 2;  /* Only the second directory (power9) is active.  */

  return 3;  /* Both directories (power10, power9) are active.  */

> ...which still creates magic numbers and really doesn't explain what the
> result represents.
> The entire API is documented in a 2-line comment before count_hwcaps().
> For someone to update the implementation for a new hwcap, which will
> need to be done fairly often, seems to me to be a bit challenging for
> the uninitiated (like me ;-).

The comments are in elf/dl-hwcaps.h:

/* Colon-separated string of glibc-hwcaps subdirectories, without the
   "glibc-hwcaps/" prefix.  The most preferred subdirectory needs to
   be listed first.  */
extern const char _dl_hwcaps_subdirs[] attribute_hidden;

/* Returns a bitmap of active subdirectories in _dl_hwcaps_subdirs.
   Bit 0 (the LSB) corresponds to the first substring in
   _dl_hwcaps_subdirs, bit 1 to the second substring, and so on.
   There is no direct correspondence between HWCAP bitmasks and this
   bitmask.  */
int32_t _dl_hwcaps_subdirs_active (void) attribute_hidden;

>> > Perhaps something like (not tested):
>> > --
>> > const char * const _dl_hwcaps_subdirs[] = {
>> > #define _DL_HWCAPS_SUBDIR_POWER10_BIT 0x2 /* or 1 to preserve same order.  */
>> > 	"power10",
>> > #define _DL_HWCAPS_SUBDIR_POWER9_BIT 0x1 /* or 2.  */
>> > 	"power9"
>> > };
>> >
>> > int32_t
>> > _dl_hwcaps_subdirs_active (void)
>> > {
>> >   int32_t result = 0;
>> >
>> >   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
>> >     result |= _DL_HWCAPS_SUBDIR_POWER10_BIT;
>> >
>> >   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
>> >     result |= _DL_HWCAPS_SUBDIR_POWER9_BIT;
>> >
>> >   return result;
>> > }
>> > --
>> >
>> > Of course, that would require changes to the code that parses
>> > _dl_hwcaps_subdirs.
>> 
>> I chose the current approach to avoid relocations and memory allocations
>> for processing hwcaps settings (e.g. from the ld.so command line).
>
> Does the "array of strings" approach introduce new/additional relocations
> and memory allocations?  It would seem to avoid the need for the splitting
> code, at least.

The string pointers need runtime relocations (although they probably do
not matter in the grand scheme of things).  We still need the splitting
code for the ld.so command line argument.

Thanks,
Florian
  
Paul A. Clarke Oct. 6, 2020, 5:45 p.m. UTC | #5
On Tue, Oct 06, 2020 at 02:20:21PM +0200, Florian Weimer via Libc-alpha wrote:
> * Paul A. Clarke:
> > On Mon, Oct 05, 2020 at 11:47:32AM +0200, Florian Weimer wrote:
> >> * Paul A. Clarke:
> >> >> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> >> >> new file mode 100644
> >> >> index 0000000000..496daf0fa0
> >> >> --- /dev/null
> >> >> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> > [...snip...]
> >> >> +_dl_hwcaps_subdirs_active (void)
> >> >> +{
> >> >> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
> >> >> +    return 3;
> >> >> +
> >> >> +  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
> >> >> +    return 1;
> 
> Meh, the second test should do: return 2; 8-(
> 
> Maybe we should write it this way:
> 
>   if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) == 0)
>     return 0;  /* No subdirectories active.  */
> 
>   if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)) == 0)
>     return 2;  /* Only the second directory (power9) is active.  */
> 
>   return 3;  /* Both directories (power10, power9) are active.  */
> 
> > ...which still creates magic numbers and really doesn't explain what the
> > result represents.

Even with that rewrite, the "magic numbers" are still there.

> > The entire API is documented in a 2-line comment before count_hwcaps().
> > For someone to update the implementation for a new hwcap, which will
> > need to be done fairly often, seems to me to be a bit challenging for
> > the uninitiated (like me ;-).
> 
> The comments are in elf/dl-hwcaps.h:
> 
> /* Colon-separated string of glibc-hwcaps subdirectories, without the
>    "glibc-hwcaps/" prefix.  The most preferred subdirectory needs to
>    be listed first.  */
> extern const char _dl_hwcaps_subdirs[] attribute_hidden;
> 
> /* Returns a bitmap of active subdirectories in _dl_hwcaps_subdirs.
>    Bit 0 (the LSB) corresponds to the first substring in
>    _dl_hwcaps_subdirs, bit 1 to the second substring, and so on.
>    There is no direct correspondence between HWCAP bitmasks and this
>    bitmask.  */
> int32_t _dl_hwcaps_subdirs_active (void) attribute_hidden;
> 
> >> > Perhaps something like (not tested):
> >> > --
> >> > const char * const _dl_hwcaps_subdirs[] = {
> >> > #define _DL_HWCAPS_SUBDIR_POWER10_BIT 0x2 /* or 1 to preserve same order.  */
> >> > 	"power10",
> >> > #define _DL_HWCAPS_SUBDIR_POWER9_BIT 0x1 /* or 2.  */
> >> > 	"power9"
> >> > };
> >> >
> >> > int32_t
> >> > _dl_hwcaps_subdirs_active (void)
> >> > {
> >> >   int32_t result = 0;
> >> >
> >> >   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
> >> >     result |= _DL_HWCAPS_SUBDIR_POWER10_BIT;
> >> >
> >> >   if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
> >> >     result |= _DL_HWCAPS_SUBDIR_POWER9_BIT;
> >> >
> >> >   return result;
> >> > }
> >> > --
> >> >
> >> > Of course, that would require changes to the code that parses
> >> > _dl_hwcaps_subdirs.
> >> 
> >> I chose the current approach to avoid relocations and memory allocations
> >> for processing hwcaps settings (e.g. from the ld.so command line).
> >
> > Does the "array of strings" approach introduce new/additional relocations
> > and memory allocations?  It would seem to avoid the need for the splitting
> > code, at least.
> 
> The string pointers need runtime relocations (although they probably do
> not matter in the grand scheme of things).  We still need the splitting
> code for the ld.so command line argument.

Ah, so you are trying to model the "hwcaps subdirs" on the command-line
argument.  The latter has to be split to be used, why not allow the
subdirs to be "pre-split", then both could be handled as an array
of strings?  The array for the subdirs could be pre-allocated (if it will
always be less than 32 entries) and filled in directly
during "active" processing, obviating the need for the mysterious
and ephemeral bits indicating "use this"/"don't use this".

PC
  
Florian Weimer Oct. 9, 2020, 9:06 a.m. UTC | #6
* Paul A. Clarke via Libc-alpha:

> Ah, so you are trying to model the "hwcaps subdirs" on the command-line
> argument.  The latter has to be split to be used, why not allow the
> subdirs to be "pre-split", then both could be handled as an array
> of strings?  The array for the subdirs could be pre-allocated (if it will
> always be less than 32 entries) and filled in directly
> during "active" processing, obviating the need for the mysterious
> and ephemeral bits indicating "use this"/"don't use this".

Splitting the strings will need memory allocation and all that.  It may
look more familiar, but I don't think it's going to be simpler.

For the --help diagnostic, we also need information on all potentially
supported (but inactive) subdirectory strings.  So something that
computes a single array would not be sufficient.

I'll see if I can cut down on the magic numbers by other means.

Thanks,
Florian
  

Patch

diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
new file mode 100644
index 0000000000..496daf0fa0
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
@@ -0,0 +1,34 @@ 
+/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dl-hwcaps.h>
+#include <ldsodefs.h>
+
+const char _dl_hwcaps_subdirs[] = "power10:power9";
+
+int32_t
+_dl_hwcaps_subdirs_active (void)
+{
+  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1)
+    return 3;
+
+  if (GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)
+    return 1;
+
+  return 0;
+}