diff mbox series

riscv: Resolve symbols directly for symbols with STO_RISCV_VARIANT_CC.

Message ID 1628480827-366232-1-git-send-email-kai.wang@sifive.com
State New
Headers show
Series riscv: Resolve symbols directly for symbols with STO_RISCV_VARIANT_CC. | expand

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Hsiangkai Wang Aug. 9, 2021, 3:47 a.m. UTC
In some cases, we do not want to go through the resolver for function
calls. For example, functions with vector arguments will use vector
registers to pass arguments. In the resolver, we do not save/restore the
vector argument registers for lazy binding efficiency. To avoid ruining
the vector arguments, functions with vector arguments will not go
through the resolver.

To achieve the goal, we will annotate the function symbols with
STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
section. In the first pass on PLT relocations, we do not set up to call
_dl_runtime_resolve. Instead, we resolve the functions directly.

The related discussion could be found in
https://github.com/riscv/riscv-elf-psabi-doc/pull/190.
---
 elf/elf.h                    |  7 +++++++
 sysdeps/riscv/dl-dtprocnum.h | 21 +++++++++++++++++++++
 sysdeps/riscv/dl-machine.h   | 26 ++++++++++++++++++++++++++
 3 files changed, 54 insertions(+)
 create mode 100644 sysdeps/riscv/dl-dtprocnum.h

Comments

Palmer Dabbelt Aug. 11, 2021, 10:11 p.m. UTC | #1
On Sun, 08 Aug 2021 20:47:07 PDT (-0700), kai.wang@sifive.com wrote:
> In some cases, we do not want to go through the resolver for function
> calls. For example, functions with vector arguments will use vector
> registers to pass arguments. In the resolver, we do not save/restore the
> vector argument registers for lazy binding efficiency. To avoid ruining
> the vector arguments, functions with vector arguments will not go
> through the resolver.
>
> To achieve the goal, we will annotate the function symbols with
> STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
> section. In the first pass on PLT relocations, we do not set up to call
> _dl_runtime_resolve. Instead, we resolve the functions directly.
>
> The related discussion could be found in
> https://github.com/riscv/riscv-elf-psabi-doc/pull/190.

I'm trying to stay away from the foundation stuff these days, but from 
reading that spec it doesn't look like the variant CC has anything to do 
specifically with the V extension but instead is about allowing for 
arbitrary calling conventions for X and F registers as well.  Handling 
non-standard X-register calling conventions is really a whole different 
beast than handling non-standard F or V register calling conventions.  
It's not crazy to allow for these, but lumping them all into a single 
bit just makes this unnecessarily difficult.

In theory we can handle arbitrary F or V register calling conventions 
with the standard resolver, we just need to ensure we deal with 
save/restore of those registers.  IMO the right way to go is to just ban 
F and V register use from the resolver, as there's not much of a reason 
for them to be in there (none for F, and you probably don't want to spin 
up the V registers).  That just requires sorting out the build system, 
and would still allow us to have lazy binding for the variants.  At a 
bare minimum we can support arbitrary V register calling conventions for 
free until we start using the V registers in glibc, which is likely 
quite a way off.

Handling arbitrary X-register calling conventions is a lot tricker: for 
the fully arbitrary case we can't even have PLT entries, for example.  
Is there actually a use case for these?  If so we'll have to support it, 
it just seems odd that one would care enough about the X ABI to want a 
different one while still being able to deal with the overhead of a 
dynamic call.

So I'm not opposed to doing something like this, we just need some way 
to make sure it's actually solving a problem.

> ---
>  elf/elf.h                    |  7 +++++++
>  sysdeps/riscv/dl-dtprocnum.h | 21 +++++++++++++++++++++
>  sysdeps/riscv/dl-machine.h   | 26 ++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+)
>  create mode 100644 sysdeps/riscv/dl-dtprocnum.h
>
> diff --git a/elf/elf.h b/elf/elf.h
> index 4738dfa..0de29bf 100644
> --- a/elf/elf.h
> +++ b/elf/elf.h
> @@ -3889,6 +3889,13 @@ enum
>
>  #define R_TILEGX_NUM		130
>
> +/* RISC-V specific values for the Dyn d_tag field.  */
> +#define DT_RISCV_VARIANT_CC	(DT_LOPROC + 1)
> +#define DT_RISCV_NUM		2
> +
> +/* RISC-V specific values for the st_other field.  */
> +#define STO_RISCV_VARIANT_CC	0x80
> +
>  /* RISC-V ELF Flags */
>  #define EF_RISCV_RVC 			0x0001
>  #define EF_RISCV_FLOAT_ABI 		0x0006
> diff --git a/sysdeps/riscv/dl-dtprocnum.h b/sysdeps/riscv/dl-dtprocnum.h
> new file mode 100644
> index 0000000..f189fd7
> --- /dev/null
> +++ b/sysdeps/riscv/dl-dtprocnum.h
> @@ -0,0 +1,21 @@
> +/* Configuration of lookup functions.  RISC-V version.
> +   Copyright (C) 2019-2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +/* Number of extra dynamic section entries for this architecture.  By
> +   default there are none.  */
> +#define DT_THISPROCNUM	DT_RISCV_NUM
> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
> index aedf69f..6488483 100644
> --- a/sysdeps/riscv/dl-machine.h
> +++ b/sysdeps/riscv/dl-machine.h
> @@ -54,6 +54,9 @@
>  #define ELF_MACHINE_NO_REL 1
>  #define ELF_MACHINE_NO_RELA 0
>
> +/* Translate a processor specific dynamic tag to the index in l_info array.  */
> +#define DT_RISCV(x) (DT_RISCV_##x - DT_LOPROC + DT_NUM)
> +
>  /* Return nonzero iff ELF header is compatible with the running host.  */
>  static inline int __attribute_used__
>  elf_machine_matches_host (const ElfW(Ehdr) *ehdr)
> @@ -299,6 +302,29 @@ elf_machine_lazy_rel (struct link_map *map, ElfW(Addr) l_addr,
>    /* Check for unexpected PLT reloc type.  */
>    if (__glibc_likely (r_type == R_RISCV_JUMP_SLOT))
>      {
> +      if (__glibc_unlikely (map->l_info[DT_RISCV (VARIANT_CC)] != NULL))
> +	{
> +	  /* Check the symbol table for variant CC symbols.  */
> +	  const Elf_Symndx symndx = ELFW(R_SYM) (reloc->r_info);
> +	  const ElfW(Sym) *symtab =
> +	    (const void *)D_PTR (map, l_info[DT_SYMTAB]);
> +	  const ElfW(Sym) *sym = &symtab[symndx];
> +	  if (__glibc_unlikely (sym->st_other & STO_RISCV_VARIANT_CC))
> +	    {
> +	      /* Avoid lazy resolution of variant CC symbols.  */
> +	      const struct r_found_version *version = NULL;
> +	      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
> +		{
> +		  const ElfW(Half) *vernum =
> +		    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
> +		  version = &map->l_versions[vernum[symndx] & 0x7fff];
> +		}
> +	      elf_machine_rela (map, reloc, sym, version, reloc_addr,
> +				skip_ifunc);
> +	      return;
> +	    }
> +	}
> +
>        if (__glibc_unlikely (map->l_mach.plt == 0))
>  	{
>  	  if (l_addr)
Jessica Clarke Aug. 11, 2021, 10:51 p.m. UTC | #2
On 11 Aug 2021, at 23:11, Palmer Dabbelt <palmer@dabbelt.com> wrote:
> 
> On Sun, 08 Aug 2021 20:47:07 PDT (-0700), kai.wang@sifive.com wrote:
>> In some cases, we do not want to go through the resolver for function
>> calls. For example, functions with vector arguments will use vector
>> registers to pass arguments. In the resolver, we do not save/restore the
>> vector argument registers for lazy binding efficiency. To avoid ruining
>> the vector arguments, functions with vector arguments will not go
>> through the resolver.
>> 
>> To achieve the goal, we will annotate the function symbols with
>> STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
>> section. In the first pass on PLT relocations, we do not set up to call
>> _dl_runtime_resolve. Instead, we resolve the functions directly.
>> 
>> The related discussion could be found in
>> https://github.com/riscv/riscv-elf-psabi-doc/pull/190.
> 
> I'm trying to stay away from the foundation stuff these days, but from reading that spec it doesn't look like the variant CC has anything to do specifically with the V extension but instead is about allowing for arbitrary calling conventions for X and F registers as well.  Handling non-standard X-register calling conventions is really a whole different beast than handling non-standard F or V register calling conventions.  It's not crazy to allow for these, but lumping them all into a single bit just makes this unnecessarily difficult.

Whilst increased granularity would be nice to have, the reality is that st_other is an 8-bit field, 2 of which are currently allocated by the gABI, with proposals to grow that to 3, so there are only 5 bits to play with, thus anything more than just a single bit for this needs an extremely good justification beyond “it would be less difficult to use”, unfortunately. This bit is sufficient to address the issue with the V calling convention, and is also sufficient for any other non-standard calling convention, whether it involves X registers, F registers or something that hasn’t even been thought up yet.

The second reason why it’s done this way is I have a strong belief that RISC-V should not reinvent the wheel except for problems unique to RISC-V. AArch64 faced exactly the same problem with SVE a few years ago and this is the exact same solution they use and is deployed everywhere, so by following their lead we can be confident it works and reuse existing implementation code for the various tools and runtimes affected.

> In theory we can handle arbitrary F or V register calling conventions with the standard resolver, we just need to ensure we deal with save/restore of those registers.  IMO the right way to go is to just ban F and V register use from the resolver, as there's not much of a reason for them to be in there (none for F, and you probably don't want to spin up the V registers).  That just requires sorting out the build system, and would still allow us to have lazy binding for the variants.  At a bare minimum we can support arbitrary V register calling conventions for free until we start using the V registers in glibc, which is likely quite a way off.

It’s a nice idea in theory, but in practice it doesn’t work, the resolver path is far too complicated for that. Run-time linkers like to call into libc for various things (e.g. on FreeBSD, if you are in a multi-threaded program, all the locking and unlocking guarding the shared data structures ends up calling into libthr via hooks, and in glibc similarly it uses libpthread’s real mutex functions), so you end up needing to make sure those parts of your (extended) libc are also free from vectors and floats (including any auto-vectorisation). But the real kicker is IFUNCs; with lazy binding, the IFUNC resolver isn’t called until you first call the function, and you have no clue what the resolver is going to do other than that it’ll follow the ABI (we should specify that, but IFUNCs as a whole are underspecified in the psABI, we don’t currently document what the arguments are). That means the resolver is free to use F and V registers. Yes, all of these can be individually addressed via various extremely-targeted means, but experience suggests that it will break sooner or later. As a simple example, none of the libc functions called by the run-time linker could ever use memcpy, be it explicit or implicit (compiler-inserted instead of a large inline struct assignment), since that will eventually be vectorised, even if you compiled the directly-called libc functions themselves without V.

> Handling arbitrary X-register calling conventions is a lot tricker: for the fully arbitrary case we can't even have PLT entries, for example.  Is there actually a use case for these?  If so we'll have to support it, it just seems odd that one would care enough about the X ABI to want a different one while still being able to deal with the overhead of a dynamic call.

You’re correct that as it stands PLT entries are technically broken by this. We should fix that so even VARAINT_CC functions can’t use those registers as argument registers (AArch64 has that implicitly as it has IP0 and IP1 as reserved registers for PLTs and veneers, but since we do linker relaxation rather than range-extending with veneers we haven’t yet needed to explicitly reserve the PLT registers).

Jess

> So I'm not opposed to doing something like this, we just need some way to make sure it's actually solving a problem.
> 
>> ---
>> elf/elf.h                    |  7 +++++++
>> sysdeps/riscv/dl-dtprocnum.h | 21 +++++++++++++++++++++
>> sysdeps/riscv/dl-machine.h   | 26 ++++++++++++++++++++++++++
>> 3 files changed, 54 insertions(+)
>> create mode 100644 sysdeps/riscv/dl-dtprocnum.h
>> 
>> diff --git a/elf/elf.h b/elf/elf.h
>> index 4738dfa..0de29bf 100644
>> --- a/elf/elf.h
>> +++ b/elf/elf.h
>> @@ -3889,6 +3889,13 @@ enum
>> 
>> #define R_TILEGX_NUM		130
>> 
>> +/* RISC-V specific values for the Dyn d_tag field.  */
>> +#define DT_RISCV_VARIANT_CC	(DT_LOPROC + 1)
>> +#define DT_RISCV_NUM		2
>> +
>> +/* RISC-V specific values for the st_other field.  */
>> +#define STO_RISCV_VARIANT_CC	0x80
>> +
>> /* RISC-V ELF Flags */
>> #define EF_RISCV_RVC 			0x0001
>> #define EF_RISCV_FLOAT_ABI 		0x0006
>> diff --git a/sysdeps/riscv/dl-dtprocnum.h b/sysdeps/riscv/dl-dtprocnum.h
>> new file mode 100644
>> index 0000000..f189fd7
>> --- /dev/null
>> +++ b/sysdeps/riscv/dl-dtprocnum.h
>> @@ -0,0 +1,21 @@
>> +/* Configuration of lookup functions.  RISC-V version.
>> +   Copyright (C) 2019-2021 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library.  If not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +/* Number of extra dynamic section entries for this architecture.  By
>> +   default there are none.  */
>> +#define DT_THISPROCNUM	DT_RISCV_NUM
>> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
>> index aedf69f..6488483 100644
>> --- a/sysdeps/riscv/dl-machine.h
>> +++ b/sysdeps/riscv/dl-machine.h
>> @@ -54,6 +54,9 @@
>> #define ELF_MACHINE_NO_REL 1
>> #define ELF_MACHINE_NO_RELA 0
>> 
>> +/* Translate a processor specific dynamic tag to the index in l_info array.  */
>> +#define DT_RISCV(x) (DT_RISCV_##x - DT_LOPROC + DT_NUM)
>> +
>> /* Return nonzero iff ELF header is compatible with the running host.  */
>> static inline int __attribute_used__
>> elf_machine_matches_host (const ElfW(Ehdr) *ehdr)
>> @@ -299,6 +302,29 @@ elf_machine_lazy_rel (struct link_map *map, ElfW(Addr) l_addr,
>>   /* Check for unexpected PLT reloc type.  */
>>   if (__glibc_likely (r_type == R_RISCV_JUMP_SLOT))
>>     {
>> +      if (__glibc_unlikely (map->l_info[DT_RISCV (VARIANT_CC)] != NULL))
>> +	{
>> +	  /* Check the symbol table for variant CC symbols.  */
>> +	  const Elf_Symndx symndx = ELFW(R_SYM) (reloc->r_info);
>> +	  const ElfW(Sym) *symtab =
>> +	    (const void *)D_PTR (map, l_info[DT_SYMTAB]);
>> +	  const ElfW(Sym) *sym = &symtab[symndx];
>> +	  if (__glibc_unlikely (sym->st_other & STO_RISCV_VARIANT_CC))
>> +	    {
>> +	      /* Avoid lazy resolution of variant CC symbols.  */
>> +	      const struct r_found_version *version = NULL;
>> +	      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
>> +		{
>> +		  const ElfW(Half) *vernum =
>> +		    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
>> +		  version = &map->l_versions[vernum[symndx] & 0x7fff];
>> +		}
>> +	      elf_machine_rela (map, reloc, sym, version, reloc_addr,
>> +				skip_ifunc);
>> +	      return;
>> +	    }
>> +	}
>> +
>>       if (__glibc_unlikely (map->l_mach.plt == 0))
>> 	{
>> 	  if (l_addr)
Jim Wilson Aug. 11, 2021, 11 p.m. UTC | #3
On Wed, Aug 11, 2021 at 3:13 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:

> I'm trying to stay away from the foundation stuff these days, but from
> reading that spec it doesn't look like the variant CC has anything to do
> specifically with the V extension but instead is about allowing for
> arbitrary calling conventions for X and F registers as well.  Handling
> non-standard X-register calling conventions is really a whole different
> beast than handling non-standard F or V register calling conventions.
> It's not crazy to allow for these, but lumping them all into a single
> bit just makes this unnecessarily difficult.
>

This is the exact same solution that the aarch64 port uses.  We aren't
inventing anything new here.

The intent is not to support arbitrary calling conventions.  The intent is
that we will have two official calling conventions once we add vector
support, one that uses vector registers and one
that doesn't.  Most code will use the non-vector register ABI, and we
expect glibc will be compiled this way for maximum portability.  But we
will need to support the second ABI that does have vector registers, and
hence we only need a single bit for that at this time.

If you aren't willing to participate in the official ABI committee, then
you should at least be willing to accept their decisions.

and would still allow us to have lazy binding for the variants.  At a
> bare minimum we can support arbitrary V register calling conventions for
> free until we start using the V registers in glibc, which is likely
> quite a way off.
>

We already have patches to use V registers in glibc, and we are already
using them.  We already upstreamed the ifunc support for binutils and glibc
required to use vector instructions.  We use them for mem* and str*
routines.  Adopting the position that the resolver isn't allowed to call
any mem* or str* functions is unreasonable.  Hence the need for this patch.

Jim
Palmer Dabbelt Aug. 11, 2021, 11:21 p.m. UTC | #4
On Wed, 11 Aug 2021 16:00:31 PDT (-0700), Jim Wilson wrote:
> On Wed, Aug 11, 2021 at 3:13 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
>> I'm trying to stay away from the foundation stuff these days, but from
>> reading that spec it doesn't look like the variant CC has anything to do
>> specifically with the V extension but instead is about allowing for
>> arbitrary calling conventions for X and F registers as well.  Handling
>> non-standard X-register calling conventions is really a whole different
>> beast than handling non-standard F or V register calling conventions.
>> It's not crazy to allow for these, but lumping them all into a single
>> bit just makes this unnecessarily difficult.
>>
>
> This is the exact same solution that the aarch64 port uses.  We aren't
> inventing anything new here.
>
> The intent is not to support arbitrary calling conventions.  The intent is
> that we will have two official calling conventions once we add vector
> support, one that uses vector registers and one
> that doesn't.  Most code will use the non-vector register ABI, and we
> expect glibc will be compiled this way for maximum portability.  But we
> will need to support the second ABI that does have vector registers, and
> hence we only need a single bit for that at this time.
>
> If you aren't willing to participate in the official ABI committee, then
> you should at least be willing to accept their decisions.

All I'm trying to do here is point out that the ABI you've defined has 
some much broader costs to implement than the features you're 
advertising you want to use it for.  If this is the ABI you want that's 
fine, I don't really care, just go make it part of the official psABI 
spec and we'll get on with our lives.

> and would still allow us to have lazy binding for the variants.  At a
>> bare minimum we can support arbitrary V register calling conventions for
>> free until we start using the V registers in glibc, which is likely
>> quite a way off.
>>
>
> We already have patches to use V registers in glibc, and we are already
> using them.  We already upstreamed the ifunc support for binutils and glibc
> required to use vector instructions.  We use them for mem* and str*
> routines.  Adopting the position that the resolver isn't allowed to call
> any mem* or str* functions is unreasonable.  Hence the need for this patch.
>
> Jim
Jessica Clarke Aug. 11, 2021, 11:32 p.m. UTC | #5
On 12 Aug 2021, at 00:21, Palmer Dabbelt <palmer@dabbelt.com> wrote:
> On Wed, 11 Aug 2021 16:00:31 PDT (-0700), Jim Wilson wrote:
>> On Wed, Aug 11, 2021 at 3:13 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>> 
>>> I'm trying to stay away from the foundation stuff these days, but from
>>> reading that spec it doesn't look like the variant CC has anything to do
>>> specifically with the V extension but instead is about allowing for
>>> arbitrary calling conventions for X and F registers as well.  Handling
>>> non-standard X-register calling conventions is really a whole different
>>> beast than handling non-standard F or V register calling conventions.
>>> It's not crazy to allow for these, but lumping them all into a single
>>> bit just makes this unnecessarily difficult.
>>> 
>> 
>> This is the exact same solution that the aarch64 port uses.  We aren't
>> inventing anything new here.
>> 
>> The intent is not to support arbitrary calling conventions.  The intent is
>> that we will have two official calling conventions once we add vector
>> support, one that uses vector registers and one
>> that doesn't.  Most code will use the non-vector register ABI, and we
>> expect glibc will be compiled this way for maximum portability.  But we
>> will need to support the second ABI that does have vector registers, and
>> hence we only need a single bit for that at this time.
>> 
>> If you aren't willing to participate in the official ABI committee, then
>> you should at least be willing to accept their decisions.
> 
> All I'm trying to do here is point out that the ABI you've defined has some much broader costs to implement than the features you're advertising you want to use it for.  If this is the ABI you want that's fine, I don't really care, just go make it part of the official psABI spec and we'll get on with our lives.

As I explained in my reply, in practice the solution for the vector calling convention specifically ends up being the solution for the more general problem, and so there are no such broader costs beyond the edge-case you identified that’s just an underspecification issue.

Regarding making it a part of the psABI spec, it has already been approved by all relevant parties, it has just been waiting on a proof-of-concept, or better, patch for either glibc or FreeBSD, which this provides, since for important features like this our policy is to not merge additions to the psABI spec until patches exist to prove its viability.

Jess

>> and would still allow us to have lazy binding for the variants.  At a
>>> bare minimum we can support arbitrary V register calling conventions for
>>> free until we start using the V registers in glibc, which is likely
>>> quite a way off.
>>> 
>> 
>> We already have patches to use V registers in glibc, and we are already
>> using them.  We already upstreamed the ifunc support for binutils and glibc
>> required to use vector instructions.  We use them for mem* and str*
>> routines.  Adopting the position that the resolver isn't allowed to call
>> any mem* or str* functions is unreasonable.  Hence the need for this patch.
>> 
>> Jim
Palmer Dabbelt Aug. 11, 2021, 11:52 p.m. UTC | #6
On Wed, 11 Aug 2021 16:32:38 PDT (-0700), jrtc27@jrtc27.com wrote:
> On 12 Aug 2021, at 00:21, Palmer Dabbelt <palmer@dabbelt.com> wrote:
>> On Wed, 11 Aug 2021 16:00:31 PDT (-0700), Jim Wilson wrote:
>>> On Wed, Aug 11, 2021 at 3:13 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>> 
>>>> I'm trying to stay away from the foundation stuff these days, but from
>>>> reading that spec it doesn't look like the variant CC has anything to do
>>>> specifically with the V extension but instead is about allowing for
>>>> arbitrary calling conventions for X and F registers as well.  Handling
>>>> non-standard X-register calling conventions is really a whole different
>>>> beast than handling non-standard F or V register calling conventions.
>>>> It's not crazy to allow for these, but lumping them all into a single
>>>> bit just makes this unnecessarily difficult.
>>>> 
>>> 
>>> This is the exact same solution that the aarch64 port uses.  We aren't
>>> inventing anything new here.
>>> 
>>> The intent is not to support arbitrary calling conventions.  The intent is
>>> that we will have two official calling conventions once we add vector
>>> support, one that uses vector registers and one
>>> that doesn't.  Most code will use the non-vector register ABI, and we
>>> expect glibc will be compiled this way for maximum portability.  But we
>>> will need to support the second ABI that does have vector registers, and
>>> hence we only need a single bit for that at this time.
>>> 
>>> If you aren't willing to participate in the official ABI committee, then
>>> you should at least be willing to accept their decisions.
>> 
>> All I'm trying to do here is point out that the ABI you've defined has some much broader costs to implement than the features you're advertising you want to use it for.  If this is the ABI you want that's fine, I don't really care, just go make it part of the official psABI spec and we'll get on with our lives.
>
> As I explained in my reply, in practice the solution for the vector calling convention specifically ends up being the solution for the more general problem, and so there are no such broader costs beyond the edge-case you identified that’s just an underspecification issue.
>
> Regarding making it a part of the psABI spec, it has already been approved by all relevant parties, it has just been waiting on a proof-of-concept, or better, patch for either glibc or FreeBSD, which this provides, since for important features like this our policy is to not merge additions to the psABI spec until patches exist to prove its viability.

OK, no problem.  Next time just please make it clear when you guys send 
something out for review that all discussion here is irrelevant, because 
it'll save us the time.

IIUC the ABI patch still isn't merged, which is what makes it actually 
official.  I'll take this when it lands.

>
> Jess
>
>>> and would still allow us to have lazy binding for the variants.  At a
>>>> bare minimum we can support arbitrary V register calling conventions for
>>>> free until we start using the V registers in glibc, which is likely
>>>> quite a way off.
>>>> 
>>> 
>>> We already have patches to use V registers in glibc, and we are already
>>> using them.  We already upstreamed the ifunc support for binutils and glibc
>>> required to use vector instructions.  We use them for mem* and str*
>>> routines.  Adopting the position that the resolver isn't allowed to call
>>> any mem* or str* functions is unreasonable.  Hence the need for this patch.
>>> 
>>> Jim
Jessica Clarke Aug. 12, 2021, 12:54 a.m. UTC | #7
On 12 Aug 2021, at 00:52, Palmer Dabbelt <palmer@dabbelt.com> wrote:
> On Wed, 11 Aug 2021 16:32:38 PDT (-0700), jrtc27@jrtc27.com wrote:
>> On 12 Aug 2021, at 00:21, Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>> On Wed, 11 Aug 2021 16:00:31 PDT (-0700), Jim Wilson wrote:
>>>> On Wed, Aug 11, 2021 at 3:13 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>>>> I'm trying to stay away from the foundation stuff these days, but from
>>>>> reading that spec it doesn't look like the variant CC has anything to do
>>>>> specifically with the V extension but instead is about allowing for
>>>>> arbitrary calling conventions for X and F registers as well.  Handling
>>>>> non-standard X-register calling conventions is really a whole different
>>>>> beast than handling non-standard F or V register calling conventions.
>>>>> It's not crazy to allow for these, but lumping them all into a single
>>>>> bit just makes this unnecessarily difficult.
>>>> This is the exact same solution that the aarch64 port uses.  We aren't
>>>> inventing anything new here.
>>>> The intent is not to support arbitrary calling conventions.  The intent is
>>>> that we will have two official calling conventions once we add vector
>>>> support, one that uses vector registers and one
>>>> that doesn't.  Most code will use the non-vector register ABI, and we
>>>> expect glibc will be compiled this way for maximum portability.  But we
>>>> will need to support the second ABI that does have vector registers, and
>>>> hence we only need a single bit for that at this time.
>>>> If you aren't willing to participate in the official ABI committee, then
>>>> you should at least be willing to accept their decisions.
>>> All I'm trying to do here is point out that the ABI you've defined has some much broader costs to implement than the features you're advertising you want to use it for.  If this is the ABI you want that's fine, I don't really care, just go make it part of the official psABI spec and we'll get on with our lives.
>> 
>> As I explained in my reply, in practice the solution for the vector calling convention specifically ends up being the solution for the more general problem, and so there are no such broader costs beyond the edge-case you identified that’s just an underspecification issue.
>> 
>> Regarding making it a part of the psABI spec, it has already been approved by all relevant parties, it has just been waiting on a proof-of-concept, or better, patch for either glibc or FreeBSD, which this provides, since for important features like this our policy is to not merge additions to the psABI spec until patches exist to prove its viability.
> 
> OK, no problem.  Next time just please make it clear when you guys send something out for review that all discussion here is irrelevant, because it'll save us the time.

That’s an unfair characterisation and, whilst I appreciate that Jim’s reply was antagonistic in places, please do not take it out on me by using that kind of tone; I’ve tried to address all your comments and concerns with technical justifications. Were there to be actual problems with the implementation we would of course take that feedback; that is one of the reasons why we don’t merge a consequential specification change until people have actually implemented it. In this case AArch64 has already gone this route two years ago and lived to tell the tale without obvious issues so in this specific case there’s unlikely to be a problem that would result in us not adopting the proposal, but it’s always possible we’ve missed something. My point was “this is the specification we (the psABI chairs, plus representatives from both the GNU and LLVM world) will be going with unless anyone has a good reason why it is a bad idea”.

However, there is a small element of truth to your comment. Discussion here isn’t irrelevant, but it’s not the primary place for ABI specification discussions. Because the RISC-V psABI tries to show no preferential treatment to a specific OS or toolchain (or, in the case of Linux, libc implementation) and seek a specification that works for all, the primary place for discussions is not a GNU-specific or Linux-specific mailing list but the psABI repository, mailing list and meetings (for which there publicly-available minutes), and those who want to have more of an input into it should contribute to the discussions there (anyone is free to offer feedback on psABI issues/PRs). When necessary we do try to reach out to the mailing lists for the specific communities involved (e.g. on the compiler front both GCC/binutils and LLVM have their own regular RISC-V sync-up calls where issues can be brought up and discussed with the outcome reported back to the psABI task group), and feedback on proof-of-concept patches that relate to the specification rather than the implementation will be taken into account, but to avoid fragmentation the ultimate sources of rationale and places for discussion are in general those common central locations.

I’m also not sure why you’re acting so surprised about this addition to the psABI; you yourself reviewed the pull request in question back in May and, as far as I can tell, it had already adopted the STO_RISCV_VARIANT_CC nomenclature, and was only using vector registers as an example of a non-standard calling convention, by that point. Since your conditional approval the wording has been overhauled, in part to address your feedback on wording that was the condition for your approval, but nothing that actually changes things as far as implementations are concerned.

And to be clear, I had no part in the authorship or publishing of this patch, nor do I have any kind of affiliation with SiFive. What SiFive, or its employees, do is entirely up to them and completely independent of myself as co-chair of the psABI specification task group. All I ask is that patches exist; whether they’re proposed as RFC/PoC or not is none of my business beyond making sure that implementations don’t officially adopt new ABI features before they are made official (or at least the kind that matter for compatibility) as that causes as much of a headache for me as it does you.

> IIUC the ABI patch still isn't merged, which is what makes it actually official.

Of course. If in light of the technical explanations from Jim and myself you still feel there are reasons why this should not be adopted then please do express them, but otherwise I expect it will be merged shortly.

> I'll take this when it lands.

Sure; I’d do the same in your shoes too.

Jess

>> Jess
>> 
>>>> and would still allow us to have lazy binding for the variants.  At a
>>>>> bare minimum we can support arbitrary V register calling conventions for
>>>>> free until we start using the V registers in glibc, which is likely
>>>>> quite a way off.
>>>> We already have patches to use V registers in glibc, and we are already
>>>> using them.  We already upstreamed the ifunc support for binutils and glibc
>>>> required to use vector instructions.  We use them for mem* and str*
>>>> routines.  Adopting the position that the resolver isn't allowed to call
>>>> any mem* or str* functions is unreasonable.  Hence the need for this patch.
>>>> Jim
Jim Wilson Oct. 25, 2021, 9:08 p.m. UTC | #8
On Wed, Aug 11, 2021 at 4:52 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:

> IIUC the ABI patch still isn't merged, which is what makes it actually
> official.  I'll take this when it lands.
>

Just following up, the psABI patch was merged in
    https://github.com/riscv/riscv-elf-psabi-doc/pull/190
and the psABI has gone into public review for the first official release.
So we would like this patch from Kai merged in.
    https://sourceware.org/pipermail/libc-alpha/2021-August/129931.html
in case you lost track of the original message on this thread.

Jim
DJ Delorie Oct. 29, 2021, 12:20 a.m. UTC | #9
This patch looks good to me and follows the spec as committed.  I think
it's good to have a "we can't trust the ABI" bit that bypasses all the
problems with lazy relocation and ABI registers.

However, it no longer compiles due to 490e6c62aa31a8aa5c4a059f6e646ede121edf0a
https://sourceware.org/pipermail/libc-alpha/2021-October/131680.html
https://patchwork.sourceware.org/project/glibc/patch/20211006050935.1311740-1-maskray@google.com/

Speficially, this function changed parameters:

-elf_machine_rela (struct link_map *map, const ElfW(Rela) *reloc,
-		  const ElfW(Sym) *sym, const struct r_found_version *version,
+elf_machine_rela (struct link_map *map, struct r_scope_elem *scope[],
+		  const ElfW(Rela) *reloc, const ElfW(Sym) *sym,
+		  const struct r_found_version *version,
 		  void *const reloc_addr, int skip_ifunc)

Hsiangkai Wang <kai.wang@sifive.com> writes:
> +/* RISC-V specific values for the Dyn d_tag field.  */
> +#define DT_RISCV_VARIANT_CC	(DT_LOPROC + 1)
> +#define DT_RISCV_NUM		2

This matches the spec, ok.  I note that (DT_LOPROC) is left unused, so
while there is only one DT_* tag, the "number" of used tags is 2, with
the first tag left undefined.

> +/* RISC-V specific values for the st_other field.  */
> +#define STO_RISCV_VARIANT_CC	0x80

This matches the spec, ok.

> diff --git a/sysdeps/riscv/dl-dtprocnum.h b/sysdeps/riscv/dl-dtprocnum.h
> new file mode 100644
> index 0000000..f189fd7
> --- /dev/null
> +++ b/sysdeps/riscv/dl-dtprocnum.h
> @@ -0,0 +1,21 @@
> +/* Configuration of lookup functions.  RISC-V version.
> +   Copyright (C) 2019-2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +/* Number of extra dynamic section entries for this architecture.  By
> +   default there are none.  */
> +#define DT_THISPROCNUM	DT_RISCV_NUM

Ok.

> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h

> +/* Translate a processor specific dynamic tag to the index in l_info array.  */
> +#define DT_RISCV(x) (DT_RISCV_##x - DT_LOPROC + DT_NUM)

Ok.

> @@ -299,6 +302,29 @@ elf_machine_lazy_rel (struct link_map *map, ElfW(Addr) l_addr,
>    /* Check for unexpected PLT reloc type.  */
>    if (__glibc_likely (r_type == R_RISCV_JUMP_SLOT))
>      {
> +      if (__glibc_unlikely (map->l_info[DT_RISCV (VARIANT_CC)] != NULL))
> +	{

If the object MAY have CC's...

> +	  /* Check the symbol table for variant CC symbols.  */
> +	  const Elf_Symndx symndx = ELFW(R_SYM) (reloc->r_info);
> +	  const ElfW(Sym) *symtab =
> +	    (const void *)D_PTR (map, l_info[DT_SYMTAB]);
> +	  const ElfW(Sym) *sym = &symtab[symndx];

Find the symbol we're relocating against; we already know reloc is not NULL.

> +	  if (__glibc_unlikely (sym->st_other & STO_RISCV_VARIANT_CC))
> +	    {

and if it IS a CC type..

> +	      /* Avoid lazy resolution of variant CC symbols.  */
> +	      const struct r_found_version *version = NULL;
> +	      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
> +		{
> +		  const ElfW(Half) *vernum =
> +		    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
> +		  version = &map->l_versions[vernum[symndx] & 0x7fff];
> +		}
> +	      elf_machine_rela (map, reloc, sym, version, reloc_addr,
> +				skip_ifunc);

THEN we relocate it now

> +	      return;
> +	    }
> +	}
> +

Else we continue to keep it lazy.

(conveniently, this code is the same as aarch64's)

>        if (__glibc_unlikely (map->l_mach.plt == 0))
>  	{
>  	  if (l_addr)
diff mbox series

Patch

diff --git a/elf/elf.h b/elf/elf.h
index 4738dfa..0de29bf 100644
--- a/elf/elf.h
+++ b/elf/elf.h
@@ -3889,6 +3889,13 @@  enum
 
 #define R_TILEGX_NUM		130
 
+/* RISC-V specific values for the Dyn d_tag field.  */
+#define DT_RISCV_VARIANT_CC	(DT_LOPROC + 1)
+#define DT_RISCV_NUM		2
+
+/* RISC-V specific values for the st_other field.  */
+#define STO_RISCV_VARIANT_CC	0x80
+
 /* RISC-V ELF Flags */
 #define EF_RISCV_RVC 			0x0001
 #define EF_RISCV_FLOAT_ABI 		0x0006
diff --git a/sysdeps/riscv/dl-dtprocnum.h b/sysdeps/riscv/dl-dtprocnum.h
new file mode 100644
index 0000000..f189fd7
--- /dev/null
+++ b/sysdeps/riscv/dl-dtprocnum.h
@@ -0,0 +1,21 @@ 
+/* Configuration of lookup functions.  RISC-V version.
+   Copyright (C) 2019-2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Number of extra dynamic section entries for this architecture.  By
+   default there are none.  */
+#define DT_THISPROCNUM	DT_RISCV_NUM
diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
index aedf69f..6488483 100644
--- a/sysdeps/riscv/dl-machine.h
+++ b/sysdeps/riscv/dl-machine.h
@@ -54,6 +54,9 @@ 
 #define ELF_MACHINE_NO_REL 1
 #define ELF_MACHINE_NO_RELA 0
 
+/* Translate a processor specific dynamic tag to the index in l_info array.  */
+#define DT_RISCV(x) (DT_RISCV_##x - DT_LOPROC + DT_NUM)
+
 /* Return nonzero iff ELF header is compatible with the running host.  */
 static inline int __attribute_used__
 elf_machine_matches_host (const ElfW(Ehdr) *ehdr)
@@ -299,6 +302,29 @@  elf_machine_lazy_rel (struct link_map *map, ElfW(Addr) l_addr,
   /* Check for unexpected PLT reloc type.  */
   if (__glibc_likely (r_type == R_RISCV_JUMP_SLOT))
     {
+      if (__glibc_unlikely (map->l_info[DT_RISCV (VARIANT_CC)] != NULL))
+	{
+	  /* Check the symbol table for variant CC symbols.  */
+	  const Elf_Symndx symndx = ELFW(R_SYM) (reloc->r_info);
+	  const ElfW(Sym) *symtab =
+	    (const void *)D_PTR (map, l_info[DT_SYMTAB]);
+	  const ElfW(Sym) *sym = &symtab[symndx];
+	  if (__glibc_unlikely (sym->st_other & STO_RISCV_VARIANT_CC))
+	    {
+	      /* Avoid lazy resolution of variant CC symbols.  */
+	      const struct r_found_version *version = NULL;
+	      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+		{
+		  const ElfW(Half) *vernum =
+		    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+		  version = &map->l_versions[vernum[symndx] & 0x7fff];
+		}
+	      elf_machine_rela (map, reloc, sym, version, reloc_addr,
+				skip_ifunc);
+	      return;
+	    }
+	}
+
       if (__glibc_unlikely (map->l_mach.plt == 0))
 	{
 	  if (l_addr)