[v2,0/5] linux: Avoid va_list for generic syscall wrappers if possible

Message ID 20230325140815.4170296-1-xry111@xry111.site
Headers
Series linux: Avoid va_list for generic syscall wrappers if possible |

Message

Xi Ruoyao March 25, 2023, 2:08 p.m. UTC
  Currently GCC generates highly sub-optimal code on architectures where
the calling convention prefers registers for arugment passing.  This is
GCC PR100955.  While it's technically a missed-optimization in GCC, it
seems not trivial to fix (I've not seen any compiler which can optimize
this properly yet).

As the generic Linux syscall wrappers often uses a fixed number of
arguments, and "overreading" the variable arguments is usually safe
(fcntl etc. is already doing this), we can pretend these wrappers were
declared with named arguments and make the compiler do right thing.

Add a macro __ASSUME_SYSCALL_NAMED_WORKS to control this, which should
be defined if we can safely replace "..." with several named arguments.
Use an internal function prototype with named arguments if it's defined,
for fcntl64, fcntl_nocancel, ioctl, mremap, open64, open64_nocancel,
openat64, openat64_nocancel, prctl, ptrace, and generic syscall()
wrapper.  I've not changed open* without "64" because I don't have a
test platform.  shm_ctl is also not changed because it contains
aggregate variable arugment which is more tricky than integers or
pointers.

Define this macro for LoongArch, x86-64, and AArch64.  This should be
also suitable for some other architectures (I think it will be fine on
RISC-V) but again I don't have a test platform.

This is the first time I make such a large change in Glibc so it's
likely I've done something wrong.  Please correct me :).

Xi Ruoyao (5):
  linux: Add __ASSUME_SYSCALL_NAMED_WORKS to allow avoiding va_list for
    generic syscall
  linux: [__ASSUME_SYSCALL_NAMED_WORKS] Avoid using va_list for various
    syscall wrappers
  LoongArch: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
  x86_64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
  aarch64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux

 include/fcntl.h                               | 32 ++++++++-----
 .../unix/sysv/linux/aarch64/kernel-features.h |  9 ++++
 sysdeps/unix/sysv/linux/fcntl64.c             | 40 +++++++++++++----
 sysdeps/unix/sysv/linux/fcntl_nocancel.c      | 30 ++++++++++---
 sysdeps/unix/sysv/linux/ioctl.c               | 38 ++++++++++++----
 .../sysv/linux/loongarch/kernel-features.h    | 29 ++++++++++++
 sysdeps/unix/sysv/linux/mremap.c              | 36 +++++++++++++--
 sysdeps/unix/sysv/linux/not-cancel.h          |  8 ++--
 sysdeps/unix/sysv/linux/open64.c              | 35 ++++++++++++---
 sysdeps/unix/sysv/linux/open64_nocancel.c     | 29 ++++++++++--
 sysdeps/unix/sysv/linux/openat64.c            | 30 +++++++++++--
 sysdeps/unix/sysv/linux/openat64_nocancel.c   | 29 ++++++++++--
 sysdeps/unix/sysv/linux/prctl.c               | 23 +++++++++-
 sysdeps/unix/sysv/linux/ptrace.c              | 45 ++++++++++++++-----
 sysdeps/unix/sysv/linux/syscall.c             | 35 +++++++++++----
 .../unix/sysv/linux/x86_64/kernel-features.h  |  9 ++++
 16 files changed, 382 insertions(+), 75 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/loongarch/kernel-features.h
  

Comments

Carlos O'Donell March 27, 2023, 2:04 p.m. UTC | #1
On 3/25/23 10:08, Xi Ruoyao via Libc-alpha wrote:
> Currently GCC generates highly sub-optimal code on architectures where
> the calling convention prefers registers for arugment passing.  This is
> GCC PR100955.  While it's technically a missed-optimization in GCC, it
> seems not trivial to fix (I've not seen any compiler which can optimize
> this properly yet).

I'm glad to see we have a gcc PR open for this.

We should be working to improve the compiler instead of working around the issue in
glibc.

> As the generic Linux syscall wrappers often uses a fixed number of
> arguments, and "overreading" the variable arguments is usually safe
> (fcntl etc. is already doing this), we can pretend these wrappers were
> declared with named arguments and make the compiler do right thing.



> Add a macro __ASSUME_SYSCALL_NAMED_WORKS to control this, which should
> be defined if we can safely replace "..." with several named arguments.
> Use an internal function prototype with named arguments if it's defined,
> for fcntl64, fcntl_nocancel, ioctl, mremap, open64, open64_nocancel,
> openat64, openat64_nocancel, prctl, ptrace, and generic syscall()
> wrapper.  I've not changed open* without "64" because I don't have a
> test platform.  shm_ctl is also not changed because it contains
> aggregate variable arugment which is more tricky than integers or
> pointers.


 
> Define this macro for LoongArch, x86-64, and AArch64.  This should be
> also suitable for some other architectures (I think it will be fine on
> RISC-V) but again I don't have a test platform.
> 
> This is the first time I make such a large change in Glibc so it's
> likely I've done something wrong.  Please correct me :).

There are few things that I see are wrong, and I'll list them out here at the top-level:

(1) Error prone macro usage.

#ifdef foo
/* Some Stuff.  */
#else
/* Other stuff. */
#endif

(a) Always define foo, either 0 or 1.
(b) Always use #if foo / #else

(2) Public declarations must match internal declarations.

The biggest problem I see here is that the public declaration of the functions
you are changing are all variadic, and so my strong opinion is that we should not
change the internal definition of these functions to be non-variadic.

I expect there are going to be knock-on effects with developer tooling that
expects the implementation not to over-read e.g. valgrind looking at reading
of registers that have undefined values.

---

In summary, I think this is a compiler problem, and that working around this in glibc
is going to result in:

- Odd corner case ABI issues between public declarations of variadic functions and
  internal non-variadic definitions.

- Poorer testing of #else code that uses variadic arguments, as the public interface
  requires.

I don't support going in this direction.

Is there an alternative that could generate better code that doesn't go this way?

> Xi Ruoyao (5):
>   linux: Add __ASSUME_SYSCALL_NAMED_WORKS to allow avoiding va_list for
>     generic syscall
>   linux: [__ASSUME_SYSCALL_NAMED_WORKS] Avoid using va_list for various
>     syscall wrappers
>   LoongArch: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
>   x86_64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
>   aarch64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
> 
>  include/fcntl.h                               | 32 ++++++++-----
>  .../unix/sysv/linux/aarch64/kernel-features.h |  9 ++++
>  sysdeps/unix/sysv/linux/fcntl64.c             | 40 +++++++++++++----
>  sysdeps/unix/sysv/linux/fcntl_nocancel.c      | 30 ++++++++++---
>  sysdeps/unix/sysv/linux/ioctl.c               | 38 ++++++++++++----
>  .../sysv/linux/loongarch/kernel-features.h    | 29 ++++++++++++
>  sysdeps/unix/sysv/linux/mremap.c              | 36 +++++++++++++--
>  sysdeps/unix/sysv/linux/not-cancel.h          |  8 ++--
>  sysdeps/unix/sysv/linux/open64.c              | 35 ++++++++++++---
>  sysdeps/unix/sysv/linux/open64_nocancel.c     | 29 ++++++++++--
>  sysdeps/unix/sysv/linux/openat64.c            | 30 +++++++++++--
>  sysdeps/unix/sysv/linux/openat64_nocancel.c   | 29 ++++++++++--
>  sysdeps/unix/sysv/linux/prctl.c               | 23 +++++++++-
>  sysdeps/unix/sysv/linux/ptrace.c              | 45 ++++++++++++++-----
>  sysdeps/unix/sysv/linux/syscall.c             | 35 +++++++++++----
>  .../unix/sysv/linux/x86_64/kernel-features.h  |  9 ++++
>  16 files changed, 382 insertions(+), 75 deletions(-)
>  create mode 100644 sysdeps/unix/sysv/linux/loongarch/kernel-features.h
>
  
Xi Ruoyao March 27, 2023, 2:44 p.m. UTC | #2
On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:

> In summary, I think this is a compiler problem

Definitely true.

> and that working around this in glibc
> is going to result in:
> 
> - Odd corner case ABI issues between public declarations of variadic functions and
>   internal non-variadic definitions.
> 
> - Poorer testing of #else code that uses variadic arguments, as the public interface
>   requires.
> 
> I don't support going in this direction.

Valid reasons.  Abandon this series then.

But I hope these could be raised earlier (in the discussion about
LoongArch syscall.S) so I wouldn't write all the code :).

> Is there an alternative that could generate better code that doesn't go this way?

For LoongArch I can improve GCC to save only the GARs containing the
arguments really used in va_arg (i.e. one GAR for things like open() or
fcntl() instead of all 8 GARs), but I guess the patch will be delayed
into GCC 14.

Generally I've not got an idea about how to make GCC avoid saving GARs
unnecessarily with va_arg.
  
Xi Ruoyao March 27, 2023, 2:45 p.m. UTC | #3
On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:

> In summary, I think this is a compiler problem

Definitely true.

> and that working around this in glibc
> is going to result in:
> 
> - Odd corner case ABI issues between public declarations of variadic
> functions and
>   internal non-variadic definitions.
> 
> - Poorer testing of #else code that uses variadic arguments, as the
> public interface
>   requires.
> 
> I don't support going in this direction.

Valid reasons.  Abandon this series then.

But I hope these could be raised earlier (in the discussion about
LoongArch syscall.S) so I wouldn't write all the code :).

> Is there an alternative that could generate better code that doesn't
> go this way?

For LoongArch I can improve GCC to save only the GARs containing the
arguments really used in va_arg (i.e. one GAR for things like open() or
fcntl() instead of all 8 GARs), but I guess the patch will be delayed
into GCC 14.

Generally I've not got an idea about how to make GCC avoid saving GARs
unnecessarily with va_arg.
  
Xi Ruoyao March 27, 2023, 2:51 p.m. UTC | #4
Sorry, mail duplicated because of some network issue.

On Mon, 2023-03-27 at 22:45 +0800, Xi Ruoyao wrote:
> On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:
> 
> > In summary, I think this is a compiler problem
> 
> Definitely true.

/* snip */
  
caiyinyu April 4, 2023, 1:25 a.m. UTC | #5
在 2023/3/27 下午10:45, Xi Ruoyao 写道:
> On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:
>
>> In summary, I think this is a compiler problem
> Definitely true.
>
>> and that working around this in glibc
>> is going to result in:
>>
>> - Odd corner case ABI issues between public declarations of variadic
>> functions and
>>    internal non-variadic definitions.
>>
>> - Poorer testing of #else code that uses variadic arguments, as the
>> public interface
>>    requires.
>>
>> I don't support going in this direction.
> Valid reasons.  Abandon this series then.
>
> But I hope these could be raised earlier (in the discussion about
> LoongArch syscall.S) so I wouldn't write all the code :).
>
>> Is there an alternative that could generate better code that doesn't
>> go this way?
> For LoongArch I can improve GCC to save only the GARs containing the
> arguments really used in va_arg (i.e. one GAR for things like open() or
> fcntl() instead of all 8 GARs), but I guess the patch will be delayed
> into GCC 14.
>
> Generally I've not got an idea about how to make GCC avoid saving GARs
> unnecessarily with va_arg.

So I believe that the assembly implementation of syscalls is still 
necessary, especially for users who are using GCC<=13.

patch:

https://sourceware.org/pipermail/libc-alpha/2023-March/146588.html

>
  
Xi Ruoyao April 4, 2023, 12:12 p.m. UTC | #6
On Tue, 2023-04-04 at 09:25 +0800, caiyinyu wrote:
> 
> 在 2023/3/27 下午10:45, Xi Ruoyao 写道:
> > On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:
> > 
> > > In summary, I think this is a compiler problem
> > Definitely true.
> > 
> > > and that working around this in glibc
> > > is going to result in:
> > > 
> > > - Odd corner case ABI issues between public declarations of variadic
> > > functions and
> > >    internal non-variadic definitions.
> > > 
> > > - Poorer testing of #else code that uses variadic arguments, as the
> > > public interface
> > >    requires.
> > > 
> > > I don't support going in this direction.
> > Valid reasons.  Abandon this series then.
> > 
> > But I hope these could be raised earlier (in the discussion about
> > LoongArch syscall.S) so I wouldn't write all the code :).
> > 
> > > Is there an alternative that could generate better code that doesn't
> > > go this way?
> > For LoongArch I can improve GCC to save only the GARs containing the
> > arguments really used in va_arg (i.e. one GAR for things like open() or
> > fcntl() instead of all 8 GARs), but I guess the patch will be delayed
> > into GCC 14.
> > 
> > Generally I've not got an idea about how to make GCC avoid saving GARs
> > unnecessarily with va_arg.
> 
> So I believe that the assembly implementation of syscalls is still 
> necessary, especially for users who are using GCC<=13.

Maybe we can use a custom C implementation (like RISC-V) as well.  But
strictly speaking the RISC-V syscall.c is invoking undefined behavior
(like my proposal) so I agree with assembly here.

> patch:
> 
> https://sourceware.org/pipermail/libc-alpha/2023-March/146588.html
>