Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break

Message ID 73c0124c-4794-6e40-460c-b26df407f322@rivosinc.com
State Not applicable
Headers
Series Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break |

Checks

Context Check Description
dj/TryBot-apply_patch fail Patch failed to apply to master at the time it was sent
dj/TryBot-32bit fail Patch series failed to apply

Commit Message

Vineet Gupta Dec. 20, 2022, 8:05 p.m. UTC
  Hi folks,

Apologies for the extraneous CC (and the top post), but I would really 
appreciate some feedback on this to close on the V-ext plumbing support 
in kernel/glibc. This is one of the two contentious issues (other being 
prctl enable) preventing us from getting to an RVV enabled SW ecosystem.

The premise is : for preserving V-ext registers across signal handling, 
the natural way is to add V reg storage to kernel struct sigcontext 
where scalar / fp regs are currently saved. But this doesn’t seem to be 
the right way to go:

1. Breaks the userspace ABI (even if user programs were recompiled) 
because RV glibc port for historical reasons has defined its own version 
of struct sigcontext (vs. relying on kernel exported UAPI header).

2. Even if we were to expand sigcontext (in both kernel and glibc, which 
is always hard to time) there's still a (different) ABI breakage for 
existing binaries despite earlier proposed __extension__ union trick [2] 
since it still breaks old binaries w.r.t. size of the sigcontext struct.

3. glibc {set,get,*}context() routines use struct mcontext_t which is 
analogous to kernel struct sigcontext (in respective ucontext structs 
[1]). Thus ideally mcontext_t needs to be expanded too but need not be, 
given its semantics to save callee-saved regs only : per current psABI 
RVVV regs are caller-saved/call-clobbered [3]. Apparently this 
connection of sigcontext to mcontext_t is also historical as some arches 
used/still-use sigreturn to restore regs in setcontext [4]

Does anyone disagree that 1-3 are not valid reasons.

So the proposal here is to *not* add V-ext state to kernel sigcontext 
but instead dynamically to struct rt_sigframe, similar to aarch64 
kernel. This avoids touching glibc sigcontext as well.

struct rt_sigframe {
   struct siginfo info;
   struct ucontext uc;
+__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length 
array to handle implementation defined VLEN wide regs
}

The only downside to this is that SA_SIGINFO signal handlers don’t have 
direct access to V state (but it seems aarch64 kernel doesn’t either).

Does anyone really disagree with this proposal.

Attached is a proof-of-concept kernel patch which implements this 
proposal with no need for any corresponding glibc change.

Thx,
-Vineet


[1] ucontex in kernel and glibc respectively.

kernel: arch/riscv/include/uapi/asm/ucontext.h

struct ucontext {
  unsigned long uc_flags;
  struct ucontext *uc_link;
  stack_t uc_stack;
  sigset_t uc_sigmask;
  __u8 __unused[1024 / 8 - sizeof(sigset_t)];
  struct sigcontext uc_mcontext;
}

glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h

typedef struct ucontext_t
   {
     unsigned long int  __uc_flags;
     struct ucontext_t *uc_link;
     stack_t            uc_stack;
     sigset_t           uc_sigmask;
     /* padding to allow future sigset_t expansion */
     char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
      mcontext_t uc_mcontext;
} ucontext_t;

[2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
[3] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
[4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html




On 12/8/22 19:39, Vineet Gupta wrote:
> Hi Florian,
>
> P.S. Since I'm revisiting a year old thread with some new CC 
> recipients, here's the link to original patch/thread [1]
>
> On 9/17/21 20:04, Vincent Chen wrote:
>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com> 
>> wrote:
>>>>>> This changes the size of struct ucontext_t, which is an ABI break
>>>>>> (getcontext callers are supposed to provide their own object).
>>>>>>
>>>> The riscv vector registers are all caller-saved registers except for
>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
>>>> mcontext_t has a space for future expansion. Based on the above ideas,
>>>> I reserved a 5K space here.
>>> You have reserved space in ucontext_t that you could use for this.
>>>
>> Sorry, I cannot really understand what you mean. The following is the
>> contents of ucontext_t
>> typedef struct ucontext_t
>>    {
>>      unsigned long int  __uc_flags;
>>      struct ucontext_t *uc_link;
>>      stack_t            uc_stack;
>>      sigset_t           uc_sigmask;
>>      /* There's some padding here to allow sigset_t to be expanded in 
>> the
>>         future.  Though this is unlikely, other architectures put 
>> uc_sigmask
>>         at the end of this structure and explicitly state it can be
>>         expanded, so we didn't want to box ourselves in here. */
>>      char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>      /* We can't put uc_sigmask at the end of this structure because 
>> we need
>>         to be able to expand sigcontext in the future.  For example, the
>>         vector ISA extension will almost certainly add ISA state.  We 
>> want
>>         to ensure all user-visible ISA state can be saved and 
>> restored via a
>>         ucontext, so we're putting this at the end in order to allow for
>>         infinite extensibility.  Since we know this will be extended 
>> and we
>>         assume sigset_t won't be extended an extreme amount, we're
>>         prioritizing this.  */
>>      mcontext_t uc_mcontext;
>>    } ucontext_t;
>>
>> Currently, we only reserve a space, __glibc_reserved[], for the future
>> expansion of sigset_t.
>> Do you mean I could use __glibc_reserved[] to for future expansion of
>> ISA as well?
>
> Given unlikely sigset expansion, we could in theory use some of those 
> reserved fields to store pointers (offsets) to actual V state, but not 
> for actual V state which is way too large for non-embedded machines 
> with typical 128 or even wider V regs.
>
>
>>
>>>>>> This shouldn't be necessary if the additional vector registers are
>>>>>> caller-saved.
>>>> Here I am a little confused about the usage of struct mcontext_t. As
>>>> far as I know, the struct mcontext_t is used to save the
>>>> machine-specific information in user context operation. Therefore, in
>>>> this case, the struct mcontext_t is allowed to reserve the space only
>>>> for saving caller-saved registers. However, in the signal handler, the
>>>> user seems to be allowed to use uc_mcontext whose data type is struct
>>>> mcontext_t to access the content of the signal context. In this case,
>>>> the struct mcontext_t may need to be the same as the struct sigcontext
>>>> defined at kernel. However, it will have a conflict with your
>>>> suggestion because the struct sigcontext cannot just reserve a space
>>>> for saving caller-saved registers. Could you help me point out my
>>>> misunderstanding? Thank you.
>
> I think the confusion comes from apparent equivalence of kernel struct 
> sigcontext and glibc mcontext_t as they appear in respective struct 
> ucontext definitions.
> I've enumerated the actual RV structs below to keep them handy in one 
> place for discussion.
>
>>> struct sigcontext is allocated by the kernel, so you can have pointers
>>> in reserved fields to out-of-line start, or after struct sigcontext.
>
> In this scheme, would the actual V regfile contents (at the 
> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc 
> i.e. do we not need to expose them to glibc userspace ABI ?
>
>
>>> I don't know how the kernel implements this, but there is considerable
>>> flexibility and extensibility.  The main issues comes from small stacks
>>> which are incompatible with large register files.
>
> Simplistically, Linux kernel needs to preserve the V regfile across 
> task switch. The necessary evil that follows is preserving V across 
> signal-handling (sigaction/sigreturn).
>
> In RV kernel we have following:
>
> struct rt_sigframe {
>   struct siginfo info;
>   struct ucontext uc;
> };
>
> struct ucontext {
>    unsigned long uc_flags;
>    struct ucontext *uc_link;
>    stack_t uc_stack;
>    sigset_t uc_sigmask;
>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for 
> sigset_t expansion
>    struct sigcontext uc_mcontext;
> };
>
> struct sigcontext {
>    struct user_regs_struct sc_regs;
>    union __riscv_fp_state sc_fpregs;
> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   // 
> handle 128B V regs
> };
>
> The sc_extn[] would have V state (regfile + control state) in kernel 
> defined format.
>
> As I understand it, you are suggesting to prevent ABI break, we should 
> not add anything to kernel struct sigcontext i.e. do something like this
>
> struct rt_sigframe {
>   struct siginfo info;
>   struct ucontext uc;
> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> }
>
> So kernel sig handling can continue to save/restore the V regfile on 
> user stack, w/o making it part of actual struct sigcontext.
> So they are not explicitly visible to userspace at all - is that 
> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp 
> regs, they won't do it V.
> Is there a POSIX req for SA_SIGINFO handlers being able to access all 
> machine regs saved by signal handling.
>
> An alternate approach is what Vincent did originally, to add sc_exn to 
> struct sigcontext. Here to prevent ABI breakage, we can choose to not 
> reflect this in the glibc sigcontext. But the question remains, is 
> that OK ?
>
> The other topic is changing glibc mcontext_t to add V-regs. It would 
> seem one has to as mcontext is "visually equivalent" to struct 
> sigcontext in the respective ucontext structs. But in unserspace 
> *context routine semantics only require callee-regs to be saved, which 
> V regs are not per psABI [2]. So looks like this can be avoided which 
> is what Vincent did in v2 series [3]
>
>
> [1] 
> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> [2] 
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html
  

Comments

Vincent Chen Dec. 21, 2022, 3:53 p.m. UTC | #1
Hi Vineet,
Thank you for creating this discussion thread to get some consensus
and propose a way to solve this problem. Actually, I don't object to
your proposal. I just don't understand why my solution is
inappropriate. IIUC, the struct sigcontext is used by the kernel to
preserve the context of the register before entering the signal
handler. Because the memory region used to save the register context
is in user space, user space can obtain register context through the
same struct sigcontext to parse the same memory region. Therefore, we
don't want to break ABI to cause this mechanism to fail in the
different kernel and Glibc combinations. Back to my approach, as you
mentioned that my patch changes the size of struct sigcontext.
However, this size difference does not seem to break the above
mechanism. I enumerate the possible case below for discussion.

1. Kernel without RVV support + user program using original Glibc sigcontext.
This is the current Glibc case. It has no problems.

2. Kernel with RVV support + user program using the new sigcontext definition
The mechanism can work smoothly because the sigcontext definition in
the kernel matches the definition in user programs.

3. Kernel without RVV support + user program using the new sigcontext definition
Because the kernel does not store vector registers context to memory,
the __reserved[4224] in GLIBC sigcontext is unneeded. Therefore, the
struct sigcontext in user programs will waste a lot of memory due to
__reserved[4224] if user programs allocate memory for it. But, the
mechanism still can work smoothly.

4. Kernel with RVV support + user program using original Glibc sigcontext
In this case, the kernel needs to save vector registers context to
memory. The user program may encounter memory issues if the user space
does not reserve enough memory size for the kernel to create the
sigcontext. However, we can't seem to avoid this case since there is
no flexible memory area in struct sigcontext for future expansion.

From the above enumeration, my approach in the 3rd case will be a
problem. But, it may be solved by replacing the __reserved[4224] in
struct sigcontext with the " C99 flexible length array". Therefore,
the new patch will become below.

--- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
+++ b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h

@@ -22,10 +22,28 @@
 # error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
 #endif

+#define sigcontext kernel_sigcontext
+#include <asm/sigcontext.h>
+#undef sigcontext

 struct sigcontext {
 /* gregs[0] holds the program counter. */
- unsigned long int gregs[32];
- unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
+ __extension__ union {
+ unsigned long int gregs[32];
+ /* Kernel uses struct user_regs_struct to save x1-x31 and pc
+ to the signal context, so please use sc_regs to access these
+ these registers from the signal context. */
+ struct user_regs_struct sc_regs;
+ };

+ __extension__ union {
+ unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
+ /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
+ to the signal context, so please use sc_fpregs to access these
+ fpu registers from the signal context. */
+ union __riscv_fp_state sc_fpregs;
+ };
+
+ __u8 sc_extn[] __attribute__((__aligned__(16)));
 };

 #endif


This change can reduce memory waste size to 16 bytes in the worst
case. The best case happens when the sc_extn locates at a 16-byte
aligned address. The size of the struct sigcontext is still the same.

If the above inference is acceptable, I want to mention some
advantages of my patch. This approach allows user programs to directly
access the vector register context. Besides, new user programs can use
kernel-defined struct sigcontext to access the context of the
register. Actually, the memory layout of the FPU register in
kernel-defined struct sigcontext is different from the Glibc-defined
struct sigcontext. It probably causes the user programs to get the
wrong value of FPU registers from the context. Therefore, my approach
can help user programs get the correct FPU registers because the user
program is able to use kernel-defined struct sigcontext to access the
FPU register context. It will help RISC-V users get rid of the
historical burden in Glibc sigcontext.h.


Thanks,
Vincent Chen

On Wed, Dec 21, 2022 at 4:05 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
> Hi folks,
>
> Apologies for the extraneous CC (and the top post), but I would really
> appreciate some feedback on this to close on the V-ext plumbing support
> in kernel/glibc. This is one of the two contentious issues (other being
> prctl enable) preventing us from getting to an RVV enabled SW ecosystem.
>
> The premise is : for preserving V-ext registers across signal handling,
> the natural way is to add V reg storage to kernel struct sigcontext
> where scalar / fp regs are currently saved. But this doesn’t seem to be
> the right way to go:
>
> 1. Breaks the userspace ABI (even if user programs were recompiled)
> because RV glibc port for historical reasons has defined its own version
> of struct sigcontext (vs. relying on kernel exported UAPI header).
>
> 2. Even if we were to expand sigcontext (in both kernel and glibc, which
> is always hard to time) there's still a (different) ABI breakage for
> existing binaries despite earlier proposed __extension__ union trick [2]
> since it still breaks old binaries w.r.t. size of the sigcontext struct.
>
> 3. glibc {set,get,*}context() routines use struct mcontext_t which is
> analogous to kernel struct sigcontext (in respective ucontext structs
> [1]). Thus ideally mcontext_t needs to be expanded too but need not be,
> given its semantics to save callee-saved regs only : per current psABI
> RVVV regs are caller-saved/call-clobbered [3]. Apparently this
> connection of sigcontext to mcontext_t is also historical as some arches
> used/still-use sigreturn to restore regs in setcontext [4]
>
> Does anyone disagree that 1-3 are not valid reasons.
>
> So the proposal here is to *not* add V-ext state to kernel sigcontext
> but instead dynamically to struct rt_sigframe, similar to aarch64
> kernel. This avoids touching glibc sigcontext as well.
>
> struct rt_sigframe {
>    struct siginfo info;
>    struct ucontext uc;
> +__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length
> array to handle implementation defined VLEN wide regs
> }
>
> The only downside to this is that SA_SIGINFO signal handlers don’t have
> direct access to V state (but it seems aarch64 kernel doesn’t either).
>
> Does anyone really disagree with this proposal.
>
> Attached is a proof-of-concept kernel patch which implements this
> proposal with no need for any corresponding glibc change.
>
> Thx,
> -Vineet
>
>
> [1] ucontex in kernel and glibc respectively.
>
> kernel: arch/riscv/include/uapi/asm/ucontext.h
>
> struct ucontext {
>   unsigned long uc_flags;
>   struct ucontext *uc_link;
>   stack_t uc_stack;
>   sigset_t uc_sigmask;
>   __u8 __unused[1024 / 8 - sizeof(sigset_t)];
>   struct sigcontext uc_mcontext;
> }
>
> glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>
> typedef struct ucontext_t
>    {
>      unsigned long int  __uc_flags;
>      struct ucontext_t *uc_link;
>      stack_t            uc_stack;
>      sigset_t           uc_sigmask;
>      /* padding to allow future sigset_t expansion */
>      char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>       mcontext_t uc_mcontext;
> } ucontext_t;
>
> [2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
> [3]
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> [4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html
>
>
>
>
> On 12/8/22 19:39, Vineet Gupta wrote:
> > Hi Florian,
> >
> > P.S. Since I'm revisiting a year old thread with some new CC
> > recipients, here's the link to original patch/thread [1]
> >
> > On 9/17/21 20:04, Vincent Chen wrote:
> >> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com>
> >> wrote:
> >>>>>> This changes the size of struct ucontext_t, which is an ABI break
> >>>>>> (getcontext callers are supposed to provide their own object).
> >>>>>>
> >>>> The riscv vector registers are all caller-saved registers except for
> >>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> >>>> it. In addition, RISCV ISA is growing, so I also hope the struct
> >>>> mcontext_t has a space for future expansion. Based on the above ideas,
> >>>> I reserved a 5K space here.
> >>> You have reserved space in ucontext_t that you could use for this.
> >>>
> >> Sorry, I cannot really understand what you mean. The following is the
> >> contents of ucontext_t
> >> typedef struct ucontext_t
> >>    {
> >>      unsigned long int  __uc_flags;
> >>      struct ucontext_t *uc_link;
> >>      stack_t            uc_stack;
> >>      sigset_t           uc_sigmask;
> >>      /* There's some padding here to allow sigset_t to be expanded in
> >> the
> >>         future.  Though this is unlikely, other architectures put
> >> uc_sigmask
> >>         at the end of this structure and explicitly state it can be
> >>         expanded, so we didn't want to box ourselves in here. */
> >>      char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
> >>      /* We can't put uc_sigmask at the end of this structure because
> >> we need
> >>         to be able to expand sigcontext in the future.  For example, the
> >>         vector ISA extension will almost certainly add ISA state.  We
> >> want
> >>         to ensure all user-visible ISA state can be saved and
> >> restored via a
> >>         ucontext, so we're putting this at the end in order to allow for
> >>         infinite extensibility.  Since we know this will be extended
> >> and we
> >>         assume sigset_t won't be extended an extreme amount, we're
> >>         prioritizing this.  */
> >>      mcontext_t uc_mcontext;
> >>    } ucontext_t;
> >>
> >> Currently, we only reserve a space, __glibc_reserved[], for the future
> >> expansion of sigset_t.
> >> Do you mean I could use __glibc_reserved[] to for future expansion of
> >> ISA as well?
> >
> > Given unlikely sigset expansion, we could in theory use some of those
> > reserved fields to store pointers (offsets) to actual V state, but not
> > for actual V state which is way too large for non-embedded machines
> > with typical 128 or even wider V regs.
> >
> >
> >>
> >>>>>> This shouldn't be necessary if the additional vector registers are
> >>>>>> caller-saved.
> >>>> Here I am a little confused about the usage of struct mcontext_t. As
> >>>> far as I know, the struct mcontext_t is used to save the
> >>>> machine-specific information in user context operation. Therefore, in
> >>>> this case, the struct mcontext_t is allowed to reserve the space only
> >>>> for saving caller-saved registers. However, in the signal handler, the
> >>>> user seems to be allowed to use uc_mcontext whose data type is struct
> >>>> mcontext_t to access the content of the signal context. In this case,
> >>>> the struct mcontext_t may need to be the same as the struct sigcontext
> >>>> defined at kernel. However, it will have a conflict with your
> >>>> suggestion because the struct sigcontext cannot just reserve a space
> >>>> for saving caller-saved registers. Could you help me point out my
> >>>> misunderstanding? Thank you.
> >
> > I think the confusion comes from apparent equivalence of kernel struct
> > sigcontext and glibc mcontext_t as they appear in respective struct
> > ucontext definitions.
> > I've enumerated the actual RV structs below to keep them handy in one
> > place for discussion.
> >
> >>> struct sigcontext is allocated by the kernel, so you can have pointers
> >>> in reserved fields to out-of-line start, or after struct sigcontext.
> >
> > In this scheme, would the actual V regfile contents (at the
> > out-of-line location w.r.t kernel sigcontext) be anonymous for glibc
> > i.e. do we not need to expose them to glibc userspace ABI ?
> >
> >
> >>> I don't know how the kernel implements this, but there is considerable
> >>> flexibility and extensibility.  The main issues comes from small stacks
> >>> which are incompatible with large register files.
> >
> > Simplistically, Linux kernel needs to preserve the V regfile across
> > task switch. The necessary evil that follows is preserving V across
> > signal-handling (sigaction/sigreturn).
> >
> > In RV kernel we have following:
> >
> > struct rt_sigframe {
> >   struct siginfo info;
> >   struct ucontext uc;
> > };
> >
> > struct ucontext {
> >    unsigned long uc_flags;
> >    struct ucontext *uc_link;
> >    stack_t uc_stack;
> >    sigset_t uc_sigmask;
> >    __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for
> > sigset_t expansion
> >    struct sigcontext uc_mcontext;
> > };
> >
> > struct sigcontext {
> >    struct user_regs_struct sc_regs;
> >    union __riscv_fp_state sc_fpregs;
> > +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   //
> > handle 128B V regs
> > };
> >
> > The sc_extn[] would have V state (regfile + control state) in kernel
> > defined format.
> >
> > As I understand it, you are suggesting to prevent ABI break, we should
> > not add anything to kernel struct sigcontext i.e. do something like this
> >
> > struct rt_sigframe {
> >   struct siginfo info;
> >   struct ucontext uc;
> > +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> > }
> >
> > So kernel sig handling can continue to save/restore the V regfile on
> > user stack, w/o making it part of actual struct sigcontext.
> > So they are not explicitly visible to userspace at all - is that
> > feasible ? I know that SA_SIGINFO handlers can access the scalar/fp
> > regs, they won't do it V.
> > Is there a POSIX req for SA_SIGINFO handlers being able to access all
> > machine regs saved by signal handling.
> >
> > An alternate approach is what Vincent did originally, to add sc_exn to
> > struct sigcontext. Here to prevent ABI breakage, we can choose to not
> > reflect this in the glibc sigcontext. But the question remains, is
> > that OK ?
> >
> > The other topic is changing glibc mcontext_t to add V-regs. It would
> > seem one has to as mcontext is "visually equivalent" to struct
> > sigcontext in the respective ucontext structs. But in unserspace
> > *context routine semantics only require callee-regs to be saved, which
> > V regs are not per psABI [2]. So looks like this can be avoided which
> > is what Vincent did in v2 series [3]
> >
> >
> > [1]
> > https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> > [2]
> > https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> > [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html
  
Vineet Gupta Dec. 21, 2022, 7:45 p.m. UTC | #2
Hi Vincent,

On 12/21/22 07:53, Vincent Chen wrote:
> Hi Vineet,
> Thank you for creating this discussion thread to get some consensus
> and propose a way to solve this problem. Actually, I don't object to
> your proposal. I just don't understand why my solution is
> inappropriate.

It is not inappropriate, in fact it is more natural to do it your way :-)
And if everything was rebuilt there was no issue. As some reviewers also 
pointed out the issue was with existing binaries with smaller sigcontext 
breaking with expanded sigcontext in kernel and/or glibc itself.


> IIUC, the struct sigcontext is used by the kernel to
> preserve the context of the register before entering the signal
> handler. Because the memory region used to save the register context
> is in user space, user space can obtain register context through the
> same struct sigcontext to parse the same memory region. Therefore, we
> don't want to break ABI to cause this mechanism to fail in the
> different kernel and Glibc combinations. Back to my approach, as you
> mentioned that my patch changes the size of struct sigcontext.
> However, this size difference does not seem to break the above
> mechanism. I enumerate the possible case below for discussion.
>
> 1. Kernel without RVV support + user program using original Glibc sigcontext.
> This is the current Glibc case. It has no problems.
>
> 2. Kernel with RVV support + user program using the new sigcontext definition
> The mechanism can work smoothly because the sigcontext definition in
> the kernel matches the definition in user programs.

Right but what about existing binaries. Imagine if they had

struct foo{
     struct sigcontext s;
     int bar;
}

Now with sigcontext expanded, bar is not at the expected location in memory.

> 3. Kernel without RVV support + user program using the new sigcontext definition
> Because the kernel does not store vector registers context to memory,
> the __reserved[4224] in GLIBC sigcontext is unneeded. Therefore, the
> struct sigcontext in user programs will waste a lot of memory due to
> __reserved[4224] if user programs allocate memory for it. But, the
> mechanism still can work smoothly.
>
> 4. Kernel with RVV support + user program using original Glibc sigcontext
> In this case, the kernel needs to save vector registers context to
> memory. The user program may encounter memory issues if the user space
> does not reserve enough memory size for the kernel to create the
> sigcontext. However, we can't seem to avoid this case since there is
> no flexible memory area in struct sigcontext for future expansion.
>
>  From the above enumeration, my approach in the 3rd case will be a
> problem. But, it may be solved by replacing the __reserved[4224] in
> struct sigcontext with the " C99 flexible length array". Therefore,
> the new patch will become below.
>
> --- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> +++ b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
>
> @@ -22,10 +22,28 @@
>   # error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
>   #endif
>
> +#define sigcontext kernel_sigcontext
> +#include <asm/sigcontext.h>
> +#undef sigcontext
>
>   struct sigcontext {
>   /* gregs[0] holds the program counter. */
> - unsigned long int gregs[32];
> - unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> + __extension__ union {
> + unsigned long int gregs[32];
> + /* Kernel uses struct user_regs_struct to save x1-x31 and pc
> + to the signal context, so please use sc_regs to access these
> + these registers from the signal context. */
> + struct user_regs_struct sc_regs;
> + };
>
> + __extension__ union {
> + unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> + /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
> + to the signal context, so please use sc_fpregs to access these
> + fpu registers from the signal context. */
> + union __riscv_fp_state sc_fpregs;
> + };
> +
> + __u8 sc_extn[] __attribute__((__aligned__(16)));
>   };
>
>   #endif
>
>
> This change can reduce memory waste size to 16 bytes in the worst
> case. The best case happens when the sc_extn locates at a 16-byte
> aligned address. The size of the struct sigcontext is still the same.

Its a neat trick. But the additional stack alignment means we could 
still potentially changing the size of sigcontext - even if by 16 bytes 
- again for existing binaries.

I agree that struct sigcontext is not something people commonly use in 
their code. And also not sure if the concern of breaking existing 
binaries with struct sigcontext is a real problem or a theoretical 
exercise. Hence I wanted some of the maintainers to weigh-in. I don't 
have issues with your approach, just that in the prior 2 reviews it 
seemed it was a no go.


> If the above inference is acceptable, I want to mention some
> advantages of my patch. This approach allows user programs to directly
> access the vector register context.

Correct, that is very true.

> Besides, new user programs can use
> kernel-defined struct sigcontext to access the context of the
> register. Actually, the memory layout of the FPU register in
> kernel-defined struct sigcontext is different from the Glibc-defined
> struct sigcontext. It probably causes the user programs to get the
> wrong value of FPU registers from the context. Therefore, my approach
> can help user programs get the correct FPU registers because the user
> program is able to use kernel-defined struct sigcontext to access the
> FPU register context. It will help RISC-V users get rid of the
> historical burden in Glibc sigcontext.h.

Indeed.

Thx,
-Vineet



>
>
> Thanks,
> Vincent Chen
>
> On Wed, Dec 21, 2022 at 4:05 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>> Hi folks,
>>
>> Apologies for the extraneous CC (and the top post), but I would really
>> appreciate some feedback on this to close on the V-ext plumbing support
>> in kernel/glibc. This is one of the two contentious issues (other being
>> prctl enable) preventing us from getting to an RVV enabled SW ecosystem.
>>
>> The premise is : for preserving V-ext registers across signal handling,
>> the natural way is to add V reg storage to kernel struct sigcontext
>> where scalar / fp regs are currently saved. But this doesn’t seem to be
>> the right way to go:
>>
>> 1. Breaks the userspace ABI (even if user programs were recompiled)
>> because RV glibc port for historical reasons has defined its own version
>> of struct sigcontext (vs. relying on kernel exported UAPI header).
>>
>> 2. Even if we were to expand sigcontext (in both kernel and glibc, which
>> is always hard to time) there's still a (different) ABI breakage for
>> existing binaries despite earlier proposed __extension__ union trick [2]
>> since it still breaks old binaries w.r.t. size of the sigcontext struct.
>>
>> 3. glibc {set,get,*}context() routines use struct mcontext_t which is
>> analogous to kernel struct sigcontext (in respective ucontext structs
>> [1]). Thus ideally mcontext_t needs to be expanded too but need not be,
>> given its semantics to save callee-saved regs only : per current psABI
>> RVVV regs are caller-saved/call-clobbered [3]. Apparently this
>> connection of sigcontext to mcontext_t is also historical as some arches
>> used/still-use sigreturn to restore regs in setcontext [4]
>>
>> Does anyone disagree that 1-3 are not valid reasons.
>>
>> So the proposal here is to *not* add V-ext state to kernel sigcontext
>> but instead dynamically to struct rt_sigframe, similar to aarch64
>> kernel. This avoids touching glibc sigcontext as well.
>>
>> struct rt_sigframe {
>>     struct siginfo info;
>>     struct ucontext uc;
>> +__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length
>> array to handle implementation defined VLEN wide regs
>> }
>>
>> The only downside to this is that SA_SIGINFO signal handlers don’t have
>> direct access to V state (but it seems aarch64 kernel doesn’t either).
>>
>> Does anyone really disagree with this proposal.
>>
>> Attached is a proof-of-concept kernel patch which implements this
>> proposal with no need for any corresponding glibc change.
>>
>> Thx,
>> -Vineet
>>
>>
>> [1] ucontex in kernel and glibc respectively.
>>
>> kernel: arch/riscv/include/uapi/asm/ucontext.h
>>
>> struct ucontext {
>>    unsigned long uc_flags;
>>    struct ucontext *uc_link;
>>    stack_t uc_stack;
>>    sigset_t uc_sigmask;
>>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];
>>    struct sigcontext uc_mcontext;
>> }
>>
>> glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>>
>> typedef struct ucontext_t
>>     {
>>       unsigned long int  __uc_flags;
>>       struct ucontext_t *uc_link;
>>       stack_t            uc_stack;
>>       sigset_t           uc_sigmask;
>>       /* padding to allow future sigset_t expansion */
>>       char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>        mcontext_t uc_mcontext;
>> } ucontext_t;
>>
>> [2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
>> [3]
>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
>> [4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html
>>
>>
>>
>>
>> On 12/8/22 19:39, Vineet Gupta wrote:
>>> Hi Florian,
>>>
>>> P.S. Since I'm revisiting a year old thread with some new CC
>>> recipients, here's the link to original patch/thread [1]
>>>
>>> On 9/17/21 20:04, Vincent Chen wrote:
>>>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com>
>>>> wrote:
>>>>>>>> This changes the size of struct ucontext_t, which is an ABI break
>>>>>>>> (getcontext callers are supposed to provide their own object).
>>>>>>>>
>>>>>> The riscv vector registers are all caller-saved registers except for
>>>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
>>>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
>>>>>> mcontext_t has a space for future expansion. Based on the above ideas,
>>>>>> I reserved a 5K space here.
>>>>> You have reserved space in ucontext_t that you could use for this.
>>>>>
>>>> Sorry, I cannot really understand what you mean. The following is the
>>>> contents of ucontext_t
>>>> typedef struct ucontext_t
>>>>     {
>>>>       unsigned long int  __uc_flags;
>>>>       struct ucontext_t *uc_link;
>>>>       stack_t            uc_stack;
>>>>       sigset_t           uc_sigmask;
>>>>       /* There's some padding here to allow sigset_t to be expanded in
>>>> the
>>>>          future.  Though this is unlikely, other architectures put
>>>> uc_sigmask
>>>>          at the end of this structure and explicitly state it can be
>>>>          expanded, so we didn't want to box ourselves in here. */
>>>>       char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>>>       /* We can't put uc_sigmask at the end of this structure because
>>>> we need
>>>>          to be able to expand sigcontext in the future.  For example, the
>>>>          vector ISA extension will almost certainly add ISA state.  We
>>>> want
>>>>          to ensure all user-visible ISA state can be saved and
>>>> restored via a
>>>>          ucontext, so we're putting this at the end in order to allow for
>>>>          infinite extensibility.  Since we know this will be extended
>>>> and we
>>>>          assume sigset_t won't be extended an extreme amount, we're
>>>>          prioritizing this.  */
>>>>       mcontext_t uc_mcontext;
>>>>     } ucontext_t;
>>>>
>>>> Currently, we only reserve a space, __glibc_reserved[], for the future
>>>> expansion of sigset_t.
>>>> Do you mean I could use __glibc_reserved[] to for future expansion of
>>>> ISA as well?
>>> Given unlikely sigset expansion, we could in theory use some of those
>>> reserved fields to store pointers (offsets) to actual V state, but not
>>> for actual V state which is way too large for non-embedded machines
>>> with typical 128 or even wider V regs.
>>>
>>>
>>>>>>>> This shouldn't be necessary if the additional vector registers are
>>>>>>>> caller-saved.
>>>>>> Here I am a little confused about the usage of struct mcontext_t. As
>>>>>> far as I know, the struct mcontext_t is used to save the
>>>>>> machine-specific information in user context operation. Therefore, in
>>>>>> this case, the struct mcontext_t is allowed to reserve the space only
>>>>>> for saving caller-saved registers. However, in the signal handler, the
>>>>>> user seems to be allowed to use uc_mcontext whose data type is struct
>>>>>> mcontext_t to access the content of the signal context. In this case,
>>>>>> the struct mcontext_t may need to be the same as the struct sigcontext
>>>>>> defined at kernel. However, it will have a conflict with your
>>>>>> suggestion because the struct sigcontext cannot just reserve a space
>>>>>> for saving caller-saved registers. Could you help me point out my
>>>>>> misunderstanding? Thank you.
>>> I think the confusion comes from apparent equivalence of kernel struct
>>> sigcontext and glibc mcontext_t as they appear in respective struct
>>> ucontext definitions.
>>> I've enumerated the actual RV structs below to keep them handy in one
>>> place for discussion.
>>>
>>>>> struct sigcontext is allocated by the kernel, so you can have pointers
>>>>> in reserved fields to out-of-line start, or after struct sigcontext.
>>> In this scheme, would the actual V regfile contents (at the
>>> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc
>>> i.e. do we not need to expose them to glibc userspace ABI ?
>>>
>>>
>>>>> I don't know how the kernel implements this, but there is considerable
>>>>> flexibility and extensibility.  The main issues comes from small stacks
>>>>> which are incompatible with large register files.
>>> Simplistically, Linux kernel needs to preserve the V regfile across
>>> task switch. The necessary evil that follows is preserving V across
>>> signal-handling (sigaction/sigreturn).
>>>
>>> In RV kernel we have following:
>>>
>>> struct rt_sigframe {
>>>    struct siginfo info;
>>>    struct ucontext uc;
>>> };
>>>
>>> struct ucontext {
>>>     unsigned long uc_flags;
>>>     struct ucontext *uc_link;
>>>     stack_t uc_stack;
>>>     sigset_t uc_sigmask;
>>>     __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for
>>> sigset_t expansion
>>>     struct sigcontext uc_mcontext;
>>> };
>>>
>>> struct sigcontext {
>>>     struct user_regs_struct sc_regs;
>>>     union __riscv_fp_state sc_fpregs;
>>> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   //
>>> handle 128B V regs
>>> };
>>>
>>> The sc_extn[] would have V state (regfile + control state) in kernel
>>> defined format.
>>>
>>> As I understand it, you are suggesting to prevent ABI break, we should
>>> not add anything to kernel struct sigcontext i.e. do something like this
>>>
>>> struct rt_sigframe {
>>>    struct siginfo info;
>>>    struct ucontext uc;
>>> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
>>> }
>>>
>>> So kernel sig handling can continue to save/restore the V regfile on
>>> user stack, w/o making it part of actual struct sigcontext.
>>> So they are not explicitly visible to userspace at all - is that
>>> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp
>>> regs, they won't do it V.
>>> Is there a POSIX req for SA_SIGINFO handlers being able to access all
>>> machine regs saved by signal handling.
>>>
>>> An alternate approach is what Vincent did originally, to add sc_exn to
>>> struct sigcontext. Here to prevent ABI breakage, we can choose to not
>>> reflect this in the glibc sigcontext. But the question remains, is
>>> that OK ?
>>>
>>> The other topic is changing glibc mcontext_t to add V-regs. It would
>>> seem one has to as mcontext is "visually equivalent" to struct
>>> sigcontext in the respective ucontext structs. But in unserspace
>>> *context routine semantics only require callee-regs to be saved, which
>>> V regs are not per psABI [2]. So looks like this can be avoided which
>>> is what Vincent did in v2 series [3]
>>>
>>>
>>> [1]
>>> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
>>> [2]
>>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
>>> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html
  
Vineet Gupta Dec. 21, 2022, 7:52 p.m. UTC | #3
On 12/21/22 11:45, Vineet Gupta wrote:
>
> 4. Kernel with RVV support + user program using original Glibc sigcontext
> In this case, the kernel needs to save vector registers context to
> memory. The user program may encounter memory issues if the user space
> does not reserve enough memory size for the kernel to create the
> sigcontext. However, we can't seem to avoid this case since there is
> no flexible memory area in struct sigcontext for future expansion.

This is not an issue, if we don't change sigcontext (in kernel and 
glibc) - it is essentially the case of existing binaries.
kernel still saves regs on user stack, in rt_sigframe, its just that  
userspace is not able to access them in SA_SIGINFO signal handers.
aarch64 have this implemented this approach and it is likely they can't 
do that either for SVE regs.
  
Vincent Chen Dec. 22, 2022, 1:50 a.m. UTC | #4
On Thu, Dec 22, 2022 at 3:45 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
> Hi Vincent,
>
> On 12/21/22 07:53, Vincent Chen wrote:
> > Hi Vineet,
> > Thank you for creating this discussion thread to get some consensus
> > and propose a way to solve this problem. Actually, I don't object to
> > your proposal. I just don't understand why my solution is
> > inappropriate.
>
> It is not inappropriate, in fact it is more natural to do it your way :-)
> And if everything was rebuilt there was no issue. As some reviewers also
> pointed out the issue was with existing binaries with smaller sigcontext
> breaking with expanded sigcontext in kernel and/or glibc itself.

Thank you for your detailed explanations :-) I still have some
questions and hope you can help me clarify them.
>
>
> > IIUC, the struct sigcontext is used by the kernel to
> > preserve the context of the register before entering the signal
> > handler. Because the memory region used to save the register context
> > is in user space, user space can obtain register context through the
> > same struct sigcontext to parse the same memory region. Therefore, we
> > don't want to break ABI to cause this mechanism to fail in the
> > different kernel and Glibc combinations. Back to my approach, as you
> > mentioned that my patch changes the size of struct sigcontext.
> > However, this size difference does not seem to break the above
> > mechanism. I enumerate the possible case below for discussion.
> >
> > 1. Kernel without RVV support + user program using original Glibc sigcontext.
> > This is the current Glibc case. It has no problems.
> >
> > 2. Kernel with RVV support + user program using the new sigcontext definition
> > The mechanism can work smoothly because the sigcontext definition in
> > the kernel matches the definition in user programs.
>
> Right but what about existing binaries. Imagine if they had
>
> struct foo{
>      struct sigcontext s;
>      int bar;
> }
>
> Now with sigcontext expanded, bar is not at the expected location in memory.

I really miss considering this case. I guess the following example is
one of the cases you want to mention.
1. a.out
#include <bits/sigcontext.h>
...
struct foo{
   struct sigcontext s;
   int bar;
} sc;
int main (void) {
lala(&sc);              // it defined in lala.so
}

2. lala.so
#include <bits/sigcontext.h>
struct foo{
   struct sigcontext s;
   int bar;
} sc;
void lala(struct foo *ptr) {
}
If the lala.so and a.out are compiled with different sizes of the
struct sigcontext, it will have an issue apparently. But, as you
mentioned, I am also curious if this example is a real problem or just
a theoretical exercise.

>
> > 3. Kernel without RVV support + user program using the new sigcontext definition
> > Because the kernel does not store vector registers context to memory,
> > the __reserved[4224] in GLIBC sigcontext is unneeded. Therefore, the
> > struct sigcontext in user programs will waste a lot of memory due to
> > __reserved[4224] if user programs allocate memory for it. But, the
> > mechanism still can work smoothly.
> >
> > 4. Kernel with RVV support + user program using original Glibc sigcontext
> > In this case, the kernel needs to save vector registers context to
> > memory. The user program may encounter memory issues if the user space
> > does not reserve enough memory size for the kernel to create the
> > sigcontext. However, we can't seem to avoid this case since there is
> > no flexible memory area in struct sigcontext for future expansion.
> >
> >  From the above enumeration, my approach in the 3rd case will be a
> > problem. But, it may be solved by replacing the __reserved[4224] in
> > struct sigcontext with the " C99 flexible length array". Therefore,
> > the new patch will become below.
> >
> > --- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> > +++ b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> >
> > @@ -22,10 +22,28 @@
> >   # error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
> >   #endif
> >
> > +#define sigcontext kernel_sigcontext
> > +#include <asm/sigcontext.h>
> > +#undef sigcontext
> >
> >   struct sigcontext {
> >   /* gregs[0] holds the program counter. */
> > - unsigned long int gregs[32];
> > - unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> > + __extension__ union {
> > + unsigned long int gregs[32];
> > + /* Kernel uses struct user_regs_struct to save x1-x31 and pc
> > + to the signal context, so please use sc_regs to access these
> > + these registers from the signal context. */
> > + struct user_regs_struct sc_regs;
> > + };
> >
> > + __extension__ union {
> > + unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> > + /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
> > + to the signal context, so please use sc_fpregs to access these
> > + fpu registers from the signal context. */
> > + union __riscv_fp_state sc_fpregs;
> > + };
> > +
> > + __u8 sc_extn[] __attribute__((__aligned__(16)));
> >   };
> >
> >   #endif
> >
> >
> > This change can reduce memory waste size to 16 bytes in the worst
> > case. The best case happens when the sc_extn locates at a 16-byte
> > aligned address. The size of the struct sigcontext is still the same.
>
> Its a neat trick. But the additional stack alignment means we could
> still potentially changing the size of sigcontext - even if by 16 bytes
> - again for existing binaries.
>
> I agree that struct sigcontext is not something people commonly use in
> their code. And also not sure if the concern of breaking existing
> binaries with struct sigcontext is a real problem or a theoretical
> exercise. Hence I wanted some of the maintainers to weigh-in. I don't
> have issues with your approach, just that in the prior 2 reviews it
> seemed it was a no go.

I agree with you that we need more maintainers to weigh-in to find an
appropriate solution. In my opinion, if the prior example is not
extensively used, maybe it is a good time to get rid of the historical
burden.

Thanks,
Vincent

>
>
> > If the above inference is acceptable, I want to mention some
> > advantages of my patch. This approach allows user programs to directly
> > access the vector register context.
>
> Correct, that is very true.
>
> > Besides, new user programs can use
> > kernel-defined struct sigcontext to access the context of the
> > register. Actually, the memory layout of the FPU register in
> > kernel-defined struct sigcontext is different from the Glibc-defined
> > struct sigcontext. It probably causes the user programs to get the
> > wrong value of FPU registers from the context. Therefore, my approach
> > can help user programs get the correct FPU registers because the user
> > program is able to use kernel-defined struct sigcontext to access the
> > FPU register context. It will help RISC-V users get rid of the
> > historical burden in Glibc sigcontext.h.
>
> Indeed.
>
> Thx,
> -Vineet
>
>
>
> >
> >
> > Thanks,
> > Vincent Chen
> >
> > On Wed, Dec 21, 2022 at 4:05 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> >> Hi folks,
> >>
> >> Apologies for the extraneous CC (and the top post), but I would really
> >> appreciate some feedback on this to close on the V-ext plumbing support
> >> in kernel/glibc. This is one of the two contentious issues (other being
> >> prctl enable) preventing us from getting to an RVV enabled SW ecosystem.
> >>
> >> The premise is : for preserving V-ext registers across signal handling,
> >> the natural way is to add V reg storage to kernel struct sigcontext
> >> where scalar / fp regs are currently saved. But this doesn’t seem to be
> >> the right way to go:
> >>
> >> 1. Breaks the userspace ABI (even if user programs were recompiled)
> >> because RV glibc port for historical reasons has defined its own version
> >> of struct sigcontext (vs. relying on kernel exported UAPI header).
> >>
> >> 2. Even if we were to expand sigcontext (in both kernel and glibc, which
> >> is always hard to time) there's still a (different) ABI breakage for
> >> existing binaries despite earlier proposed __extension__ union trick [2]
> >> since it still breaks old binaries w.r.t. size of the sigcontext struct.
> >>
> >> 3. glibc {set,get,*}context() routines use struct mcontext_t which is
> >> analogous to kernel struct sigcontext (in respective ucontext structs
> >> [1]). Thus ideally mcontext_t needs to be expanded too but need not be,
> >> given its semantics to save callee-saved regs only : per current psABI
> >> RVVV regs are caller-saved/call-clobbered [3]. Apparently this
> >> connection of sigcontext to mcontext_t is also historical as some arches
> >> used/still-use sigreturn to restore regs in setcontext [4]
> >>
> >> Does anyone disagree that 1-3 are not valid reasons.
> >>
> >> So the proposal here is to *not* add V-ext state to kernel sigcontext
> >> but instead dynamically to struct rt_sigframe, similar to aarch64
> >> kernel. This avoids touching glibc sigcontext as well.
> >>
> >> struct rt_sigframe {
> >>     struct siginfo info;
> >>     struct ucontext uc;
> >> +__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length
> >> array to handle implementation defined VLEN wide regs
> >> }
> >>
> >> The only downside to this is that SA_SIGINFO signal handlers don’t have
> >> direct access to V state (but it seems aarch64 kernel doesn’t either).
> >>
> >> Does anyone really disagree with this proposal.
> >>
> >> Attached is a proof-of-concept kernel patch which implements this
> >> proposal with no need for any corresponding glibc change.
> >>
> >> Thx,
> >> -Vineet
> >>
> >>
> >> [1] ucontex in kernel and glibc respectively.
> >>
> >> kernel: arch/riscv/include/uapi/asm/ucontext.h
> >>
> >> struct ucontext {
> >>    unsigned long uc_flags;
> >>    struct ucontext *uc_link;
> >>    stack_t uc_stack;
> >>    sigset_t uc_sigmask;
> >>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];
> >>    struct sigcontext uc_mcontext;
> >> }
> >>
> >> glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> >>
> >> typedef struct ucontext_t
> >>     {
> >>       unsigned long int  __uc_flags;
> >>       struct ucontext_t *uc_link;
> >>       stack_t            uc_stack;
> >>       sigset_t           uc_sigmask;
> >>       /* padding to allow future sigset_t expansion */
> >>       char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
> >>        mcontext_t uc_mcontext;
> >> } ucontext_t;
> >>
> >> [2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
> >> [3]
> >> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> >> [4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html
> >>
> >>
> >>
> >>
> >> On 12/8/22 19:39, Vineet Gupta wrote:
> >>> Hi Florian,
> >>>
> >>> P.S. Since I'm revisiting a year old thread with some new CC
> >>> recipients, here's the link to original patch/thread [1]
> >>>
> >>> On 9/17/21 20:04, Vincent Chen wrote:
> >>>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com>
> >>>> wrote:
> >>>>>>>> This changes the size of struct ucontext_t, which is an ABI break
> >>>>>>>> (getcontext callers are supposed to provide their own object).
> >>>>>>>>
> >>>>>> The riscv vector registers are all caller-saved registers except for
> >>>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> >>>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
> >>>>>> mcontext_t has a space for future expansion. Based on the above ideas,
> >>>>>> I reserved a 5K space here.
> >>>>> You have reserved space in ucontext_t that you could use for this.
> >>>>>
> >>>> Sorry, I cannot really understand what you mean. The following is the
> >>>> contents of ucontext_t
> >>>> typedef struct ucontext_t
> >>>>     {
> >>>>       unsigned long int  __uc_flags;
> >>>>       struct ucontext_t *uc_link;
> >>>>       stack_t            uc_stack;
> >>>>       sigset_t           uc_sigmask;
> >>>>       /* There's some padding here to allow sigset_t to be expanded in
> >>>> the
> >>>>          future.  Though this is unlikely, other architectures put
> >>>> uc_sigmask
> >>>>          at the end of this structure and explicitly state it can be
> >>>>          expanded, so we didn't want to box ourselves in here. */
> >>>>       char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
> >>>>       /* We can't put uc_sigmask at the end of this structure because
> >>>> we need
> >>>>          to be able to expand sigcontext in the future.  For example, the
> >>>>          vector ISA extension will almost certainly add ISA state.  We
> >>>> want
> >>>>          to ensure all user-visible ISA state can be saved and
> >>>> restored via a
> >>>>          ucontext, so we're putting this at the end in order to allow for
> >>>>          infinite extensibility.  Since we know this will be extended
> >>>> and we
> >>>>          assume sigset_t won't be extended an extreme amount, we're
> >>>>          prioritizing this.  */
> >>>>       mcontext_t uc_mcontext;
> >>>>     } ucontext_t;
> >>>>
> >>>> Currently, we only reserve a space, __glibc_reserved[], for the future
> >>>> expansion of sigset_t.
> >>>> Do you mean I could use __glibc_reserved[] to for future expansion of
> >>>> ISA as well?
> >>> Given unlikely sigset expansion, we could in theory use some of those
> >>> reserved fields to store pointers (offsets) to actual V state, but not
> >>> for actual V state which is way too large for non-embedded machines
> >>> with typical 128 or even wider V regs.
> >>>
> >>>
> >>>>>>>> This shouldn't be necessary if the additional vector registers are
> >>>>>>>> caller-saved.
> >>>>>> Here I am a little confused about the usage of struct mcontext_t. As
> >>>>>> far as I know, the struct mcontext_t is used to save the
> >>>>>> machine-specific information in user context operation. Therefore, in
> >>>>>> this case, the struct mcontext_t is allowed to reserve the space only
> >>>>>> for saving caller-saved registers. However, in the signal handler, the
> >>>>>> user seems to be allowed to use uc_mcontext whose data type is struct
> >>>>>> mcontext_t to access the content of the signal context. In this case,
> >>>>>> the struct mcontext_t may need to be the same as the struct sigcontext
> >>>>>> defined at kernel. However, it will have a conflict with your
> >>>>>> suggestion because the struct sigcontext cannot just reserve a space
> >>>>>> for saving caller-saved registers. Could you help me point out my
> >>>>>> misunderstanding? Thank you.
> >>> I think the confusion comes from apparent equivalence of kernel struct
> >>> sigcontext and glibc mcontext_t as they appear in respective struct
> >>> ucontext definitions.
> >>> I've enumerated the actual RV structs below to keep them handy in one
> >>> place for discussion.
> >>>
> >>>>> struct sigcontext is allocated by the kernel, so you can have pointers
> >>>>> in reserved fields to out-of-line start, or after struct sigcontext.
> >>> In this scheme, would the actual V regfile contents (at the
> >>> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc
> >>> i.e. do we not need to expose them to glibc userspace ABI ?
> >>>
> >>>
> >>>>> I don't know how the kernel implements this, but there is considerable
> >>>>> flexibility and extensibility.  The main issues comes from small stacks
> >>>>> which are incompatible with large register files.
> >>> Simplistically, Linux kernel needs to preserve the V regfile across
> >>> task switch. The necessary evil that follows is preserving V across
> >>> signal-handling (sigaction/sigreturn).
> >>>
> >>> In RV kernel we have following:
> >>>
> >>> struct rt_sigframe {
> >>>    struct siginfo info;
> >>>    struct ucontext uc;
> >>> };
> >>>
> >>> struct ucontext {
> >>>     unsigned long uc_flags;
> >>>     struct ucontext *uc_link;
> >>>     stack_t uc_stack;
> >>>     sigset_t uc_sigmask;
> >>>     __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for
> >>> sigset_t expansion
> >>>     struct sigcontext uc_mcontext;
> >>> };
> >>>
> >>> struct sigcontext {
> >>>     struct user_regs_struct sc_regs;
> >>>     union __riscv_fp_state sc_fpregs;
> >>> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   //
> >>> handle 128B V regs
> >>> };
> >>>
> >>> The sc_extn[] would have V state (regfile + control state) in kernel
> >>> defined format.
> >>>
> >>> As I understand it, you are suggesting to prevent ABI break, we should
> >>> not add anything to kernel struct sigcontext i.e. do something like this
> >>>
> >>> struct rt_sigframe {
> >>>    struct siginfo info;
> >>>    struct ucontext uc;
> >>> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> >>> }
> >>>
> >>> So kernel sig handling can continue to save/restore the V regfile on
> >>> user stack, w/o making it part of actual struct sigcontext.
> >>> So they are not explicitly visible to userspace at all - is that
> >>> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp
> >>> regs, they won't do it V.
> >>> Is there a POSIX req for SA_SIGINFO handlers being able to access all
> >>> machine regs saved by signal handling.
> >>>
> >>> An alternate approach is what Vincent did originally, to add sc_exn to
> >>> struct sigcontext. Here to prevent ABI breakage, we can choose to not
> >>> reflect this in the glibc sigcontext. But the question remains, is
> >>> that OK ?
> >>>
> >>> The other topic is changing glibc mcontext_t to add V-regs. It would
> >>> seem one has to as mcontext is "visually equivalent" to struct
> >>> sigcontext in the respective ucontext structs. But in unserspace
> >>> *context routine semantics only require callee-regs to be saved, which
> >>> V regs are not per psABI [2]. So looks like this can be avoided which
> >>> is what Vincent did in v2 series [3]
> >>>
> >>>
> >>> [1]
> >>> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> >>> [2]
> >>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> >>> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html
>
  
Vincent Chen Dec. 22, 2022, 3:37 a.m. UTC | #5
On Thu, Dec 22, 2022 at 3:52 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 12/21/22 11:45, Vineet Gupta wrote:
> >
> > 4. Kernel with RVV support + user program using original Glibc sigcontext
> > In this case, the kernel needs to save vector registers context to
> > memory. The user program may encounter memory issues if the user space
> > does not reserve enough memory size for the kernel to create the
> > sigcontext. However, we can't seem to avoid this case since there is
> > no flexible memory area in struct sigcontext for future expansion.
>
> This is not an issue, if we don't change sigcontext (in kernel and
> glibc) - it is essentially the case of existing binaries.
> kernel still saves regs on user stack, in rt_sigframe, its just that
> userspace is not able to access them in SA_SIGINFO signal handers.
> aarch64 have this implemented this approach and it is likely they can't
> do that either for SVE regs.

Sorry, I don't clearly describe the case. As you mentioned, the kernel
will save the vector registers to the user stack or user-specified
memory region by struct rt_sigframe in your patch. But, if there is an
existing binary compiled with the original sigcontext.h, it will
assume that the kernel only occupies the sizeof(struct sigcontext) to
save these registers. It will not aware the RVV extension is supported
and not expect that the kernel with RVV support needs an extra huge
memory region on its stack or specified memory region to save vector
registers context. In this case, the user program will encounter
memory corruption issues if the size of the memory region specified by
the user program is not enough to store these vector registers'
context.
  
Richard Henderson Dec. 22, 2022, 5:32 a.m. UTC | #6
On 12/21/22 11:52, Vineet Gupta wrote:
> This is not an issue, if we don't change sigcontext (in kernel and glibc) - it is 
> essentially the case of existing binaries. kernel still saves regs on user stack, in
> rt_sigframe, its just that userspace is not able to access them in SA_SIGINFO signal
> handers. aarch64 have this implemented this approach and it is likely they can't do
> that either for SVE regs.

aarch64 can certainly access the SVE regs on the signal stack.  It simply requires that 
you parse the chain of extensions within __reserved to find it.
It's quite well designed, really.

What you can't do is "only" declare a sigcontext_t and be able to construct a new context, 
nor copy the entire context via structure assignment.

There is room within the risc-v context for a similar scheme, via

     sigcontext.sc_fpregs.q.reserved[3]

E.g.

     reserved[0] -> magic
     reserved[1] -> displacement to "extension area"
     reserved[2] -> size of "extension area"

Thus the area can be located anywhere within 4GB and expand to 4GB.
Not that I'd hope any signal frame would require 4GB.  :-)


r~
  
Richard Henderson Dec. 22, 2022, 5:34 a.m. UTC | #7
On 12/21/22 11:45, Vineet Gupta wrote:
>> + __extension__ union {
>> + unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
>> + /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
>> + to the signal context, so please use sc_fpregs to access these
>> + fpu registers from the signal context. */
>> + union __riscv_fp_state sc_fpregs;
>> + };
>> +
>> + __u8 sc_extn[] __attribute__((__aligned__(16)));
>>   };
>>
>>   #endif
>>
>>
>> This change can reduce memory waste size to 16 bytes in the worst
>> case. The best case happens when the sc_extn locates at a 16-byte
>> aligned address. The size of the struct sigcontext is still the same.
> 
> Its a neat trick. But the additional stack alignment means we could still potentially 
> changing the size of sigcontext - even if by 16 bytes - again for existing binaries.

The riscv sigcontext is already aligned by 16, via __riscv_q_ext_state, fwiw.


r~
  
Andy Chiu Dec. 22, 2022, 6:33 p.m. UTC | #8
On Thu, Dec 22, 2022 at 1:32 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
> E.g.
>
>      reserved[0] -> magic
>      reserved[1] -> displacement to "extension area"
>      reserved[2] -> size of "extension area"
>
> Thus the area can be located anywhere within 4GB and expand to 4GB.
> Not that I'd hope any signal frame would require 4GB.  :-)
>

By encoding the extension magic into fp reserved space, and attaching
actual Vector states underneath, it is possible to make no size
changes to the sigcontext itself. In fact the comment section of
__riscv_q_ext_state specifies those bytes were purposely reserved for
sigcontext expansion. If this is the case then maybe we should just
use those reserved spaces anyway.

struct __riscv_q_ext_state {
        __u64 f[64] __attribute__((aligned(16)));
        __u32 fcsr;
        /*
         * Reserved for expansion of sigcontext structure.  Currently zeroed
         * upon signal, and must be zero upon sigreturn.
         */
        __u32 reserved[3];
};

Here is a way that keeps the size and layout of sigcontext, while it
also manages to let the kernel write Vector state into an user's
signal stack. This approach also lets the user space leverage existing
reserved space to get context from new extensions. We introduce a new
struct, __riscv_extra_ext_header, unioning with __riscv_fp_state in
sigcontext. __riscv_extra_ext_header is the same size as
__riscv_fp_state. The only purpose of the struct is to point to the
magic header of a following extension, e.g. Vector, located at the
reserved space. If there is no more extension to come, then all of
those bytes should be zeros.

 struct sigcontext {
        struct user_regs_struct sc_regs;
-       union __riscv_fp_state sc_fpregs;
+       union {
+               union __riscv_fp_state sc_fpregs;
+               struct __riscv_extra_ext_header sc_extdesc;
+       };
 };

I wrote a PoC patch for this and it has been pushed into the following git tree:
https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
I tested it on a rv32 QEMU virt machine and the user space can get/set
Vector registers normally. I haven't tested it on rv64 yet but it
should be no difference. The patch is not the final version and maybe
I missed some basic ideas. But if everyone agrees with this approach
then I would like to start formalizing and submit the series.
  
Vineet Gupta Dec. 22, 2022, 7:25 p.m. UTC | #9
On 12/21/22 19:37, Vincent Chen wrote:
> On Thu, Dec 22, 2022 at 3:52 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>>
>>
>> On 12/21/22 11:45, Vineet Gupta wrote:
>>> 4. Kernel with RVV support + user program using original Glibc sigcontext
>>> In this case, the kernel needs to save vector registers context to
>>> memory. The user program may encounter memory issues if the user space
>>> does not reserve enough memory size for the kernel to create the
>>> sigcontext. However, we can't seem to avoid this case since there is
>>> no flexible memory area in struct sigcontext for future expansion.
>> This is not an issue, if we don't change sigcontext (in kernel and
>> glibc) - it is essentially the case of existing binaries.
>> kernel still saves regs on user stack, in rt_sigframe, its just that
>> userspace is not able to access them in SA_SIGINFO signal handers.
>> aarch64 have this implemented this approach and it is likely they can't
>> do that either for SVE regs.
> Sorry, I don't clearly describe the case. As you mentioned, the kernel
> will save the vector registers to the user stack or user-specified
> memory region by struct rt_sigframe in your patch. But, if there is an
> existing binary compiled with the original sigcontext.h, it will
> assume that the kernel only occupies the sizeof(struct sigcontext) to
> save these registers. It will not aware the RVV extension is supported
> and not expect that the kernel with RVV support needs an extra huge
> memory region on its stack or specified memory region to save vector
> registers context. In this case, the user program will encounter
> memory corruption issues if the size of the memory region specified by
> the user program is not enough to store these vector registers'
> context.

No, it will not. In this scheme struct sigcontext remains same as 
before. Kernel is copying the RVV context not in sigcontext, but beyond 
the canonical sigcontext, in layout specified in the rt_sigframe. Please 
take a look at my patch again. It works.

Again I don't care what scheme we follow, I just want o make forward 
progress.

-Vineet
  
Vineet Gupta Dec. 22, 2022, 8:27 p.m. UTC | #10
On 12/22/22 10:33, Andy Chiu wrote:
> On Thu, Dec 22, 2022 at 1:32 PM Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> E.g.
>>
>>       reserved[0] -> magic
>>       reserved[1] -> displacement to "extension area"
>>       reserved[2] -> size of "extension area"
>>
>> Thus the area can be located anywhere within 4GB and expand to 4GB.
>> Not that I'd hope any signal frame would require 4GB.  :-)
>>
> By encoding the extension magic into fp reserved space, and attaching
> actual Vector states underneath, it is possible to make no size
> changes to the sigcontext itself. In fact the comment section of
> __riscv_q_ext_state specifies those bytes were purposely reserved for
> sigcontext expansion. If this is the case then maybe we should just
> use those reserved spaces anyway.
>
> struct __riscv_q_ext_state {
>          __u64 f[64] __attribute__((aligned(16)));
>          __u32 fcsr;
>          /*
>           * Reserved for expansion of sigcontext structure.  Currently zeroed
>           * upon signal, and must be zero upon sigreturn.
>           */
>          __u32 reserved[3];
> };
>
> Here is a way that keeps the size and layout of sigcontext, while it
> also manages to let the kernel write Vector state into an user's
> signal stack. This approach also lets the user space leverage existing
> reserved space to get context from new extensions. We introduce a new
> struct, __riscv_extra_ext_header, unioning with __riscv_fp_state in
> sigcontext. __riscv_extra_ext_header is the same size as
> __riscv_fp_state. The only purpose of the struct is to point to the
> magic header of a following extension, e.g. Vector, located at the
> reserved space. If there is no more extension to come, then all of
> those bytes should be zeros.
>
>   struct sigcontext {
>          struct user_regs_struct sc_regs;
> -       union __riscv_fp_state sc_fpregs;
> +       union {
> +               union __riscv_fp_state sc_fpregs;
> +               struct __riscv_extra_ext_header sc_extdesc;
> +       };
>   };
>
> I wrote a PoC patch for this and it has been pushed into the following git tree:
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
> I tested it on a rv32 QEMU virt machine and the user space can get/set
> Vector registers normally. I haven't tested it on rv64 yet but it
> should be no difference. The patch is not the final version and maybe
> I missed some basic ideas. But if everyone agrees with this approach
> then I would like to start formalizing and submit the series.

This approach looks perfect. Lets productize it to fold this patch into 
the respective patch(es).
We would then need fixups to not unconditionally enable V on fork/execve 
and hook that up to a prctl.
Let me work on that and provide something on top of your series.

-Vineet
  
Vineet Gupta Dec. 22, 2022, 8:30 p.m. UTC | #11
On 12/21/22 21:32, Richard Henderson wrote:
> On 12/21/22 11:52, Vineet Gupta wrote:
>> This is not an issue, if we don't change sigcontext (in kernel and 
>> glibc) - it is essentially the case of existing binaries. kernel 
>> still saves regs on user stack, in
>> rt_sigframe, its just that userspace is not able to access them in 
>> SA_SIGINFO signal
>> handers. aarch64 have this implemented this approach and it is likely 
>> they can't do
>> that either for SVE regs.
>
> aarch64 can certainly access the SVE regs on the signal stack.  It 
> simply requires that you parse the chain of extensions within 
> __reserved to find it.
> It's quite well designed, really.

Yep I've been staring at it this week and really appreciate the 
extensible design. Indeed one can do thru the existing __reserved in 
sigcontext to access that from userspace.


>
> What you can't do is "only" declare a sigcontext_t and be able to 
> construct a new context, nor copy the entire context via structure 
> assignment.
>
> There is room within the risc-v context for a similar scheme, via
>
>     sigcontext.sc_fpregs.q.reserved[3]
>
> E.g.
>
>     reserved[0] -> magic
>     reserved[1] -> displacement to "extension area"
>     reserved[2] -> size of "extension area"
>
> Thus the area can be located anywhere within 4GB and expand to 4GB.
> Not that I'd hope any signal frame would require 4GB.  :-)

Looks like we almost missed this. Thx for the pointer Richard.

-Vineet
  
Andrew Waterman Dec. 22, 2022, 9:38 p.m. UTC | #12
On Thu, Dec 22, 2022 at 12:30 PM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 12/21/22 21:32, Richard Henderson wrote:
> > On 12/21/22 11:52, Vineet Gupta wrote:
> >> This is not an issue, if we don't change sigcontext (in kernel and
> >> glibc) - it is essentially the case of existing binaries. kernel
> >> still saves regs on user stack, in
> >> rt_sigframe, its just that userspace is not able to access them in
> >> SA_SIGINFO signal
> >> handers. aarch64 have this implemented this approach and it is likely
> >> they can't do
> >> that either for SVE regs.
> >
> > aarch64 can certainly access the SVE regs on the signal stack.  It
> > simply requires that you parse the chain of extensions within
> > __reserved to find it.
> > It's quite well designed, really.
>
> Yep I've been staring at it this week and really appreciate the
> extensible design. Indeed one can do thru the existing __reserved in
> sigcontext to access that from userspace.

Sorry y'all had to reverse-engineer our logic: this was exactly our
intent for those reserved words when we defined the current ABI.  It's
also why the current ABI requires them to be zero: as a sentinel to
signify the end of the list of extension areas.

>
>
> >
> > What you can't do is "only" declare a sigcontext_t and be able to
> > construct a new context, nor copy the entire context via structure
> > assignment.
> >
> > There is room within the risc-v context for a similar scheme, via
> >
> >     sigcontext.sc_fpregs.q.reserved[3]
> >
> > E.g.
> >
> >     reserved[0] -> magic
> >     reserved[1] -> displacement to "extension area"
> >     reserved[2] -> size of "extension area"
> >
> > Thus the area can be located anywhere within 4GB and expand to 4GB.
> > Not that I'd hope any signal frame would require 4GB.  :-)
>
> Looks like we almost missed this. Thx for the pointer Richard.
>
> -Vineet
>
  
Richard Henderson Dec. 22, 2022, 10:33 p.m. UTC | #13
On 12/22/22 10:33, Andy Chiu wrote:
> I wrote a PoC patch for this and it has been pushed into the following git tree:
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13

I had a look at your include/uapi/, and it looks good.
Mere nits:

> struct __riscv_q_ext_state {
> 	__u64 f[64] __attribute__((aligned(16)));
> 	__u32 fcsr;
> 	/*
> 	 * Reserved for expansion of sigcontext structure.  Currently zeroed
> 	 * upon signal, and must be zero upon sigreturn.
> 	 */
> 	__u32 reserved[3];
> };
> 
> struct __riscv_ctx_hdr {
> 	__u32 magic;
> 	__u32 size;
> 	__u32 reserved;
> };

Thinking about the _next_ extension on the chain, perhaps drop the 3rd word from here, so 
that (&hdr + 1) is 8-byte aligned (which may be enough depending on what the extension 
contains)?

> struct __riscv_extra_ext_header {
> 	__u64 ignored[64] __attribute__((aligned(16)));
> 	__u32 padding;
> 	/*
> 	 * Reserved for expansion of sigcontext structure.  Currently zeroed
> 	 * upon signal, and must be zero upon sigreturn.
> 	 */
> 	struct __riscv_ctx_hdr hdr;
> };

     __u32 __padding[129]
or
     __u64 __padding[65]

depending on your answer to the above?

It might reduce confusion to move (or replicate, for redundancy) the aligned(16) from the 
innermost __riscv_q_ext_state.f[] to the outermost sc_fpregs and/or sigcontext.
  
Conor Dooley Dec. 22, 2022, 11:47 p.m. UTC | #14
On Fri, Dec 23, 2022 at 02:33:26AM +0800, Andy Chiu wrote:
 
> I wrote a PoC patch for this and it has been pushed into the following git tree:
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
> I tested it on a rv32 QEMU virt machine and the user space can get/set
> Vector registers normally. I haven't tested it on rv64 yet but it
> should be no difference. The patch is not the final version and maybe
> I missed some basic ideas.

> But if everyone agrees with this approach
> then I would like to start formalizing and submit the series.

Between yourself and the Rivos folk, you should probably sort out who is
doing what with the series at the very least, so that you're not both
working on "competing" v13s...
  
Vineet Gupta Dec. 22, 2022, 11:58 p.m. UTC | #15
On 12/22/22 15:47, Conor Dooley wrote:
> On Fri, Dec 23, 2022 at 02:33:26AM +0800, Andy Chiu wrote:
>   
>> I wrote a PoC patch for this and it has been pushed into the following git tree:
>> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
>> I tested it on a rv32 QEMU virt machine and the user space can get/set
>> Vector registers normally. I haven't tested it on rv64 yet but it
>> should be no difference. The patch is not the final version and maybe
>> I missed some basic ideas.
>> But if everyone agrees with this approach
>> then I would like to start formalizing and submit the series.
> Between yourself and the Rivos folk, you should probably sort out who is
> doing what with the series at the very least, so that you're not both
> working on "competing" v13s...

No we are not competing ;-)
I'm mostly facilitating since this got stuck in a stalemate and original 
contributors had gone radio silent for a while.

-Vineet
  
Vincent Chen Dec. 23, 2022, 2:27 a.m. UTC | #16
On Fri, Dec 23, 2022 at 3:25 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
> On 12/21/22 19:37, Vincent Chen wrote:
> > On Thu, Dec 22, 2022 at 3:52 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> >>
> >>
> >> On 12/21/22 11:45, Vineet Gupta wrote:
> >>> 4. Kernel with RVV support + user program using original Glibc sigcontext
> >>> In this case, the kernel needs to save vector registers context to
> >>> memory. The user program may encounter memory issues if the user space
> >>> does not reserve enough memory size for the kernel to create the
> >>> sigcontext. However, we can't seem to avoid this case since there is
> >>> no flexible memory area in struct sigcontext for future expansion.
> >> This is not an issue, if we don't change sigcontext (in kernel and
> >> glibc) - it is essentially the case of existing binaries.
> >> kernel still saves regs on user stack, in rt_sigframe, its just that
> >> userspace is not able to access them in SA_SIGINFO signal handers.
> >> aarch64 have this implemented this approach and it is likely they can't
> >> do that either for SVE regs.
> > Sorry, I don't clearly describe the case. As you mentioned, the kernel
> > will save the vector registers to the user stack or user-specified
> > memory region by struct rt_sigframe in your patch. But, if there is an
> > existing binary compiled with the original sigcontext.h, it will
> > assume that the kernel only occupies the sizeof(struct sigcontext) to
> > save these registers. It will not aware the RVV extension is supported
> > and not expect that the kernel with RVV support needs an extra huge
> > memory region on its stack or specified memory region to save vector
> > registers context. In this case, the user program will encounter
> > memory corruption issues if the size of the memory region specified by
> > the user program is not enough to store these vector registers'
> > context.
>
> No, it will not. In this scheme struct sigcontext remains same as
> before. Kernel is copying the RVV context not in sigcontext, but beyond
> the canonical sigcontext, in layout specified in the rt_sigframe. Please
> take a look at my patch again. It works.

If I understand correctly, in your patch, the kernel uses rt_sigframe
to back up all register contexts in the user space, including RVV
registers. Therefore, the user program needs to reserve enough memory
space for the kernel, which enough size of this memory space is the
sizeof(rt_sigframe) plus rvv_sc_size. However, the rvv_sc_size is
unexpected to existing RISC-V programs. Therefore, some memory of the
existing program may be corrupted by the kernel when the kernel backs
up the RVV registers context.

>
> Again I don't care what scheme we follow, I just want o make forward
> progress.
>

 I understand your thoughts and I sincerely appreciate everything you do.

> -Vineet
>
  
Vineet Gupta Dec. 23, 2022, 7:42 p.m. UTC | #17
On 12/22/22 18:27, Vincent Chen wrote:
> If I understand correctly, in your patch, the kernel uses rt_sigframe
> to back up all register contexts in the user space, including RVV
> registers.

Discussing this all moot point but still...

> Therefore, the user program needs to reserve enough memory
> space for the kernel, which enough size of this memory space is the
> sizeof(rt_sigframe) plus rvv_sc_size.

In my patch, rt_sigframe has the c99 flexible array. So it doesn't add 
any extra space on its own.
The total size increase is same whether we add it to kernel sigcontext 
or rt_sigframe. And since glibc sigcontext is not changed, application 
is unaware of rvv_sc_size in either case.

> However, the rvv_sc_size is
> unexpected to existing RISC-V programs.

Again not sure how it is different in both cases.

> Therefore, some memory of the
> existing program may be corrupted by the kernel when the kernel backs
> up the RVV registers context.

kernel builds signal frame on top of existing user stack.

setup_rt_frame
     get_sigframe
           sp = regs->sp;

So it can't possibly corrupt any existing user stack area. Sure when 
expanding the stack user stack rlimit etc may hit when doing put_user. 
But again that is same for both approaches.

FWIW kernel with my patch can be found below: it survives full glibc 
testsuite run w/o any regression so it definitely works w/o any obvious 
user memory corruption.

git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/linux.git 
#rvv-v13.2-use-rt_sigframe

-Vineet
  
Andy Chiu Dec. 28, 2022, 10:53 a.m. UTC | #18
On Fri, Dec 23, 2022 at 4:28 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> This approach looks perfect. Lets productize it to fold this patch into
> the respective patch(es).
> We would then need fixups to not unconditionally enable V on fork/execve
> and hook that up to a prctl.
> Let me work on that and provide something on top of your series.

Hi Vineet, I have included the approach into the Vector series
according to suggestions, which makes it formaler than the PoC one.
Additionally, I picked up your prctl patch and added a kconfig to
compile a kernel that won't unconditionally enable V. Please tell me
if this does not seem right to you. I will submit the series if this
seems well to you and let's discuss some more details further in that
thread. Here is the tree, thanks:

https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13.1-newapi-prctl

-Andy
  
Vineet Gupta Jan. 3, 2023, 7:17 p.m. UTC | #19
Hi Andy,

On 12/28/22 02:53, Andy Chiu wrote:
> On Fri, Dec 23, 2022 at 4:28 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>> This approach looks perfect. Lets productize it to fold this patch into
>> the respective patch(es).
>> We would then need fixups to not unconditionally enable V on fork/execve
>> and hook that up to a prctl.
>> Let me work on that and provide something on top of your series.
> Hi Vineet, I have included the approach into the Vector series
> according to suggestions, which makes it formaler than the PoC one.
> Additionally, I picked up your prctl patch and added a kconfig to
> compile a kernel that won't unconditionally enable V. Please tell me
> if this does not seem right to you.

The prctl support in there is really rudimentary and incomplete. There's 
more work needed to use the dynamic state of enablement - for say signal 
frame etc. The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a 
hack bolted on top.
It would be best to drop it in the current state and rework properly 
based on your patches.

> I will submit the series if this
> seems well to you and let's discuss some more details further in that
> thread. Here is the tree, thanks:
>
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13.1-newapi-prctl

I would also suggesting dropping the 2 patches for in-kernel enablement 
for your submission as it might require some more thinking/design and 
builds naturally on top of the baseline patches.

Thx,
-Vineet
  
Andy Chiu Jan. 4, 2023, 4:34 p.m. UTC | #20
Hi Vineet,

On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> The prctl support in there is really rudimentary and incomplete. There's
> more work needed to use the dynamic state of enablement - for say signal
> frame etc.
Yes, I agree that signal and ptrace need special handling if we'd turn
off Vector with prctl. For example, we may not need to save/restore
vector context on context switches and signal handlings. And we may
have to prevent ptrace from setting/getting vector context in such
case. I can implement this into the series if this is what you're
looking for. Or you could share the code somewhere so that I could
merge it.

> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> hack bolted on top.
IIUC, most opinions suggested that we should keep the default Vector
state to ON in thread:
https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
So IMHO adding a build option to those who prefer not to
unconditionally enable V should be sufficient.

> I would also suggesting dropping the 2 patches for in-kernel enablement
> for your submission as it might require some more thinking/design and
> builds naturally on top of the baseline patches.
Yes, I agree. Those patches were heavily copied from arm neon, which
will not benefit from hardware feature on riscv-V. I will refine those
patches and submit independently, on top of the baseline patch.

Thanks,
Andy
  
Vineet Gupta Jan. 4, 2023, 8:46 p.m. UTC | #21
On 1/4/23 08:34, Andy Chiu wrote:
> Hi Vineet,
>
> On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>> The prctl support in there is really rudimentary and incomplete. There's
>> more work needed to use the dynamic state of enablement - for say signal
>> frame etc.
> Yes, I agree that signal and ptrace need special handling if we'd turn
> off Vector with prctl. For example, we may not need to save/restore
> vector context on context switches and signal handlings. And we may
> have to prevent ptrace from setting/getting vector context in such
> case. I can implement this into the series if this is what you're
> looking for.

Perfect. This is exactly the coverage I was hoping to see. Go for it.

>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
>> hack bolted on top.
> IIUC, most opinions suggested that we should keep the default Vector
> state to ON in thread:
> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u

Actually community feedback is that they *don't * want the default 
vector state to be on due to power implications, increased stack and 
memory usage for vector contents (in that thread and else where as 
well). So we should keep it disabled by default, but indeed we could 
have that Kconfig option to enable it. Granted distro kernels will keep 
it disabled by default, this lets vendors enable it selectively until 
the full userspace enabling bits are in place.

> So IMHO adding a build option to those who prefer not to
> unconditionally enable V should be sufficient.

As above, it should be other way round.

Thx,
-Vineet
  
Philipp Tomsich Jan. 4, 2023, 9:29 p.m. UTC | #22
On Wed, 4 Jan 2023 at 21:46, Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 1/4/23 08:34, Andy Chiu wrote:
> > Hi Vineet,
> >
> > On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> >> The prctl support in there is really rudimentary and incomplete. There's
> >> more work needed to use the dynamic state of enablement - for say signal
> >> frame etc.
> > Yes, I agree that signal and ptrace need special handling if we'd turn
> > off Vector with prctl. For example, we may not need to save/restore
> > vector context on context switches and signal handlings. And we may
> > have to prevent ptrace from setting/getting vector context in such
> > case. I can implement this into the series if this is what you're
> > looking for.
>
> Perfect. This is exactly the coverage I was hoping to see. Go for it.
>
> >> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> >> hack bolted on top.
> > IIUC, most opinions suggested that we should keep the default Vector
> > state to ON in thread:
> > https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
>
> Actually community feedback is that they *don't * want the default
> vector state to be on due to power implications, increased stack and
> memory usage for vector contents (in that thread and else where as
> well). So we should keep it disabled by default, but indeed we could
> have that Kconfig option to enable it. Granted distro kernels will keep
> it disabled by default, this lets vendors enable it selectively until
> the full userspace enabling bits are in place.

Should we punt this to the ELF (e.g., using a RISC-V specific
attribute) and take a per-process decision on whether to start in ON
or OFF?
I don't feel fully comfortable with a KCONFIG that could change and
invalidate the assumptions a userspace process could have made…

Alternatively, we could establish the convention of having two stub
libraries that set up either enabled or disable state from their
.init_array to provide a mechanism for folks that want to make an
explicit assumption.  Although this may try to overdesign a solution
for a non-issue.

Philipp.

>
> > So IMHO adding a build option to those who prefer not to
> > unconditionally enable V should be sufficient.
>
> As above, it should be other way round.
>
> Thx,
> -Vineet
  
Andrew Waterman Jan. 4, 2023, 9:37 p.m. UTC | #23
On Wed, Jan 4, 2023 at 1:29 PM Philipp Tomsich <philipp.tomsich@vrull.eu> wrote:
>
> On Wed, 4 Jan 2023 at 21:46, Vineet Gupta <vineetg@rivosinc.com> wrote:
> >
> >
> >
> > On 1/4/23 08:34, Andy Chiu wrote:
> > > Hi Vineet,
> > >
> > > On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> > >> The prctl support in there is really rudimentary and incomplete. There's
> > >> more work needed to use the dynamic state of enablement - for say signal
> > >> frame etc.
> > > Yes, I agree that signal and ptrace need special handling if we'd turn
> > > off Vector with prctl. For example, we may not need to save/restore
> > > vector context on context switches and signal handlings. And we may
> > > have to prevent ptrace from setting/getting vector context in such
> > > case. I can implement this into the series if this is what you're
> > > looking for.
> >
> > Perfect. This is exactly the coverage I was hoping to see. Go for it.
> >
> > >> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> > >> hack bolted on top.
> > > IIUC, most opinions suggested that we should keep the default Vector
> > > state to ON in thread:
> > > https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
> >
> > Actually community feedback is that they *don't * want the default
> > vector state to be on due to power implications, increased stack and
> > memory usage for vector contents (in that thread and else where as
> > well). So we should keep it disabled by default, but indeed we could
> > have that Kconfig option to enable it. Granted distro kernels will keep
> > it disabled by default, this lets vendors enable it selectively until
> > the full userspace enabling bits are in place.
>
> Should we punt this to the ELF (e.g., using a RISC-V specific
> attribute) and take a per-process decision on whether to start in ON
> or OFF?
> I don't feel fully comfortable with a KCONFIG that could change and
> invalidate the assumptions a userspace process could have made…

I am supremely confident we will eventually have userspace that
unconditionally wants V (for optimized C library routines at minimum),
and that it will follow very closely on the heels of V becoming
mainstream.  So, your proposal to embed this information in the ELF
header (so that the kernel can enable V automatically on program load,
or so the dynamic loader can execute the `prctl` call on library load,
or whatever) seems more forward-looking to me than making this a
Kconfig option.

>
> Alternatively, we could establish the convention of having two stub
> libraries that set up either enabled or disable state from their
> .init_array to provide a mechanism for folks that want to make an
> explicit assumption.  Although this may try to overdesign a solution
> for a non-issue.
>
> Philipp.
>
> >
> > > So IMHO adding a build option to those who prefer not to
> > > unconditionally enable V should be sufficient.
> >
> > As above, it should be other way round.
> >
> > Thx,
> > -Vineet
  
Vineet Gupta Jan. 4, 2023, 10:43 p.m. UTC | #24
On 1/4/23 13:29, Philipp Tomsich wrote:
>>>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
>>>> hack bolted on top.
>>> IIUC, most opinions suggested that we should keep the default Vector
>>> state to ON in thread:
>>> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
>> Actually community feedback is that they *don't * want the default
>> vector state to be on due to power implications, increased stack and
>> memory usage for vector contents (in that thread and else where as
>> well). So we should keep it disabled by default, but indeed we could
>> have that Kconfig option to enable it. Granted distro kernels will keep
>> it disabled by default, this lets vendors enable it selectively until
>> the full userspace enabling bits are in place.
> Should we punt this to the ELF (e.g., using a RISC-V specific
> attribute) and take a per-process decision on whether to start in ON
> or OFF?
> I don't feel fully comfortable with a KCONFIG that could change and
> invalidate the assumptions a userspace process could have made…

The Kconfig is just a stop gap for vendors to enable V development while 
the full userspace stuff is sorted out.

Indeed RISCV_ATTRIBUTES section has -march info, but we need to do some 
development around it to parse it and use it.
There are still corner cases such as non-V executable dlopen a dso - so 
kernel elf parser doing this might not cover all cases.
Similar logic will need to be added to glibc loader - eventually.

Adding the full plumbing is a chicken-and-egg problem.


> Alternatively, we could establish the convention of having two stub
> libraries that set up either enabled or disable state from their
> .init_array to provide a mechanism for folks that want to make an
> explicit assumption.  Although this may try to overdesign a solution
> for a non-issue.

I was thinking more along the lines of x86 GLIBC_TUNABLES to enable it 
via env/sub-shell on a per-task basis - the tunable hook could in turn 
verify that Vector support does exist - or it could invoke the prctl 
unconditionally (which would fail if V didn't exist etc).
  
Kito Cheng Jan. 9, 2023, 1:33 p.m. UTC | #25
Hi Vineet:

> >>>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> >>>> hack bolted on top.
> >>> IIUC, most opinions suggested that we should keep the default Vector
> >>> state to ON in thread:
> >>> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
> >> Actually community feedback is that they *don't * want the default
> >> vector state to be on due to power implications, increased stack and
> >> memory usage for vector contents (in that thread and else where as
> >> well). So we should keep it disabled by default, but indeed we could
> >> have that Kconfig option to enable it. Granted distro kernels will keep
> >> it disabled by default, this lets vendors enable it selectively until
> >> the full userspace enabling bits are in place.
> > Should we punt this to the ELF (e.g., using a RISC-V specific
> > attribute) and take a per-process decision on whether to start in ON
> > or OFF?
> > I don't feel fully comfortable with a KCONFIG that could change and
> > invalidate the assumptions a userspace process could have made…
>
> The Kconfig is just a stop gap for vendors to enable V development while
> the full userspace stuff is sorted out.
>
> Indeed RISCV_ATTRIBUTES section has -march info, but we need to do some
> development around it to parse it and use it.

I don't think RISCV_ATTRIBUTES is the right place to check that - even if the
program compiles without V, it still can enable V and then get performance
benefit by ifunc in glibc, or even some 3rd party libraries might also be
optimized with V ext.

And don't forget other shared libraries in the system,
are we going to check all dependent libraries at program load time?
it will require resolving the library dependency at kernel.
Or we intend to enable V only if executable compiles with V?

> There are still corner cases such as non-V executable dlopen a dso - so
> kernel elf parser doing this might not cover all cases.
>
> Similar logic will need to be added to glibc loader - eventually.
>
> Adding the full plumbing is a chicken-and-egg problem.
>
>
> > Alternatively, we could establish the convention of having two stub
> > libraries that set up either enabled or disable state from their
> > .init_array to provide a mechanism for folks that want to make an
> > explicit assumption.  Although this may try to overdesign a solution
> > for a non-issue.
>
> I was thinking more along the lines of x86 GLIBC_TUNABLES to enable it
> via env/sub-shell on a per-task basis - the tunable hook could in turn
> verify that Vector support does exist - or it could invoke the prctl
> unconditionally (which would fail if V didn't exist etc).
  
Vineet Gupta Jan. 9, 2023, 7:16 p.m. UTC | #26
Hi Kito,

On 1/9/23 05:33, Kito Cheng wrote:
> Hi Vineet:
>
>>>>>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
>>>>>> hack bolted on top.
>>>>> IIUC, most opinions suggested that we should keep the default Vector
>>>>> state to ON in thread:
>>>>> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
>>>> Actually community feedback is that they *don't * want the default
>>>> vector state to be on due to power implications, increased stack and
>>>> memory usage for vector contents (in that thread and else where as
>>>> well). So we should keep it disabled by default, but indeed we could
>>>> have that Kconfig option to enable it. Granted distro kernels will keep
>>>> it disabled by default, this lets vendors enable it selectively until
>>>> the full userspace enabling bits are in place.
>>> Should we punt this to the ELF (e.g., using a RISC-V specific
>>> attribute) and take a per-process decision on whether to start in ON
>>> or OFF?
>>> I don't feel fully comfortable with a KCONFIG that could change and
>>> invalidate the assumptions a userspace process could have made…
>> The Kconfig is just a stop gap for vendors to enable V development while
>> the full userspace stuff is sorted out.
>>
>> Indeed RISCV_ATTRIBUTES section has -march info, but we need to do some
>> development around it to parse it and use it.
> I don't think RISCV_ATTRIBUTES is the right place to check that -

What a timing. I just finished testing initial kernel patch to parse the 
elf section and on to tag parsing now ;-)
https://git.kernel.org/pub/scm/linux/kernel/git/vgupta/linux.git/log/?h=topic-elf-attr 
<https://git.kernel.org/pub/scm/linux/kernel/git/vgupta/linux.git/log/?h=topic-elf-attr>

> even if the
> program compiles without V, it still can enable V and then get performance
> benefit by ifunc in glibc, or even some 3rd party libraries might also be
> optimized with V ext.

Right kernel can only handle dynamic executable and/or the the loader 
itself. If V is used distro wide we are covered.
And it can then also pass this info (V enabled as HWCAP*, no need for 
everything)

But you are not suggesting that there is a scenario with executable 
built somehow with V instructions (even .byte encoded) but not have that 
info encoded in RV_ATTR_TAG_arch string. And I'd argue that it is user 
error, they need to make sure that -march had 'v' passed to compiler 
and/or assembler.

> And don't forget other shared libraries in the system,

No I've not forgotten about shared libs (and there's also a case of 
non-V built executable dlopen a V built dso) which can't be handled by 
above.

> are we going to check all dependent libraries at program load time?
> it will require resolving the library dependency at kernel.
> Or we intend to enable V only if executable compiles with V?

So we need a similar parsing in glibc loader which creates a union of "V 
enabled in any lib" and then invokes the prctl to enable, if it is not 
already.

-Vineet
  
Kito Cheng Jan. 10, 2023, 1:21 p.m. UTC | #27
Hi Vineet:


> But you are not suggesting that there is a scenario with executable
> built somehow with V instructions (even .byte encoded) but not have that
> info encoded in RV_ATTR_TAG_arch string. And I'd argue that it is user
> error, they need to make sure that -march had 'v' passed to compiler
> and/or assembler.


The concept of Tag_RISCV_arch attribute is minimal execution
environment requirement of the executable or shared libraries; use
glibc as an example, we can compile glibc with rv64gc only and then it
can contain vector optimized routines like memcpy and memcpy, and
those function are resolved by ifunc, which means only use those
routines when vector extension are available, so the Tag_RISCV_arch
for the glibc is rv64gc, not rv64gcv since V is not minimal execution
environment requirement.

My expectation is most distro will still distribute with rv64gc for a
while and then optimize function with vector extension for some
libraries, and those vector code will guarded with some runtime check
mechanism maybe IFUNC, so Tag_RISCV_arch for those libraries won't
contain V.

It's not clear in psABI spec, but intend to fix in future:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/292
  
Vineet Gupta Jan. 10, 2023, 6:07 p.m. UTC | #28
Hi Kito,

On 1/10/23 05:21, Kito Cheng wrote:
> Hi Vineet:
>
>
>> But you are not suggesting that there is a scenario with executable
>> built somehow with V instructions (even .byte encoded) but not have that
>> info encoded in RV_ATTR_TAG_arch string. And I'd argue that it is user
>> error, they need to make sure that -march had 'v' passed to compiler
>> and/or assembler.
>
> The concept of Tag_RISCV_arch attribute is minimal execution
> environment requirement of the executable or shared libraries; use
> glibc as an example, we can compile glibc with rv64gc only and then it
> can contain vector optimized routines like memcpy and memcpy, and
> those function are resolved by ifunc, which means only use those
> routines when vector extension are available, so the Tag_RISCV_arch
> for the glibc is rv64gc, not rv64gcv since V is not minimal execution
> environment requirement.

I understand where you are coming from. This "minimal" info can be used 
in a "compile-once-used-multiple" kind of a paradigm where a glibc with 
V enabled ifunc can still run on non-V hardware.


> My expectation is most distro will still distribute with rv64gc for a
> while and then optimize function with vector extension for some
> libraries, and those vector code will guarded with some runtime check
> mechanism maybe IFUNC, so Tag_RISCV_arch for those libraries won't
> contain V.

Yes bulk of glibc might not have vector code, but those V ifunc routines 
do and IMO this information needs to be recorded somewhere in the elf. 
Case in point being the current issue with how to enable V unit. 
Community wants a per-process enable, using an explicit prctl from 
userspace (since RV doesn't have fault-on-first use hardware mechanism 
unlike some of the other arches). But how does the glibc loader know to 
invoke prctl. We can't just rely on user env GLIBC_TUNABLE etc since 
that might not be accurate. It needs somethign concrete which IMO can 
come from elf attributes. If not, do you have suggestions on how to 
solve this issue ?

Granted the case of executable itself using V insns directly is less 
likely than the linked/dlopen dso, so we can punt this being done in 
kernel elf loader and do it in the glibc loader for the DT_NEEDED dsos.

> It's not clear in psABI spec, but intend to fix in future:
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/292

Please don't change the semantics of Tag_RISCV_arch itself. Keep the 
minimum if you want, but also have something which reflects the absolute 
-march used to build. If nothing it can be used to annotate binaries how 
they were built.

Thx,
-Vineet
  
Richard Henderson Jan. 11, 2023, 1:22 a.m. UTC | #29
On 1/10/23 10:07, Vineet Gupta wrote:
> Yes bulk of glibc might not have vector code, but those V ifunc routines do and IMO this 
> information needs to be recorded somewhere in the elf. Case in point being the current 
> issue with how to enable V unit. Community wants a per-process enable, using an explicit 
> prctl from userspace (since RV doesn't have fault-on-first use hardware mechanism unlike 
> some of the other arches). But how does the glibc loader know to invoke prctl. We can't 
> just rely on user env GLIBC_TUNABLE etc since that might not be accurate. It needs 
> somethign concrete which IMO can come from elf attributes. If not, do you have suggestions 
> on how to solve this issue ?

Why not just fault on first use to enable?  That's vastly less complicated than trying to 
plumb anything through elf resulting in a prctl.

You might say "but the fault could fail to allocate memory", but honestly, the prctl isn't 
able to fail either -- if it doesn't work, the process must exit.  And, surely, there's 
some minimal vector configuration for which the allocation must succeed.


r~
  
Jeff Law Jan. 11, 2023, 4:28 a.m. UTC | #30
On 1/10/23 18:22, Richard Henderson wrote:
> On 1/10/23 10:07, Vineet Gupta wrote:
>> Yes bulk of glibc might not have vector code, but those V ifunc 
>> routines do and IMO this information needs to be recorded somewhere in 
>> the elf. Case in point being the current issue with how to enable V 
>> unit. Community wants a per-process enable, using an explicit prctl 
>> from userspace (since RV doesn't have fault-on-first use hardware 
>> mechanism unlike some of the other arches). But how does the glibc 
>> loader know to invoke prctl. We can't just rely on user env 
>> GLIBC_TUNABLE etc since that might not be accurate. It needs somethign 
>> concrete which IMO can come from elf attributes. If not, do you have 
>> suggestions on how to solve this issue ?
> 
> Why not just fault on first use to enable?  That's vastly less 
> complicated than trying to plumb anything through elf resulting in a prctl.
Well, the answer is in Vineet's paragraph -- the hardware apparently 
doesn't have fault-on-first-use which is mighty unfortunate.

Jeff
  
Richard Henderson Jan. 11, 2023, 4:57 a.m. UTC | #31
On 1/10/23 20:28, Jeff Law wrote:
> 
> 
> On 1/10/23 18:22, Richard Henderson wrote:
>> On 1/10/23 10:07, Vineet Gupta wrote:
>>> Yes bulk of glibc might not have vector code, but those V ifunc routines do and IMO 
>>> this information needs to be recorded somewhere in the elf. Case in point being the 
>>> current issue with how to enable V unit. Community wants a per-process enable, using an 
>>> explicit prctl from userspace (since RV doesn't have fault-on-first use hardware 
>>> mechanism unlike some of the other arches). But how does the glibc loader know to 
>>> invoke prctl. We can't just rely on user env GLIBC_TUNABLE etc since that might not be 
>>> accurate. It needs somethign concrete which IMO can come from elf attributes. If not, 
>>> do you have suggestions on how to solve this issue ?
>>
>> Why not just fault on first use to enable?  That's vastly less complicated than trying 
>> to plumb anything through elf resulting in a prctl.
> Well, the answer is in Vineet's paragraph -- the hardware apparently doesn't have 
> fault-on-first-use which is mighty unfortunate.

Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just like fpu.
Now treat the vector unit just like fpu lazy migration.


r~
  
Anup Patel Jan. 11, 2023, 5:05 a.m. UTC | #32
On Wed, Jan 11, 2023 at 6:53 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 1/10/23 10:07, Vineet Gupta wrote:
> > Yes bulk of glibc might not have vector code, but those V ifunc routines do and IMO this
> > information needs to be recorded somewhere in the elf. Case in point being the current
> > issue with how to enable V unit. Community wants a per-process enable, using an explicit
> > prctl from userspace (since RV doesn't have fault-on-first use hardware mechanism unlike
> > some of the other arches). But how does the glibc loader know to invoke prctl. We can't
> > just rely on user env GLIBC_TUNABLE etc since that might not be accurate. It needs
> > somethign concrete which IMO can come from elf attributes. If not, do you have suggestions
> > on how to solve this issue ?
>
> Why not just fault on first use to enable?  That's vastly less complicated than trying to
> plumb anything through elf resulting in a prctl.
>
> You might say "but the fault could fail to allocate memory", but honestly, the prctl isn't
> able to fail either -- if it doesn't work, the process must exit.  And, surely, there's
> some minimal vector configuration for which the allocation must succeed.

IMO, this is a very good suggestion.

For the benefit of everyone, both sstatus.FS and sstatus.VS have the
following states:
1. Off (0): All off and any access to float / vector will result in exception
2. Initial (1): None dirty or clean, some on
3. Clean (2): None dirty, some clean
4. Dirty (3): Some dirty

For float, we are setting sstatus.FS = 1 (Initial) in start_thread() by
default for all tasks and we are doing lazy save-restore in fstate_save()
and fstate_restore().

For vector, we can take a different approach where start_thread()
will by default set sstatus.VS = 0 (Off) for all tasks. Now whenever
any task access vector state, Linux RISC-V will get an exception
and at that point in time we can allocate memory for the vector
state and also set sstatus.VS = 1 (Initial) for that task. The save
restore of the vector state can still be lazy for the tasks using it.

Regards,
Anup
  
Jeff Law Jan. 11, 2023, 5:07 a.m. UTC | #33
On 1/10/23 21:57, Richard Henderson wrote:
> On 1/10/23 20:28, Jeff Law wrote:
>>
>>
>> On 1/10/23 18:22, Richard Henderson wrote:
>>> On 1/10/23 10:07, Vineet Gupta wrote:
>>>> Yes bulk of glibc might not have vector code, but those V ifunc 
>>>> routines do and IMO this information needs to be recorded somewhere 
>>>> in the elf. Case in point being the current issue with how to enable 
>>>> V unit. Community wants a per-process enable, using an explicit 
>>>> prctl from userspace (since RV doesn't have fault-on-first use 
>>>> hardware mechanism unlike some of the other arches). But how does 
>>>> the glibc loader know to invoke prctl. We can't just rely on user 
>>>> env GLIBC_TUNABLE etc since that might not be accurate. It needs 
>>>> somethign concrete which IMO can come from elf attributes. If not, 
>>>> do you have suggestions on how to solve this issue ?
>>>
>>> Why not just fault on first use to enable?  That's vastly less 
>>> complicated than trying to plumb anything through elf resulting in a 
>>> prctl.
>> Well, the answer is in Vineet's paragraph -- the hardware apparently 
>> doesn't have fault-on-first-use which is mighty unfortunate.
> 
> Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just 
> like fpu.
> Now treat the vector unit just like fpu lazy migration.
Then let's do something sensible.    Manually enabling via prctl seems 
silly if we have fault on first use.

jeff
  
Richard Henderson Jan. 11, 2023, 5:23 a.m. UTC | #34
On 1/10/23 17:22, Richard Henderson wrote:
> And, surely, there's some minimal vector configuration for which the allocation must succeed.

To answer my own question here, no, there does not seem to be a way to cap VLMAX in the OS 
or the hypervisor -- vsetvli rd, r0, e8 will always set VL to the VLMAX for which the cpu 
is configured.

(ARM SVE can artificially limit the vector length.  Linux chooses the default vector 
length so that state fits within the existing 4k signal stack frame.  This is good enough 
for the vector usage within e.g. strlen.  In order to take advantage of any larger vector 
length the hardware may support, one must use a prctl.)


r~
  
Andy Chiu Jan. 11, 2023, 6 a.m. UTC | #35
On Wed, Jan 11, 2023 at 1:07 PM Jeff Law <jlaw@ventanamicro.com> wrote:
>
>
>
> On 1/10/23 21:57, Richard Henderson wrote:
> > On 1/10/23 20:28, Jeff Law wrote:
> >>
> >>
> >> On 1/10/23 18:22, Richard Henderson wrote:
> >>> On 1/10/23 10:07, Vineet Gupta wrote:
> >>>> Yes bulk of glibc might not have vector code, but those V ifunc
> >>>> routines do and IMO this information needs to be recorded somewhere
> >>>> in the elf. Case in point being the current issue with how to enable
> >>>> V unit. Community wants a per-process enable, using an explicit
> >>>> prctl from userspace (since RV doesn't have fault-on-first use
> >>>> hardware mechanism unlike some of the other arches). But how does
> >>>> the glibc loader know to invoke prctl. We can't just rely on user
> >>>> env GLIBC_TUNABLE etc since that might not be accurate. It needs
> >>>> somethign concrete which IMO can come from elf attributes. If not,
> >>>> do you have suggestions on how to solve this issue ?
> >>>
> >>> Why not just fault on first use to enable?  That's vastly less
> >>> complicated than trying to plumb anything through elf resulting in a
> >>> prctl.
> >> Well, the answer is in Vineet's paragraph -- the hardware apparently
> >> doesn't have fault-on-first-use which is mighty unfortunate.
> >
> > Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just
> > like fpu.
> > Now treat the vector unit just like fpu lazy migration.
> Then let's do something sensible.    Manually enabling via prctl seems
> silly if we have fault on first use.
Yes, faulting on first use is a viable way of approaching. However, my
concern is that doing this on a system with libraries having common
V-optimized routines such as memcpy, memset would essentially trap
every process to m-mode starting up. This might take more cost than a
prctl syscall. And if every process on the system wants to be
benefited from V-optimized ifuncs, then having an additional prctl to
call at start time seems tedious as well.

Andy
  
Jeff Law Jan. 11, 2023, 6:20 a.m. UTC | #36
On 1/10/23 23:00, Andy Chiu wrote:
> On Wed, Jan 11, 2023 at 1:07 PM Jeff Law <jlaw@ventanamicro.com> wrote:
>>
>>
>>
>> On 1/10/23 21:57, Richard Henderson wrote:
>>> On 1/10/23 20:28, Jeff Law wrote:
>>>>
>>>>
>>>> On 1/10/23 18:22, Richard Henderson wrote:
>>>>> On 1/10/23 10:07, Vineet Gupta wrote:
>>>>>> Yes bulk of glibc might not have vector code, but those V ifunc
>>>>>> routines do and IMO this information needs to be recorded somewhere
>>>>>> in the elf. Case in point being the current issue with how to enable
>>>>>> V unit. Community wants a per-process enable, using an explicit
>>>>>> prctl from userspace (since RV doesn't have fault-on-first use
>>>>>> hardware mechanism unlike some of the other arches). But how does
>>>>>> the glibc loader know to invoke prctl. We can't just rely on user
>>>>>> env GLIBC_TUNABLE etc since that might not be accurate. It needs
>>>>>> somethign concrete which IMO can come from elf attributes. If not,
>>>>>> do you have suggestions on how to solve this issue ?
>>>>>
>>>>> Why not just fault on first use to enable?  That's vastly less
>>>>> complicated than trying to plumb anything through elf resulting in a
>>>>> prctl.
>>>> Well, the answer is in Vineet's paragraph -- the hardware apparently
>>>> doesn't have fault-on-first-use which is mighty unfortunate.
>>>
>>> Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just
>>> like fpu.
>>> Now treat the vector unit just like fpu lazy migration.
>> Then let's do something sensible.    Manually enabling via prctl seems
>> silly if we have fault on first use.
> Yes, faulting on first use is a viable way of approaching. However, my
> concern is that doing this on a system with libraries having common
> V-optimized routines such as memcpy, memset would essentially trap
> every process to m-mode starting up. This might take more cost than a
> prctl syscall. And if every process on the system wants to be
> benefited from V-optimized ifuncs, then having an additional prctl to
> call at start time seems tedious as well.
>
It's not perfect, but it's workable.  Explicitly turning things on seems 
like madness.  It boils down to having to annotate every binary and DSO 
and also adds complexity to JITs, the dynamic loader and probably all 
kinds of places we haven't thought through yet.

Fault on first use is well understood and has been implemented on many 
architectures through the decades, even with its warts.

jeff
  
Andy Chiu Jan. 11, 2023, 9:28 a.m. UTC | #37
On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw@ventanamicro.com> wrote:
> Fault on first use is well understood and has been implemented on many
> architectures through the decades, even with its warts.

Unfortunately, we don't have a direct way of acknowledging if an
illegal instruction is caused by illegitimate use of V instructions.
Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
fault, we may have to perform a sw decode on the faulting instruction.
Then see if it is the first-use fault, or a more general illegal
instruction fault.

Yes, we may just enable V for a process whenever we find an OP-V major
opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
instruction. But it could be kind of messy, IF, later extensions would
also like to be enabled at first-use-fault. (e.g. ARM has SME followed
by SVE). And implementing this decoding logic in sw just seems
redundant to me because hw has already done that for us.

Besides, ARM64 has individual mappings of traps for the use of
FP-related units in EL1 and EL0. So SIMD running in kernel mode would
not take additional instruction to enable the unit. I assume these
kinds of CSR-controlling instructions would have to flush hw internal
buffers to some extent. And doing these takes additional latencies.

Andy
  
Andy Chiu Jan. 11, 2023, 12:13 p.m. UTC | #38
On Wed, Jan 11, 2023 at 5:28 PM Andy Chiu <andy.chiu@sifive.com> wrote:
>
> On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw@ventanamicro.com> wrote:
> > Fault on first use is well understood and has been implemented on many
> > architectures through the decades, even with its warts.
>
> Unfortunately, we don't have a direct way of acknowledging if an
> illegal instruction is caused by illegitimate use of V instructions.
> Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
> fault, we may have to perform a sw decode on the faulting instruction.
> Then see if it is the first-use fault, or a more general illegal
> instruction fault.
After taking more considerations, I think this could be minor. The
first V-instruction of a valid program that uses Vector is limited to
vset{i}vl{i}, vl<nf>r, or vs<nf>r. And perhaps some r/w of
vector-specific CSRs. Decoding these instructions should be relatively
constraint and easy. And we need this decoding only once for each
process since we don't have to do lazy save/restore.
>
> Yes, we may just enable V for a process whenever we find an OP-V major
> opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
> instruction. But it could be kind of messy, IF, later extensions would
> also like to be enabled at first-use-fault. (e.g. ARM has SME followed
> by SVE). And implementing this decoding logic in sw just seems
> redundant to me because hw has already done that for us.
Let's limit our discussion to the scope of VS enablement for now.
>
> Besides, ARM64 has individual mappings of traps for the use of
> FP-related units in EL1 and EL0. So SIMD running in kernel mode would
> not take additional instruction to enable the unit. I assume these
> kinds of CSR-controlling instructions would have to flush hw internal
> buffers to some extent. And doing these takes additional latencies.
We already do some VS/FS settings on the entry of kernel code. So this
should be minor as well.

Anyway, I agree that faulting on first-uses is a better way to make
per-process control of VS feasible. Sorry for disturbing the list.

Thanks,
Andy
  
Conor Dooley Jan. 23, 2023, 12:17 p.m. UTC | #39
Hey Andy!

On Wed, Jan 11, 2023 at 08:13:27PM +0800, Andy Chiu wrote:
> On Wed, Jan 11, 2023 at 5:28 PM Andy Chiu <andy.chiu@sifive.com> wrote:
> >
> > On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw@ventanamicro.com> wrote:
> > > Fault on first use is well understood and has been implemented on many
> > > architectures through the decades, even with its warts.
> >
> > Unfortunately, we don't have a direct way of acknowledging if an
> > illegal instruction is caused by illegitimate use of V instructions.
> > Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
> > fault, we may have to perform a sw decode on the faulting instruction.
> > Then see if it is the first-use fault, or a more general illegal
> > instruction fault.
> After taking more considerations, I think this could be minor. The
> first V-instruction of a valid program that uses Vector is limited to
> vset{i}vl{i}, vl<nf>r, or vs<nf>r. And perhaps some r/w of
> vector-specific CSRs. Decoding these instructions should be relatively
> constraint and easy. And we need this decoding only once for each
> process since we don't have to do lazy save/restore.
> >
> > Yes, we may just enable V for a process whenever we find an OP-V major
> > opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
> > instruction. But it could be kind of messy, IF, later extensions would
> > also like to be enabled at first-use-fault. (e.g. ARM has SME followed
> > by SVE). And implementing this decoding logic in sw just seems
> > redundant to me because hw has already done that for us.
> Let's limit our discussion to the scope of VS enablement for now.
> >
> > Besides, ARM64 has individual mappings of traps for the use of
> > FP-related units in EL1 and EL0. So SIMD running in kernel mode would
> > not take additional instruction to enable the unit. I assume these
> > kinds of CSR-controlling instructions would have to flush hw internal
> > buffers to some extent. And doing these takes additional latencies.
> We already do some VS/FS settings on the entry of kernel code. So this
> should be minor as well.
> 
> Anyway, I agree that faulting on first-uses is a better way to make
> per-process control of VS feasible.

> Sorry for disturbing the list.

Meh, all of these discussions seem worthwhile to me!

Now that things have died down though, I'm curious - what are your
plans? Still going to submit another version of this series?

Thanks,
Conor.
  
Andy Chiu Jan. 23, 2023, 1:29 p.m. UTC | #40
Hey Conor,

On Mon, Jan 23, 2023 at 8:18 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> Meh, all of these discussions seem worthwhile to me!
>
> Now that things have died down though, I'm curious - what are your
> plans? Still going to submit another version of this series?
>
Yes, we have implemented most of it and are planning to send the
series in recent days. Thanks to Vineet, he is helping me to sort out
some last bits before the submission. Here are some points related to
this thread that will be in v13:

1. allocate V context in the first-use trap
2. drop prctl V-controlling because it conflicts with the idea of the
first-use trap.
2. sigframe/ptrace will not have V context if a process's VS is off
3. If the kernel is compiled with CONFIG_RISCV_ISA_V enabled, then the
auxv always reports size of the sigframe as if there is a V context.
This is because user space may need information from auxv to set up an
alternative signal stack, and it may not know if it would use V. ARM64
also reports the size assuming all extensions are used.

Thanks,
Andy
  

Patch

From 169eea1ef072c8403277a66313b00258080ac92c Mon Sep 17 00:00:00 2001
From: Vineet Gupta <vineetg@rivosinc.com>
Date: Wed, 21 Sep 2022 14:43:52 -0700
Subject: [PATCH] riscv: Add sigcontext save/restore for vector

V state needs to be preserved across signal handling on user stack.
To avoid glibc ABI break, this is not added to struct sigcontext (just as
for int/fp regs) but to struct rt_sigframe. Also this is all done
dynamically (vs. some static allocation) to cleanly handle implementation
defined VLEN wide V-regs.

We also borrow arm64 style of "context header" to tag the extension
state to allow for easy integration of future extensions.

Co-developed-by: Vincent Chen <vincent.chen@sifive.com>
Co-developed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
[vineetg: reworked to not change struct sigcontext,
          wireup init_rt_signal_env]
---
 arch/riscv/include/asm/processor.h       |   1 +
 arch/riscv/include/uapi/asm/sigcontext.h |  18 +++
 arch/riscv/kernel/asm-offsets.c          |   2 +
 arch/riscv/kernel/setup.c                |   2 +
 arch/riscv/kernel/signal.c               | 171 +++++++++++++++++++++--
 5 files changed, 186 insertions(+), 8 deletions(-)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 95917a2b24f9..854854b377b2 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -85,6 +85,7 @@  int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid);
 
 extern void riscv_fill_hwcap(void);
 extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
+void init_rt_signal_env(void);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/include/uapi/asm/sigcontext.h b/arch/riscv/include/uapi/asm/sigcontext.h
index 84f2dfcfdbce..411bf6985784 100644
--- a/arch/riscv/include/uapi/asm/sigcontext.h
+++ b/arch/riscv/include/uapi/asm/sigcontext.h
@@ -8,6 +8,24 @@ 
 
 #include <asm/ptrace.h>
 
+/* The Magic number for signal context frame header. */
+#define RVV_MAGIC	0x53465457
+#define END_MAGIC	0x0
+
+/* The size of END signal context header. */
+#define END_HDR_SIZE	0x0
+
+/* Every optional extension state needs to have the hdr. */
+struct __riscv_ctx_hdr {
+	__u32 magic;
+	__u32 size;
+};
+
+struct __sc_riscv_v_state {
+	struct __riscv_ctx_hdr head;
+	struct __riscv_v_state v_state;
+} __attribute__((aligned(16)));
+
 /*
  * Signal context structure
  *
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 37e3e6a8d877..80316ef7bb78 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -75,6 +75,8 @@  void asm_offsets(void)
 	OFFSET(TSK_STACK_CANARY, task_struct, stack_canary);
 #endif
 
+	OFFSET(RISCV_V_STATE_MAGIC, __riscv_ctx_hdr, magic);
+	OFFSET(RISCV_V_STATE_SIZE, __riscv_ctx_hdr, size);
 	OFFSET(RISCV_V_STATE_VSTART, __riscv_v_state, vstart);
 	OFFSET(RISCV_V_STATE_VL, __riscv_v_state, vl);
 	OFFSET(RISCV_V_STATE_VTYPE, __riscv_v_state, vtype);
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 2dfc463b86bb..aa0eedd3b890 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -299,6 +299,8 @@  void __init setup_arch(char **cmdline_p)
 	riscv_init_cbom_blocksize();
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
+	/* needs to be after riscv_fill_hwcap */
+	init_rt_signal_env();
 }
 
 static int __init topology_init(void)
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index 5c591123c440..ee234c319e5b 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -21,15 +21,27 @@ 
 #include <asm/csr.h>
 
 extern u32 __user_rt_sigreturn[2];
+static size_t rvv_sc_size;
 
 #define DEBUG_SIG 0
 
 struct rt_sigframe {
 	struct siginfo info;
-	struct ucontext uc;
 #ifndef CONFIG_MMU
 	u32 sigreturn_code[2];
 #endif
+	struct ucontext uc;
+	/*
+	 * Placeholder for additional state for V ext (and others in future).
+	 *  - Not added to struct sigcontext (unlike int/fp regs) to remain
+	 *    compatible with existing glibc struct sigcontext
+	 *  - Not added here explicitly either to allow for
+	 *     - Implementation defined VLEN wide V reg
+	 *     - Ability to do this per process
+	 * The actual V state struct is defined in uapi header.
+	 * Note: The alignment of 16 is ABI mandated for stack entries.
+	 */
+	__u8 sc_extn[] __attribute__((__aligned__(16)));
 };
 
 #ifdef CONFIG_FPU
@@ -86,16 +98,142 @@  static long save_fp_state(struct pt_regs *regs,
 #define restore_fp_state(task, regs) (0)
 #endif
 
-static long restore_sigcontext(struct pt_regs *regs,
-	struct sigcontext __user *sc)
+#ifdef CONFIG_RISCV_ISA_V
+
+static long save_v_state(struct pt_regs *regs, void **sc_vec)
+{
+	/*
+	 * Put __sc_riscv_v_state to the user's signal context space pointed
+	 * by sc_vec and the datap point the address right
+	 * after __sc_riscv_v_state.
+	 */
+	struct __sc_riscv_v_state __user *state = (struct __sc_riscv_v_state *) (*sc_vec);
+	void __user *datap = state + 1;
+	long err;
+
+	err = __put_user(RVV_MAGIC, &state->head.magic);
+	err = __put_user(rvv_sc_size, &state->head.size);
+
+	vstate_save(current, regs);
+	/* Copy additional vstate (except V regfile). */
+	err = __copy_to_user(&state->v_state, &current->thread.vstate,
+			     RISCV_V_STATE_DATAP);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the pointer datap itself. */
+	err = __put_user(datap, &state->v_state.datap);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the V regfile to user space datap. */
+	err = __copy_to_user(datap, current->thread.vstate.datap, riscv_vsize);
+
+	*sc_vec += rvv_sc_size;
+
+	return err;
+}
+
+static long restore_v_state(struct pt_regs *regs, void **sc_vec)
+{
+	long err;
+	struct __sc_riscv_v_state __user *state = (struct __sc_riscv_v_state *)(*sc_vec);
+	void __user *datap;
+
+	/* ctx_hdr check for RVV_MAGIC already done in caller. */
+
+	/* Copy everything of __sc_riscv_v_state except datap. */
+	err = __copy_from_user(&current->thread.vstate, &state->v_state,
+			       RISCV_V_STATE_DATAP);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the pointer datap itself. */
+	err = __get_user(datap, &state->v_state.datap);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the whole vector content from user space datap. */
+	err = __copy_from_user(current->thread.vstate.datap, datap, riscv_vsize);
+	if (unlikely(err))
+		return err;
+
+	vstate_restore(current, regs);
+
+	*sc_vec += rvv_sc_size;
+
+	return err;
+}
+
+#else
+#define save_v_state(task, regs) (0)
+#define restore_v_state(task, regs) (0)
+#endif
+
+static long restore_sigcontext(struct rt_sigframe __user *frame,
+			       struct pt_regs *regs)
 {
+	struct sigcontext __user *sc = &frame->uc.uc_mcontext;
+	void *sc_extn = &frame->sc_extn;
 	long err;
+
 	/* sc_regs is structured the same as the start of pt_regs */
 	err = __copy_from_user(regs, &sc->sc_regs, sizeof(sc->sc_regs));
 	/* Restore the floating-point state. */
 	if (has_fpu())
 		err |= restore_fp_state(regs, &sc->sc_fpregs);
+
+	while (1 && !err) {
+		struct __riscv_ctx_hdr *head = (struct __riscv_ctx_hdr *)sc_extn;
+		__u32 magic, size;
+
+		err |= __get_user(magic, &head->magic);
+		err |= __get_user(size, &head->size);
+		if (err)
+			goto done;
+
+		switch (magic) {
+		case END_MAGIC:
+			if (size != END_HDR_SIZE)
+				goto invalid;
+			goto done;
+		case RVV_MAGIC:
+			if (!has_vector() || (size != rvv_sc_size))
+				goto invalid;
+			err |= restore_v_state(regs, &sc_extn);
+			break;
+		default:
+			goto invalid;
+		}
+	}
+done:
 	return err;
+
+invalid:
+	return -EINVAL;
+}
+
+static size_t cal_rt_frame_size(void)
+{
+	struct rt_sigframe __user *frame;
+	static size_t frame_size;
+	size_t total_context_size = 0;
+
+	if (frame_size)
+		goto done;
+
+	total_context_size = sizeof(*frame);
+
+	if (has_vector())
+		total_context_size += rvv_sc_size;
+
+	/* Add a __riscv_ctx_hdr for END signal context header. */
+	total_context_size += sizeof(struct __riscv_ctx_hdr);
+
+	frame_size = round_up(total_context_size, 16);
+done:
+	return frame_size;
+
 }
 
 SYSCALL_DEFINE0(rt_sigreturn)
@@ -104,13 +242,14 @@  SYSCALL_DEFINE0(rt_sigreturn)
 	struct rt_sigframe __user *frame;
 	struct task_struct *task;
 	sigset_t set;
+	size_t frame_size = cal_rt_frame_size();
 
 	/* Always make any pending restarted system calls return -EINTR */
 	current->restart_block.fn = do_no_restart_syscall;
 
 	frame = (struct rt_sigframe __user *)regs->sp;
 
-	if (!access_ok(frame, sizeof(*frame)))
+	if (!access_ok(frame, frame_size))
 		goto badframe;
 
 	if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
@@ -118,7 +257,7 @@  SYSCALL_DEFINE0(rt_sigreturn)
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+	if (restore_sigcontext(frame, regs))
 		goto badframe;
 
 	if (restore_altstack(&frame->uc.uc_stack))
@@ -141,15 +280,24 @@  SYSCALL_DEFINE0(rt_sigreturn)
 }
 
 static long setup_sigcontext(struct rt_sigframe __user *frame,
-	struct pt_regs *regs)
+			     struct pt_regs *regs)
 {
 	struct sigcontext __user *sc = &frame->uc.uc_mcontext;
+	void *sc_extn = &frame->sc_extn;
 	long err;
+
 	/* sc_regs is structured the same as the start of pt_regs */
 	err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs));
 	/* Save the floating-point state. */
 	if (has_fpu())
 		err |= save_fp_state(regs, &sc->sc_fpregs);
+	/* Save the vector state. */
+	if (has_vector())
+		err |= save_v_state(regs, &sc_extn);
+
+	/* Put END __riscv_ctx_hdr at the end. */
+	err = __put_user(END_MAGIC, &((struct __riscv_ctx_hdr *)sc_extn)->magic);
+	err = __put_user(END_HDR_SIZE, &((struct __riscv_ctx_hdr *)sc_extn)->size);
 	return err;
 }
 
@@ -180,10 +328,11 @@  static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 	struct pt_regs *regs)
 {
 	struct rt_sigframe __user *frame;
+	size_t frame_size = cal_rt_frame_size();
 	long err = 0;
 
-	frame = get_sigframe(ksig, regs, sizeof(*frame));
-	if (!access_ok(frame, sizeof(*frame)))
+	frame = get_sigframe(ksig, regs, frame_size);
+	if (!access_ok(frame, frame_size))
 		return -EFAULT;
 
 	err |= copy_siginfo_to_user(&frame->info, &ksig->info);
@@ -329,3 +478,9 @@  asmlinkage __visible void do_notify_resume(struct pt_regs *regs,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
 		resume_user_mode_work(regs);
 }
+
+void __init init_rt_signal_env(void)
+{
+	/* Vector regfile + control regs. */
+	rvv_sc_size = sizeof(struct __sc_riscv_v_state) + riscv_vsize;
+}
-- 
2.34.1