[RFC,06/10] arm64/sve: Disallow VL setting for individual threads by default

Message ID 1484220369-23970-7-git-send-email-Dave.Martin@arm.com
State New, archived
Headers

Commit Message

Dave Martin Jan. 12, 2017, 11:26 a.m. UTC
  General-purpose code in userspace is not expected to work correctly
if multiple threads are allowed to run concurrently with different
vector lengths in a single process.

This patch adds an explicit flag PR_SVE_SET_VL_THREAD to request
this behaviour.  Without the flag, vector length setting is
permitted only for a single-threaded process (which matches the
expected usage model of setting the vector length at process
startup).

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 12 +++++++++++-
 include/uapi/linux/prctl.h |  1 +
 2 files changed, 12 insertions(+), 1 deletion(-)
  

Comments

Yao Qi Jan. 16, 2017, 11:34 a.m. UTC | #1
On 17-01-12 11:26:05, Dave Martin wrote:
> General-purpose code in userspace is not expected to work correctly
> if multiple threads are allowed to run concurrently with different
> vector lengths in a single process.
> 
> This patch adds an explicit flag PR_SVE_SET_VL_THREAD to request
> this behaviour.  Without the flag, vector length setting is
> permitted only for a single-threaded process (which matches the
> expected usage model of setting the vector length at process
> startup).

Hi Dave,
PR_SVE_SET_VL_THREAD can be arch-independent, IMO, because prctl
needs a scope.  Looks some of them are system-wide, some of them are
about threads within the same process (like, PR_MPX_ENABLE_MANAGEMENT).
IOW, PR_SVE_SET_VL_THREAD can be general flag, to indicate the scope
of each new ptrcl command is per-thread.

I happen to see PR_SET_FP_MODE in man pages, which is about setting
FP register modes in runtime.  It is a little similar to setting VL in
this patch.  However the doc doesn't mention the effect or the scope
of this command.
  
Dave Martin Jan. 16, 2017, 12:23 p.m. UTC | #2
On Mon, Jan 16, 2017 at 11:34:39AM +0000, Yao Qi wrote:
> On 17-01-12 11:26:05, Dave Martin wrote:
> > General-purpose code in userspace is not expected to work correctly
> > if multiple threads are allowed to run concurrently with different
> > vector lengths in a single process.
> > 
> > This patch adds an explicit flag PR_SVE_SET_VL_THREAD to request
> > this behaviour.  Without the flag, vector length setting is
> > permitted only for a single-threaded process (which matches the
> > expected usage model of setting the vector length at process
> > startup).

To be clear, PR_SVE_SET_VL_THREAD is not persistent, it just overrides
the default one-thread-per-process restriction for this prctl call.

The idea is that if someone writes some code to set the VL and then
moves the code to a multithreaded environment, by default it will stop
working.  This is a hint that some actual work is likely needed to
port their code to work with multiple threads.

> Hi Dave,
> PR_SVE_SET_VL_THREAD can be arch-independent, IMO, because prctl
> needs a scope.  Looks some of them are system-wide, some of them are
> about threads within the same process (like, PR_MPX_ENABLE_MANAGEMENT).
> IOW, PR_SVE_SET_VL_THREAD can be general flag, to indicate the scope
> of each new ptrcl command is per-thread.

This can't be backported to the existing prctls because that would
change their behaviour.   Rather, what each prctl applies (thread or
process) is part of the definition of that particular prctl.

Since there are no other prctl() calls that can apply per-thread or
per-process, or that differ only in this regard, is seems a bit esoteric
to try to apply this concept across all prctls... ?

Which prctl()s are system-wide?  I didn't see any, but I may have missed
something.

> I happen to see PR_SET_FP_MODE in man pages, which is about setting
> FP register modes in runtime.  It is a little similar to setting VL in
> this patch.  However the doc doesn't mention the effect or the scope
> of this command.

The various FP/SIMD twiddling prctls() all seem to be arch-specific.
PR_SET_FP_MODE only exists for mips.

Unless the semantics are really the same, I'm not too keen on an arm64
prctl with the same name.

Putting "ARM64" in the name of the new prctls might be clearer, but
nobody seemed to care so far...

Cheers
---Dave
  

Patch

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 5f2c24a..32caca3 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -21,6 +21,7 @@ 
 #include <linux/cpu_pm.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/prctl.h>
 #include <linux/sched.h>
 #include <linux/signal.h>
 #include <linux/hardirq.h>
@@ -498,8 +499,17 @@  int sve_set_vector_length(struct task_struct *task,
 {
 	BUG_ON(task == current && preemptible());
 
+	/*
+	 * To avoid accidents, forbid setting for individual threads of a
+	 * multithreaded process.  User code that knows what it's doing can
+	 * pass PR_SVE_SET_VL_THREAD to override this restriction:
+	 */
+	if (!(flags & PR_SVE_SET_VL_THREAD) && get_nr_threads(task) != 1)
+		return -EINVAL;
+
+	flags &= ~(unsigned long)PR_SVE_SET_VL_THREAD;
 	if (flags)
-		return -EINVAL; /* No flags defined yet */
+		return -EINVAL; /* No other flags defined yet */
 
 	if (!sve_vl_valid(vl))
 		return -EINVAL;
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index e32e2da..c55530b 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -199,6 +199,7 @@  struct prctl_mm_map {
 
 /* arm64 Scalable Vector Extension controls */
 #define PR_SVE_SET_VL			48	/* set task vector length */
+# define PR_SVE_SET_VL_THREAD		(1 << 1) /* set just this thread */
 #define PR_SVE_GET_VL			49	/* get task vector length */
 
 #endif /* _LINUX_PRCTL_H */