powerpc: Enable demuxed sysv IPC syscalls

Message ID 1904883.fPjB3uFKBy@wuerfel
State Not applicable
Delegated to: Tulio Magno Quites Machado Filho
Headers

Commit Message

Arnd Bergmann Dec. 14, 2015, 10:04 p.m. UTC
  On Monday 14 December 2015 10:40:12 Paul E. Murphy wrote:
> On 12/03/2015 05:13 PM, Arnd Bergmann wrote:
> > On Friday 04 December 2015 00:09:08 Arnd Bergmann wrote:
> >> On Thursday 03 December 2015 23:12:10 Andreas Schwab wrote:
> >>>>  #endif
> >>>> @@ -64,7 +68,12 @@ compat_symbol (libc, __old_msgctl, msgctl, GLIBC_2_0);
> >>>>  int
> >>>>  __new_msgctl (int msqid, int cmd, struct msqid_ds *buf)
> >>>>  {
> >>>> +#ifdef __ASSUME_MSGCTL_SYSCALL
> >>>> +  return INLINE_SYSCALL (msgctl, 3, msqid, cmd | __IPC_64, buf);
> >>>
> >>> Why does a brand new syscall need IPC_64?
> >>
> >> This is a bug in the kernel, which we should fix there.  The same
> >> problem currently exists on ARM and AVR32, which also support the
> >> old IPC API (pre-__IPC64) and are adding separate syscalls now.
> > 
> > Correction, I looked at the wrong place: ARM and AVR32 have had this
> > problem for a long time, so we can't fix it any more. But we should
> > fix it for PowerPC and all other architectures that add these calls
> > in the future.
> 
> I'm not clear as to what you are suggesting for this patch. Looking at the
> kernel code, it does not look trivial to remove the IPC_64 bit. It seems
> to boil down to whether ARCH_WANT_IPC_PARSE_VERSION is configured on the
> kernel.
> 
> Should the compat versions of these be left untouched? Or is it safe to
> switch them to the demuxed version?

It is a little tricky indeed, especially on powerpc where we do this
differently on 32-bit and 64-bit kernels.

I think the patch below would be the simplest way to do this for all
three affected syscalls, but we probably don't want to it this late
in the kernel cycle. Maybe we can do it this way for 4.5, and change
the powerpc syscall table for 4.4 to leave out the separate calls
until we get this right?

	Arnd
  

Comments

Paul E. Murphy Dec. 15, 2015, 9:20 p.m. UTC | #1
On 12/14/2015 04:04 PM, Arnd Bergmann wrote:
> On Monday 14 December 2015 10:40:12 Paul E. Murphy wrote:
>> On 12/03/2015 05:13 PM, Arnd Bergmann wrote:
>>> On Friday 04 December 2015 00:09:08 Arnd Bergmann wrote:
>>>> On Thursday 03 December 2015 23:12:10 Andreas Schwab wrote:
>>>>>>  #endif
>>>>>> @@ -64,7 +68,12 @@ compat_symbol (libc, __old_msgctl, msgctl, GLIBC_2_0);
>>>>>>  int
>>>>>>  __new_msgctl (int msqid, int cmd, struct msqid_ds *buf)
>>>>>>  {
>>>>>> +#ifdef __ASSUME_MSGCTL_SYSCALL
>>>>>> +  return INLINE_SYSCALL (msgctl, 3, msqid, cmd | __IPC_64, buf);
>>>>>
>>>>> Why does a brand new syscall need IPC_64?
>>>>
>>>> This is a bug in the kernel, which we should fix there.  The same
>>>> problem currently exists on ARM and AVR32, which also support the
>>>> old IPC API (pre-__IPC64) and are adding separate syscalls now.
>>>
>>> Correction, I looked at the wrong place: ARM and AVR32 have had this
>>> problem for a long time, so we can't fix it any more. But we should
>>> fix it for PowerPC and all other architectures that add these calls
>>> in the future.
>>
>> I'm not clear as to what you are suggesting for this patch. Looking at the
>> kernel code, it does not look trivial to remove the IPC_64 bit. It seems
>> to boil down to whether ARCH_WANT_IPC_PARSE_VERSION is configured on the
>> kernel.
>>
>> Should the compat versions of these be left untouched? Or is it safe to
>> switch them to the demuxed version?
> 
> It is a little tricky indeed, especially on powerpc where we do this
> differently on 32-bit and 64-bit kernels.
> 
> I think the patch below would be the simplest way to do this for all
> three affected syscalls, but we probably don't want to it this late
> in the kernel cycle. Maybe we can do it this way for 4.5, and change
> the powerpc syscall table for 4.4 to leave out the separate calls
> until we get this right?
> 
> 	Arnd

Would a respin removing shmctl, semctl, and msgctl be acceptable?

I'd argue switching to the demuxed version is still better than going
through the mux. At worst this superfluously sets the IPC64 bit for
kernel's which assume it. Is it even possible to change the behavior
of these syscalls now?
  
Arnd Bergmann Dec. 15, 2015, 9:46 p.m. UTC | #2
On Tuesday 15 December 2015 15:20:14 Paul E. Murphy wrote:
> 
> Would a respin removing shmctl, semctl, and msgctl be acceptable?
> 
> I'd argue switching to the demuxed version is still better than going
> through the mux. At worst this superfluously sets the IPC64 bit for
> kernel's which assume it. Is it even possible to change the behavior
> of these syscalls now?

They have not yet been part of a released kernel, so I think it's still
possible to change or remove them before 4.4 comes out. If we get the
4.4 release with the syscalls in place, we should not change the behavior
any more, otherwise life gets too confusing in user space.

Similarly, I'd argue that removing just those three syscalls but leaving
the other ones in place would only make it harder to use them correctly
later, because every libc that wants to use them has to not only check
if the separate calls are available at all, but also whether the
three *ctl calls need special treatment.

	Arnd
  
Paul E. Murphy Dec. 15, 2015, 11:04 p.m. UTC | #3
On 12/15/2015 03:46 PM, Arnd Bergmann wrote:
> On Tuesday 15 December 2015 15:20:14 Paul E. Murphy wrote:
>>
>> Would a respin removing shmctl, semctl, and msgctl be acceptable?
>>
>> I'd argue switching to the demuxed version is still better than going
>> through the mux. At worst this superfluously sets the IPC64 bit for
>> kernel's which assume it. Is it even possible to change the behavior
>> of these syscalls now?
> 
> They have not yet been part of a released kernel, so I think it's still
> possible to change or remove them before 4.4 comes out. If we get the
> 4.4 release with the syscalls in place, we should not change the behavior
> any more, otherwise life gets too confusing in user space.
> 
> Similarly, I'd argue that removing just those three syscalls but leaving
> the other ones in place would only make it harder to use them correctly
> later, because every libc that wants to use them has to not only check
> if the separate calls are available at all, but also whether the
> three *ctl calls need special treatment.
> 

So, the bug is that some kernels check for, and clear the IPC64 bit, and
some assume it, but don't clear it? I guess that can't be checked at
compile time.

Isn't that still an issue with the current implementations? I guess
IA64 gets around it by generating these syscalls code dynamically.
  
Michael Ellerman Dec. 16, 2015, 8:08 a.m. UTC | #4
On Mon, 2015-12-14 at 23:04 +0100, Arnd Bergmann wrote:
> On Monday 14 December 2015 10:40:12 Paul E. Murphy wrote:
> > On 12/03/2015 05:13 PM, Arnd Bergmann wrote:
> > > On Friday 04 December 2015 00:09:08 Arnd Bergmann wrote:
> > > > On Thursday 03 December 2015 23:12:10 Andreas Schwab wrote:
> > > > > >  #endif
> > > > > > @@ -64,7 +68,12 @@ compat_symbol (libc, __old_msgctl, msgctl, GLIBC_2_0);
> > > > > >  int
> > > > > >  __new_msgctl (int msqid, int cmd, struct msqid_ds *buf)
> > > > > >  {
> > > > > > +#ifdef __ASSUME_MSGCTL_SYSCALL
> > > > > > +  return INLINE_SYSCALL (msgctl, 3, msqid, cmd | __IPC_64, buf);
> > > > > 
> > > > > Why does a brand new syscall need IPC_64?
> > > > 
> > > > This is a bug in the kernel, which we should fix there.  The same
> > > > problem currently exists on ARM and AVR32, which also support the
> > > > old IPC API (pre-__IPC64) and are adding separate syscalls now.
> > > 
> > > Correction, I looked at the wrong place: ARM and AVR32 have had this
> > > problem for a long time, so we can't fix it any more. But we should
> > > fix it for PowerPC and all other architectures that add these calls
> > > in the future.
> > 
> > I'm not clear as to what you are suggesting for this patch. Looking at the
> > kernel code, it does not look trivial to remove the IPC_64 bit. It seems
> > to boil down to whether ARCH_WANT_IPC_PARSE_VERSION is configured on the
> > kernel.
> > 
> > Should the compat versions of these be left untouched? Or is it safe to
> > switch them to the demuxed version?
> 
> It is a little tricky indeed, especially on powerpc where we do this
> differently on 32-bit and 64-bit kernels.
> 
> I think the patch below would be the simplest way to do this for all
> three affected syscalls, but we probably don't want to it this late
> in the kernel cycle. Maybe we can do it this way for 4.5, and change
> the powerpc syscall table for 4.4 to leave out the separate calls
> until we get this right?


Hi Arnd,

I'm willing to revert the powerpc kernel patch for 4.4 if you think it's worth
it so that we can clean up the kernel side before trying again.

Having said that it doesn't seem like it buys us that much. Just that userspace
no longer has to worry about passing IPC_64, or is there more to it that I'm
missing? I guess that's a worthwhile cleanup though.

cheers
  
Andreas Schwab Dec. 16, 2015, 8:42 a.m. UTC | #5
Michael Ellerman <mpe@ellerman.id.au> writes:

> Having said that it doesn't seem like it buys us that much. Just that userspace
> no longer has to worry about passing IPC_64, or is there more to it that I'm
> missing? I guess that's a worthwhile cleanup though.

By removing the requirement to add IPC_64 you remove the need for a
special wrapper, and the syscalls can be implemented by just adding it
to syscalls.list.  That should be the goal for every new syscall.  Also,
it doesn't make sense to offer obsolete interfaces (the non-IPC_64
variants) through such syscalls.

Andreas.
  
Arnd Bergmann Dec. 16, 2015, 9 a.m. UTC | #6
On Wednesday 16 December 2015 09:42:53 Andreas Schwab wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> > Having said that it doesn't seem like it buys us that much. Just that userspace
> > no longer has to worry about passing IPC_64, or is there more to it that I'm
> > missing? I guess that's a worthwhile cleanup though.
> 
> By removing the requirement to add IPC_64 you remove the need for a
> special wrapper, and the syscalls can be implemented by just adding it
> to syscalls.list.  That should be the goal for every new syscall.  Also,
> it doesn't make sense to offer obsolete interfaces (the non-IPC_64
> variants) through such syscalls.

Right, the other point is that we can do it consistently for all
architectures that currently don't have separate ipc syscalls but
may want to add them.

I made a table when this first came up:
https://docs.google.com/spreadsheets/d/18GxXEHE2ywnSr-SPoGFd1ABz6wEM1ex-JMu5lEraaH8

So the affected architectures are cris, frv, m32r, m68k, mips-o32, mn10300,
s390, sh, sparc and x86.

	Arnd
  
Michael Ellerman Dec. 16, 2015, 9:21 a.m. UTC | #7
On Wed, 2015-12-16 at 10:00 +0100, Arnd Bergmann wrote:
> On Wednesday 16 December 2015 09:42:53 Andreas Schwab wrote:
> > Michael Ellerman <mpe@ellerman.id.au> writes:

> > > Having said that it doesn't seem like it buys us that much. Just that userspace
> > > no longer has to worry about passing IPC_64, or is there more to it that I'm
> > > missing? I guess that's a worthwhile cleanup though.
> > 
> > By removing the requirement to add IPC_64 you remove the need for a
> > special wrapper, and the syscalls can be implemented by just adding it
> > to syscalls.list.  That should be the goal for every new syscall.  Also,
> > it doesn't make sense to offer obsolete interfaces (the non-IPC_64
> > variants) through such syscalls.
> 
> Right, the other point is that we can do it consistently for all
> architectures that currently don't have separate ipc syscalls but
> may want to add them.

OK, sounds good.

> I made a table when this first came up:
> https://docs.google.com/spreadsheets/d/18GxXEHE2ywnSr-SPoGFd1ABz6wEM1ex-JMu5lEraaH8
> 
> So the affected architectures are cris, frv, m32r, m68k, mips-o32, mn10300,
> s390, sh, sparc and x86.

Eek, powerpc is red, bad!

So the only complication with reverting the patch is that we've since added
sys_mlock2 after the IPC calls, as NR 378.

I guess because we haven't released a kernel with mlock2 using that number I'll
just renumber it, as if the IPC calls were never there.

cheers
  
Arnd Bergmann Dec. 16, 2015, 9:57 a.m. UTC | #8
On Wednesday 16 December 2015 20:21:47 Michael Ellerman wrote:
> On Wed, 2015-12-16 at 10:00 +0100, Arnd Bergmann wrote:
> > On Wednesday 16 December 2015 09:42:53 Andreas Schwab wrote:
> > > Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> > > > Having said that it doesn't seem like it buys us that much. Just that userspace
> > > > no longer has to worry about passing IPC_64, or is there more to it that I'm
> > > > missing? I guess that's a worthwhile cleanup though.
> > > 
> > > By removing the requirement to add IPC_64 you remove the need for a
> > > special wrapper, and the syscalls can be implemented by just adding it
> > > to syscalls.list.  That should be the goal for every new syscall.  Also,
> > > it doesn't make sense to offer obsolete interfaces (the non-IPC_64
> > > variants) through such syscalls.
> > 
> > Right, the other point is that we can do it consistently for all
> > architectures that currently don't have separate ipc syscalls but
> > may want to add them.
> 
> OK, sounds good.
> 
> > I made a table when this first came up:
> > https://docs.google.com/spreadsheets/d/18GxXEHE2ywnSr-SPoGFd1ABz6wEM1ex-JMu5lEraaH8
> > 
> > So the affected architectures are cris, frv, m32r, m68k, mips-o32, mn10300,
> > s390, sh, sparc and x86.
> 
> Eek, powerpc is red, bad!
> 
> So the only complication with reverting the patch is that we've since added
> sys_mlock2 after the IPC calls, as NR 378.
> 
> I guess because we haven't released a kernel with mlock2 using that number I'll
> just renumber it, as if the IPC calls were never there.

I think it would be better to leave the numbers as they are and bring them back
with the name numbers when we get there, just to avoid any incompatibilities with
people using mlock2 on prerelease kernels.

	Arnd
  
Andreas Schwab Dec. 16, 2015, 9:59 a.m. UTC | #9
Michael Ellerman <mpe@ellerman.id.au> writes:

> So the only complication with reverting the patch is that we've since added
> sys_mlock2 after the IPC calls, as NR 378.
>
> I guess because we haven't released a kernel with mlock2 using that number I'll
> just renumber it, as if the IPC calls were never there.

There is no need to renumber, a gap doesn't pose any difficulties.
Syscall numbers can be reserved any time, the important part is that you
get ENOSYS when trying to call the new ipc syscalls until the final ABI
is implemented.

Andreas.
  

Patch

diff --git a/ipc/compat.c b/ipc/compat.c
index 9b3c85f8a538..e583e7e83c0e 100644
--- a/ipc/compat.c
+++ b/ipc/compat.c
@@ -495,20 +495,19 @@  static inline int put_compat_msqid_ds(struct msqid64_ds *m,
 	return err;
 }
 
-COMPAT_SYSCALL_DEFINE3(msgctl, int, first, int, second, void __user *, uptr)
+static long do_compat_msgctl(int msqid, int cmd, void __user *uptr, int version)
 {
 	int err, err2;
 	struct msqid64_ds m64;
-	int version = compat_ipc_parse_version(&second);
 	void __user *p;
 
 	memset(&m64, 0, sizeof(m64));
 
-	switch (second & (~IPC_64)) {
+	switch (cmd & (~IPC_64)) {
 	case IPC_INFO:
 	case IPC_RMID:
 	case MSG_INFO:
-		err = sys_msgctl(first, second, uptr);
+		err = sys_msgctl(msqid, cmd, uptr);
 		break;
 
 	case IPC_SET:
@@ -523,13 +522,13 @@  COMPAT_SYSCALL_DEFINE3(msgctl, int, first, int, second, void __user *, uptr)
 		if (copy_to_user(p, &m64, sizeof(m64)))
 			err = -EFAULT;
 		else
-			err = sys_msgctl(first, second, p);
+			err = sys_msgctl(msqid, cmd, p);
 		break;
 
 	case IPC_STAT:
 	case MSG_STAT:
 		p = compat_alloc_user_space(sizeof(m64));
-		err = sys_msgctl(first, second, p);
+		err = sys_msgctl(msqid, cmd, p);
 		if (err < 0)
 			break;
 		if (copy_from_user(&m64, p, sizeof(m64)))
@@ -549,6 +548,18 @@  COMPAT_SYSCALL_DEFINE3(msgctl, int, first, int, second, void __user *, uptr)
 	return err;
 }
 
+COMPAT_SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, void __user *, uptr)
+{
+	int version = compat_ipc_parse_version(&cmd);
+
+	return do_compat_msgctl(msqid, cmd, uptr, version);
+}
+
+COMPAT_SYSCALL_DEFINE3(msgctl64, int, first, int, second, void __user *, uptr)
+{
+	return do_compat_msgctl(msqid, cmd, uptr, IPC_64);
+}
+
 COMPAT_SYSCALL_DEFINE3(shmat, int, shmid, compat_uptr_t, shmaddr, int, shmflg)
 {
 	unsigned long ret;
diff --git a/ipc/msg.c b/ipc/msg.c
index 59559a215401..acbb82a80647 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -520,15 +520,13 @@  out_unlock:
 	return err;
 }
 
-SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, struct msqid_ds __user *, buf)
+static long do_msgctl(int msqid, int cmd, struct msqid_ds __user * buf, int version)
 {
-	int version;
 	struct ipc_namespace *ns;
 
 	if (msqid < 0 || cmd < 0)
 		return -EINVAL;
 
-	version = ipc_parse_version(&cmd);
 	ns = current->nsproxy->ipc_ns;
 
 	switch (cmd) {
@@ -545,6 +543,18 @@  SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, struct msqid_ds __user *, buf)
 	}
 }
 
+SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, struct msqid_ds __user *, buf)
+{
+	int version = ipc_parse_version(&cmd);
+
+	return do_msgctl(msqid, cmd, buf, version);
+}
+
+SYSCALL_DEFINE3(msgctl64, int, msqid, int, cmd, struct msqid_ds __user *, buf)
+{
+	return do_msgctl(msqid, cmd, buf, IPC_64);
+}
+
 static int testmsg(struct msg_msg *msg, long type, int mode)
 {
 	switch (mode) {