Error checking for SETXID (bug 13347)
Commit Message
Check for syscall error in the SETXID implementation in NPTL (bug 13347).
At this point, we can only abort the process because we have already
switched credentials on other threads. Returning an error would still
leave the process in an inconsistent state.
The new xtest needs root privileges to run.
Comments
On Mon, 24 Mar 2014, Florian Weimer wrote:
> Check for syscall error in the SETXID implementation in NPTL (bug 13347).
>
> At this point, we can only abort the process because we have already switched
> credentials on other threads. Returning an error would still leave the
> process in an inconsistent state.
This may be the best possible in the absence of a kernel interface for
setting ids atomically for the whole process, but such an interface would
be the desired long-term fix, with aborting from the present code just a
fallback - is there ongoing work to agree such an interface?
On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote:
> On Mon, 24 Mar 2014, Florian Weimer wrote:
>
> > Check for syscall error in the SETXID implementation in NPTL (bug 13347).
> >
> > At this point, we can only abort the process because we have already switched
> > credentials on other threads. Returning an error would still leave the
> > process in an inconsistent state.
>
> This may be the best possible in the absence of a kernel interface for
> setting ids atomically for the whole process, but such an interface would
> be the desired long-term fix, with aborting from the present code just a
> fallback - is there ongoing work to agree such an interface?
Are you sure you can't make it so that all setuid calls but the first
can't fail? There are basically only 2 reasons it might fail: ulimit
on old kernels (this was fixed in early 3.x series), and failure to
allocate the kernel data structure to represent the new permissions
(they're refcounted/shared, so this failure can only happen on the
first one under normal circumstances, but might be possible still if
setfsuid, etc. are in use).
I'm very much in favor of pressing the kernel folks for a syscall that
makes the change atomically for all threads, but this is vastly
complicated by the fact that they're storing permissions which should
be thread-local (fsuid) in the same object as ones which should be
process-global. So implementing this correctly at the kernel level
might require radical changes to how they represent permissions... :(
Rich
On 03/24/2014 04:19 PM, Joseph S. Myers wrote:
> On Mon, 24 Mar 2014, Florian Weimer wrote:
>
>> Check for syscall error in the SETXID implementation in NPTL (bug 13347).
>>
>> At this point, we can only abort the process because we have already switched
>> credentials on other threads. Returning an error would still leave the
>> process in an inconsistent state.
>
> This may be the best possible in the absence of a kernel interface for
> setting ids atomically for the whole process, but such an interface would
> be the desired long-term fix, with aborting from the present code just a
> fallback - is there ongoing work to agree such an interface?
As far as I know, the credentials switching work goes in the other
direction, providing additional per-thread credentials to glibc-based
userspace:
<http://thread.gmane.org/gmane.linux.file-systems/81751>
Probably like most kernel developers, I'm not convinced that the POSIX
semantics are useful.
On 03/24/2014 04:32 PM, Rich Felker wrote:
> On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote:
>> On Mon, 24 Mar 2014, Florian Weimer wrote:
>>
>>> Check for syscall error in the SETXID implementation in NPTL (bug 13347).
>>>
>>> At this point, we can only abort the process because we have already switched
>>> credentials on other threads. Returning an error would still leave the
>>> process in an inconsistent state.
>>
>> This may be the best possible in the absence of a kernel interface for
>> setting ids atomically for the whole process, but such an interface would
>> be the desired long-term fix, with aborting from the present code just a
>> fallback - is there ongoing work to agree such an interface?
>
> Are you sure you can't make it so that all setuid calls but the first
> can't fail?
We already are in this situation if application only ever uses the
SETXID wrappers, I think. That's why the test has to resort to directly
invoking the syscall.
Florian Weimer <fweimer@redhat.com> writes:
> Probably like most kernel developers, I'm not convinced that the POSIX
> semantics are useful.
Has anybody raised this on the Austin Group?
Andreas.
On Mon, Mar 24, 2014 at 04:44:04PM +0100, Florian Weimer wrote:
> On 03/24/2014 04:32 PM, Rich Felker wrote:
> >On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote:
> >>On Mon, 24 Mar 2014, Florian Weimer wrote:
> >>
> >>>Check for syscall error in the SETXID implementation in NPTL (bug 13347).
> >>>
> >>>At this point, we can only abort the process because we have already switched
> >>>credentials on other threads. Returning an error would still leave the
> >>>process in an inconsistent state.
> >>
> >>This may be the best possible in the absence of a kernel interface for
> >>setting ids atomically for the whole process, but such an interface would
> >>be the desired long-term fix, with aborting from the present code just a
> >>fallback - is there ongoing work to agree such an interface?
> >
> >Are you sure you can't make it so that all setuid calls but the first
> >can't fail?
>
> We already are in this situation if application only ever uses the
> SETXID wrappers, I think. That's why the test has to resort to
> directly invoking the syscall.
I don't understand how your message is an answer to the question you
quoted. I was asking whether there might be a way to setup the
conditions prior to making the setuid syscalls such that if the first
one succeeds, the subsequent ones cannot fail. I'll admit it's
unlikely that this is possible, but you can reduce the chances of
failure by temporaily setting RLIMIT_NPROC to infinity before making
the syscalls.
Rich
On 03/24/2014 04:49 PM, Rich Felker wrote:
> On Mon, Mar 24, 2014 at 04:44:04PM +0100, Florian Weimer wrote:
>> On 03/24/2014 04:32 PM, Rich Felker wrote:
>>> On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote:
>>>> On Mon, 24 Mar 2014, Florian Weimer wrote:
>>>>
>>>>> Check for syscall error in the SETXID implementation in NPTL (bug 13347).
>>>>>
>>>>> At this point, we can only abort the process because we have already switched
>>>>> credentials on other threads. Returning an error would still leave the
>>>>> process in an inconsistent state.
>>>>
>>>> This may be the best possible in the absence of a kernel interface for
>>>> setting ids atomically for the whole process, but such an interface would
>>>> be the desired long-term fix, with aborting from the present code just a
>>>> fallback - is there ongoing work to agree such an interface?
>>>
>>> Are you sure you can't make it so that all setuid calls but the first
>>> can't fail?
>>
>> We already are in this situation if application only ever uses the
>> SETXID wrappers, I think. That's why the test has to resort to
>> directly invoking the syscall.
>
> I don't understand how your message is an answer to the question you
> quoted.
The negations and quantifiers are confusing, so I wasn't sure what you
were asking.
> I was asking whether there might be a way to setup the
> conditions prior to making the setuid syscalls such that if the first
> one succeeds, the subsequent ones cannot fail.
Not in general, no, because the kernel implementation calls into the
Linux Security Module framework, whose modules typically implement
additional preconditions we cannot check in glibc due to insufficient
information.
On Mon, Mar 24, 2014 at 04:57:23PM +0100, Florian Weimer wrote:
> >I was asking whether there might be a way to setup the
> >conditions prior to making the setuid syscalls such that if the first
> >one succeeds, the subsequent ones cannot fail.
>
> Not in general, no, because the kernel implementation calls into the
> Linux Security Module framework, whose modules typically implement
> additional preconditions we cannot check in glibc due to
> insufficient information.
Yes, I'm well aware of the Linux Insecurity Modules framework. Any
framework that can make standard functions with documented interface
contracts violate their own interface contracts subtracts from the
security of a system rather than adding to it, and I really have no
problem with telling users this if they're running broken Insecurity
Modules.
But back to the topic, I was assuming correct behavior from the
kernel. If the kernel misbehaves, aborting is a perfectly reasonable
response (but if LSM's make the kernel lie, can you even tell if it
misbehaved?).
Rich
On Mon, Mar 24, 2014 at 04:41:59PM +0100, Florian Weimer wrote:
> On 03/24/2014 04:19 PM, Joseph S. Myers wrote:
> >On Mon, 24 Mar 2014, Florian Weimer wrote:
> >
> >>Check for syscall error in the SETXID implementation in NPTL (bug 13347).
> >>
> >>At this point, we can only abort the process because we have already switched
> >>credentials on other threads. Returning an error would still leave the
> >>process in an inconsistent state.
> >
> >This may be the best possible in the absence of a kernel interface for
> >setting ids atomically for the whole process, but such an interface would
> >be the desired long-term fix, with aborting from the present code just a
> >fallback - is there ongoing work to agree such an interface?
>
> As far as I know, the credentials switching work goes in the other
> direction, providing additional per-thread credentials to
> glibc-based userspace:
>
> <http://thread.gmane.org/gmane.linux.file-systems/81751>
>
> Probably like most kernel developers, I'm not convinced that the
> POSIX semantics are useful.
It is a critical security flaw to have multiple tasks
(threads/processes) running in the same virtual address space with
different privileges. I have described potential attacks for this
situation before; I could lookup the references if you care.
BTW, normally it's a bad idea to call setuid in a multithreaded
program anyway. However the situation does arise quite often,
especially in programs written in higher-level languages like Java
where the language runtime may have started threads for its own
internal purposes before the program drops privileges via a FFI-like
interface to the low-level libc functions like setuid. I've seen this
issue before in software I've audited. So glibc needs to handle it
correctly, Linux should too at the kernel level, and the
POSIX-specified behavior is "right" from a security standpoint.
Rich
On Mon 24 Mar 2014 12:26:32 Rich Felker wrote:
> On Mon, Mar 24, 2014 at 04:57:23PM +0100, Florian Weimer wrote:
> > >I was asking whether there might be a way to setup the
> > >conditions prior to making the setuid syscalls such that if the first
> > >one succeeds, the subsequent ones cannot fail.
> >
> > Not in general, no, because the kernel implementation calls into the
> > Linux Security Module framework, whose modules typically implement
> > additional preconditions we cannot check in glibc due to
> > insufficient information.
>
> Yes, I'm well aware of the Linux Insecurity Modules framework. Any
> framework that can make standard functions with documented interface
> contracts violate their own interface contracts subtracts from the
> security of a system rather than adding to it, and I really have no
> problem with telling users this if they're running broken Insecurity
> Modules.
>
> But back to the topic, I was assuming correct behavior from the
> kernel. If the kernel misbehaves, aborting is a perfectly reasonable
> response (but if LSM's make the kernel lie, can you even tell if it
> misbehaved?).
trying to stack the deck against failure is a good idea, but that is
orthogonal to checking the return value. there's no good reason at all to not
check & abort when the call fails.
-mike
On Mon, Mar 24, 2014 at 03:16:44PM -0400, Mike Frysinger wrote:
> On Mon 24 Mar 2014 12:26:32 Rich Felker wrote:
> > On Mon, Mar 24, 2014 at 04:57:23PM +0100, Florian Weimer wrote:
> > > >I was asking whether there might be a way to setup the
> > > >conditions prior to making the setuid syscalls such that if the first
> > > >one succeeds, the subsequent ones cannot fail.
> > >
> > > Not in general, no, because the kernel implementation calls into the
> > > Linux Security Module framework, whose modules typically implement
> > > additional preconditions we cannot check in glibc due to
> > > insufficient information.
> >
> > Yes, I'm well aware of the Linux Insecurity Modules framework. Any
> > framework that can make standard functions with documented interface
> > contracts violate their own interface contracts subtracts from the
> > security of a system rather than adding to it, and I really have no
> > problem with telling users this if they're running broken Insecurity
> > Modules.
> >
> > But back to the topic, I was assuming correct behavior from the
> > kernel. If the kernel misbehaves, aborting is a perfectly reasonable
> > response (but if LSM's make the kernel lie, can you even tell if it
> > misbehaved?).
>
> trying to stack the deck against failure is a good idea, but that is
> orthogonal to checking the return value. there's no good reason at all to not
> check & abort when the call fails.
> -mike
Agreed.
Rich
On 03/25/2014 04:07 AM, Rich Felker wrote:
> On Mon, Mar 24, 2014 at 03:16:44PM -0400, Mike Frysinger wrote:
>>> But back to the topic, I was assuming correct behavior from the
>>> kernel. If the kernel misbehaves, aborting is a perfectly reasonable
>>> response (but if LSM's make the kernel lie, can you even tell if it
>>> misbehaved?).
>>
>> trying to stack the deck against failure is a good idea, but that is
>> orthogonal to checking the return value. there's no good reason at all to not
>> check & abort when the call fails.
>> -mike
>
> Agreed.
So what about the patch? I have put kernel support on my to-do list,
but I have other kernel items that I want to deal with first.
On 03/24/2014 07:22 PM, Rich Felker wrote:
> It is a critical security flaw to have multiple tasks
> (threads/processes) running in the same virtual address space with
> different privileges. I have described potential attacks for this
> situation before; I could lookup the references if you care.
I think you are wrong—the kernel does it all the time. It is okay as
long as you can control what code you run.
We already support it through setfsuid/setfsgid, which is per-thread,
not per-process.
On 03/24/2014 04:41 PM, Florian Weimer wrote:
> As far as I know, the credentials switching work goes in the other
> direction, providing additional per-thread credentials to glibc-based
> userspace:
>
> <http://thread.gmane.org/gmane.linux.file-systems/81751>
>
> Probably like most kernel developers, I'm not convinced that the POSIX
> semantics are useful.
The credentials switching system call discussion just recommenced:
<http://marc.info/?l=linux-fsdevel&m=139587987829677>
I'll take a breath by sending more mail to that thread. :-/
On Thu, Mar 27, 2014 at 02:38:54PM +0100, Florian Weimer wrote:
> On 03/24/2014 07:22 PM, Rich Felker wrote:
>
> >It is a critical security flaw to have multiple tasks
> >(threads/processes) running in the same virtual address space with
> >different privileges. I have described potential attacks for this
> >situation before; I could lookup the references if you care.
>
> I think you are wrong—the kernel does it all the time. It is okay
The kernel does a lot of things that are not good security practices.
Appeal to authority is not a valid argument.
> as long as you can control what code you run.
>
> We already support it through setfsuid/setfsgid, which is
> per-thread, not per-process.
The whole reason you're changing uids is because you can't be sure
about what code you run; your code could (and probably does) have
arbitrary code execution vulnerabilities you're not aware of. This is
why you use privilege separation enforced by the kernel/mmu/etc.
rather than just assuming your code is all safe and correct. If, after
calling setuid with the intent of fully dropping root privileges, the
code that subsequently runs has write access to memory used by a code
that continues to run with root privileges, it's trivial for an
exploit in the "unprivileged" code to elevate itself to compromise of
the code that's still running with privileges.
Rich
On 03/27/2014 04:21 PM, Rich Felker wrote:
>> We already support it through setfsuid/setfsgid, which is
>> per-thread, not per-process.
>
> The whole reason you're changing uids is because you can't be sure
> about what code you run;
Ah, no, you can also change credentials to impersonate a user and access
resources with the privileges of that user. A file server does this,
for example.
On Thu, Mar 27, 2014 at 04:27:46PM +0100, Florian Weimer wrote:
> On 03/27/2014 04:21 PM, Rich Felker wrote:
>
> >>We already support it through setfsuid/setfsgid, which is
> >>per-thread, not per-process.
> >
> >The whole reason you're changing uids is because you can't be sure
> >about what code you run;
>
> Ah, no, you can also change credentials to impersonate a user and
> access resources with the privileges of that user. A file server
> does this, for example.
That's what setfsuid is for. setuid is pretty much exclusively for
dropping privileges.
Rich
On Mon 24 Mar 2014 15:30:30 Florian Weimer wrote:
> Check for syscall error in the SETXID implementation in NPTL (bug 13347).
>
> At this point, we can only abort the process because we have already
> switched credentials on other threads. Returning an error would still
> leave the process in an inconsistent state.
>
> The new xtest needs root privileges to run.
patch looks OK to me. improving the kernel layer is an independent issue.
-mike
On 03/24/2014 03:30 PM, Florian Weimer wrote:
> Check for syscall error in the SETXID implementation in NPTL (bug 13347).
>
> At this point, we can only abort the process because we have already
> switched credentials on other threads. Returning an error would still
> leave the process in an inconsistent state.
>
> The new xtest needs root privileges to run.
It turns out that the patch is wrong/incomplete. The abort needs to be
restricted to cases where we actually see inconsistent failure/success
behavior. I will work on a fix.
commit 06f66b8da0d3d2914f1d6f66ebf422007c5b00b7
Author: Florian Weimer <fweimer@redhat.com>
Date: Mon Mar 24 15:24:02 2014 +0100
Check for syscall error in the SETXID implementation in NPTL (bug 13347).
At this point, we can only abort the process because we have already
switched credentials on other threads. Returning an error would still
leave the process in an inconsistent state.
The new xtest needs root privileges to run.
@@ -1,3 +1,10 @@
+2014-03-24 Florian Weimer <fweimer@redhat.com>
+
+ [BZ #13347]
+ * nptl/nptl-init.c (sighandler_setxid): Check system call result.
+ * nptl/tst-setuid2.c: New file.
+ * nptl/Makefile (xtests): Add tst-setuid2.
+
2014-03-24 Joseph Myers <joseph@codesourcery.com>
[BZ #16284]
@@ -270,7 +270,8 @@ tests = tst-typesizes \
tst-vfork1 tst-vfork2 tst-vfork1x tst-vfork2x \
tst-getpid1 tst-getpid2 tst-getpid3 \
tst-initializers1 $(patsubst %,tst-initializers1-%,c89 gnu89 c99 gnu99)
-xtests = tst-setuid1 tst-setuid1-static tst-mutexpp1 tst-mutexpp6 tst-mutexpp10
+xtests = tst-setuid1 tst-setuid1-static tst-setuid2 \
+ tst-mutexpp1 tst-mutexpp6 tst-mutexpp10
test-srcs = tst-oddstacklimit
# Files which must not be linked with libpthread.
@@ -232,6 +232,7 @@ sighandler_setxid (int sig, siginfo_t *si, void *ctx)
/* Determine the process ID. It might be negative if the thread is
in the middle of a fork() call. */
pid_t pid = THREAD_GETMEM (THREAD_SELF, pid);
+ int result;
if (__glibc_unlikely (pid < 0))
pid = -pid;
@@ -245,8 +246,12 @@ sighandler_setxid (int sig, siginfo_t *si, void *ctx)
return;
INTERNAL_SYSCALL_DECL (err);
- INTERNAL_SYSCALL_NCS (__xidcmd->syscall_no, err, 3, __xidcmd->id[0],
- __xidcmd->id[1], __xidcmd->id[2]);
+ result = INTERNAL_SYSCALL_NCS (__xidcmd->syscall_no, err, 3, __xidcmd->id[0],
+ __xidcmd->id[1], __xidcmd->id[2]);
+ if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result, err)))
+ /* Safety check. This should never happen if the setxid system
+ calls are only ever called through their glibc wrappers. */
+ abort ();
/* Reset the SETXID flag. */
struct pthread *self = THREAD_SELF;
new file mode 100644
@@ -0,0 +1,145 @@
+/* Copyright (C) 2014 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <errno.h>
+#include <pthread.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+/* Check that a partial setuid failure aborts the process. */
+
+static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t cond_send;
+static void (*func_sent) (void);
+static pthread_cond_t cond_recv;
+
+#define FAIL(fmt, ...) \
+ do { printf ("FAIL: " fmt "\n", __VA_ARGS__); _exit (1); } while (0)
+
+static void *
+thread_func (void *ctx __attribute__ ((unused)))
+{
+ int ret = pthread_mutex_lock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_lock (thread): %d", ret);
+
+ while (true)
+ {
+ if (func_sent != NULL)
+ {
+ void (*func) (void) = func_sent;
+ ret = pthread_mutex_unlock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_unlock (thread): %d", ret);
+ func ();
+ ret = pthread_mutex_lock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_lock (thread): %d", ret);
+ func_sent = NULL;
+ ret = pthread_cond_signal (&cond_recv);
+ if (ret != 0)
+ FAIL ("pthread_cond_signal (recv): %d", ret);
+ }
+ ret = pthread_cond_wait (&cond_send, &mutex);
+ if (ret != 0)
+ FAIL ("pthread_cond_wait (send): %d", ret);
+ }
+ return NULL;
+}
+
+static void
+run_on_thread (void (*func) (void))
+{
+ int ret = pthread_mutex_lock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_lock (%s): %d", __func__, ret);
+ func_sent = func;
+ ret = pthread_mutex_unlock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_unlock (%s): %d", __func__, ret);
+
+ ret = pthread_cond_signal (&cond_send);
+ if (ret != 0)
+ FAIL ("pthread_mutex_lock (%s): %d", __func__, ret);
+
+ ret = pthread_mutex_lock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_lock (%s): %d", __func__, ret);
+
+ while (func_sent != NULL)
+ {
+ ret = pthread_cond_wait (&cond_recv, &mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_wait (%s): %d", __func__, ret);
+ }
+ ret = pthread_mutex_unlock (&mutex);
+ if (ret != 0)
+ FAIL ("pthread_mutex_unlock (%s): %d", __func__, ret);
+}
+
+static void
+change_thread_ids (void)
+{
+ long ret = syscall (__NR_setresuid, 2001, 2002, 2003);
+ if (ret != 0)
+ FAIL ("setresuid (2001, 2002, 2003): %ld", ret);
+}
+
+static uid_t ruid, euid, suid;
+
+static void
+get_thread_ids (void)
+{
+ if (getresuid (&ruid, &euid, &suid) < 0)
+ FAIL ("getresuid: %m (%d)", errno);
+}
+
+static void
+abort_expected (int signal __attribute__ ((unused)))
+{
+ _exit (0);
+}
+
+static int
+do_test (void)
+{
+ pthread_t thread;
+ int ret = pthread_create (&thread, NULL, thread_func, NULL);
+ if (ret != 0)
+ FAIL ("pthread_create: %d", ret);
+
+ run_on_thread (change_thread_ids);
+
+ signal (SIGABRT, &abort_expected);
+ /* This should abort the process. */
+ if (setresuid (1001, 1002, 1003) < 0)
+ FAIL ("setresuid: %m (%d)", errno);
+ signal (SIGABRT, SIG_DFL);
+
+ /* If we get here, check that the kernel did the right thing. */
+ run_on_thread (get_thread_ids);
+ if (ruid != 1001 || euid != 1002 || euid != 1003)
+ FAIL ("unexpected UIDs after setuid: %ld, %ld, %ld",
+ (long) ruid, (long) euid, (long) suid);
+ return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"