signals: Bug or manpage inconsistency?

Message ID alpine.DEB.2.20.1705301750390.1950@nanos
State New, archived
Headers

Commit Message

Thomas Gleixner May 30, 2017, 4:14 p.m. UTC
  On Tue, 30 May 2017, Thomas Gleixner wrote:
> The commit which added the queuing of blocked and ignored signals is in the
> history tree with a pretty useless changelog.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
> 
>  commit 98fc8ab9e74389e0c7001052597f61336dc62833
>  Author: Linus Torvalds <torvalds@penguin.transmeta.com>
>  Date:   Tue Feb 11 20:49:03 2003 -0800
> 
>      Don't wake up processes unnecessarily for ignored signals
> 
> It rewrites sig_ignored() and adds the following to it:
> 
> +       /*
> +        * Blocked signals are never ignored, since the
> +        * signal handler may change by the time it is
> +        * unblocked.
> +        */
> +       if (sigismember(&t->blocked, sig))
> +               return 0;
> 
> I have no idea how that is related to $subject of the commit and why this
> decision was made.
> 
> Linus, any recollection?

So I found at least some explanation by studying the spec some more.

There are two variants of ignored signals:

  1) handler is SIG_IGN

  2) handler is SIG_DFL and default action is 'ignore'

     These are the signals in SIG_KERNEL_IGNORE_MASK 

     #define SIG_KERNEL_IGNORE_MASK (\
       	rt_sigmask(SIGCONT)   |  rt_sigmask(SIGCHLD)   | \
        rt_sigmask(SIGWINCH)  |  rt_sigmask(SIGURG)    )

     These signals are not allowed to be discarded when the signal is
     blocked.

So my understanding of the spec is:

  #1 Can discard the signals as long as SIG_IGN is set whether the signal
     is blocked or not

  #2 Must queue them if the signal is blocked, otherwise discard

I changed the logic according to this with the patch below and a quick test
run of lpt and glibc test cases produces no failures.

Thoughts?

Thanks,

	tglx

8<--------------------
Subject: signals: Reduce scope of blocked signals in sig_handler_ignored()
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 30 May 2017 18:01:33 +0200

Add proper changelog and a big fat comment in the code.

Not-yet-signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/signal.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)
  

Comments

Oleg Nesterov May 30, 2017, 5:04 p.m. UTC | #1
On 05/30, Thomas Gleixner wrote:
>
> So I found at least some explanation by studying the spec some more.
>
> There are two variants of ignored signals:
>
>   1) handler is SIG_IGN
>
>   2) handler is SIG_DFL and default action is 'ignore'

Yes, and note that sys_rt_sigaction() discard the pending signal in both cases.
So even with this change the logic won't look 100% consistent.

I can't comment, I never tried to understand the rationality behind the current
behaviour. But at least the sending path should never drop a blocked SIG_DFL
signal, there is no other way to ensure you won't miss a signal during exec.

> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -70,6 +70,13 @@ static int sig_handler_ignored(void __us
>  		(handler == SIG_DFL && sig_kernel_ignore(sig));
>  }
>
> +static int sig_handler_is_sigign(struct task_struct *t, int sig)
> +{
> +	void __user *handler = sig_handler(t, sig);
> +
> +	return handler == SIG_IGN;
> +}
> +
>  static int sig_task_ignored(struct task_struct *t, int sig, bool force)
>  {
>  	void __user *handler;
> @@ -91,7 +98,7 @@ static int sig_ignored(struct task_struc
>  	 * unblocked.
>  	 */
>  	if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig))
> -		return 0;
> +		return sig_handler_is_sigign(t, sig);

we can probably make a simpler change, but this doesn't matter.

Obviously this is a user-visible change and it can break something. Say, an
application does sigwaitinfo(SIGCHLD) and SIGCHLD is ignored (SIG_IGN), this
will no longer work.

I won't argue, but perhaps it is too late change this historical behaviour.




Although perhaps we can cleanup do_sigtimedwait() for the start. ->real_blocked
doesn't look nice. I think we can replace it with task->sigwait_mask and then
change sig_handler() to do

	sigismember(sigwait_mask, sig) ? SIG_ERR :
		t->sighand->action[sig - 1].sa.sa_handler;

this needs other changes, say, sig_fatal() will need to use sig_handler() too.
Then it would be more safe to drop the SIG_IGN signals unconditionally.

Oleg.
  
Linus Torvalds May 30, 2017, 5:19 p.m. UTC | #2
On Tue, May 30, 2017 at 10:04 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> I can't comment, I never tried to understand the rationality behind the current
> behaviour. But at least the sending path should never drop a blocked SIG_DFL
> signal, there is no other way to ensure you won't miss a signal during exec.

Note that both SIG_DFL _and_ SIG_IGN are possible after exec, so if
you don't want to drop particular signals to the new process (which
may then add its own handler and want them), using the signal blocked
mask is the rigth thing to do for both of them,

SIG_IGN doesn't mean "ignore signal forever". It means "ignore signals
right now", and I think that our current signal blocking semantics are
likely the correct ones, exactly because it means "when you start
blocking signals, the kernel will not drop them".

There is no difference wrt SIG_DFL and SIG_IGN in this sense.

> Obviously this is a user-visible change and it can break something. Say, an
> application does sigwaitinfo(SIGCHLD) and SIGCHLD is ignored (SIG_IGN), this
> will no longer work.

That's an interesting special case. Yes, SIG_IGN actually has magical
properties wrt SIGCHLD. It basically means the opposite of ignoring
it, it's an "implicit signal handler".  So I could imagine people
using SIG_IGN to avoid the signal handler, but then block SIG_CHLD and
using sigwait() for it.

That sounds nonportable as hell, but I could imagine people doing it
because it happens to work.

So again, I really wouldn't want to change existing semantics unless
there is a big real reason for it. Our current semantics are not
wrong.

                Linus
  
Oleg Nesterov May 30, 2017, 7:18 p.m. UTC | #3
On 05/30, Linus Torvalds wrote:
>
> On Tue, May 30, 2017 at 10:04 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > I can't comment, I never tried to understand the rationality behind the current
> > behaviour. But at least the sending path should never drop a blocked SIG_DFL
> > signal, there is no other way to ensure you won't miss a signal during exec.
>
> Note that both SIG_DFL _and_ SIG_IGN are possible after exec,

Yes, if it was already ignored before exec. But ignoring the compatibility the
only important case is when it is SIG_DFL because of flush_signal_handlers().

> SIG_IGN doesn't mean "ignore signal forever". It means "ignore signals
> right now", and I think that our current signal blocking semantics are
> likely the correct ones,

I am not saying it is incorrect, but I agree with Thomas in that this
sigismember(t->blocked) in sig_ignored() doesn't look really nice.

> exactly because it means "when you start
> blocking signals, the kernel will not drop them".

if the process is singe-threaded or the signal is private, or it is blocked
by all threads. Otherwise it will wakeup another thread for no reason, the
signal will be dropped in get_signal().

And again, this doesn't look consistent with do_sigaction(). It even has a
comment which explains that we want to flush the ignored signals, blocked
or not.

Nevermind, I am not trying to argue, and

> So again, I really wouldn't want to change existing semantics unless
> there is a big real reason for it. Our current semantics are not
> wrong.

I certainly agree.

Oleg.
  
Thomas Gleixner May 30, 2017, 8:54 p.m. UTC | #4
On Tue, 30 May 2017, Linus Torvalds wrote:
> On Tue, May 30, 2017 at 10:04 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > Obviously this is a user-visible change and it can break something. Say, an
> > application does sigwaitinfo(SIGCHLD) and SIGCHLD is ignored (SIG_IGN), this
> > will no longer work.
> 
> That's an interesting special case. Yes, SIG_IGN actually has magical
> properties wrt SIGCHLD. It basically means the opposite of ignoring
> it, it's an "implicit signal handler".  So I could imagine people
> using SIG_IGN to avoid the signal handler, but then block SIG_CHLD and
> using sigwait() for it.
> 
> That sounds nonportable as hell, but I could imagine people doing it
> because it happens to work.

Just that it does not work. See do_notify_parent()

	if (!tsk->ptrace && sig == SIGCHLD &&
	    (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
	     (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
		/*
		 * We are exiting and our parent doesn't care.  POSIX.1
		 * defines special semantics for setting SIGCHLD to SIG_IGN
		 * or setting the SA_NOCLDWAIT flag: we should be reaped
		 * automatically and not left for our parent's wait4 call.
		 * Rather than having the parent do it as a magic kind of
		 * signal handler, we just set this to tell do_exit that we
		 * can be cleaned up without becoming a zombie.  Note that
		 * we still call __wake_up_parent in this case, because a
		 * blocked sys_wait4 might now return -ECHILD.
		 *
		 * Whether we send SIGCHLD or not for SA_NOCLDWAIT
		 * is implementation-defined: we do (if you don't want
		 * it, just use SIG_IGN instead).
		 */
		autoreap = true;
		if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
			sig = 0;
	}
        if (valid_signal(sig) && sig)
                __group_send_sig_info(sig, &info, tsk->parent);

So if the oarent has SIG_IGN we do not send a signal at all. So it's not a
really interesting special case and the magic properties are not that magic
either. Test case below. The parent waits forever.

Thanks,

	tglx
---

#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>

int main(void)
{
	struct sigaction action;
	sigset_t set;
	int signum;

	sigemptyset(&set);
	sigaddset (&set, SIGCHLD);

	memset(&action, 0, sizeof(action));
	action.sa_handler = SIG_IGN;
	sigaction(SIGCHLD, &action, NULL);

	sigprocmask(SIG_BLOCK, &set, NULL);

	if (fork() == 0) {
		sleep(1);
		printf("Child exiting\n");
		exit(0);
	}

	sigwait(&set, &signum);
	printf("Parent exiting\n");
	return 0;
}
  
Eric W. Biederman May 31, 2017, 12:48 a.m. UTC | #5
Thomas Gleixner <tglx@linutronix.de> writes:

> On Tue, 30 May 2017, Linus Torvalds wrote:
>> On Tue, May 30, 2017 at 10:04 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > Obviously this is a user-visible change and it can break something. Say, an
>> > application does sigwaitinfo(SIGCHLD) and SIGCHLD is ignored (SIG_IGN), this
>> > will no longer work.
>> 
>> That's an interesting special case. Yes, SIG_IGN actually has magical
>> properties wrt SIGCHLD. It basically means the opposite of ignoring
>> it, it's an "implicit signal handler".  So I could imagine people
>> using SIG_IGN to avoid the signal handler, but then block SIG_CHLD and
>> using sigwait() for it.
>> 
>> That sounds nonportable as hell, but I could imagine people doing it
>> because it happens to work.
>
> Just that it does not work. See do_notify_parent()
>
> 	if (!tsk->ptrace && sig == SIGCHLD &&
> 	    (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
> 	     (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
> 		/*
> 		 * We are exiting and our parent doesn't care.  POSIX.1
> 		 * defines special semantics for setting SIGCHLD to SIG_IGN
> 		 * or setting the SA_NOCLDWAIT flag: we should be reaped
> 		 * automatically and not left for our parent's wait4 call.
> 		 * Rather than having the parent do it as a magic kind of
> 		 * signal handler, we just set this to tell do_exit that we
> 		 * can be cleaned up without becoming a zombie.  Note that
> 		 * we still call __wake_up_parent in this case, because a
> 		 * blocked sys_wait4 might now return -ECHILD.
> 		 *
> 		 * Whether we send SIGCHLD or not for SA_NOCLDWAIT
> 		 * is implementation-defined: we do (if you don't want
> 		 * it, just use SIG_IGN instead).
> 		 */
> 		autoreap = true;
> 		if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
> 			sig = 0;
> 	}
>         if (valid_signal(sig) && sig)
>                 __group_send_sig_info(sig, &info, tsk->parent);
>
> So if the oarent has SIG_IGN we do not send a signal at all. So it's not a
> really interesting special case and the magic properties are not that magic
> either. Test case below. The parent waits forever.

Which would suggests that to be consistent we should ignore
blocks for other signals on send when the signal handler is SIG_IGN.

Hmm.

For blocked signals because there is only one siginfo ever allocated
as I read it the code naturally blocks the signal until it is
dequeued and rearmed.

I suspect what you want to do is a little more in the magic
dequeue_signal for timers and look if the signal handler
is SIG_IGN.  I think the clean solution would be to
treat timers whose signal handler is SIG_IGN as blocked
signals and simply not dequeue them.

If they are not dequeued they won't reschedule and won't restart.
Then when the signal handler finally changes you immediately get
one pending signal and then the timers fire normally.

That gets tricky though because the signal numbers are not dedicated
to posix timers.

It might instead require noting that the handler is SIG_IGN when
dequeued and simply disabled the timer.  With an enable that kicks
in when someone calls sigaction and changes the handler.

Eric
  
Eric W. Biederman May 31, 2017, 1:10 a.m. UTC | #6
ebiederm@xmission.com (Eric W. Biederman) writes:

> Thomas Gleixner <tglx@linutronix.de> writes:
>
>> On Tue, 30 May 2017, Linus Torvalds wrote:
>>> On Tue, May 30, 2017 at 10:04 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>>> > Obviously this is a user-visible change and it can break something. Say, an
>>> > application does sigwaitinfo(SIGCHLD) and SIGCHLD is ignored (SIG_IGN), this
>>> > will no longer work.
>>> 
>>> That's an interesting special case. Yes, SIG_IGN actually has magical
>>> properties wrt SIGCHLD. It basically means the opposite of ignoring
>>> it, it's an "implicit signal handler".  So I could imagine people
>>> using SIG_IGN to avoid the signal handler, but then block SIG_CHLD and
>>> using sigwait() for it.
>>> 
>>> That sounds nonportable as hell, but I could imagine people doing it
>>> because it happens to work.
>>
>> Just that it does not work. See do_notify_parent()
>>
>> 	if (!tsk->ptrace && sig == SIGCHLD &&
>> 	    (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
>> 	     (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
>> 		/*
>> 		 * We are exiting and our parent doesn't care.  POSIX.1
>> 		 * defines special semantics for setting SIGCHLD to SIG_IGN
>> 		 * or setting the SA_NOCLDWAIT flag: we should be reaped
>> 		 * automatically and not left for our parent's wait4 call.
>> 		 * Rather than having the parent do it as a magic kind of
>> 		 * signal handler, we just set this to tell do_exit that we
>> 		 * can be cleaned up without becoming a zombie.  Note that
>> 		 * we still call __wake_up_parent in this case, because a
>> 		 * blocked sys_wait4 might now return -ECHILD.
>> 		 *
>> 		 * Whether we send SIGCHLD or not for SA_NOCLDWAIT
>> 		 * is implementation-defined: we do (if you don't want
>> 		 * it, just use SIG_IGN instead).
>> 		 */
>> 		autoreap = true;
>> 		if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
>> 			sig = 0;
>> 	}
>>         if (valid_signal(sig) && sig)
>>                 __group_send_sig_info(sig, &info, tsk->parent);
>>
>> So if the oarent has SIG_IGN we do not send a signal at all. So it's not a
>> really interesting special case and the magic properties are not that magic
>> either. Test case below. The parent waits forever.
>
> Which would suggests that to be consistent we should ignore
> blocks for other signals on send when the signal handler is SIG_IGN.
>
> Hmm.
>
> For blocked signals because there is only one siginfo ever allocated
> as I read it the code naturally blocks the signal until it is
> dequeued and rearmed.
>
> I suspect what you want to do is a little more in the magic
> dequeue_signal for timers and look if the signal handler
> is SIG_IGN.  I think the clean solution would be to
> treat timers whose signal handler is SIG_IGN as blocked
> signals and simply not dequeue them.
>
> If they are not dequeued they won't reschedule and won't restart.
> Then when the signal handler finally changes you immediately get
> one pending signal and then the timers fire normally.
>
> That gets tricky though because the signal numbers are not dedicated
> to posix timers.
>
> It might instead require noting that the handler is SIG_IGN when
> dequeued and simply disabled the timer.  With an enable that kicks
> in when someone calls sigaction and changes the handler.

The point my tired brain is making is that I don't think you actually
care about SIG_IGN vs blocked signals.

Sigh.  But then again you have two places to worry about blocked
signals.  From send_siqueue telling you the signals are ignored
and from signals being dequeued and ignored with dequeue_signal
in get_signal, do_sigtimedwait, and signalfd_dequeue.

Now I see why you are asking about semantics.  If send_siqueue could
always look at SIG_IGN you would only have one spot to worry about.

However the big practical question is if you can block those signals
and pick them up with sigtimedwait or with signalfd.  It looks like
you can today as neither sigtimedwait or signalfd care if the signal
handler can be set to SIG_IGN.

Which means you wind up having 4 places you need to deal with ignored
signals.  send_sigque, get_signal, do_sigtimedwait, and
signalfd_dequeue.  It feels like it would be nice to move the timer
requeue out of dequeue_signal and into it's callers, with an appropriate
set of helpers.

Sigh.

I hope that helps a little.

Eric
  

Patch

Index: b/kernel/signal.c
===================================================================
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -70,6 +70,13 @@  static int sig_handler_ignored(void __us
 		(handler == SIG_DFL && sig_kernel_ignore(sig));
 }
 
+static int sig_handler_is_sigign(struct task_struct *t, int sig)
+{
+	void __user *handler = sig_handler(t, sig);
+
+	return handler == SIG_IGN;
+}
+
 static int sig_task_ignored(struct task_struct *t, int sig, bool force)
 {
 	void __user *handler;
@@ -91,7 +98,7 @@  static int sig_ignored(struct task_struc
 	 * unblocked.
 	 */
 	if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig))
-		return 0;
+		return sig_handler_is_sigign(t, sig);
 
 	if (!sig_task_ignored(t, sig, force))
 		return 0;