* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-01 19:38 ` [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context Ingo Molnar
@ 2015-05-02 8:30 ` NeilBrown
2015-05-02 16:27 ` Linus Torvalds
2015-05-04 17:35 ` Oleg Nesterov
2015-05-02 11:56 ` Evgeniy Polyakov
` (2 subsequent siblings)
3 siblings, 2 replies; 21+ messages in thread
From: NeilBrown @ 2015-05-02 8:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, Oleg Nesterov, linux-kernel, kvm
[-- Attachment #1: Type: text/plain, Size: 2044 bytes --]
On Fri, 1 May 2015 21:38:13 +0200 Ingo Molnar <mingo@kernel.org> wrote:
> drivers/md/md.c
> drivers/md/raid1.c
> drivers/md/raid5.c
>
> Hm, so I'm not super sure about the flush_signals() in
> raid1.c:make_request() AFAICS we can do direct RAID1 writes in
> raid1_unplug(). That looks unsafe ... I've Cc:-ed Neil.
>
> raid5.c seems safe: raid5_unplug() doesn't create requests directly,
> leaves it all for the mddev kthread.
Both raid1.c and raid5.c call flush_signals() in the make_request function
(in unusual circumstances).
I wanted a uninterruptible wait which didn't add to load-average. That
approach works in kernel threads...
All the calls in md.c are in a kernel thread so safe, but I'd rather have an
explicit "uninterruptible, but no load-average" wait....
I should probably change the make_request code to queue the request
somewhere rather than wait for it to be serviceable.
I'll look into that...
> In any case, it seems to me that the patch below would be justified?
> Totally untested and so. __flush_signals() not affected.
Fine by me - does seem justified.
Thanks,
NeilBrown
>
> Thanks,
>
> Ingo
>
> ---
> kernel/signal.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index d51c5ddd855c..100e30afe5d2 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -427,6 +427,10 @@ void flush_signals(struct task_struct *t)
> {
> unsigned long flags;
>
> + /* Only kthreads are allowed to destroy signals: */
> + if (WARN_ON_ONCE(!(current->flags & PF_KTHREAD)))
> + return;
> +
> spin_lock_irqsave(&t->sighand->siglock, flags);
> __flush_signals(t);
> spin_unlock_irqrestore(&t->sighand->siglock, flags);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-02 8:30 ` NeilBrown
@ 2015-05-02 16:27 ` Linus Torvalds
2015-05-07 12:54 ` Peter Zijlstra
2015-05-04 17:35 ` Oleg Nesterov
1 sibling, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-05-02 16:27 UTC (permalink / raw)
To: NeilBrown
Cc: Ingo Molnar, Evgeniy Polyakov, Stephen Smalley, Alex Williamson,
Oleg Nesterov, linux-kernel, kvm
On Sat, May 2, 2015 at 1:30 AM, NeilBrown <neilb@suse.de> wrote:
>
> All the calls in md.c are in a kernel thread so safe, but I'd rather have an
> explicit "uninterruptible, but no load-average" wait....
Hmm. Our task state is a bitmask anyway, so we could probably just add a
#define __TASK_NOLOAD 16
(and move the EXIT_xyz defines *away* from the list that is actually
the task state), and teach our load average thing to not count those
kinds of waits. Then you could just use
TASK_UNINTERRUPTIBLE | __TASK_NOLOAD
to make processes not count towards the load.
Or - probably preferably - we could really clean things up, and make
things much more like the bitmask it *should* be, and have explicit
bits for
- SLEEPING/STOPPED/EXITING ("why not running?")
- LOADAVG (accounted towards load)
- WAKESIG (ie "interruptible")
- WAKEKILL (this we already have)
and just make the rule be that we use "__TASK_xyz" for the actual
individual bits, and "TASK_xyz" for the helper combinations. So then
we'd have
#define TASK_UNINTERRUPTIBLE (__TASK_SLEEPING | __TASK_LOADAVG)
#define TASK_INTERRUPTIBLE (__TASK_SLEEPING | __TASK_WAKESIG)
#define TASK_KILLABLE (__TASK_SLEEPING | __TASK_WAKEKILL)
#define TASK_NOLOADAVG (__TASK_SLEEPING)
which is almost certainly how this *should* have been done, but isn't,
because of historical use.
Cleaning up like that *should* be fairly simple, but I'd be a bit
nervous about getting all the state comparisons right (we have an
unholy mix of "check this bit" and "check this whole state", and we'd
need to make sure we get those cases all right).
Ingo, what do you think? This is mostly a scheduler interface issue..
Linus
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-02 16:27 ` Linus Torvalds
@ 2015-05-07 12:54 ` Peter Zijlstra
0 siblings, 0 replies; 21+ messages in thread
From: Peter Zijlstra @ 2015-05-07 12:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: NeilBrown, Ingo Molnar, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, Oleg Nesterov, linux-kernel, kvm
On Sat, May 02, 2015 at 09:27:49AM -0700, Linus Torvalds wrote:
> On Sat, May 2, 2015 at 1:30 AM, NeilBrown <neilb@suse.de> wrote:
> >
> > All the calls in md.c are in a kernel thread so safe, but I'd rather have an
> > explicit "uninterruptible, but no load-average" wait....
>
> Hmm. Our task state is a bitmask anyway, so we could probably just add a
>
> #define __TASK_NOLOAD 16
>
> (and move the EXIT_xyz defines *away* from the list that is actually
> the task state), and teach our load average thing to not count those
> kinds of waits. Then you could just use
>
> TASK_UNINTERRUPTIBLE | __TASK_NOLOAD
>
> to make processes not count towards the load.
>
> Or - probably preferably - we could really clean things up, and make
> things much more like the bitmask it *should* be, and have explicit
> bits for
>
> - SLEEPING/STOPPED/EXITING ("why not running?")
> - LOADAVG (accounted towards load)
> - WAKESIG (ie "interruptible")
> - WAKEKILL (this we already have)
>
> and just make the rule be that we use "__TASK_xyz" for the actual
> individual bits, and "TASK_xyz" for the helper combinations. So then
> we'd have
>
> #define TASK_UNINTERRUPTIBLE (__TASK_SLEEPING | __TASK_LOADAVG)
> #define TASK_INTERRUPTIBLE (__TASK_SLEEPING | __TASK_WAKESIG)
> #define TASK_KILLABLE (__TASK_SLEEPING | __TASK_WAKEKILL)
> #define TASK_NOLOADAVG (__TASK_SLEEPING)
>
> which is almost certainly how this *should* have been done, but isn't,
> because of historical use.
>
> Cleaning up like that *should* be fairly simple, but I'd be a bit
> nervous about getting all the state comparisons right (we have an
> unholy mix of "check this bit" and "check this whole state", and we'd
> need to make sure we get those cases all right).
>
> Ingo, what do you think? This is mostly a scheduler interface issue..
Hehe, a little something like this:
https://lkml.org/lkml/2013/11/12/710
Lemme go clean that up and finish it :-)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-02 8:30 ` NeilBrown
2015-05-02 16:27 ` Linus Torvalds
@ 2015-05-04 17:35 ` Oleg Nesterov
2015-05-07 13:33 ` Jiri Kosina
1 sibling, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2015-05-04 17:35 UTC (permalink / raw)
To: NeilBrown
Cc: Ingo Molnar, Linus Torvalds, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, linux-kernel, kvm
On 05/02, NeilBrown wrote:
>
> All the calls in md.c are in a kernel thread so safe, but I'd rather have an
> explicit "uninterruptible, but no load-average" wait....
Could you please explain why md_thread() does allow_signal(SIGKILL) ?
I am just curious. It looks as if we want to allow user-space to "call"
thread->run(), and this looks strange.
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-04 17:35 ` Oleg Nesterov
@ 2015-05-07 13:33 ` Jiri Kosina
2015-05-07 22:37 ` NeilBrown
0 siblings, 1 reply; 21+ messages in thread
From: Jiri Kosina @ 2015-05-07 13:33 UTC (permalink / raw)
To: Oleg Nesterov
Cc: NeilBrown, Ingo Molnar, Linus Torvalds, Evgeniy Polyakov,
Stephen Smalley, Alex Williamson, linux-kernel, kvm
On Mon, 4 May 2015, Oleg Nesterov wrote:
> > All the calls in md.c are in a kernel thread so safe, but I'd rather have an
> > explicit "uninterruptible, but no load-average" wait....
>
> Could you please explain why md_thread() does allow_signal(SIGKILL) ?
>
> I am just curious. It looks as if we want to allow user-space to "call"
> thread->run(), and this looks strange.
One would think that this is because md wants to be notified when system
is going to be halted/rebooted, and userspace init (whatever that is)
decides to do 'kill -9 -1' to perform the final shutdown of md (the
question is why it really should be needed, becasue all filesystems should
be R/O by that time anyway).
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-07 13:33 ` Jiri Kosina
@ 2015-05-07 22:37 ` NeilBrown
0 siblings, 0 replies; 21+ messages in thread
From: NeilBrown @ 2015-05-07 22:37 UTC (permalink / raw)
To: Jiri Kosina
Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, Evgeniy Polyakov,
Stephen Smalley, Alex Williamson, linux-kernel, kvm
[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]
On Thu, 7 May 2015 15:33:53 +0200 (CEST) Jiri Kosina <jkosina@suse.cz> wrote:
> On Mon, 4 May 2015, Oleg Nesterov wrote:
>
> > > All the calls in md.c are in a kernel thread so safe, but I'd rather have an
> > > explicit "uninterruptible, but no load-average" wait....
> >
> > Could you please explain why md_thread() does allow_signal(SIGKILL) ?
> >
> > I am just curious. It looks as if we want to allow user-space to "call"
> > thread->run(), and this looks strange.
>
> One would think that this is because md wants to be notified when system
> is going to be halted/rebooted, and userspace init (whatever that is)
> decides to do 'kill -9 -1' to perform the final shutdown of md (the
> question is why it really should be needed, becasue all filesystems should
> be R/O by that time anyway).
>
Something like that. It is historical strangeness that might have seemed
like a good idea at the time, but is hard to justify.
There is an alt-sysrq that will remount filesystems read-only, but no
alt-sysrq to switch RAID arrays to "idle/clean". So I leverages alt-sysrq-K,
which does "kill -9 -1" (I think).
Since md gained the ability to switch to idle/clean after a short timeout,
and also the ability to use a write-intent-bitmap so only little bits of the
array are ever non idle/clean, this all became much less interesting.
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-01 19:38 ` [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context Ingo Molnar
2015-05-02 8:30 ` NeilBrown
@ 2015-05-02 11:56 ` Evgeniy Polyakov
2015-05-02 16:33 ` Richard Weinberger
2015-05-03 17:34 ` Oleg Nesterov
3 siblings, 0 replies; 21+ messages in thread
From: Evgeniy Polyakov @ 2015-05-02 11:56 UTC (permalink / raw)
To: Ingo Molnar, Linus Torvalds, Neil Brown, Stephen Smalley
Cc: Alex Williamson, Oleg Nesterov, linux-kernel, kvm
Hi Ingo
01.05.2015, 22:38, "Ingo Molnar" <mingo@kernel.org>:
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> On Fri, May 1, 2015 at 10:40 AM, Alex Williamson
>> <alex.williamson@redhat.com> wrote:
>>> - Flush signals on interrupted wait to retain polling interval (Alex Williamson)
>> This cannot *possibly* be right. If I read this patch right, you're
>> randomly just getting rid of signals. No way in hell is that correct.
>>
>> "flush_signals()" is only for kernel threads, where it's a hacky
>> alternative to actually handling them (since kernel threads never
>> rreturn to user space and cannot really "handle" a signal). But you're
>> doing it in the ->remove handler for the device, which can be called
>> by arbitrary system processes. This is not a kernel thread thing, as
>> far as I can see.
>>
>> If you cannot handle signals, you damn well shouldn't be using
>> "wait_event_interruptible_timeout()" to begin with. Get rid of the
>> "interruptible", since it apparently *isn't* interruptible.
>>
>> So I'm not pulling this.
>>
>> Now I'm worried that other drivers do insane things like this. I
>> wonder if we should add some sanity test to flush_signals() to make
>> sure that it can only ever get called from a kernel thread.
> Looks unsafe: called from various module exit handlers in:
>
> drivers/w1/slaves/w1_bq27000.c
> drivers/w1/slaves/w1_ds2406.c
> drivers/w1/slaves/w1_ds2408.c
> drivers/w1/slaves/w1_ds2413.c
> drivers/w1/slaves/w1_ds2423.c
> drivers/w1/slaves/w1_ds2431.c
> drivers/w1/slaves/w1_ds2433.c
> drivers/w1/slaves/w1_ds2760.c
> drivers/w1/slaves/w1_ds2780.c
> drivers/w1/slaves/w1_ds2781.c
> drivers/w1/slaves/w1_ds28e04.c
> drivers/w1/slaves/w1_smem.c
> drivers/w1/slaves/w1_therm.c
>
> which would be executed in rmmod context, losing signals.
> Cc:-ed Evgeniy.
w1 uses a little bit strange refcnt logic, basically every w1 module waits
for all its users to release their w1 resources and doesn't exit until its safe.
To wait for resources to be freed at module exit time it checks its refcnt to drop to zero periodically
and sleeps between the checks for a second. w1 flushes signals between these
checks for no particular reason, it can be safely removed from w1_unregister_family()
and interruptible sleep replaced with the normal one.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-01 19:38 ` [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context Ingo Molnar
2015-05-02 8:30 ` NeilBrown
2015-05-02 11:56 ` Evgeniy Polyakov
@ 2015-05-02 16:33 ` Richard Weinberger
2015-05-03 17:34 ` Oleg Nesterov
3 siblings, 0 replies; 21+ messages in thread
From: Richard Weinberger @ 2015-05-02 16:33 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Neil Brown, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, Oleg Nesterov, linux-kernel, kvm
On Fri, May 1, 2015 at 9:38 PM, Ingo Molnar <mingo@kernel.org> wrote:
> diff --git a/kernel/signal.c b/kernel/signal.c
> index d51c5ddd855c..100e30afe5d2 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -427,6 +427,10 @@ void flush_signals(struct task_struct *t)
> {
> unsigned long flags;
>
> + /* Only kthreads are allowed to destroy signals: */
> + if (WARN_ON_ONCE(!(current->flags & PF_KTHREAD)))
Shouldn't this be t->flags?
--
Thanks,
//richard
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-01 19:38 ` [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context Ingo Molnar
` (2 preceding siblings ...)
2015-05-02 16:33 ` Richard Weinberger
@ 2015-05-03 17:34 ` Oleg Nesterov
2015-05-04 16:45 ` [PATCH 0/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds() Oleg Nesterov
2015-05-06 10:19 ` [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context Ingo Molnar
3 siblings, 2 replies; 21+ messages in thread
From: Oleg Nesterov @ 2015-05-03 17:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: Linus Torvalds, Neil Brown, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, linux-kernel, kvm
On 05/01, Ingo Molnar wrote:
>
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > On Fri, May 1, 2015 at 10:40 AM, Alex Williamson
> > <alex.williamson@redhat.com> wrote:
> > >
> > > - Flush signals on interrupted wait to retain polling interval (Alex Williamson)
> >
> > This cannot *possibly* be right. If I read this patch right, you're
> > randomly just getting rid of signals. No way in hell is that correct.
> >
> > "flush_signals()" is only for kernel threads, where it's a hacky
> > alternative to actually handling them (since kernel threads never
> > rreturn to user space and cannot really "handle" a signal). But you're
> > doing it in the ->remove handler for the device, which can be called
> > by arbitrary system processes. This is not a kernel thread thing, as
> > far as I can see.
> >
> > If you cannot handle signals, you damn well shouldn't be using
> > "wait_event_interruptible_timeout()" to begin with. Get rid of the
> > "interruptible", since it apparently *isn't* interruptible.
> >
> > So I'm not pulling this.
> >
> > Now I'm worried that other drivers do insane things like this. I
> > wonder if we should add some sanity test to flush_signals() to make
> > sure that it can only ever get called from a kernel thread.
> >
> > Oleg?
>
> So there are these uses:
>
> triton:~/tip> git grep -lw flush_signals
> arch/arm/common/bL_switcher.c
>
>
> Looks safe: used within the bL_switcher_thread() kthread.
safe but wrong at first glance... I mean, it is pointless. Looks like,
bL_switcher_thread() wrongly thinks that wait_event_interruptible() can
lead to signal_pending().
> drivers/block/drbd/drbd_main.c
> drivers/block/drbd/drbd_nl.c
> drivers/block/drbd/drbd_receiver.c
> drivers/block/drbd/drbd_worker.c
Oh, I didn't know this helper is abused so much. I'll try to recheck
tomorrow, but it seems that it should die...
> drivers/md/md.c
I can't understand this code... The comment says:
/* We need to wait INTERRUPTIBLE so that
* we don't add to the load-average.
* That means we need to be sure no signals are
* pending
*/
if (signal_pending(current))
flush_signals(current);
and this is wrong. However, signal_pending() can be true because of
allow_signal(SIGKILL) above. But why it does allow_signal() ?
> fs/lockd/svc.c
> fs/nfs/callback.c
> fs/nfsd/nfssvc.c
>
> Looks safe: lockd, nfsd plus nfsv4.%u-svc kthreads.
Yes, this case looks fine. But perhaps it makes sense to add another
helper... Or not, I'll try to think more.
> I also found a __flush_signals() use in:
>
> security/selinux/hooks.c
>
> Now that's selinux_bprm_committed_creds(), apparently executed on
> exec(). Also does stuff like:
>
> memset(&itimer, 0, sizeof itimer);
> for (i = 0; i < 3; i++)
> do_setitimer(i, &itimer, NULL);
>
> and unblocks signals as well:
>
> sigemptyset(¤t->blocked);
>
> but this appears to be kind of legit: the task failed to get the
> required permissions, and guns go off.
and I simply can't understand this code... but I feel that it can't
be correct ;) Will try to re-read tomorrow.
> In any case, it seems to me that the patch below would be justified?
> Totally untested and so. __flush_signals() not affected.
Yes, agreed, it can't be right if the caller is not kthread.
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -427,6 +427,10 @@ void flush_signals(struct task_struct *t)
> {
> unsigned long flags;
>
> + /* Only kthreads are allowed to destroy signals: */
> + if (WARN_ON_ONCE(!(current->flags & PF_KTHREAD)))
> + return;
> +
But I am not sure this can't make some buggy driver even more buggy.
Just suppose it does something
do {
if (signal_pending())
flush_signals();
} while (wait_event_interruptible(...));
and this change will turn this into busy-wait loop.
So perhaps another change which just adds WARN_ON_RATELIMIT() without
"return" will be safer?
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
2015-05-03 17:34 ` Oleg Nesterov
@ 2015-05-04 16:45 ` Oleg Nesterov
2015-05-04 16:45 ` [PATCH 1/1] " Oleg Nesterov
2015-05-06 10:19 ` [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context Ingo Molnar
1 sibling, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2015-05-04 16:45 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: Linus Torvalds, Neil Brown, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, linux-kernel, kvm, Paul Moore, Eric Paris
On 05/03, Oleg Nesterov wrote:
>
> On 05/01, Ingo Molnar wrote:
> >
>
> > I also found a __flush_signals() use in:
> >
> > security/selinux/hooks.c
> >
> and I simply can't understand this code... but I feel that it can't
> be correct ;) Will try to re-read tomorrow.
Yes, this doesn't look right. Lets kill __flush_signals() first, there
are no other users.
And I am not sure it is fine to flush SIGSTOP... do we really want this?
Oleg.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
2015-05-04 16:45 ` [PATCH 0/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds() Oleg Nesterov
@ 2015-05-04 16:45 ` Oleg Nesterov
2015-05-04 19:43 ` Paul Moore
0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2015-05-04 16:45 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: Linus Torvalds, Neil Brown, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, linux-kernel, kvm, Paul Moore, Eric Paris
selinux_bprm_committed_creds()->__flush_signals() is not right, we
shouldn't clear TIF_SIGPENDING unconditionally. There can be other
reasons for signal_pending(): freezing(), JOBCTL_PENDING_MASK, and
potentially more.
Also change this code to check fatal_signal_pending() rather than
SIGNAL_GROUP_EXIT, it looks a bit better.
Now we can kill __flush_signals() before it finds another buggy user.
Note: this code looks racy, we can flush a signal which was sent after
the task SID has been updated.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/linux/sched.h | 1 -
kernel/signal.c | 13 ++++---------
security/selinux/hooks.c | 6 ++++--
3 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8db31ef..eb1ac84 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2345,7 +2345,6 @@ extern void sched_dead(struct task_struct *p);
extern void proc_caches_init(void);
extern void flush_signals(struct task_struct *);
-extern void __flush_signals(struct task_struct *);
extern void ignore_signals(struct task_struct *);
extern void flush_signal_handlers(struct task_struct *, int force_default);
extern int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info);
diff --git a/kernel/signal.c b/kernel/signal.c
index 16a3052..837ca7d 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -414,21 +414,16 @@ void flush_sigqueue(struct sigpending *queue)
}
/*
- * Flush all pending signals for a task.
+ * Flush all pending signals for this kthread.
*/
-void __flush_signals(struct task_struct *t)
-{
- clear_tsk_thread_flag(t, TIF_SIGPENDING);
- flush_sigqueue(&t->pending);
- flush_sigqueue(&t->signal->shared_pending);
-}
-
void flush_signals(struct task_struct *t)
{
unsigned long flags;
spin_lock_irqsave(&t->sighand->siglock, flags);
- __flush_signals(t);
+ clear_tsk_thread_flag(t, TIF_SIGPENDING);
+ flush_sigqueue(&t->pending);
+ flush_sigqueue(&t->signal->shared_pending);
spin_unlock_irqrestore(&t->sighand->siglock, flags);
}
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 6da7532..6907d11 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2397,10 +2397,12 @@ static void selinux_bprm_committed_creds(struct linux_binprm *bprm)
for (i = 0; i < 3; i++)
do_setitimer(i, &itimer, NULL);
spin_lock_irq(¤t->sighand->siglock);
- if (!(current->signal->flags & SIGNAL_GROUP_EXIT)) {
- __flush_signals(current);
+ if (!fatal_signal_pending(current)) {
+ flush_sigqueue(¤t->pending);
+ flush_sigqueue(¤t->signal->shared_pending);
flush_signal_handlers(current, 1);
sigemptyset(¤t->blocked);
+ recalc_sigpending();
}
spin_unlock_irq(¤t->sighand->siglock);
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 1/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds()
2015-05-04 16:45 ` [PATCH 1/1] " Oleg Nesterov
@ 2015-05-04 19:43 ` Paul Moore
0 siblings, 0 replies; 21+ messages in thread
From: Paul Moore @ 2015-05-04 19:43 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Ingo Molnar, Linus Torvalds, Neil Brown,
Evgeniy Polyakov, Stephen Smalley, Alex Williamson, linux-kernel,
kvm, Eric Paris, selinux
On Monday, May 04, 2015 06:45:58 PM Oleg Nesterov wrote:
> selinux_bprm_committed_creds()->__flush_signals() is not right, we
> shouldn't clear TIF_SIGPENDING unconditionally. There can be other
> reasons for signal_pending(): freezing(), JOBCTL_PENDING_MASK, and
> potentially more.
>
> Also change this code to check fatal_signal_pending() rather than
> SIGNAL_GROUP_EXIT, it looks a bit better.
>
> Now we can kill __flush_signals() before it finds another buggy user.
[NOTE: Added the SELinux list to the CC line]
This looks reasonable to me, I'm going to apply it to selinux#next today.
> Note: this code looks racy, we can flush a signal which was sent after
> the task SID has been updated.
The whole signal flush thread has started some discussions about how we are
currently handling this, and if it still makes sense. Like many things, it
seemed like a good idea at the time, but after several years we're debating if
that is still the case. I expect we'll be changing this code soon.
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
> include/linux/sched.h | 1 -
> kernel/signal.c | 13 ++++---------
> security/selinux/hooks.c | 6 ++++--
> 3 files changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8db31ef..eb1ac84 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2345,7 +2345,6 @@ extern void sched_dead(struct task_struct *p);
>
> extern void proc_caches_init(void);
> extern void flush_signals(struct task_struct *);
> -extern void __flush_signals(struct task_struct *);
> extern void ignore_signals(struct task_struct *);
> extern void flush_signal_handlers(struct task_struct *, int force_default);
> extern int dequeue_signal(struct task_struct *tsk, sigset_t *mask,
> siginfo_t *info); diff --git a/kernel/signal.c b/kernel/signal.c
> index 16a3052..837ca7d 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -414,21 +414,16 @@ void flush_sigqueue(struct sigpending *queue)
> }
>
> /*
> - * Flush all pending signals for a task.
> + * Flush all pending signals for this kthread.
> */
> -void __flush_signals(struct task_struct *t)
> -{
> - clear_tsk_thread_flag(t, TIF_SIGPENDING);
> - flush_sigqueue(&t->pending);
> - flush_sigqueue(&t->signal->shared_pending);
> -}
> -
> void flush_signals(struct task_struct *t)
> {
> unsigned long flags;
>
> spin_lock_irqsave(&t->sighand->siglock, flags);
> - __flush_signals(t);
> + clear_tsk_thread_flag(t, TIF_SIGPENDING);
> + flush_sigqueue(&t->pending);
> + flush_sigqueue(&t->signal->shared_pending);
> spin_unlock_irqrestore(&t->sighand->siglock, flags);
> }
>
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 6da7532..6907d11 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -2397,10 +2397,12 @@ static void selinux_bprm_committed_creds(struct
> linux_binprm *bprm) for (i = 0; i < 3; i++)
> do_setitimer(i, &itimer, NULL);
> spin_lock_irq(¤t->sighand->siglock);
> - if (!(current->signal->flags & SIGNAL_GROUP_EXIT)) {
> - __flush_signals(current);
> + if (!fatal_signal_pending(current)) {
> + flush_sigqueue(¤t->pending);
> + flush_sigqueue(¤t->signal->shared_pending);
> flush_signal_handlers(current, 1);
> sigemptyset(¤t->blocked);
> + recalc_sigpending();
> }
> spin_unlock_irq(¤t->sighand->siglock);
> }
--
paul moore
www.paul-moore.com
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] signals: Generate warning when flush_signals() is called from non-kthread context
2015-05-03 17:34 ` Oleg Nesterov
2015-05-04 16:45 ` [PATCH 0/1] signals: don't abuse __flush_signals() in selinux_bprm_committed_creds() Oleg Nesterov
@ 2015-05-06 10:19 ` Ingo Molnar
1 sibling, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2015-05-06 10:19 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Linus Torvalds, Neil Brown, Evgeniy Polyakov, Stephen Smalley,
Alex Williamson, linux-kernel, kvm
* Oleg Nesterov <oleg@redhat.com> wrote:
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -427,6 +427,10 @@ void flush_signals(struct task_struct *t)
> > {
> > unsigned long flags;
> >
> > + /* Only kthreads are allowed to destroy signals: */
> > + if (WARN_ON_ONCE(!(current->flags & PF_KTHREAD)))
> > + return;
> > +
>
> But I am not sure this can't make some buggy driver even more buggy.
> Just suppose it does something
>
> do {
> if (signal_pending())
> flush_signals();
> } while (wait_event_interruptible(...));
>
> and this change will turn this into busy-wait loop.
>
> So perhaps another change which just adds WARN_ON_RATELIMIT()
> without "return" will be safer?
Yeah, absolutely.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 21+ messages in thread