linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
@ 2003-09-30 16:27 Benjamin Herrenschmidt
  2003-09-30 17:36 ` Andrea Arcangeli
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2003-09-30 16:27 UTC (permalink / raw)
  To: Linux Kernel list; +Cc: Marcelo Tosatti

Hi !

I finally figured out why on a friend machine, his nr_queued_signals is
continuously growing until reaching nr_max_signals, thus preventing
queuing of RT signals, for example causing do_notify_parent() to fail
(libpthread uses sig 33 which is RTMIN+1 typically) leading to all sorts
of zombies floating around etc...

The problem is a bug in kupdated (possibly shared by other kernel code
manipulating a task tsk->pending.signal mask "by hand") that gets
triggered, in this case, by the infamous noflushd, but other culprits
are possible.

The bug is simple: the SIGSTOP sent to kupdated gets queued (allocated
& queued actually) since we try to queue one non-RT signal nowadays.

However, when "receiving" it, kupdated will "manually" clear it from
signal pending mask and will _not_ dequeue it. Thus, that signal will
stay forever in kupdated signal queue, it will never be deallocated and
nr_queued_signals will never be decreased.

Actually, further sigstops will stack there as well since kupdated is
clearing it from tsk->pending.signal so further queuing won't "notice"
it's already there.
That clearing also prevents handle_stop_signal() from flushing it from
the queue when SIGCONT is received.

The only thing I can see that could get rid of those signals  is
flush_sigqueue(), but of course, this is never called for a kernel
thread like kupdated.

So there is a clear bug in kupdated, I suppose the fix is to call
something like dequeue_signal() from kupdated instead of hacking
tsk->pending.signal. I need to test a fix before I post a patch.

Do we have a smiliar bug(s) with other bits of kernel "manipulating"
the pending signal mask this way ? I don't know what others may do
here, so if you know something like that, please speak up.

Ben.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
  2003-09-30 16:27 [BUG] 2.4.x RT signal leak with kupdated (and maybe others) Benjamin Herrenschmidt
@ 2003-09-30 17:36 ` Andrea Arcangeli
  2003-09-30 17:47   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Andrea Arcangeli @ 2003-09-30 17:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux Kernel list, Marcelo Tosatti

On Tue, Sep 30, 2003 at 06:27:55PM +0200, Benjamin Herrenschmidt wrote:
> Hi !
> 
> I finally figured out why on a friend machine, his nr_queued_signals is
> continuously growing until reaching nr_max_signals, thus preventing
> queuing of RT signals, for example causing do_notify_parent() to fail
> (libpthread uses sig 33 which is RTMIN+1 typically) leading to all sorts
> of zombies floating around etc...
> 
> The problem is a bug in kupdated (possibly shared by other kernel code
> manipulating a task tsk->pending.signal mask "by hand") that gets
> triggered, in this case, by the infamous noflushd, but other culprits
> are possible.
> 
> The bug is simple: the SIGSTOP sent to kupdated gets queued (allocated
> & queued actually) since we try to queue one non-RT signal nowadays.
> 
> However, when "receiving" it, kupdated will "manually" clear it from
> signal pending mask and will _not_ dequeue it. Thus, that signal will
> stay forever in kupdated signal queue, it will never be deallocated and
> nr_queued_signals will never be decreased.
> 
> Actually, further sigstops will stack there as well since kupdated is
> clearing it from tsk->pending.signal so further queuing won't "notice"
> it's already there.
> That clearing also prevents handle_stop_signal() from flushing it from
> the queue when SIGCONT is received.
> 
> The only thing I can see that could get rid of those signals  is
> flush_sigqueue(), but of course, this is never called for a kernel
> thread like kupdated.
> 
> So there is a clear bug in kupdated, I suppose the fix is to call
> something like dequeue_signal() from kupdated instead of hacking
> tsk->pending.signal. I need to test a fix before I post a patch.
> 
> Do we have a smiliar bug(s) with other bits of kernel "manipulating"
> the pending signal mask this way ? I don't know what others may do
> here, so if you know something like that, please speak up.

When I wrote the kupdate code, only the real time signals could be
queued. Now things have changed to carry the siginfo for non-RT too. The
fact we clear the pending by hand is what allows more than a RT signal
to be stacked, we shouldn't clear the bitflag unless we dequeue the
signal too. That's definitely a bug (though a minor one ;)

I sure agree it should be fixed with a dequeue_signal in kupdate.

BTW, things like this in daemonize don't protect against allocating
signals (the kernel deamon should flush_signals once in a while in the
main loop to do that):

	/* Block and flush all signals */
	sigfillset(&blocked);
	sigprocmask(SIG_BLOCK, &blocked, NULL);
	flush_signals(current);

But it's not a big problem you shouldn't send signals to daemons
anyways, only kupdate gives a semantic to signals.

I would suggest you to go ahead and cook the fix ;), thanks.

Andrea - If you prefer relying on open source software, check these links:
	    rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.[45]/
	    http://www.cobite.com/cvsps/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
  2003-09-30 17:36 ` Andrea Arcangeli
@ 2003-09-30 17:47   ` Benjamin Herrenschmidt
  2003-09-30 18:22     ` Andrea Arcangeli
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2003-09-30 17:47 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linux Kernel list, Marcelo Tosatti

> When I wrote the kupdate code, only the real time signals could be
> queued. Now things have changed to carry the siginfo for non-RT too. The
> fact we clear the pending by hand is what allows more than a RT signal
> to be stacked, we shouldn't clear the bitflag unless we dequeue the
> signal too. That's definitely a bug (though a minor one ;)

"Minor" but leads to interesting results in the end when coupled
with something like noflushd that regulary send those signals ;)

Not only we leak them, but we also get nr_queued_signals reaching
nr_max_signals. This has the side effect of making do_notify_parent()
silently fail when a pthread is dead (libpthread use an RT signal).

The end result is that after a few days, a machine running noflushd
and thread intensive apps like evolution and gkrellm will have dozens
(or even hundreds) of zombies as the child threads are never reclaimed
by libpthread "manager" thread since it never gets the signal...

> I sure agree it should be fixed with a dequeue_signal in kupdate.

I'll cook something tomorrow (I'm away from any 2.4 machine tonight).

> BTW, things like this in daemonize don't protect against allocating
> signals (the kernel deamon should flush_signals once in a while in the
> main loop to do that):
> 
> 	/* Block and flush all signals */
> 	sigfillset(&blocked);
> 	sigprocmask(SIG_BLOCK, &blocked, NULL);
> 	flush_signals(current);
> 
> But it's not a big problem you shouldn't send signals to daemons
> anyways, only kupdate gives a semantic to signals.

Interesting... though hopefully, I didn't see anybody else causing
such a constant increase of nr_queued_signals so far on this laptop...

Ben.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
  2003-09-30 17:47   ` Benjamin Herrenschmidt
@ 2003-09-30 18:22     ` Andrea Arcangeli
  2003-10-01  9:24       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Andrea Arcangeli @ 2003-09-30 18:22 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux Kernel list, Marcelo Tosatti

On Tue, Sep 30, 2003 at 07:47:09PM +0200, Benjamin Herrenschmidt wrote:
> > When I wrote the kupdate code, only the real time signals could be
> > queued. Now things have changed to carry the siginfo for non-RT too. The
> > fact we clear the pending by hand is what allows more than a RT signal
> > to be stacked, we shouldn't clear the bitflag unless we dequeue the
> > signal too. That's definitely a bug (though a minor one ;)
> 
> "Minor" but leads to interesting results in the end when coupled
> with something like noflushd that regulary send those signals ;)
> 
> Not only we leak them, but we also get nr_queued_signals reaching
> nr_max_signals. This has the side effect of making do_notify_parent()
> silently fail when a pthread is dead (libpthread use an RT signal).
> 
> The end result is that after a few days, a machine running noflushd
> and thread intensive apps like evolution and gkrellm will have dozens
> (or even hundreds) of zombies as the child threads are never reclaimed
> by libpthread "manager" thread since it never gets the signal...

for noflushd users it's major (for everybody else is minor)

> Interesting... though hopefully, I didn't see anybody else causing
> such a constant increase of nr_queued_signals so far on this laptop...

That's because nobody else sends signals to the daemons I guess. And
even if they do the daemon won't clear the pending bitflag, so there's
no risk to queue more than 1 non-RT entry per signal per daemon like it
happened with kupdate.

Andrea - If you prefer relying on open source software, check these links:
	    rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.[45]/
	    http://www.cobite.com/cvsps/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
  2003-09-30 18:22     ` Andrea Arcangeli
@ 2003-10-01  9:24       ` Benjamin Herrenschmidt
  2003-10-01  9:32         ` Nigel Cunningham
  2003-10-01 14:45         ` Andrea Arcangeli
  0 siblings, 2 replies; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2003-10-01  9:24 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linux Kernel list, Marcelo Tosatti


> That's because nobody else sends signals to the daemons I guess. And
> even if they do the daemon won't clear the pending bitflag, so there's
> no risk to queue more than 1 non-RT entry per signal per daemon like it
> happened with kupdate.

And any daemon can cause the same leak by sending it RT signals... I just
verified sending for example a bunch of 33's (SIGRTMIN) to khubd, that
increased the count permanently.

I agree this should not happen normally, and I suppose only root can do
that and we aren't here to prevent root from shooting itself in the
foot, are we ?

The question is should I spend some time adding some flush_signals() to
the loop of those various kernel daemons I can find or that isn't worth ?

Regarding kupated, dequeue_signal is a better option as we actually care
about the signal, I'm testing a patch it will be there in a few minutes.

Ben.
 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
  2003-10-01  9:24       ` Benjamin Herrenschmidt
@ 2003-10-01  9:32         ` Nigel Cunningham
  2003-10-01 14:45         ` Andrea Arcangeli
  1 sibling, 0 replies; 7+ messages in thread
From: Nigel Cunningham @ 2003-10-01  9:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Andrea Arcangeli, Linux Kernel Mailing List, Marcelo Tosatti

Hi.

For what it's worth, Software Suspend sends signals to daemons, but it
uses flush_signals() to ensure they don't have any other effect than
getting the process stopped during the cycle.

Nigel

On Wed, 2003-10-01 at 21:24, Benjamin Herrenschmidt wrote:
> > That's because nobody else sends signals to the daemons I guess. And
> > even if they do the daemon won't clear the pending bitflag, so there's
> > no risk to queue more than 1 non-RT entry per signal per daemon like it
> > happened with kupdate.
> 
> And any daemon can cause the same leak by sending it RT signals... I just
> verified sending for example a bunch of 33's (SIGRTMIN) to khubd, that
> increased the count permanently.
> 
> I agree this should not happen normally, and I suppose only root can do
> that and we aren't here to prevent root from shooting itself in the
> foot, are we ?
> 
> The question is should I spend some time adding some flush_signals() to
> the loop of those various kernel daemons I can find or that isn't worth ?
> 
> Regarding kupated, dequeue_signal is a better option as we actually care
> about the signal, I'm testing a patch it will be there in a few minutes.
> 
> Ben.
>  
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
Nigel Cunningham
495 St Georges Road South, Hastings 4201, New Zealand

You see, at just the right time, when we were still powerless,
Christ died for the ungodly.
	-- Romans 5:6, NIV.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] 2.4.x RT signal leak with kupdated (and maybe others)
  2003-10-01  9:24       ` Benjamin Herrenschmidt
  2003-10-01  9:32         ` Nigel Cunningham
@ 2003-10-01 14:45         ` Andrea Arcangeli
  1 sibling, 0 replies; 7+ messages in thread
From: Andrea Arcangeli @ 2003-10-01 14:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux Kernel list, Marcelo Tosatti

On Wed, Oct 01, 2003 at 11:24:01AM +0200, Benjamin Herrenschmidt wrote:
> 
> > That's because nobody else sends signals to the daemons I guess. And
> > even if they do the daemon won't clear the pending bitflag, so there's
> > no risk to queue more than 1 non-RT entry per signal per daemon like it
> > happened with kupdate.
> 
> And any daemon can cause the same leak by sending it RT signals... I just
> verified sending for example a bunch of 33's (SIGRTMIN) to khubd, that
> increased the count permanently.

Yes I know. I was only talking about non-RT, I was concerned about
sigterm/sigkill primarly, that is the only realistic scenario. If you
send SIGRTMIN that is a pilot error, the system will never attempt to do
that, while it's not so obvious for sigterm. That's why I said at least
not more than 1 signal will be queued.

> I agree this should not happen normally, and I suppose only root can do
> that and we aren't here to prevent root from shooting itself in the
> foot, are we ?
> 
> The question is should I spend some time adding some flush_signals() to
> the loop of those various kernel daemons I can find or that isn't worth ?

It's up to you, I believe what we have to fix is the kupdate, the
SIGRTMIN can't matter. At most we can care about the sigterm, but even
the sigterm if it happens, it happens at shutdown time, so at the end it
doesn't matter much either for most systems (and that one isn't a leak,
only some minor memory lost).

So I guess you can fix only kupdate, and care to add the flush_signals
only in 2.6 that has the very same problem.

> Regarding kupated, dequeue_signal is a better option as we actually care
> about the signal, I'm testing a patch it will be there in a few minutes.

thanks

Andrea - If you prefer relying on open source software, check these links:
	    rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.[45]/
	    http://www.cobite.com/cvsps/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-10-01 14:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-30 16:27 [BUG] 2.4.x RT signal leak with kupdated (and maybe others) Benjamin Herrenschmidt
2003-09-30 17:36 ` Andrea Arcangeli
2003-09-30 17:47   ` Benjamin Herrenschmidt
2003-09-30 18:22     ` Andrea Arcangeli
2003-10-01  9:24       ` Benjamin Herrenschmidt
2003-10-01  9:32         ` Nigel Cunningham
2003-10-01 14:45         ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).