linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] fix the long standing exec vs kill race
@ 2007-12-02 15:14 Oleg Nesterov
  2007-12-02 17:06 ` Simon Holm Thøgersen
  2007-12-03 16:37 ` Linus Torvalds
  0 siblings, 2 replies; 8+ messages in thread
From: Oleg Nesterov @ 2007-12-02 15:14 UTC (permalink / raw)
  To: Andrew Morton, Davide Libenzi, Ingo Molnar, Linus Torvalds,
	Roland McGrath
  Cc: linux-kernel

Depends on
	[PATCH] __group_complete_signal: fix coredump with group stop race
	http://marc.info/?l=linux-kernel&m=119653436116036

Needs review and testing.

Please comment, I think at least the idea is promising.

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-02 15:14 [PATCH 0/2] fix the long standing exec vs kill race Oleg Nesterov
@ 2007-12-02 17:06 ` Simon Holm Thøgersen
  2007-12-02 17:18   ` Oleg Nesterov
  2007-12-03 16:37 ` Linus Torvalds
  1 sibling, 1 reply; 8+ messages in thread
From: Simon Holm Thøgersen @ 2007-12-02 17:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Davide Libenzi, Ingo Molnar, Linus Torvalds,
	Roland McGrath, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 562 bytes --]


søn, 02 12 2007 kl. 18:14 +0300, skrev Oleg Nesterov:
> Depends on
> 	[PATCH] __group_complete_signal: fix coredump with group stop race
> 	http://marc.info/?l=linux-kernel&m=119653436116036
> 
> Needs review and testing.
> 
> Please comment, I think at least the idea is promising.
> 
I have an issue that sounds related, but I might be completely off. I
would expect the simple attached program to keep receiving the same
signal, i.e. respond to
	killall signal-exec -s SIGHUP

I tried your patches, but they didn't help.

Any ideas?


Simon Holm Thøgersen

[-- Attachment #2: signal-exec.c --]
[-- Type: text/x-csrc, Size: 469 bytes --]

#include <signal.h>
#include <stdio.h>
#include <unistd.h>

static char **argv_;

static void handler(int signal)
{
	printf("got signal %d\n", signal);
	execv(argv_[0], argv_);
}

int main(int argc, char *argv[])
{
	printf("spawned\n");
	argv_ = argv;
	if (signal(SIGTERM, handler) == SIG_ERR)
		err(1, "could not set signal handler for SIGTERM");
	if (signal(SIGHUP, handler) == SIG_ERR)
		err(1, "could not set signal handler for SIGTERM");
	sleep(60);
	return 0;
}


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-02 17:06 ` Simon Holm Thøgersen
@ 2007-12-02 17:18   ` Oleg Nesterov
  2007-12-02 18:08     ` Simon Holm Thøgersen
  2007-12-02 20:26     ` Linus Torvalds
  0 siblings, 2 replies; 8+ messages in thread
From: Oleg Nesterov @ 2007-12-02 17:18 UTC (permalink / raw)
  To: Simon Holm Th?gersen
  Cc: Andrew Morton, Davide Libenzi, Ingo Molnar, Linus Torvalds,
	Roland McGrath, linux-kernel

On 12/02, Simon Holm Th?gersen wrote:
> 
> s??n, 02 12 2007 kl. 18:14 +0300, skrev Oleg Nesterov:
> > 
> > Please comment, I think at least the idea is promising.
> > 
> I have an issue that sounds related, but I might be completely off. I
> would expect the simple attached program to keep receiving the same
> signal, i.e. respond to
> 	killall signal-exec -s SIGHUP
> 
> I tried your patches, but they didn't help.
> 
> Any ideas?
> 
> 
> Simon Holm Th??gersen

> #include <signal.h>
> #include <stdio.h>
> #include <unistd.h>
> 
> static char **argv_;
> 
> static void handler(int signal)
> {
> 	printf("got signal %d\n", signal);
> 	execv(argv_[0], argv_);
> }
> 
> int main(int argc, char *argv[])
> {
> 	printf("spawned\n");
> 	argv_ = argv;
> 	if (signal(SIGTERM, handler) == SIG_ERR)
> 		err(1, "could not set signal handler for SIGTERM");
> 	if (signal(SIGHUP, handler) == SIG_ERR)
> 		err(1, "could not set signal handler for SIGTERM");
> 	sleep(60);
> 	return 0;
> }
> 

I think this is another issue which should be solved (?).

exec() from the signal handler doesn't do sys_sigreturn(), so we don't unblock
the signal, and it remains blocked after exec().

Hmm. Is this linux bug, or application bug?

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-02 17:18   ` Oleg Nesterov
@ 2007-12-02 18:08     ` Simon Holm Thøgersen
  2007-12-02 18:52       ` Oleg Nesterov
  2007-12-02 20:26     ` Linus Torvalds
  1 sibling, 1 reply; 8+ messages in thread
From: Simon Holm Thøgersen @ 2007-12-02 18:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Davide Libenzi, Ingo Molnar, Linus Torvalds,
	Roland McGrath, linux-kernel


søn, 02 12 2007 kl. 20:18 +0300, skrev Oleg Nesterov:
> On 12/02, Simon Holm Th?gersen wrote:
> > 
> > s??n, 02 12 2007 kl. 18:14 +0300, skrev Oleg Nesterov:
> > > 
> > > Please comment, I think at least the idea is promising.
> > > 
> > I have an issue that sounds related, but I might be completely off. I
> > would expect the simple attached program to keep receiving the same
> > signal, i.e. respond to
> > 	killall signal-exec -s SIGHUP
> > 
> > I tried your patches, but they didn't help.
> > 
> > Any ideas?
> > 
> > 
> > Simon Holm Th??gersen
> 
> > #include <signal.h>
> > #include <stdio.h>
> > #include <unistd.h>
> > 
> > static char **argv_;
> > 
> > static void handler(int signal)
> > {
> > 	printf("got signal %d\n", signal);
> > 	execv(argv_[0], argv_);
> > }
> > 
> > int main(int argc, char *argv[])
> > {
> > 	printf("spawned\n");
> > 	argv_ = argv;
> > 	if (signal(SIGTERM, handler) == SIG_ERR)
> > 		err(1, "could not set signal handler for SIGTERM");
> > 	if (signal(SIGHUP, handler) == SIG_ERR)
> > 		err(1, "could not set signal handler for SIGTERM");
> > 	sleep(60);
> > 	return 0;
> > }
> > 
> 
> I think this is another issue which should be solved (?).
> 
> exec() from the signal handler doesn't do sys_sigreturn(), so we don't unblock
> the signal, and it remains blocked after exec().
> 
> Hmm. Is this linux bug, or application bug?

Good question. I haven't been able to find something in the
documentation for execve(2) and signal(2) saying it shouldn't be
possible, and it works on Solaris 10, so I'd say it is a Linux bug.
Actually, having another look at the documentation, signal(7) mentions
that POSIX.1-2003 requires that execve is safe to call from inside a
signal handler.


Simon Holm Thøgersen


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-02 18:08     ` Simon Holm Thøgersen
@ 2007-12-02 18:52       ` Oleg Nesterov
  0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2007-12-02 18:52 UTC (permalink / raw)
  To: Simon Holm Th?gersen
  Cc: Andrew Morton, Davide Libenzi, Ingo Molnar, Linus Torvalds,
	Roland McGrath, linux-kernel

On 12/02, Simon Holm Th?gersen wrote:
>
> s??n, 02 12 2007 kl. 20:18 +0300, skrev Oleg Nesterov:
> > On 12/02, Simon Holm Th?gersen wrote:
> > > 
> > > I have an issue that sounds related, but I might be completely off. I
> > > would expect the simple attached program to keep receiving the same
> > > signal, i.e. respond to
> > > 	killall signal-exec -s SIGHUP
> > > 
> > > I tried your patches, but they didn't help.
> > > 
> > > Any ideas?
> > > 
> > > 
> > > Simon Holm Th??gersen
> > 
> > > #include <signal.h>
> > > #include <stdio.h>
> > > #include <unistd.h>
> > > 
> > > static char **argv_;
> > > 
> > > static void handler(int signal)
> > > {
> > > 	printf("got signal %d\n", signal);
> > > 	execv(argv_[0], argv_);
> > > }
> > > 
> > > int main(int argc, char *argv[])
> > > {
> > > 	printf("spawned\n");
> > > 	argv_ = argv;
> > > 	if (signal(SIGTERM, handler) == SIG_ERR)
> > > 		err(1, "could not set signal handler for SIGTERM");
> > > 	if (signal(SIGHUP, handler) == SIG_ERR)
> > > 		err(1, "could not set signal handler for SIGTERM");
> > > 	sleep(60);
> > > 	return 0;
> > > }
> > > 
> > 
> > I think this is another issue which should be solved (?).
> > 
> > exec() from the signal handler doesn't do sys_sigreturn(), so we don't unblock
> > the signal, and it remains blocked after exec().
> > 
> > Hmm. Is this linux bug, or application bug?
> 
> Good question. I haven't been able to find something in the
> documentation for execve(2) and signal(2) saying it shouldn't be
> possible, and it works on Solaris 10, so I'd say it is a Linux bug.

Well, as I said, I don't know what would be the right behaviour,

> Actually, having another look at the documentation, signal(7) mentions
> that POSIX.1-2003 requires that execve is safe to call from inside a
> signal handler.

... but this doesn't look very clear to me.

- Linux can perfectly exec from inside a signal handler

- the application should know that the signal is blocked when the handler runs

- exec should preserve the ->blocked mask

So, is this really buggy? Do we break the "execve should be signal-safe" rule?
I don't know, but our CC: list is good ;)

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-02 17:18   ` Oleg Nesterov
  2007-12-02 18:08     ` Simon Holm Thøgersen
@ 2007-12-02 20:26     ` Linus Torvalds
  1 sibling, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2007-12-02 20:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Simon Holm Th?gersen, Andrew Morton, Davide Libenzi, Ingo Molnar,
	Roland McGrath, linux-kernel



On Sun, 2 Dec 2007, Oleg Nesterov wrote:
> 
> exec() from the signal handler doesn't do sys_sigreturn(), so we don't unblock
> the signal, and it remains blocked after exec().
> 
> Hmm. Is this linux bug, or application bug?

I think that's an application bug.

The kernel does the obvious (and required) thing: it preserves the 
list of blocked signals over the execve(). And if you call execve() from 
within a signal handler, that list of blocked signals will obviously 
include the signals that got blocked by the execution of the signal 
itself.

(Side note: I also suspect that the program is not strictly POSIX 
conforming, and that execve() isn't in the list of functions that are safe 
to call from a signal handler in the first place, but that's a totally 
separate issue).

So if havign the signal blocked isn't what the application wants, I'd 
suggest one of:
 - just set the signal mask by hand to whatever mask you want (perhaps 
   also marking the signal handler with SIGIGN or SIGDFL or whatever)
 - alternatively, if you control the program being execve'd, just do it in 
   that progam instead.
 - use siglongjmp in the signal handler to get out of the signal handler 
   context and do it that way.
 - use a "sigatomic_t" flag, set it in the signal handler, and then do the 
   execve() in the main loop if it's set.

The last one is the safest one in many ways (since it doesn't care if you 
get a hundred of those signals in close succession - and you could also 
make it a counter or something if you want to actually count those 
things).

		Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-02 15:14 [PATCH 0/2] fix the long standing exec vs kill race Oleg Nesterov
  2007-12-02 17:06 ` Simon Holm Thøgersen
@ 2007-12-03 16:37 ` Linus Torvalds
  2007-12-03 17:41   ` Oleg Nesterov
  1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2007-12-03 16:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Davide Libenzi, Ingo Molnar, Roland McGrath, linux-kernel



On Sun, 2 Dec 2007, Oleg Nesterov wrote:
>
> Depends on
> 	[PATCH] __group_complete_signal: fix coredump with group stop race
> 	http://marc.info/?l=linux-kernel&m=119653436116036
> 
> Needs review and testing.
> 
> Please comment, I think at least the idea is promising.

It looks clean and sane to me, but I'm currently more worried about 
2.6.24, and even the first patch this depends on (coredump/stop race) 
makes me a bit nervous since all these things tend to have some rather 
subtle interactions with other parts that depended on the exact semantics 
of all the signal issues.

So my gut feel - considering that none of the problems involved here are 
exactly new - is that this is good material for early in the 2.6.25 cycle.

But I think the whole series looks ok, and if people press me and convince 
me it's (a) well tested and (b) needed early, then I guess it can be 
pushed into 2.6.24.

Anybody?

		Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] fix the long standing exec vs kill race
  2007-12-03 16:37 ` Linus Torvalds
@ 2007-12-03 17:41   ` Oleg Nesterov
  0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2007-12-03 17:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Davide Libenzi, Ingo Molnar, Roland McGrath, linux-kernel

On 12/03, Linus Torvalds wrote:
> 
> On Sun, 2 Dec 2007, Oleg Nesterov wrote:
> >
> > Depends on
> > 	[PATCH] __group_complete_signal: fix coredump with group stop race
> > 	http://marc.info/?l=linux-kernel&m=119653436116036
> > 
> > Needs review and testing.
> > 
> > Please comment, I think at least the idea is promising.
> 
> It looks clean and sane to me, but I'm currently more worried about 
> 2.6.24, and even the first patch this depends on (coredump/stop race) 
> makes me a bit nervous since all these things tend to have some rather 
> subtle interactions with other parts that depended on the exact semantics 
> of all the signal issues.
> 
> So my gut feel - considering that none of the problems involved here are 
> exactly new - is that this is good material for early in the 2.6.25 cycle.
> 
> But I think the whole series looks ok, and if people press me and convince 
> me it's (a) well tested and (b) needed early, then I guess it can be 
> pushed into 2.6.24.

No, no, I don't think this should be pushed into 2.6.24 (even the first patch).

These problems are very old afaics, and nobody complained so far.

Even if correct, this needs more testing. I don't think this can break exec
or coredump in some "obvious" way, but I'm afraid this can introduce new
races / corner cases.

<offtopic>

I hope that with the new meaning of ->group_exit_task we can re-introduce the
"coredump signal "freezes" the thread group at sender's side" property, but we
need some hack to do this. OTOH, it was always a hack.

</offtopic>

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-12-03 17:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-02 15:14 [PATCH 0/2] fix the long standing exec vs kill race Oleg Nesterov
2007-12-02 17:06 ` Simon Holm Thøgersen
2007-12-02 17:18   ` Oleg Nesterov
2007-12-02 18:08     ` Simon Holm Thøgersen
2007-12-02 18:52       ` Oleg Nesterov
2007-12-02 20:26     ` Linus Torvalds
2007-12-03 16:37 ` Linus Torvalds
2007-12-03 17:41   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).