All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: kernel panic on kill(0, SIGTERM) with PGID == 0
       [not found] <4BE01C86.3050908@secunet.com>
@ 2010-05-09 18:45 ` Oleg Nesterov
  2010-05-09 19:06   ` Oleg Nesterov
  2010-05-10  7:20   ` Mathias Krause
  0 siblings, 2 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-09 18:45 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Eric W. Biederman, linux-kernel

sorry for delay, vacation.

On 05/04, Mathias Krause wrote:
>
> Hi Oleg, Hi Eric,
>
> I stumbled across a nasty bug related to the special init I'm using
> (cinit) and a process trying to kill it's process group. That always ends
> in a kernel NULL pointer dereference. git bisect brought me to that
> commit:
>
> | commit 430c623121ea88ca80595c99fdc63b7f8a803ae5
> | Author: Oleg Nesterov <oleg@tv-sign.ru>
> | Date:   Fri Feb 8 04:19:11 2008 -0800
> |
> | start the global /sbin/init with 0,0 special pids
> |
> | As Eric pointed out, there is no problem with init starting with sid == pgid
> | == 0, and this was historical linux behavior changed in 2.6.18.
> |
> | Remove kernel_init()->__set_special_pids(), this is unneeded and complicates
> | the rules for sys_setsid().
> |
> | This change and the previous change in daemonize() mean that /sbin/init does
> | not need the special "session != 1" hack in sys_setsid() any longer. We can'
> | remove this check yet, we should cleanup copy_process(CLONE_NEWPID) first, s
> | update the comment only.
> |
> | Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> | Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> | Cc: Pavel Emelyanov <xemul@openvz.org>
> | Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> | Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>
> Well, it actually is a problem for my setup. If neither init nor any of
> the programs init starts ever change the PGID, all will live in the
> process group 0. That's bad when some of the started programs tries to
> kill its process group. That will in fact kill _all_ processes. So far so
> bad.

sorry again, I'll try to comment this later...

And I think this should be discussed on lkml, cc'ed.

> But it even gets worser because process group 0 contains some
> special processes, like swapper (PID: 0). Normally swapper will never be
> reachable for userland because PID 0 is handled special by kill(2) but
> killing the current process group while having a PGID of 0 will also try
> to kill those special processes like swapper. This ends in the following
> kernel null pointer deref:
>
> [    3.595820] BUG: unable to handle kernel NULL pointer dereference at 000003a8

Thanks Mathias.

I think this should be fixed anyway. Could you try the patch below?

In any case swapper should be immune to signals, and its ->thread_group
should be properly initiallized (the patch does only this).

> [    3.595820]  [<c012b45b>] __group_send_sig_info+0x7b/0xa0
> [    3.595820]  [<c012b5bd>] group_send_sig_info+0x5d/0x80
> [    3.595820]  [<c012b628>] __kill_pgrp_info+0x48/0x70
> [    3.595820]  [<c012b679>] kill_pgrp_info+0x29/0x40

Looks like, you kernel is old. Any chance you can also test the recent
kernel?

> May be a minor bug, because it can be work around by calling setpgid(0,0)
> in init

setpgid(0,0) just moves the caller's pgrp from PGID 0, that is why it
helps.

> but I think it should be fixed, anyway.

Completely agreed.

> A reproducer is attached. It contains a substitute for init that triggers
> the bug.

Thanks.

I didn't try it, but it looks overcomplicated to trigger this bug, or
I missed something? Afaics, init could be just

	int main(void)
	{
		kill(0, SIGGKILL);
	}

No?

Oleg.

We should also change INIT_SIGHAND, but _hopefully_ this is enough
to fix the crash.

--- x/include/linux/init_task.h
+++ x/include/linux/init_task.h
@@ -172,6 +172,7 @@ extern struct cred init_cred;
 		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
 		[PIDTYPE_SID]  = INIT_PID_LINK(PIDTYPE_SID),		\
 	},								\
+	.thread_group	= LIST_HEAD_INIT(tsk.thread_group),		\
 	.dirties = INIT_PROP_LOCAL_SINGLE(dirties),			\
 	INIT_IDS							\
 	INIT_PERF_EVENTS(tsk)						\



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kernel panic on kill(0, SIGTERM) with PGID == 0
  2010-05-09 18:45 ` kernel panic on kill(0, SIGTERM) with PGID == 0 Oleg Nesterov
@ 2010-05-09 19:06   ` Oleg Nesterov
  2010-05-10  7:20   ` Mathias Krause
  1 sibling, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-09 19:06 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Eric W. Biederman, linux-kernel

forgot to mention...

On 05/09, Oleg Nesterov wrote:
>
> We should also change INIT_SIGHAND, but _hopefully_ this is enough
> to fix the crash.

Also. I strongly believe we should change INIT_STRUCT_PID.tasks, it
should be hlist_empty(). But this needs another discussion.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: kernel panic on kill(0, SIGTERM) with PGID == 0
  2010-05-09 18:45 ` kernel panic on kill(0, SIGTERM) with PGID == 0 Oleg Nesterov
  2010-05-09 19:06   ` Oleg Nesterov
@ 2010-05-10  7:20   ` Mathias Krause
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
  1 sibling, 1 reply; 18+ messages in thread
From: Mathias Krause @ 2010-05-10  7:20 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Eric W. Biederman, linux-kernel

Hello  Oleg,

Oleg Nesterov wrote:
> sorry for delay, vacation.
>

No problem. Thanks for replying.

>> But it even gets worser because process group 0 contains some
>> special processes, like swapper (PID: 0). Normally swapper will never be
>> reachable for userland because PID 0 is handled special by kill(2) but
>> killing the current process group while having a PGID of 0 will also try
>> to kill those special processes like swapper. This ends in the following
>> kernel null pointer deref:
>>
>> [    3.595820] BUG: unable to handle kernel NULL pointer dereference at 000003a8
> 
> Thanks Mathias.
> 
> I think this should be fixed anyway. Could you try the patch below?

See below.

> 
> In any case swapper should be immune to signals, and its ->thread_group
> should be properly initiallized (the patch does only this).
> 
>> [    3.595820]  [<c012b45b>] __group_send_sig_info+0x7b/0xa0
>> [    3.595820]  [<c012b5bd>] group_send_sig_info+0x5d/0x80
>> [    3.595820]  [<c012b628>] __kill_pgrp_info+0x48/0x70
>> [    3.595820]  [<c012b679>] kill_pgrp_info+0x29/0x40
> 
> Looks like, you kernel is old. Any chance you can also test the recent
> kernel?
> 

It's old because it's the result of bisecting the cause of the problem.
It's actually some 2.6.24 kernel but I could reproduce the bug with
2.6.34-rc4, too.

>> May be a minor bug, because it can be work around by calling setpgid(0,0)
>> in init
> 
> setpgid(0,0) just moves the caller's pgrp from PGID 0, that is why it
> helps.
> 

Right.

>> but I think it should be fixed, anyway.
> 
> Completely agreed.
> 
>> A reproducer is attached. It contains a substitute for init that triggers
>> the bug.
> 
> Thanks.
> 
> I didn't try it, but it looks overcomplicated to trigger this bug, or
> I missed something? Afaics, init could be just
> 
> 	int main(void)
> 	{
> 		kill(0, SIGGKILL);
> 	}
> 
> No?
> 

Yes, sure. Killing the process group, while having a PGID of 0 are the
only prerequisites to trigger this bug. In my example I forked a child
and let it do the call to kill to not have init  (PID 1) beeing killed,
too. The kernel doesn't like that. :)
But your example should also work.


> Oleg.
> 
> We should also change INIT_SIGHAND, but _hopefully_ this is enough
> to fix the crash.
> 
> --- x/include/linux/init_task.h
> +++ x/include/linux/init_task.h
> @@ -172,6 +172,7 @@ extern struct cred init_cred;
>  		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
>  		[PIDTYPE_SID]  = INIT_PID_LINK(PIDTYPE_SID),		\
>  	},								\
> +	.thread_group	= LIST_HEAD_INIT(tsk.thread_group),		\
>  	.dirties = INIT_PROP_LOCAL_SINGLE(dirties),			\
>  	INIT_IDS							\
>  	INIT_PERF_EVENTS(tsk)						\
> 
> 

This works for me. Thanks.

Tested-by: Mathias Krause <mathias.krause@secunet.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0)
  2010-05-10  7:20   ` Mathias Krause
@ 2010-05-10 19:49     ` Oleg Nesterov
  2010-05-10 19:49       ` [PATCH 1/4] INIT_TASK() should initialize ->thread_group list Oleg Nesterov
                         ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-10 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

Hello,

Mathias Krause reports that a buggy (or special) /sbin/init can
crash the kernel if it sends a signal to its pgrp/sid before it
changes its initial (0,0) pids. See the changelog for 1/4.

git-bisect blames "start the global /sbin/init with 0,0 special pids"
commit 430c623121ea88ca80595c99fdc63b7f8a803ae5, but in fact the
problem was caused by another change, see 2/4.

The patches do not depend on each other, 3/4 fixes another problem,
4/4 is purely cosmetic.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] INIT_TASK() should initialize ->thread_group list
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
@ 2010-05-10 19:49       ` Oleg Nesterov
  2010-05-11  7:52         ` Serge E. Hallyn
  2010-05-12  2:15         ` Sukadev Bhattiprolu
  2010-05-10 19:50       ` [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process Oleg Nesterov
                         ` (4 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-10 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

The trivial /sbin/init doing

	int main(void)
	{
		kill(0, SIGKILL)
	}

crashes the kernel.

This happens because __kill_pgrp_info(init_struct_pid) also sends SIGKILL
to the swapper process which runs with the uninitialized ->thread_group.

Change INIT_TASK() to initialize ->thread_group properly.

Note: the real problem is that the swapper process must not be visible to
signals, see the next patch. But this change is right anyway and fixes
the crash.

Reported-and-tested-by: Mathias Krause <mathias.krause@secunet.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 include/linux/init_task.h |    1 +
 1 file changed, 1 insertion(+)

--- 34-rc1/include/linux/init_task.h~1_INIT_TASK_THREAD_GROUP	2010-05-10 19:44:19.000000000 +0200
+++ 34-rc1/include/linux/init_task.h	2010-05-10 19:45:27.000000000 +0200
@@ -172,6 +172,7 @@ extern struct cred init_cred;
 		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
 		[PIDTYPE_SID]  = INIT_PID_LINK(PIDTYPE_SID),		\
 	},								\
+	.thread_group	= LIST_HEAD_INIT(tsk.thread_group),		\
 	.dirties = INIT_PROP_LOCAL_SINGLE(dirties),			\
 	INIT_IDS							\
 	INIT_PERF_EVENTS(tsk)						\


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
  2010-05-10 19:49       ` [PATCH 1/4] INIT_TASK() should initialize ->thread_group list Oleg Nesterov
@ 2010-05-10 19:50       ` Oleg Nesterov
  2010-05-11  9:54         ` Serge E. Hallyn
  2010-05-10 19:50       ` [PATCH 3/4] pids: fix fork_idle() to setup ->pids correctly Oleg Nesterov
                         ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-10 19:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

"statically initialize struct pid for swapper" commit 820e45db says:

	Statically initialize a struct pid for the swapper process (pid_t == 0)
	and attach it to init_task.  This is needed so task_pid(), task_pgrp()
	and task_session() interfaces work on the swapper process also.

OK, but:

	- it doesn't make sense to add init_task.pids[].node into
	  init_struct_pid.tasks[], and in fact this just wrong.

	  idle threads are special, they shouldn't be visible on any
	  global list. In particular do_each_pid_task(init_struct_pid)
	  shouldn't see swapper.

	  This is the actual reason why kill(0, SIGKILL) from /sbin/init
	  (which starts with 0,0 special pids) crashes the kernel. The
	  signal sent to pgid/sid == 0 must never see idle threads, even
	  if the previous patch fixed the crash itself.

	- we have other idle threads running on the non-boot CPUs, see
	  the next patch.

Change INIT_STRUCT_PID/INIT_PID_LINK to create the empty/unhashed
hlist_head/hlist_node. Like any other idle thread swapper can never exit,
so detach_pid()->__hlist_del() is not possible, but we could change
INIT_PID_LINK() to set pprev = &next if needed.

All we need is the valid swapper->pids[].pid == &init_struct_pid.

Reported-by: Mathias Krause <mathias.krause@secunet.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 include/linux/init_task.h |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- 34-rc1/include/linux/init_task.h~2_INIT_PID_DONT_LINK_SWAPPER	2010-05-10 19:45:27.000000000 +0200
+++ 34-rc1/include/linux/init_task.h	2010-05-10 20:26:08.000000000 +0200
@@ -53,9 +53,9 @@ extern struct group_info init_groups;
 #define INIT_STRUCT_PID {						\
 	.count 		= ATOMIC_INIT(1),				\
 	.tasks		= {						\
-		{ .first = &init_task.pids[PIDTYPE_PID].node },		\
-		{ .first = &init_task.pids[PIDTYPE_PGID].node },	\
-		{ .first = &init_task.pids[PIDTYPE_SID].node },		\
+		{ .first = NULL },					\
+		{ .first = NULL },					\
+		{ .first = NULL },					\
 	},								\
 	.rcu		= RCU_HEAD_INIT,				\
 	.level		= 0,						\
@@ -70,7 +70,7 @@ extern struct group_info init_groups;
 {								\
 	.node = {						\
 		.next = NULL,					\
-		.pprev = &init_struct_pid.tasks[type].first,	\
+		.pprev = NULL,					\
 	},							\
 	.pid = &init_struct_pid,				\
 }


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/4] pids: fix fork_idle() to setup ->pids correctly
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
  2010-05-10 19:49       ` [PATCH 1/4] INIT_TASK() should initialize ->thread_group list Oleg Nesterov
  2010-05-10 19:50       ` [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process Oleg Nesterov
@ 2010-05-10 19:50       ` Oleg Nesterov
  2010-05-11  8:54         ` Serge E. Hallyn
  2010-05-10 19:51       ` [PATCH 4/4] INIT_SIGHAND: use SIG_DFL instead of NULL Oleg Nesterov
                         ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-10 19:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

copy_process(pid => &init_struct_pid) doesn't do attach_pid/etc.

It shouldn't, but this means that the idle threads run with the wrong
pids copied from the caller's task_struct. In x86 case the caller is
either kernel_init() thread or keventd.

In particular, this means that after the series of cpu_up/cpu_down an
idle thread (which never exits) can run with .pid pointing to nowhere.

Change fork_idle() to initialize idle->pids[] correctly. We only set
.pid = &init_struct_pid but do not add .node to list, INIT_TASK() does
the same for the boot-cpu idle thread (swapper).

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/fork.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

--- 34-rc1/kernel/fork.c~3_FORK_IDLE_SET_PIDS	2010-03-24 18:07:03.000000000 +0100
+++ 34-rc1/kernel/fork.c	2010-05-10 20:45:33.000000000 +0200
@@ -1339,6 +1339,16 @@ noinline struct pt_regs * __cpuinit __at
 	return regs;
 }
 
+static inline void init_idle_pids(struct pid_link *links)
+{
+	enum pid_type type;
+
+	for (type = PIDTYPE_PID; type < PIDTYPE_MAX; ++type) {
+		INIT_HLIST_NODE(&links[type].node); /* not really needed */
+		links[type].pid = &init_struct_pid;
+	}
+}
+
 struct task_struct * __cpuinit fork_idle(int cpu)
 {
 	struct task_struct *task;
@@ -1346,8 +1356,10 @@ struct task_struct * __cpuinit fork_idle
 
 	task = copy_process(CLONE_VM, 0, idle_regs(&regs), 0, NULL,
 			    &init_struct_pid, 0);
-	if (!IS_ERR(task))
+	if (!IS_ERR(task)) {
+		init_idle_pids(task->pids);
 		init_idle(task, cpu);
+	}
 
 	return task;
 }


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/4] INIT_SIGHAND: use SIG_DFL instead of NULL
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
                         ` (2 preceding siblings ...)
  2010-05-10 19:50       ` [PATCH 3/4] pids: fix fork_idle() to setup ->pids correctly Oleg Nesterov
@ 2010-05-10 19:51       ` Oleg Nesterov
  2010-05-11  8:54         ` Serge E. Hallyn
  2010-05-10 21:08       ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Andrew Morton
  2010-05-10 23:55       ` Roland McGrath
  5 siblings, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-10 19:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

Cosmetic, no changes in the compiled code. Just s/NULL/SIG_DFL/ to make
it more readable and grep-friendly.

Note: probably SIG_IGN makes more sense, we could kill ignore_signals().
But then kernel_init() should do flush_signal_handlers() before exec().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 include/linux/init_task.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- 34-rc1/include/linux/init_task.h~4_INIT_SIGHAND_USE_SIG_DFL	2010-05-10 20:26:08.000000000 +0200
+++ 34-rc1/include/linux/init_task.h	2010-05-10 21:06:03.000000000 +0200
@@ -43,7 +43,7 @@ extern struct nsproxy init_nsproxy;
 
 #define INIT_SIGHAND(sighand) {						\
 	.count		= ATOMIC_INIT(1), 				\
-	.action		= { { { .sa_handler = NULL, } }, },		\
+	.action		= { { { .sa_handler = SIG_DFL, } }, },		\
 	.siglock	= __SPIN_LOCK_UNLOCKED(sighand.siglock),	\
 	.signalfd_wqh	= __WAIT_QUEUE_HEAD_INITIALIZER(sighand.signalfd_wqh),	\
 }


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0)
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
                         ` (3 preceding siblings ...)
  2010-05-10 19:51       ` [PATCH 4/4] INIT_SIGHAND: use SIG_DFL instead of NULL Oleg Nesterov
@ 2010-05-10 21:08       ` Andrew Morton
  2010-05-10 21:41         ` Oleg Nesterov
  2010-05-10 23:55       ` Roland McGrath
  5 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2010-05-10 21:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

On Mon, 10 May 2010 21:49:17 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> Hello,
> 
> Mathias Krause reports that a buggy (or special) /sbin/init can
> crash the kernel if it sends a signal to its pgrp/sid before it
> changes its initial (0,0) pids. See the changelog for 1/4.
> 
> git-bisect blames "start the global /sbin/init with 0,0 special pids"
> commit 430c623121ea88ca80595c99fdc63b7f8a803ae5, but in fact the
> problem was caused by another change, see 2/4.
> 
> The patches do not depend on each other, 3/4 fixes another problem,
> 4/4 is purely cosmetic.
> 

Do you see a need to merge these into 2.6.34?  (I don't)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0)
  2010-05-10 21:08       ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Andrew Morton
@ 2010-05-10 21:41         ` Oleg Nesterov
  0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-10 21:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Cedric Le Goater, Dave Hansen, Eric Biederman, Herbert Poetzl,
	Ingo Molnar, Mathias Krause, Roland McGrath, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

On 05/10, Andrew Morton wrote:
>
> On Mon, 10 May 2010 21:49:17 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Hello,
> >
> > Mathias Krause reports that a buggy (or special) /sbin/init can
> > crash the kernel if it sends a signal to its pgrp/sid before it
> > changes its initial (0,0) pids. See the changelog for 1/4.
> >
> > git-bisect blames "start the global /sbin/init with 0,0 special pids"
> > commit 430c623121ea88ca80595c99fdc63b7f8a803ae5, but in fact the
> > problem was caused by another change, see 2/4.
> >
> > The patches do not depend on each other, 3/4 fixes another problem,
> > 4/4 is purely cosmetic.
> >
>
> Do you see a need to merge these into 2.6.34?  (I don't)

No, the problem is minor, it is not possible to exploit it unless
/sbin/init does "bad things".

And the long CC asks for review. Although 1/4 is "obviously good" in
any case and I strongly believe 2/4 is right at least in general.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0)
  2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
                         ` (4 preceding siblings ...)
  2010-05-10 21:08       ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Andrew Morton
@ 2010-05-10 23:55       ` Roland McGrath
  5 siblings, 0 replies; 18+ messages in thread
From: Roland McGrath @ 2010-05-10 23:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Serge Hallyn,
	Sukadev Bhattiprolu, linux-kernel

All 4
Acked-by: Roland McGrath <roland@redhat.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] INIT_TASK() should initialize ->thread_group list
  2010-05-10 19:49       ` [PATCH 1/4] INIT_TASK() should initialize ->thread_group list Oleg Nesterov
@ 2010-05-11  7:52         ` Serge E. Hallyn
  2010-05-12  2:15         ` Sukadev Bhattiprolu
  1 sibling, 0 replies; 18+ messages in thread
From: Serge E. Hallyn @ 2010-05-11  7:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

Quoting Oleg Nesterov (oleg@redhat.com):
> The trivial /sbin/init doing
> 
> 	int main(void)
> 	{
> 		kill(0, SIGKILL)
> 	}
> 
> crashes the kernel.
> 
> This happens because __kill_pgrp_info(init_struct_pid) also sends SIGKILL
> to the swapper process which runs with the uninitialized ->thread_group.
> 
> Change INIT_TASK() to initialize ->thread_group properly.
> 
> Note: the real problem is that the swapper process must not be visible to
> signals, see the next patch. But this change is right anyway and fixes
> the crash.
> 
> Reported-and-tested-by: Mathias Krause <mathias.krause@secunet.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Serge E. Hallyn <serue@us.ibm.com>

> ---
> 
>  include/linux/init_task.h |    1 +
>  1 file changed, 1 insertion(+)
> 
> --- 34-rc1/include/linux/init_task.h~1_INIT_TASK_THREAD_GROUP	2010-05-10 19:44:19.000000000 +0200
> +++ 34-rc1/include/linux/init_task.h	2010-05-10 19:45:27.000000000 +0200
> @@ -172,6 +172,7 @@ extern struct cred init_cred;
>  		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
>  		[PIDTYPE_SID]  = INIT_PID_LINK(PIDTYPE_SID),		\
>  	},								\
> +	.thread_group	= LIST_HEAD_INIT(tsk.thread_group),		\
>  	.dirties = INIT_PROP_LOCAL_SINGLE(dirties),			\
>  	INIT_IDS							\
>  	INIT_PERF_EVENTS(tsk)						\

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] pids: fix fork_idle() to setup ->pids correctly
  2010-05-10 19:50       ` [PATCH 3/4] pids: fix fork_idle() to setup ->pids correctly Oleg Nesterov
@ 2010-05-11  8:54         ` Serge E. Hallyn
  0 siblings, 0 replies; 18+ messages in thread
From: Serge E. Hallyn @ 2010-05-11  8:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

Quoting Oleg Nesterov (oleg@redhat.com):
> copy_process(pid => &init_struct_pid) doesn't do attach_pid/etc.
> 
> It shouldn't, but this means that the idle threads run with the wrong
> pids copied from the caller's task_struct. In x86 case the caller is
> either kernel_init() thread or keventd.
> 
> In particular, this means that after the series of cpu_up/cpu_down an
> idle thread (which never exits) can run with .pid pointing to nowhere.
> 
> Change fork_idle() to initialize idle->pids[] correctly. We only set
> .pid = &init_struct_pid but do not add .node to list, INIT_TASK() does
> the same for the boot-cpu idle thread (swapper).
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Serge E. Hallyn <serue@us.ibm.com>

> ---
> 
>  kernel/fork.c |   14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> --- 34-rc1/kernel/fork.c~3_FORK_IDLE_SET_PIDS	2010-03-24 18:07:03.000000000 +0100
> +++ 34-rc1/kernel/fork.c	2010-05-10 20:45:33.000000000 +0200
> @@ -1339,6 +1339,16 @@ noinline struct pt_regs * __cpuinit __at
>  	return regs;
>  }
> 
> +static inline void init_idle_pids(struct pid_link *links)
> +{
> +	enum pid_type type;
> +
> +	for (type = PIDTYPE_PID; type < PIDTYPE_MAX; ++type) {
> +		INIT_HLIST_NODE(&links[type].node); /* not really needed */
> +		links[type].pid = &init_struct_pid;
> +	}
> +}
> +
>  struct task_struct * __cpuinit fork_idle(int cpu)
>  {
>  	struct task_struct *task;
> @@ -1346,8 +1356,10 @@ struct task_struct * __cpuinit fork_idle
> 
>  	task = copy_process(CLONE_VM, 0, idle_regs(&regs), 0, NULL,
>  			    &init_struct_pid, 0);
> -	if (!IS_ERR(task))
> +	if (!IS_ERR(task)) {
> +		init_idle_pids(task->pids);
>  		init_idle(task, cpu);
> +	}
> 
>  	return task;
>  }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] INIT_SIGHAND: use SIG_DFL instead of NULL
  2010-05-10 19:51       ` [PATCH 4/4] INIT_SIGHAND: use SIG_DFL instead of NULL Oleg Nesterov
@ 2010-05-11  8:54         ` Serge E. Hallyn
  0 siblings, 0 replies; 18+ messages in thread
From: Serge E. Hallyn @ 2010-05-11  8:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

Quoting Oleg Nesterov (oleg@redhat.com):
> Cosmetic, no changes in the compiled code. Just s/NULL/SIG_DFL/ to make
> it more readable and grep-friendly.
> 
> Note: probably SIG_IGN makes more sense, we could kill ignore_signals().
> But then kernel_init() should do flush_signal_handlers() before exec().
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Serge E. Hallyn <serue@us.ibm.com>

> ---
> 
>  include/linux/init_task.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- 34-rc1/include/linux/init_task.h~4_INIT_SIGHAND_USE_SIG_DFL	2010-05-10 20:26:08.000000000 +0200
> +++ 34-rc1/include/linux/init_task.h	2010-05-10 21:06:03.000000000 +0200
> @@ -43,7 +43,7 @@ extern struct nsproxy init_nsproxy;
> 
>  #define INIT_SIGHAND(sighand) {						\
>  	.count		= ATOMIC_INIT(1), 				\
> -	.action		= { { { .sa_handler = NULL, } }, },		\
> +	.action		= { { { .sa_handler = SIG_DFL, } }, },		\
>  	.siglock	= __SPIN_LOCK_UNLOCKED(sighand.siglock),	\
>  	.signalfd_wqh	= __WAIT_QUEUE_HEAD_INITIALIZER(sighand.signalfd_wqh),	\
>  }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process
  2010-05-10 19:50       ` [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process Oleg Nesterov
@ 2010-05-11  9:54         ` Serge E. Hallyn
  2010-05-12 16:03           ` Oleg Nesterov
  0 siblings, 1 reply; 18+ messages in thread
From: Serge E. Hallyn @ 2010-05-11  9:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

Quoting Oleg Nesterov (oleg@redhat.com):
> "statically initialize struct pid for swapper" commit 820e45db says:
> 
> 	Statically initialize a struct pid for the swapper process (pid_t == 0)
> 	and attach it to init_task.  This is needed so task_pid(), task_pgrp()
> 	and task_session() interfaces work on the swapper process also.
> 
> OK, but:
> 
> 	- it doesn't make sense to add init_task.pids[].node into
> 	  init_struct_pid.tasks[], and in fact this just wrong.
> 
> 	  idle threads are special, they shouldn't be visible on any
> 	  global list. In particular do_each_pid_task(init_struct_pid)
> 	  shouldn't see swapper.
> 
> 	  This is the actual reason why kill(0, SIGKILL) from /sbin/init
> 	  (which starts with 0,0 special pids) crashes the kernel. The
> 	  signal sent to pgid/sid == 0 must never see idle threads, even
> 	  if the previous patch fixed the crash itself.
> 
> 	- we have other idle threads running on the non-boot CPUs, see
> 	  the next patch.
> Change INIT_STRUCT_PID/INIT_PID_LINK to create the empty/unhashed
> hlist_head/hlist_node. Like any other idle thread swapper can never exit,
> so detach_pid()->__hlist_del() is not possible, but we could change
> INIT_PID_LINK() to set pprev = &next if needed.
> 
> All we need is the valid swapper->pids[].pid == &init_struct_pid.
> 
> Reported-by: Mathias Krause <mathias.krause@secunet.com>

Crimey, trying to find some way this could get dereferenced, finding
myself impressed with the likes of set_ftrace_swapper().

Anyway, not finding anything, so

> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Serge E. Hallyn <serue@us.ibm.com>

thanks,
-serge

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] INIT_TASK() should initialize ->thread_group list
  2010-05-10 19:49       ` [PATCH 1/4] INIT_TASK() should initialize ->thread_group list Oleg Nesterov
  2010-05-11  7:52         ` Serge E. Hallyn
@ 2010-05-12  2:15         ` Sukadev Bhattiprolu
  2010-05-12 15:54           ` Oleg Nesterov
  1 sibling, 1 reply; 18+ messages in thread
From: Sukadev Bhattiprolu @ 2010-05-12  2:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Serge Hallyn, Sukadev Bhattiprolu, linux-kernel

Oleg Nesterov [oleg@redhat.com] wrote:
| The trivial /sbin/init doing
| 
| 	int main(void)
| 	{
| 		kill(0, SIGKILL)
| 	}
| 
| crashes the kernel.

Really subtle. Good catch.

So, now init is not part of any process group until it calls setsid().
So the above SIGKILL is lost right ? - i.e it does not kill even init
itself.

In my quick test, the following init process lives on inspite of the
SIGKILL.

	main()
	{
		kill(0, SIGKILL);

		while(1)
			sleep(1); 
	}

I don't have a better solution. Maybe a hung init is better than a
crashed kernel. the patches look good.

Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] INIT_TASK() should initialize ->thread_group list
  2010-05-12  2:15         ` Sukadev Bhattiprolu
@ 2010-05-12 15:54           ` Oleg Nesterov
  0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-12 15:54 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Serge Hallyn, Sukadev Bhattiprolu, linux-kernel

On 05/11, Sukadev Bhattiprolu wrote:
>
> Oleg Nesterov [oleg@redhat.com] wrote:
> | The trivial /sbin/init doing
> |
> | 	int main(void)
> | 	{
> | 		kill(0, SIGKILL)
> | 	}
> |
> | crashes the kernel.
>
> Really subtle. Good catch.

Thanks to Mathias ;)

> So, now init is not part of any process group until it calls setsid().
> So the above SIGKILL is lost right ? - i.e it does not kill even init
> itself.

No, no. swapper != init. With or without these patches (more precisely,
the next patch) /sbin/init still belongs to the 0 pgrp/sid.

> In my quick test, the following init process lives on inspite of the
> SIGKILL.

Yes, /sbin/init is not killable, that is why it survies.

But:

> 	main()
> 	{
> 		kill(0, SIGKILL);
>
> 		while(1)
> 			sleep(1);
> 	}

Yes. if /sbin/init exits the kernel panics. The real test-case shouldn't
exit, like your example.

> I don't have a better solution. Maybe a hung init is better than a
> crashed kernel.

Agreed!!! I sent the patch a long ago. But security people do not
like it, they use exit() from init to provoke the crash intentionally.
And I still think they are wrong, but this is another story.

> Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process
  2010-05-11  9:54         ` Serge E. Hallyn
@ 2010-05-12 16:03           ` Oleg Nesterov
  0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2010-05-12 16:03 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Cedric Le Goater, Dave Hansen, Eric Biederman,
	Herbert Poetzl, Ingo Molnar, Mathias Krause, Roland McGrath,
	Sukadev Bhattiprolu, linux-kernel

On 05/11, Serge E. Hallyn wrote:
>
> Quoting Oleg Nesterov (oleg@redhat.com):
> > Change INIT_STRUCT_PID/INIT_PID_LINK to create the empty/unhashed
> > hlist_head/hlist_node. Like any other idle thread swapper can never exit,
> > so detach_pid()->__hlist_del() is not possible, but we could change
> > INIT_PID_LINK() to set pprev = &next if needed.
> >
> > All we need is the valid swapper->pids[].pid == &init_struct_pid.
> >
> > Reported-by: Mathias Krause <mathias.krause@secunet.com>
>
> Crimey, trying to find some way this could get dereferenced,

Yes, I was worried too. But afaics we should never use this hlist_node.
Except, of course, it is linked into pid->task.

> finding
> myself impressed with the likes of set_ftrace_swapper().
>
> Anyway, not finding anything, so
>
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> Acked-by: Serge E. Hallyn <serue@us.ibm.com>

Thanks for review!

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-05-12 16:07 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4BE01C86.3050908@secunet.com>
2010-05-09 18:45 ` kernel panic on kill(0, SIGTERM) with PGID == 0 Oleg Nesterov
2010-05-09 19:06   ` Oleg Nesterov
2010-05-10  7:20   ` Mathias Krause
2010-05-10 19:49     ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Oleg Nesterov
2010-05-10 19:49       ` [PATCH 1/4] INIT_TASK() should initialize ->thread_group list Oleg Nesterov
2010-05-11  7:52         ` Serge E. Hallyn
2010-05-12  2:15         ` Sukadev Bhattiprolu
2010-05-12 15:54           ` Oleg Nesterov
2010-05-10 19:50       ` [PATCH 2/4] pids: init_struct_pid.tasks should never see the swapper process Oleg Nesterov
2010-05-11  9:54         ` Serge E. Hallyn
2010-05-12 16:03           ` Oleg Nesterov
2010-05-10 19:50       ` [PATCH 3/4] pids: fix fork_idle() to setup ->pids correctly Oleg Nesterov
2010-05-11  8:54         ` Serge E. Hallyn
2010-05-10 19:51       ` [PATCH 4/4] INIT_SIGHAND: use SIG_DFL instead of NULL Oleg Nesterov
2010-05-11  8:54         ` Serge E. Hallyn
2010-05-10 21:08       ` [PATCH 0/4] swapper fixes (Was: kernel panic on kill(0, SIGTERM) with PGID == 0) Andrew Morton
2010-05-10 21:41         ` Oleg Nesterov
2010-05-10 23:55       ` Roland McGrath

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.