linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit
@ 2019-12-12  6:14 qiwuchen55
  2019-12-12  9:51 ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: qiwuchen55 @ 2019-12-12  6:14 UTC (permalink / raw)
  To: peterz, mingo, oleg, christian.brauner
  Cc: kernel-team, linux-kernel, chenqiwu

From: chenqiwu <chenqiwu@xiaomi.com>

When global init task get a chance to be killed, panic will happen in
later calling steps by do_exit()->exit_notify()->forget_original_parent()
->find_child_reaper() if all init threads have exited.

However, it's hard to extract the coredump of init task from a kernel
crashdump, since exit_mm() has released its mm before panic. In order
to get the backtrace of init task in userspace, it's better to do panic
earlier at the beginning of exitting route.

It's worth noting that we must take case of a multi-threaded init exitting
issue. We need the test for signal_group_exit() to ensure that it is all
threads exiting and not just the current thread.

Of course testing signal_group_exit() is not sufficient. It is still
possible that this is someone calling exit(2) in an old binary or someone
calling pthread_exit(3) from the last thread of init. To handle that case,
we need the test to be:

(signal_group_exit(tsk->signal) || thread_group_empty(tsk)).

Because exiting the final thread of init should also trigger the panic.

Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
---
 kernel/exit.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index e10de98..ba7d1aa 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -578,10 +578,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
 	}
 
 	write_unlock_irq(&tasklist_lock);
-	if (unlikely(pid_ns == &init_pid_ns)) {
-		panic("Attempted to kill init! exitcode=0x%08x\n",
-			father->signal->group_exit_code ?: father->exit_code);
-	}
 
 	list_for_each_entry_safe(p, n, dead, ptrace_entry) {
 		list_del_init(&p->ptrace_entry);
@@ -785,6 +781,9 @@ void __noreturn do_exit(long code)
 		panic("Aiee, killing interrupt handler!");
 	if (unlikely(!tsk->pid))
 		panic("Attempted to kill the idle task!");
+	if (unlikely(is_global_init(tsk) &&
+		(signal_group_exit(tsk->signal) || thread_group_empty(tsk))))
+		panic("Attempted to kill init! exitcode=0x%08lx\n", code);
 
 	/*
 	 * If do_exit is called because this processes oopsed, it's possible
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit
  2019-12-12  6:14 [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit qiwuchen55
@ 2019-12-12  9:51 ` Oleg Nesterov
  2019-12-12 10:08   ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2019-12-12  9:51 UTC (permalink / raw)
  To: qiwuchen55
  Cc: peterz, mingo, christian.brauner, kernel-team, linux-kernel, chenqiwu

On 12/12, qiwuchen55@gmail.com wrote:
>
> Of course testing signal_group_exit() is not sufficient. It is still
> possible that this is someone calling exit(2)

Or execve(), so

> @@ -785,6 +781,9 @@ void __noreturn do_exit(long code)
>  		panic("Aiee, killing interrupt handler!");
>  	if (unlikely(!tsk->pid))
>  		panic("Attempted to kill the idle task!");
> +	if (unlikely(is_global_init(tsk) &&
> +		(signal_group_exit(tsk->signal) || thread_group_empty(tsk))))
> +		panic("Attempted to kill init! exitcode=0x%08lx\n", code);

so this can panic() if one of init's threads does does exec.

Oleg.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit
  2019-12-12  9:51 ` Oleg Nesterov
@ 2019-12-12 10:08   ` Oleg Nesterov
  2019-12-12 11:05     ` Christian Brauner
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2019-12-12 10:08 UTC (permalink / raw)
  To: qiwuchen55
  Cc: peterz, mingo, christian.brauner, kernel-team, linux-kernel, chenqiwu

can't you use is_global_init() && group_dead ?

On 12/12, Oleg Nesterov wrote:
>
> On 12/12, qiwuchen55@gmail.com wrote:
> >
> > Of course testing signal_group_exit() is not sufficient. It is still
> > possible that this is someone calling exit(2)
> 
> Or execve(), so
> 
> > @@ -785,6 +781,9 @@ void __noreturn do_exit(long code)
> >  		panic("Aiee, killing interrupt handler!");
> >  	if (unlikely(!tsk->pid))
> >  		panic("Attempted to kill the idle task!");
> > +	if (unlikely(is_global_init(tsk) &&
> > +		(signal_group_exit(tsk->signal) || thread_group_empty(tsk))))
> > +		panic("Attempted to kill init! exitcode=0x%08lx\n", code);
> 
> so this can panic() if one of init's threads does does exec.
> 
> Oleg.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit
  2019-12-12 10:08   ` Oleg Nesterov
@ 2019-12-12 11:05     ` Christian Brauner
  2019-12-14  6:27       ` chenqiwu
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2019-12-12 11:05 UTC (permalink / raw)
  To: Oleg Nesterov, qiwuchen55
  Cc: qiwuchen55, peterz, mingo, kernel-team, linux-kernel, chenqiwu

On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> can't you use is_global_init() && group_dead ?

Seems reasonable.
Looks like we can move
group_dead = atomic_dec_and_test(&tsk->signal->live);
further up...

(Ideally I'd like to have a test for this to ensure that this lets you
capture a global init coredump but that might be tricky. But since you've
seem to have run into this case maybe you even have something that could
be turned into a test? (Similar to how we already have a purely opt-in
test for pstore.))

Christian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit
  2019-12-12 11:05     ` Christian Brauner
@ 2019-12-14  6:27       ` chenqiwu
  2019-12-14 11:48         ` Christian Brauner
  0 siblings, 1 reply; 6+ messages in thread
From: chenqiwu @ 2019-12-14  6:27 UTC (permalink / raw)
  To: Christian Brauner, peterz, mingo; +Cc: kernel-team, linux-kernel, chenqiwu

On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote:
> On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> > can't you use is_global_init() && group_dead ?
> 
> Seems reasonable.
> Looks like we can move
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> further up...
> 
> (Ideally I'd like to have a test for this to ensure that this lets you
> capture a global init coredump but that might be tricky. But since you've
> seem to have run into this case maybe you even have something that could
> be turned into a test? (Similar to how we already have a purely opt-in
> test for pstore.))
> 
> Christian

Hi all,
I agree that using is_global_init() && group_dead is more reasonable.

The crash isuee happened on a Android phone by reboot stress test.
panic log:
[   84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   84.048521]
[   84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S         O    4.14.117-perf-g8035d1a #1
[   84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT)
[   84.048550] Call trace:
[   84.048564]  dump_backtrace+0x0/0x268
[   84.048569]  show_stack+0x14/0x20
[   84.048577]  dump_stack+0xc4/0x100
[   84.048584]  panic+0x1f0/0x410
[   84.048591]  complete_and_exit+0x0/0x20
[   84.048596]  do_group_exit+0x8c/0xa0
[   84.048602]  get_signal+0x1c0/0x790
[   84.048608]  do_notify_resume+0x184/0xc30
[   84.048613]  work_pending+0x8/0x10

From the kdump loaded by crash utility, all threads of global init have exited,
the group_dead value of global init has truned to 0 by atomic_dec_and_test().
crash> ps init
   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
>     1      0   2  ffffffcd77526000  ??   0.0       0      0  init
    534      1   4  ffffffcd6b9a9000  ZO   0.0       0      0  init
    535      1   4  ffffffcd6b9aa000  ZO   0.0       0      0  init
crash> ps -g 1
PID: 1      TASK: ffffffcd77526000  CPU: 2   COMMAND: "init"
  (no threads)
crash> struct task_struct.signal ffffffcd77526000
  signal = 0xffffffcd77530000
crash> struct signal_struct 0xffffffcd77530000
struct signal_struct {
  sigcnt = {
    counter = 1
  },
  live = {
    counter = 0
  },
  nr_threads = 1,
  thread_head = {
    next = 0xffffffcd77526730,
    prev = 0xffffffcd77526730
  },
  group_exit_code = 11,
  notify_count = 0,
  group_exit_task = 0x0,
  group_stop_count = 0,
  flags = 4,
  ...
 }

However, as Christian said, the test for this is tricky since we must
make sure all of init threads exited. I make a test for is_global_init()
and send a SIGSEGV signal to global init task in userspace. The phone
crash imeddiately and reboot to collect kdump. Then I extract the coredump
of global init task from kdump successfully.
(gdb) core-file core.1.init
Core was generated by `/system/bin/init second_stage'.
#0  _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
9           cmn     x0, #(MAX_ERRNO + 1)
(gdb) bt
#0  _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
#1  0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11)
    at system/core/init/reboot_utils.cpp:141
#2  0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138
#3  0x0000007f8de236a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

So from the following test, we have confidence that the following patch can help us
to get the extra coredump of init when global init task do real exit.

diff --git a/kernel/exit.c b/kernel/exit.c
index 8e288e8..9454106 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -476,10 +476,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father,
        }

        write_unlock_irq(&tasklist_lock);
-       if (unlikely(pid_ns == &init_pid_ns)) {
-               panic("Attempted to kill init! exitcode=0x%08x\n",
-                       father->signal->group_exit_code ?: father->exit_code);
-       }

        list_for_each_entry_safe(p, n, dead, ptrace_entry) {
                list_del_init(&p->ptrace_entry);
@@ -687,6 +683,10 @@ void do_exit(long code)
        if (unlikely(!tsk->pid))
                panic("Attempted to kill the idle task!");

+       group_dead = atomic_dec_and_test(&tsk->signal->live);
+       if (unlikely(is_global_init(tsk) && group_dead))
+               panic("Attempted to kill init! exitcode=0x%08lx\n", code);
+
        /*
         * If do_exit is called because this processes oopsed, it's possible
         * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
@@ -743,7 +743,6 @@ void do_exit(long code)
        if (tsk->mm)
                sync_mm_rss(tsk->mm);
        acct_update_integrals(tsk);
-       group_dead = atomic_dec_and_test(&tsk->signal->live);
        if (group_dead) {
                hrtimer_cancel(&tsk->signal->real_timer);
                exit_itimers(tsk->signal);



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit
  2019-12-14  6:27       ` chenqiwu
@ 2019-12-14 11:48         ` Christian Brauner
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2019-12-14 11:48 UTC (permalink / raw)
  To: chenqiwu; +Cc: peterz, mingo, kernel-team, linux-kernel, chenqiwu, oleg

On Sat, Dec 14, 2019 at 02:27:04PM +0800, chenqiwu wrote:
> On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote:
> > On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote:
> > > can't you use is_global_init() && group_dead ?
> > 
> > Seems reasonable.
> > Looks like we can move
> > group_dead = atomic_dec_and_test(&tsk->signal->live);
> > further up...
> > 
> > (Ideally I'd like to have a test for this to ensure that this lets you
> > capture a global init coredump but that might be tricky. But since you've
> > seem to have run into this case maybe you even have something that could
> > be turned into a test? (Similar to how we already have a purely opt-in
> > test for pstore.))
> > 
> > Christian
> 
> Hi all,
> I agree that using is_global_init() && group_dead is more reasonable.
> 
> The crash isuee happened on a Android phone by reboot stress test.
> panic log:
> [   84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [   84.048521]
> [   84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S         O    4.14.117-perf-g8035d1a #1
> [   84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT)
> [   84.048550] Call trace:
> [   84.048564]  dump_backtrace+0x0/0x268
> [   84.048569]  show_stack+0x14/0x20
> [   84.048577]  dump_stack+0xc4/0x100
> [   84.048584]  panic+0x1f0/0x410
> [   84.048591]  complete_and_exit+0x0/0x20
> [   84.048596]  do_group_exit+0x8c/0xa0
> [   84.048602]  get_signal+0x1c0/0x790
> [   84.048608]  do_notify_resume+0x184/0xc30
> [   84.048613]  work_pending+0x8/0x10
> 
> From the kdump loaded by crash utility, all threads of global init have exited,
> the group_dead value of global init has truned to 0 by atomic_dec_and_test().
> crash> ps init
>    PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
> >     1      0   2  ffffffcd77526000  ??   0.0       0      0  init
>     534      1   4  ffffffcd6b9a9000  ZO   0.0       0      0  init
>     535      1   4  ffffffcd6b9aa000  ZO   0.0       0      0  init
> crash> ps -g 1
> PID: 1      TASK: ffffffcd77526000  CPU: 2   COMMAND: "init"
>   (no threads)
> crash> struct task_struct.signal ffffffcd77526000
>   signal = 0xffffffcd77530000
> crash> struct signal_struct 0xffffffcd77530000
> struct signal_struct {
>   sigcnt = {
>     counter = 1
>   },
>   live = {
>     counter = 0
>   },
>   nr_threads = 1,
>   thread_head = {
>     next = 0xffffffcd77526730,
>     prev = 0xffffffcd77526730
>   },
>   group_exit_code = 11,
>   notify_count = 0,
>   group_exit_task = 0x0,
>   group_stop_count = 0,
>   flags = 4,
>   ...
>  }
> 
> However, as Christian said, the test for this is tricky since we must
> make sure all of init threads exited. I make a test for is_global_init()
> and send a SIGSEGV signal to global init task in userspace. The phone
> crash imeddiately and reboot to collect kdump. Then I extract the coredump
> of global init task from kdump successfully.
> (gdb) core-file core.1.init
> Core was generated by `/system/bin/init second_stage'.
> #0  _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
> 9           cmn     x0, #(MAX_ERRNO + 1)
> (gdb) bt
> #0  _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9
> #1  0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11)
>     at system/core/init/reboot_utils.cpp:141
> #2  0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138
> #3  0x0000007f8de236a0 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> 
> So from the following test, we have confidence that the following patch can help us

Thanks. Can you please resend this as a proper patch for v2?

Christian

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-14 11:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-12  6:14 [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit qiwuchen55
2019-12-12  9:51 ` Oleg Nesterov
2019-12-12 10:08   ` Oleg Nesterov
2019-12-12 11:05     ` Christian Brauner
2019-12-14  6:27       ` chenqiwu
2019-12-14 11:48         ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).