* [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit @ 2019-12-12 6:14 qiwuchen55 2019-12-12 9:51 ` Oleg Nesterov 0 siblings, 1 reply; 6+ messages in thread From: qiwuchen55 @ 2019-12-12 6:14 UTC (permalink / raw) To: peterz, mingo, oleg, christian.brauner Cc: kernel-team, linux-kernel, chenqiwu From: chenqiwu <chenqiwu@xiaomi.com> When global init task get a chance to be killed, panic will happen in later calling steps by do_exit()->exit_notify()->forget_original_parent() ->find_child_reaper() if all init threads have exited. However, it's hard to extract the coredump of init task from a kernel crashdump, since exit_mm() has released its mm before panic. In order to get the backtrace of init task in userspace, it's better to do panic earlier at the beginning of exitting route. It's worth noting that we must take case of a multi-threaded init exitting issue. We need the test for signal_group_exit() to ensure that it is all threads exiting and not just the current thread. Of course testing signal_group_exit() is not sufficient. It is still possible that this is someone calling exit(2) in an old binary or someone calling pthread_exit(3) from the last thread of init. To handle that case, we need the test to be: (signal_group_exit(tsk->signal) || thread_group_empty(tsk)). Because exiting the final thread of init should also trigger the panic. Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> --- kernel/exit.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index e10de98..ba7d1aa 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -578,10 +578,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father, } write_unlock_irq(&tasklist_lock); - if (unlikely(pid_ns == &init_pid_ns)) { - panic("Attempted to kill init! exitcode=0x%08x\n", - father->signal->group_exit_code ?: father->exit_code); - } list_for_each_entry_safe(p, n, dead, ptrace_entry) { list_del_init(&p->ptrace_entry); @@ -785,6 +781,9 @@ void __noreturn do_exit(long code) panic("Aiee, killing interrupt handler!"); if (unlikely(!tsk->pid)) panic("Attempted to kill the idle task!"); + if (unlikely(is_global_init(tsk) && + (signal_group_exit(tsk->signal) || thread_group_empty(tsk)))) + panic("Attempted to kill init! exitcode=0x%08lx\n", code); /* * If do_exit is called because this processes oopsed, it's possible -- 1.9.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit 2019-12-12 6:14 [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit qiwuchen55 @ 2019-12-12 9:51 ` Oleg Nesterov 2019-12-12 10:08 ` Oleg Nesterov 0 siblings, 1 reply; 6+ messages in thread From: Oleg Nesterov @ 2019-12-12 9:51 UTC (permalink / raw) To: qiwuchen55 Cc: peterz, mingo, christian.brauner, kernel-team, linux-kernel, chenqiwu On 12/12, qiwuchen55@gmail.com wrote: > > Of course testing signal_group_exit() is not sufficient. It is still > possible that this is someone calling exit(2) Or execve(), so > @@ -785,6 +781,9 @@ void __noreturn do_exit(long code) > panic("Aiee, killing interrupt handler!"); > if (unlikely(!tsk->pid)) > panic("Attempted to kill the idle task!"); > + if (unlikely(is_global_init(tsk) && > + (signal_group_exit(tsk->signal) || thread_group_empty(tsk)))) > + panic("Attempted to kill init! exitcode=0x%08lx\n", code); so this can panic() if one of init's threads does does exec. Oleg. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit 2019-12-12 9:51 ` Oleg Nesterov @ 2019-12-12 10:08 ` Oleg Nesterov 2019-12-12 11:05 ` Christian Brauner 0 siblings, 1 reply; 6+ messages in thread From: Oleg Nesterov @ 2019-12-12 10:08 UTC (permalink / raw) To: qiwuchen55 Cc: peterz, mingo, christian.brauner, kernel-team, linux-kernel, chenqiwu can't you use is_global_init() && group_dead ? On 12/12, Oleg Nesterov wrote: > > On 12/12, qiwuchen55@gmail.com wrote: > > > > Of course testing signal_group_exit() is not sufficient. It is still > > possible that this is someone calling exit(2) > > Or execve(), so > > > @@ -785,6 +781,9 @@ void __noreturn do_exit(long code) > > panic("Aiee, killing interrupt handler!"); > > if (unlikely(!tsk->pid)) > > panic("Attempted to kill the idle task!"); > > + if (unlikely(is_global_init(tsk) && > > + (signal_group_exit(tsk->signal) || thread_group_empty(tsk)))) > > + panic("Attempted to kill init! exitcode=0x%08lx\n", code); > > so this can panic() if one of init's threads does does exec. > > Oleg. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit 2019-12-12 10:08 ` Oleg Nesterov @ 2019-12-12 11:05 ` Christian Brauner 2019-12-14 6:27 ` chenqiwu 0 siblings, 1 reply; 6+ messages in thread From: Christian Brauner @ 2019-12-12 11:05 UTC (permalink / raw) To: Oleg Nesterov, qiwuchen55 Cc: qiwuchen55, peterz, mingo, kernel-team, linux-kernel, chenqiwu On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote: > can't you use is_global_init() && group_dead ? Seems reasonable. Looks like we can move group_dead = atomic_dec_and_test(&tsk->signal->live); further up... (Ideally I'd like to have a test for this to ensure that this lets you capture a global init coredump but that might be tricky. But since you've seem to have run into this case maybe you even have something that could be turned into a test? (Similar to how we already have a purely opt-in test for pstore.)) Christian ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit 2019-12-12 11:05 ` Christian Brauner @ 2019-12-14 6:27 ` chenqiwu 2019-12-14 11:48 ` Christian Brauner 0 siblings, 1 reply; 6+ messages in thread From: chenqiwu @ 2019-12-14 6:27 UTC (permalink / raw) To: Christian Brauner, peterz, mingo; +Cc: kernel-team, linux-kernel, chenqiwu On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote: > On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote: > > can't you use is_global_init() && group_dead ? > > Seems reasonable. > Looks like we can move > group_dead = atomic_dec_and_test(&tsk->signal->live); > further up... > > (Ideally I'd like to have a test for this to ensure that this lets you > capture a global init coredump but that might be tricky. But since you've > seem to have run into this case maybe you even have something that could > be turned into a test? (Similar to how we already have a purely opt-in > test for pstore.)) > > Christian Hi all, I agree that using is_global_init() && group_dead is more reasonable. The crash isuee happened on a Android phone by reboot stress test. panic log: [ 84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 84.048521] [ 84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S O 4.14.117-perf-g8035d1a #1 [ 84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT) [ 84.048550] Call trace: [ 84.048564] dump_backtrace+0x0/0x268 [ 84.048569] show_stack+0x14/0x20 [ 84.048577] dump_stack+0xc4/0x100 [ 84.048584] panic+0x1f0/0x410 [ 84.048591] complete_and_exit+0x0/0x20 [ 84.048596] do_group_exit+0x8c/0xa0 [ 84.048602] get_signal+0x1c0/0x790 [ 84.048608] do_notify_resume+0x184/0xc30 [ 84.048613] work_pending+0x8/0x10 From the kdump loaded by crash utility, all threads of global init have exited, the group_dead value of global init has truned to 0 by atomic_dec_and_test(). crash> ps init PID PPID CPU TASK ST %MEM VSZ RSS COMM > 1 0 2 ffffffcd77526000 ?? 0.0 0 0 init 534 1 4 ffffffcd6b9a9000 ZO 0.0 0 0 init 535 1 4 ffffffcd6b9aa000 ZO 0.0 0 0 init crash> ps -g 1 PID: 1 TASK: ffffffcd77526000 CPU: 2 COMMAND: "init" (no threads) crash> struct task_struct.signal ffffffcd77526000 signal = 0xffffffcd77530000 crash> struct signal_struct 0xffffffcd77530000 struct signal_struct { sigcnt = { counter = 1 }, live = { counter = 0 }, nr_threads = 1, thread_head = { next = 0xffffffcd77526730, prev = 0xffffffcd77526730 }, group_exit_code = 11, notify_count = 0, group_exit_task = 0x0, group_stop_count = 0, flags = 4, ... } However, as Christian said, the test for this is tricky since we must make sure all of init threads exited. I make a test for is_global_init() and send a SIGSEGV signal to global init task in userspace. The phone crash imeddiately and reboot to collect kdump. Then I extract the coredump of global init task from kdump successfully. (gdb) core-file core.1.init Core was generated by `/system/bin/init second_stage'. #0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9 9 cmn x0, #(MAX_ERRNO + 1) (gdb) bt #0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9 #1 0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11) at system/core/init/reboot_utils.cpp:141 #2 0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138 #3 0x0000007f8de236a0 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) So from the following test, we have confidence that the following patch can help us to get the extra coredump of init when global init task do real exit. diff --git a/kernel/exit.c b/kernel/exit.c index 8e288e8..9454106 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -476,10 +476,6 @@ static struct task_struct *find_child_reaper(struct task_struct *father, } write_unlock_irq(&tasklist_lock); - if (unlikely(pid_ns == &init_pid_ns)) { - panic("Attempted to kill init! exitcode=0x%08x\n", - father->signal->group_exit_code ?: father->exit_code); - } list_for_each_entry_safe(p, n, dead, ptrace_entry) { list_del_init(&p->ptrace_entry); @@ -687,6 +683,10 @@ void do_exit(long code) if (unlikely(!tsk->pid)) panic("Attempted to kill the idle task!"); + group_dead = atomic_dec_and_test(&tsk->signal->live); + if (unlikely(is_global_init(tsk) && group_dead)) + panic("Attempted to kill init! exitcode=0x%08lx\n", code); + /* * If do_exit is called because this processes oopsed, it's possible * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before @@ -743,7 +743,6 @@ void do_exit(long code) if (tsk->mm) sync_mm_rss(tsk->mm); acct_update_integrals(tsk); - group_dead = atomic_dec_and_test(&tsk->signal->live); if (group_dead) { hrtimer_cancel(&tsk->signal->real_timer); exit_itimers(tsk->signal); ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit 2019-12-14 6:27 ` chenqiwu @ 2019-12-14 11:48 ` Christian Brauner 0 siblings, 0 replies; 6+ messages in thread From: Christian Brauner @ 2019-12-14 11:48 UTC (permalink / raw) To: chenqiwu; +Cc: peterz, mingo, kernel-team, linux-kernel, chenqiwu, oleg On Sat, Dec 14, 2019 at 02:27:04PM +0800, chenqiwu wrote: > On Thu, Dec 12, 2019 at 12:05:14PM +0100, Christian Brauner wrote: > > On Thu, Dec 12, 2019 at 11:08:38AM +0100, Oleg Nesterov wrote: > > > can't you use is_global_init() && group_dead ? > > > > Seems reasonable. > > Looks like we can move > > group_dead = atomic_dec_and_test(&tsk->signal->live); > > further up... > > > > (Ideally I'd like to have a test for this to ensure that this lets you > > capture a global init coredump but that might be tricky. But since you've > > seem to have run into this case maybe you even have something that could > > be turned into a test? (Similar to how we already have a purely opt-in > > test for pstore.)) > > > > Christian > > Hi all, > I agree that using is_global_init() && group_dead is more reasonable. > > The crash isuee happened on a Android phone by reboot stress test. > panic log: > [ 84.048521] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b > [ 84.048521] > [ 84.048540] CPU: 2 PID: 1 Comm: init Tainted: G S O 4.14.117-perf-g8035d1a #1 > [ 84.048544] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 RAPHAEL (DT) > [ 84.048550] Call trace: > [ 84.048564] dump_backtrace+0x0/0x268 > [ 84.048569] show_stack+0x14/0x20 > [ 84.048577] dump_stack+0xc4/0x100 > [ 84.048584] panic+0x1f0/0x410 > [ 84.048591] complete_and_exit+0x0/0x20 > [ 84.048596] do_group_exit+0x8c/0xa0 > [ 84.048602] get_signal+0x1c0/0x790 > [ 84.048608] do_notify_resume+0x184/0xc30 > [ 84.048613] work_pending+0x8/0x10 > > From the kdump loaded by crash utility, all threads of global init have exited, > the group_dead value of global init has truned to 0 by atomic_dec_and_test(). > crash> ps init > PID PPID CPU TASK ST %MEM VSZ RSS COMM > > 1 0 2 ffffffcd77526000 ?? 0.0 0 0 init > 534 1 4 ffffffcd6b9a9000 ZO 0.0 0 0 init > 535 1 4 ffffffcd6b9aa000 ZO 0.0 0 0 init > crash> ps -g 1 > PID: 1 TASK: ffffffcd77526000 CPU: 2 COMMAND: "init" > (no threads) > crash> struct task_struct.signal ffffffcd77526000 > signal = 0xffffffcd77530000 > crash> struct signal_struct 0xffffffcd77530000 > struct signal_struct { > sigcnt = { > counter = 1 > }, > live = { > counter = 0 > }, > nr_threads = 1, > thread_head = { > next = 0xffffffcd77526730, > prev = 0xffffffcd77526730 > }, > group_exit_code = 11, > notify_count = 0, > group_exit_task = 0x0, > group_stop_count = 0, > flags = 4, > ... > } > > However, as Christian said, the test for this is tricky since we must > make sure all of init threads exited. I make a test for is_global_init() > and send a SIGSEGV signal to global init task in userspace. The phone > crash imeddiately and reboot to collect kdump. Then I extract the coredump > of global init task from kdump successfully. > (gdb) core-file core.1.init > Core was generated by `/system/bin/init second_stage'. > #0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9 > 9 cmn x0, #(MAX_ERRNO + 1) > (gdb) bt > #0 _exit () at bionic/libc/arch-arm64/syscalls/_exit.S:9 > #1 0x00000055606db11c in android::init::InstallRebootSignalHandlers()::$_14::operator()(int) const (this=<optimized out>, signal=11) > at system/core/init/reboot_utils.cpp:141 > #2 0x00000055606db100 in android::init::InstallRebootSignalHandlers()::$_14::__invoke(int) (signal=11) at system/core/init/reboot_utils.cpp:138 > #3 0x0000007f8de236a0 in ?? () > Backtrace stopped: previous frame identical to this frame (corrupt stack?) > > So from the following test, we have confidence that the following patch can help us Thanks. Can you please resend this as a proper patch for v2? Christian ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-12-14 11:53 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-12 6:14 [PATCH] kernel/exit: do panic earlier to get coredump if global init task exit qiwuchen55 2019-12-12 9:51 ` Oleg Nesterov 2019-12-12 10:08 ` Oleg Nesterov 2019-12-12 11:05 ` Christian Brauner 2019-12-14 6:27 ` chenqiwu 2019-12-14 11:48 ` Christian Brauner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).