* kernel BUG at kernel/sched/core.c:3490! @ 2019-01-01 5:44 Qian Cai 2019-01-07 13:52 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Qian Cai @ 2019-01-01 5:44 UTC (permalink / raw) To: Ingo Molnar, Peter Zijlstra; +Cc: linux kernel Running some mmap() workloads to put the system on low memory situation with swapping and OOM, and then it trigger this BUG(), void __noreturn do_task_dead(void) { /* Causes final put_task_struct in finish_task_switch(): */ set_special_state(TASK_DEAD); /* Tell freezer to ignore us: */ current->flags |= PF_NOFREEZE; __schedule(false); BUG(); /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */ for (;;) cpu_relax(); } [ 422.863911] kernel BUG at kernel/sched/core.c:3490! [ 422.868634] oom01 (3177) used greatest stack depth: 28712 bytes left [ 422.869109] invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 422.880325] CPU: 86 PID: 3235 Comm: oom01 Kdump: loaded Tainted: G W 4.20.0+ #5 [ 422.888995] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 09/07/2018 [ 422.897590] RIP: 0010:do_task_dead+0x73/0x90 [ 422.901893] Code: 48 c7 43 10 80 00 00 00 4c 89 ee 4c 89 e7 e8 34 26 8a 00 48 8d 7b 24 e8 3b b6 2e 00 81 4b 24 00 80 00 00 31 ff e8 8d 4c 89 00 <0f> 0b 48 c7 c7 40 2c 53 b1 e8 da e7 51 00 0f 1f 44 00 00 66 2e 0f [ 422.920783] RSP: 0018:ffff888392daf5a8 EFLAGS: 00010282 [ 422.926048] RAX: 0000000000000000 RBX: ffff88810e23aec0 RCX: 0000000000000000 [ 422.933234] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffffed10725b5ea8 [ 422.940419] RBP: ffff888392daf5c0 R08: fffffbfff6338a76 R09: fffffbfff6338a75 [ 422.947604] R10: fffffbfff6338a75 R11: ffffffffb19c53ab R12: ffff88810e23b510 [ 422.954789] R13: 0000000000000246 R14: 0000000000000000 R15: ffff888638f636d8 [ 422.961974] FS: 00007f105ff5d700(0000) GS:ffff88905e300000(0000) knlGS:0000000000000000 [ 422.970118] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 422.975905] CR2: 00007f105ff5d9d0 CR3: 0000000a23e1e000 CR4: 00000000001406a0 [ 422.983087] Call Trace: [ 422.985564] do_exit+0x95e/0xe30 [ 422.988821] ? dump_align+0x50/0x50 [ 422.992338] ? mm_update_next_owner+0x570/0x570 [ 422.996906] ? __kasan_slab_free+0x1af/0x210 [ 423.001207] ? kmem_cache_free+0xc0/0x350 [ 423.005247] ? __dequeue_signal+0x2bd/0x370 [ 423.009460] ? dequeue_signal+0x90/0x2d0 [ 423.013410] ? get_signal+0x296/0xf30 [ 423.017100] ? do_signal+0x93/0x9d0 [ 423.020619] ? exit_to_usermode_loop+0x130/0x170 [ 423.025271] ? prepare_exit_to_usermode+0x1d7/0x1f0 [ 423.030187] ? retint_user+0x8/0x18 [ 423.033704] ? trace_hardirqs_off+0x9d/0x230 [ 423.038005] ? trace_hardirqs_on_caller+0x230/0x230 [ 423.042919] ? do_raw_spin_trylock+0x180/0x180 [ 423.047396] ? do_raw_spin_lock+0xf0/0x1f0 [ 423.051524] ? rwlock_bug.part.0+0x60/0x60 [ 423.055656] ? __dequeue_signal+0x2bd/0x370 [ 423.059871] ? _raw_spin_unlock_irqrestore+0x34/0x50 [ 423.064873] ? __debug_check_no_obj_freed+0x204/0x330 [ 423.069963] ? debug_object_free+0x10/0x10 [ 423.074092] ? trace_hardirqs_on_caller+0x9f/0x230 [ 423.078920] ? check_stack_object+0x22/0x60 [ 423.083138] ? debug_lockdep_rcu_enabled+0x22/0x40 [ 423.087964] ? kmem_cache_free+0x22e/0x350 [ 423.092091] ? __dequeue_signal+0x2bd/0x370 [ 423.096309] ? debug_lockdep_rcu_enabled+0x22/0x40 [ 423.101137] ? get_signal+0x530/0xf30 [ 423.104828] ? __flush_itimer_signals+0x310/0x310 [ 423.109567] ? check_flags.part.18+0x220/0x220 [ 423.114045] ? recalc_sigpending+0x6e/0x110 [ 423.118258] ? __sigqueue_alloc+0x4e0/0x4e0 [ 423.122470] ? lockdep_hardirqs_on+0x11/0x290 [ 423.126861] do_group_exit+0xc1/0x1d0 [ 423.130552] ? __x64_sys_exit+0x30/0x30 [ 423.134416] get_signal+0x4a6/0xf30 [ 423.137933] ? ptrace_notify+0xb0/0xb0 [ 423.141708] ? force_sig_fault+0xb3/0xf0 [ 423.145659] ? force_sigsegv+0x90/0x90 [ 423.149440] ? set_signal_archinfo+0x6f/0xa0 [ 423.153738] ? __do_page_fault+0x6b1/0x6d0 [ 423.157863] ? mm_fault_error+0x140/0x140 [ 423.161903] do_signal+0x93/0x9d0 [ 423.165244] ? lockdep_hardirqs_on+0x11/0x290 [ 423.169632] ? trace_hardirqs_on+0x9d/0x230 [ 423.173846] ? ftrace_destroy_function_files+0x50/0x50 [ 423.179022] ? do_raw_spin_trylock+0x180/0x180 [ 423.183498] ? setup_sigcontext+0x260/0x260 [ 423.187712] ? do_page_fault+0x119/0x53c [ 423.191663] ? lockdep_hardirqs_on+0x11/0x290 [ 423.196050] ? trace_hardirqs_on+0x9d/0x230 [ 423.200264] ? ftrace_destroy_function_files+0x50/0x50 [ 423.205441] ? task_work_run+0x118/0x1a0 [ 423.209393] ? mark_held_locks+0x23/0xb0 [ 423.213345] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 423.217999] ? retint_user+0x18/0x18 [ 423.221602] exit_to_usermode_loop+0x130/0x170 [ 423.226081] ? lockdep_sys_exit_thunk+0x29/0x29 [ 423.230647] prepare_exit_to_usermode+0x1d7/0x1f0 [ 423.235387] ? syscall_slow_exit_work+0x380/0x380 [ 423.240127] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 423.244866] ? page_fault+0x5/0x20 [ 423.248294] retint_user+0x8/0x18 [ 423.251637] RIP: 0033:0x40f910 [ 423.254719] Code: Bad RIP value. [ 423.257970] RSP: 002b:00007f105ff5cec0 EFLAGS: 00010206 [ 423.263230] RAX: 0000000000001000 RBX: 00000000c0000000 RCX: 00007f5dd40bd497 [ 423.270413] RDX: 0000000013ff8000 RSI: 00000000c0000000 RDI: 0000000000000000 [ 423.277594] RBP: 00007f0edef5c000 R08: 00000000ffffffff R09: 0000000000000000 [ 423.284777] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000001 [ 423.291958] R13: 00007ffe090ed89f R14: 0000000000000000 R15: 00007f105ff5cfc0 [ 423.299141] Modules linked in: af_packet nls_iso8859_1 nls_cp437 vfat fat ses enclosure efivars ip_tables x_tables xfs libcrc32c crc32c_generic crypto_hash sd_mod smartpqi tg3 scsi_transport_sas mlx5_core libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod efivarfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-01 5:44 kernel BUG at kernel/sched/core.c:3490! Qian Cai @ 2019-01-07 13:52 ` Peter Zijlstra 2019-01-07 14:36 ` Qian Cai 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2019-01-07 13:52 UTC (permalink / raw) To: Qian Cai; +Cc: Ingo Molnar, linux kernel, Oleg Nesterov, gkohli On Tue, Jan 01, 2019 at 12:44:35AM -0500, Qian Cai wrote: > Running some mmap() workloads to put the system on low memory situation with > swapping and OOM, and then it trigger this BUG(), > > void __noreturn do_task_dead(void) > { > /* Causes final put_task_struct in finish_task_switch(): */ > set_special_state(TASK_DEAD); > > /* Tell freezer to ignore us: */ > current->flags |= PF_NOFREEZE; > > __schedule(false); > BUG(); > > /* Avoid "noreturn function does return" - but don't continue if BUG() > is a NOP: */ > for (;;) > cpu_relax(); > } This would mean that we somehow loose the TASK_DEAD state before hitting schedule(), but that is something that should be avoided by set_special_state(), which is supposed to serialize against concurrent wake-ups. Also see commit: b5bf9a90bbeb ("sched/core: Introduce set_special_state()") How readily does this reproduce? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-07 13:52 ` Peter Zijlstra @ 2019-01-07 14:36 ` Qian Cai 2019-01-07 17:56 ` Oleg Nesterov 0 siblings, 1 reply; 8+ messages in thread From: Qian Cai @ 2019-01-07 14:36 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux kernel, Oleg Nesterov, gkohli On 1/7/19 8:52 AM, Peter Zijlstra wrote: > On Tue, Jan 01, 2019 at 12:44:35AM -0500, Qian Cai wrote: >> Running some mmap() workloads to put the system on low memory situation with >> swapping and OOM, and then it trigger this BUG(), >> >> void __noreturn do_task_dead(void) >> { >> /* Causes final put_task_struct in finish_task_switch(): */ >> set_special_state(TASK_DEAD); >> >> /* Tell freezer to ignore us: */ >> current->flags |= PF_NOFREEZE; >> >> __schedule(false); >> BUG(); >> >> /* Avoid "noreturn function does return" - but don't continue if BUG() >> is a NOP: */ >> for (;;) >> cpu_relax(); >> } > > This would mean that we somehow loose the TASK_DEAD state before hitting > schedule(), but that is something that should be avoided by > set_special_state(), which is supposed to serialize against concurrent > wake-ups. > > Also see commit: b5bf9a90bbeb ("sched/core: Introduce set_special_state()") > > How readily does this reproduce? Running LTP oom01 [1] triggered it at least once in five attempts every time so far on v4.20+. Have not tried much on v5.0-rc1 yet. [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-07 14:36 ` Qian Cai @ 2019-01-07 17:56 ` Oleg Nesterov 2019-01-11 10:37 ` Kohli, Gaurav 0 siblings, 1 reply; 8+ messages in thread From: Oleg Nesterov @ 2019-01-07 17:56 UTC (permalink / raw) To: Qian Cai; +Cc: Peter Zijlstra, Ingo Molnar, linux kernel, gkohli On 01/07, Qian Cai wrote: > > > On 1/7/19 8:52 AM, Peter Zijlstra wrote: > > On Tue, Jan 01, 2019 at 12:44:35AM -0500, Qian Cai wrote: > >> Running some mmap() workloads to put the system on low memory situation with > >> swapping and OOM, and then it trigger this BUG(), > >> > >> void __noreturn do_task_dead(void) > >> { > >> /* Causes final put_task_struct in finish_task_switch(): */ > >> set_special_state(TASK_DEAD); > >> > >> /* Tell freezer to ignore us: */ > >> current->flags |= PF_NOFREEZE; > >> > >> __schedule(false); > >> BUG(); > >> > >> /* Avoid "noreturn function does return" - but don't continue if BUG() > >> is a NOP: */ > >> for (;;) > >> cpu_relax(); > >> } > > > > This would mean that we somehow loose the TASK_DEAD state before hitting > > schedule(), but that is something that should be avoided by > > set_special_state(), which is supposed to serialize against concurrent > > wake-ups. or may be pick_next_task() somehow returns the deactivated TASK_DEAD task? > > How readily does this reproduce? > > Running LTP oom01 [1] triggered it at least once in five attempts every time so > far on v4.20+. Have not tried much on v5.0-rc1 yet. Can you add pr_crit("XXX: %ld %d\n", current->state, current->on_rq); before that BUG() and reproduce? Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-07 17:56 ` Oleg Nesterov @ 2019-01-11 10:37 ` Kohli, Gaurav 2019-01-11 16:17 ` Qian Cai 0 siblings, 1 reply; 8+ messages in thread From: Kohli, Gaurav @ 2019-01-11 10:37 UTC (permalink / raw) To: Oleg Nesterov, Qian Cai; +Cc: Peter Zijlstra, Ingo Molnar, linux kernel On 1/7/2019 11:26 PM, Oleg Nesterov wrote: > pr_crit("XXX: %ld %d\n", current->state, current->on_rq); Can we also add flags, this may help to know the path of problem: pr_crit("XXX: %ld %d 0x%x\n", current->state, current->on_rq, current->flags); -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-11 10:37 ` Kohli, Gaurav @ 2019-01-11 16:17 ` Qian Cai 2019-01-12 8:54 ` Kohli, Gaurav 0 siblings, 1 reply; 8+ messages in thread From: Qian Cai @ 2019-01-11 16:17 UTC (permalink / raw) To: Kohli, Gaurav, Oleg Nesterov; +Cc: Peter Zijlstra, Ingo Molnar, linux kernel On Fri, 2019-01-11 at 16:07 +0530, Kohli, Gaurav wrote: > > On 1/7/2019 11:26 PM, Oleg Nesterov wrote: > > pr_crit("XXX: %ld %d\n", current->state, current->on_rq); > > Can we also add flags, this may help to know the path of problem: > > pr_crit("XXX: %ld %d 0x%x\n", current->state, current->on_rq, > current->flags); > XXX: 0 1 0x40844c ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-11 16:17 ` Qian Cai @ 2019-01-12 8:54 ` Kohli, Gaurav 2019-01-16 17:32 ` Oleg Nesterov 0 siblings, 1 reply; 8+ messages in thread From: Kohli, Gaurav @ 2019-01-12 8:54 UTC (permalink / raw) To: Qian Cai, Oleg Nesterov; +Cc: Peter Zijlstra, Ingo Molnar, linux kernel HI Peter, Oleg, as per flag and state this seems to be possible only from below code: XXX: 0 1 0x40844c PF_NOFREEZE PF_RANDOMIZE PF_SIGNALED PF_FORKNOEXEC PF_EXITING PF_EXITPIDONE above state shows do_exit runs properely and if somehow after parked stated , TASK_WAKEKILL got set and signal_pending_state returns 1 in below case: switch_count = &prev->nivcsw; if (!preempt && prev->state) { if (unlikely(signal_pending_state(prev->state, prev))) { prev->state = TASK_RUNNING; } else { deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK); Regards Gaurav On 1/11/2019 9:47 PM, Qian Cai wrote: > On Fri, 2019-01-11 at 16:07 +0530, Kohli, Gaurav wrote: >> >> On 1/7/2019 11:26 PM, Oleg Nesterov wrote: >>> pr_crit("XXX: %ld %d\n", current->state, current->on_rq); >> >> Can we also add flags, this may help to know the path of problem: >> >> pr_crit("XXX: %ld %d 0x%x\n", current->state, current->on_rq, >> current->flags); >> > > XXX: 0 1 0x40844c > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: kernel BUG at kernel/sched/core.c:3490! 2019-01-12 8:54 ` Kohli, Gaurav @ 2019-01-16 17:32 ` Oleg Nesterov 0 siblings, 0 replies; 8+ messages in thread From: Oleg Nesterov @ 2019-01-16 17:32 UTC (permalink / raw) To: Kohli, Gaurav; +Cc: Qian Cai, Peter Zijlstra, Ingo Molnar, linux kernel On 01/12, Kohli, Gaurav wrote: > > HI Peter, Oleg, > > as per flag and state this seems to be possible only from below code: Not sure I understand you, > XXX: 0 1 0x40844c > PF_NOFREEZE > PF_RANDOMIZE > PF_SIGNALED > PF_FORKNOEXEC > PF_EXITING > PF_EXITPIDONE > > above state shows do_exit runs properely and if somehow after parked stated > , TASK_WAKEKILL got set and signal_pending_state returns 1 in below case: > > switch_count = &prev->nivcsw; > if (!preempt && prev->state) { > if (unlikely(signal_pending_state(prev->state, prev))) { > prev->state = TASK_RUNNING; > } else { > deactivate_task(rq, prev, DEQUEUE_SLEEP | > DEQUEUE_NOCLOCK); or task->state was TASK_RUNNING when __schedule() was called, or the deactivated dead task was woken up later... The only problem is that every case looks "obviously impossible" ;) I have no idea whats going on, I can only suggest more stupid debugging patches which might narrow the problem. Oleg. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-01-16 17:32 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-01 5:44 kernel BUG at kernel/sched/core.c:3490! Qian Cai 2019-01-07 13:52 ` Peter Zijlstra 2019-01-07 14:36 ` Qian Cai 2019-01-07 17:56 ` Oleg Nesterov 2019-01-11 10:37 ` Kohli, Gaurav 2019-01-11 16:17 ` Qian Cai 2019-01-12 8:54 ` Kohli, Gaurav 2019-01-16 17:32 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).