* [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. @ 2018-03-20 11:57 Tetsuo Handa 2018-03-20 11:57 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Tetsuo Handa 2018-03-20 12:28 ` [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Michal Hocko 0 siblings, 2 replies; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 11:57 UTC (permalink / raw) To: linux-mm; +Cc: Tetsuo Handa, David Rientjes, Michal Hocko I found that it is not difficult to hit "oom_reaper: unable to reap pid:" messages if the victim thread is doing copy_process(). Since I noticed that it is likely helpful to show trace of unable to reap victim thread for finding locations which should use killable wait, this patch does so. [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) [ 226.611971] a.out D13056 9261 6927 0x00100084 [ 226.615879] Call Trace: [ 226.617926] ? __schedule+0x25f/0x780 [ 226.620559] schedule+0x2d/0x80 [ 226.623356] rwsem_down_write_failed+0x2bb/0x440 [ 226.626426] ? rwsem_down_write_failed+0x55/0x440 [ 226.629458] ? anon_vma_fork+0x124/0x150 [ 226.632679] call_rwsem_down_write_failed+0x13/0x20 [ 226.635884] down_write+0x49/0x60 [ 226.638867] ? copy_process.part.41+0x12f2/0x1fe0 [ 226.642042] copy_process.part.41+0x12f2/0x1fe0 /* i_mmap_lock_write() in dup_mmap() */ [ 226.645087] ? _do_fork+0xe6/0x560 [ 226.647991] _do_fork+0xe6/0x560 [ 226.650495] ? syscall_trace_enter+0x1a9/0x240 [ 226.653443] ? retint_user+0x18/0x18 [ 226.656601] ? page_fault+0x2f/0x50 [ 226.659159] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 226.662399] do_syscall_64+0x74/0x230 [ 226.664989] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 380.525910] oom_reaper: unable to reap pid:6927 (a.out) [ 380.529274] a.out R running task 13544 6927 1 0x00100084 [ 380.533264] Call Trace: [ 380.535433] ? __schedule+0x25f/0x780 [ 380.538052] ? mark_held_locks+0x60/0x80 [ 380.540775] schedule+0x2d/0x80 [ 380.543258] schedule_timeout+0x1a4/0x350 [ 380.546053] ? __next_timer_interrupt+0xd0/0xd0 [ 380.549033] msleep+0x25/0x30 [ 380.551458] shrink_inactive_list+0x5b7/0x690 [ 380.554408] ? __lock_acquire+0x1f1/0xfd0 [ 380.557228] ? find_held_lock+0x2d/0x90 [ 380.559959] shrink_node_memcg+0x340/0x770 [ 380.562776] ? __lock_acquire+0x246/0xfd0 [ 380.565532] ? mem_cgroup_iter+0x121/0x4f0 [ 380.568337] ? mem_cgroup_iter+0x121/0x4f0 [ 380.571203] shrink_node+0xd8/0x370 [ 380.573745] do_try_to_free_pages+0xe3/0x390 [ 380.576759] try_to_free_pages+0xc8/0x110 [ 380.579476] __alloc_pages_slowpath+0x28a/0x8d9 [ 380.582639] __alloc_pages_nodemask+0x21d/0x260 [ 380.585662] new_slab+0x558/0x760 [ 380.588188] ___slab_alloc+0x353/0x6f0 [ 380.590962] ? copy_process.part.41+0x121f/0x1fe0 [ 380.594008] ? find_held_lock+0x2d/0x90 [ 380.596750] ? copy_process.part.41+0x121f/0x1fe0 [ 380.599852] __slab_alloc+0x41/0x7a [ 380.602469] ? copy_process.part.41+0x121f/0x1fe0 [ 380.605607] kmem_cache_alloc+0x1a6/0x1f0 [ 380.608528] copy_process.part.41+0x121f/0x1fe0 /* kmem_cache_alloc(GFP_KERNEL) in dup_mmap() */ [ 380.611613] ? _do_fork+0xe6/0x560 [ 380.614299] _do_fork+0xe6/0x560 [ 380.616752] ? syscall_trace_enter+0x1a9/0x240 [ 380.619813] ? retint_user+0x18/0x18 [ 380.622588] ? page_fault+0x2f/0x50 [ 380.625184] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 380.628841] do_syscall_64+0x74/0x230 [ 380.631660] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 600.285293] oom_reaper: unable to reap pid:7550 (a.out) [ 600.289112] a.out D13104 7550 6931 0x00100084 [ 600.292918] Call Trace: [ 600.295506] ? __schedule+0x25f/0x780 [ 600.298614] ? __lock_acquire+0x246/0xfd0 [ 600.301879] schedule+0x2d/0x80 [ 600.304623] schedule_timeout+0x1fd/0x350 [ 600.307633] ? find_held_lock+0x2d/0x90 [ 600.310909] ? mark_held_locks+0x60/0x80 [ 600.314001] ? _raw_spin_unlock_irq+0x24/0x30 [ 600.317382] wait_for_completion+0xab/0x130 [ 600.320579] ? wake_up_q+0x70/0x70 [ 600.323419] flush_work+0x1bd/0x260 [ 600.326434] ? flush_work+0x174/0x260 [ 600.329332] ? destroy_worker+0x90/0x90 [ 600.332301] drain_all_pages+0x16d/0x1e0 [ 600.335464] __alloc_pages_slowpath+0x443/0x8d9 [ 600.338743] __alloc_pages_nodemask+0x21d/0x260 [ 600.342178] new_slab+0x558/0x760 [ 600.344861] ___slab_alloc+0x353/0x6f0 [ 600.347744] ? copy_process.part.41+0x121f/0x1fe0 [ 600.351213] ? find_held_lock+0x2d/0x90 [ 600.354119] ? copy_process.part.41+0x121f/0x1fe0 [ 600.357491] __slab_alloc+0x41/0x7a [ 600.360206] ? copy_process.part.41+0x121f/0x1fe0 [ 600.363480] kmem_cache_alloc+0x1a6/0x1f0 [ 600.366579] copy_process.part.41+0x121f/0x1fe0 /* kmem_cache_alloc(GFP_KERNEL) in dup_mmap() */ [ 600.369761] ? _do_fork+0xe6/0x560 [ 600.372382] _do_fork+0xe6/0x560 [ 600.375142] ? syscall_trace_enter+0x1a9/0x240 [ 600.378258] ? retint_user+0x18/0x18 [ 600.380978] ? page_fault+0x2f/0x50 [ 600.383814] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 600.387096] do_syscall_64+0x74/0x230 [ 600.389928] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> --- mm/oom_kill.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 5336985..900300c 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -41,6 +41,7 @@ #include <linux/kthread.h> #include <linux/init.h> #include <linux/mmu_notifier.h> +#include <linux/sched/debug.h> #include <asm/tlb.h> #include "internal.h" @@ -596,6 +597,7 @@ static void oom_reap_task(struct task_struct *tsk) pr_info("oom_reaper: unable to reap pid:%d (%s)\n", task_pid_nr(tsk), tsk->comm); + sched_show_task(tsk); debug_show_all_locks(); done: -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt. 2018-03-20 11:57 [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Tetsuo Handa @ 2018-03-20 11:57 ` Tetsuo Handa 2018-03-20 12:12 ` Michal Hocko 2018-03-20 12:28 ` [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Michal Hocko 1 sibling, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 11:57 UTC (permalink / raw) To: linux-mm; +Cc: Tetsuo Handa, David Rientjes, Michal Hocko I got "oom_reaper: unable to reap pid:" messages when the victim thread was blocked inside free_pgtables() (which occurred after returning from unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when __oom_reap_task_mm() returned true (by e.g. finding MMF_OOM_SKIP already set) when oom_reap_task() was trying MAX_OOM_REAP_RETRIES'th attempt. [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) [ 664.892292] a.out D13272 7558 6931 0x00100084 [ 664.895765] Call Trace: [ 664.897574] ? __schedule+0x25f/0x780 [ 664.900099] schedule+0x2d/0x80 [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 [ 664.908335] ? free_pgd_range+0x569/0x5e0 [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 [ 664.914121] down_write+0x49/0x60 [ 664.916519] ? unlink_file_vma+0x28/0x50 [ 664.919255] unlink_file_vma+0x28/0x50 [ 664.922234] free_pgtables+0x36/0x100 [ 664.924797] exit_mmap+0xbb/0x180 [ 664.927220] mmput+0x50/0x110 [ 664.929504] copy_process.part.41+0xb61/0x1fe0 [ 664.932448] ? _do_fork+0xe6/0x560 [ 664.934902] ? _do_fork+0xe6/0x560 [ 664.937361] _do_fork+0xe6/0x560 [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 [ 664.942693] ? retint_user+0x18/0x18 [ 664.945309] ? page_fault+0x2f/0x50 [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 664.951075] do_syscall_64+0x74/0x230 [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> --- mm/oom_kill.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 900300c..1cb2b98 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -588,11 +588,11 @@ static void oom_reap_task(struct task_struct *tsk) struct mm_struct *mm = tsk->signal->oom_mm; /* Retry the down_read_trylock(mmap_sem) a few times */ - while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm)) + while (attempts++ < MAX_OOM_REAP_RETRIES) { + if (__oom_reap_task_mm(tsk, mm)) + goto done; schedule_timeout_idle(HZ/10); - - if (attempts <= MAX_OOM_REAP_RETRIES) - goto done; + } pr_info("oom_reaper: unable to reap pid:%d (%s)\n", -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt. 2018-03-20 11:57 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Tetsuo Handa @ 2018-03-20 12:12 ` Michal Hocko 2018-03-20 12:37 ` Tetsuo Handa 0 siblings, 1 reply; 20+ messages in thread From: Michal Hocko @ 2018-03-20 12:12 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, David Rientjes On Tue 20-03-18 20:57:56, Tetsuo Handa wrote: > I got "oom_reaper: unable to reap pid:" messages when the victim thread > was blocked inside free_pgtables() (which occurred after returning from > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > __oom_reap_task_mm() returned true (by e.g. finding MMF_OOM_SKIP already > set) when oom_reap_task() was trying MAX_OOM_REAP_RETRIES'th attempt. > > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) I do not see "oom_reaper: reaped process..." so has the task been reaped? > [ 664.892292] a.out D13272 7558 6931 0x00100084 > [ 664.895765] Call Trace: > [ 664.897574] ? __schedule+0x25f/0x780 > [ 664.900099] schedule+0x2d/0x80 > [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 > [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 > [ 664.908335] ? free_pgd_range+0x569/0x5e0 > [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 > [ 664.914121] down_write+0x49/0x60 > [ 664.916519] ? unlink_file_vma+0x28/0x50 > [ 664.919255] unlink_file_vma+0x28/0x50 > [ 664.922234] free_pgtables+0x36/0x100 > [ 664.924797] exit_mmap+0xbb/0x180 > [ 664.927220] mmput+0x50/0x110 > [ 664.929504] copy_process.part.41+0xb61/0x1fe0 > [ 664.932448] ? _do_fork+0xe6/0x560 > [ 664.934902] ? _do_fork+0xe6/0x560 > [ 664.937361] _do_fork+0xe6/0x560 > [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 > [ 664.942693] ? retint_user+0x18/0x18 > [ 664.945309] ? page_fault+0x2f/0x50 > [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 > [ 664.951075] do_syscall_64+0x74/0x230 > [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Cc: Michal Hocko <mhocko@suse.com> > Cc: David Rientjes <rientjes@google.com> > --- > mm/oom_kill.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 900300c..1cb2b98 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -588,11 +588,11 @@ static void oom_reap_task(struct task_struct *tsk) > struct mm_struct *mm = tsk->signal->oom_mm; > > /* Retry the down_read_trylock(mmap_sem) a few times */ > - while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm)) > + while (attempts++ < MAX_OOM_REAP_RETRIES) { > + if (__oom_reap_task_mm(tsk, mm)) > + goto done; > schedule_timeout_idle(HZ/10); > - > - if (attempts <= MAX_OOM_REAP_RETRIES) > - goto done; > + } I do not see how this improves anything. Even if __oom_reap_task_mm suceeded during the last attempt then attempts == MAX_OOM_REAP_RETRIES and the if below would bail out. Or what do I miss? > pr_info("oom_reaper: unable to reap pid:%d (%s)\n", -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt. 2018-03-20 12:12 ` Michal Hocko @ 2018-03-20 12:37 ` Tetsuo Handa 2018-03-20 12:47 ` [PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain Tetsuo Handa 2018-03-20 13:24 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Michal Hocko 0 siblings, 2 replies; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 12:37 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes Michal Hocko wrote: > On Tue 20-03-18 20:57:56, Tetsuo Handa wrote: > > I got "oom_reaper: unable to reap pid:" messages when the victim thread > > was blocked inside free_pgtables() (which occurred after returning from > > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > > __oom_reap_task_mm() returned true (by e.g. finding MMF_OOM_SKIP already > > set) when oom_reap_task() was trying MAX_OOM_REAP_RETRIES'th attempt. > > > > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > > I do not see "oom_reaper: reaped process..." so has the task been > reaped? The log is http://I-love.SAKURA.ne.jp/tmp/serial-20180320.txt.xz . I think the task succeeded to call unmap_vmas() in exit_mmap(), for no "oom_reaper: reaped process..." message means that __oom_reap_task_mm() returned without setting MMF_UNSTABLE. > > > [ 664.892292] a.out D13272 7558 6931 0x00100084 > > [ 664.895765] Call Trace: > > [ 664.897574] ? __schedule+0x25f/0x780 > > [ 664.900099] schedule+0x2d/0x80 > > [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 > > [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 > > [ 664.908335] ? free_pgd_range+0x569/0x5e0 > > [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 > > [ 664.914121] down_write+0x49/0x60 > > [ 664.916519] ? unlink_file_vma+0x28/0x50 > > [ 664.919255] unlink_file_vma+0x28/0x50 > > [ 664.922234] free_pgtables+0x36/0x100 > > [ 664.924797] exit_mmap+0xbb/0x180 > > [ 664.927220] mmput+0x50/0x110 > > [ 664.929504] copy_process.part.41+0xb61/0x1fe0 > > [ 664.932448] ? _do_fork+0xe6/0x560 > > [ 664.934902] ? _do_fork+0xe6/0x560 > > [ 664.937361] _do_fork+0xe6/0x560 > > [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 > > [ 664.942693] ? retint_user+0x18/0x18 > > [ 664.945309] ? page_fault+0x2f/0x50 > > [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 > > [ 664.951075] do_syscall_64+0x74/0x230 > > [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > > Cc: Michal Hocko <mhocko@suse.com> > > Cc: David Rientjes <rientjes@google.com> > > --- > > mm/oom_kill.c | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 900300c..1cb2b98 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -588,11 +588,11 @@ static void oom_reap_task(struct task_struct *tsk) > > struct mm_struct *mm = tsk->signal->oom_mm; > > > > /* Retry the down_read_trylock(mmap_sem) a few times */ > > - while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm)) > > + while (attempts++ < MAX_OOM_REAP_RETRIES) { > > + if (__oom_reap_task_mm(tsk, mm)) > > + goto done; > > schedule_timeout_idle(HZ/10); > > - > > - if (attempts <= MAX_OOM_REAP_RETRIES) > > - goto done; > > + } > > I do not see how this improves anything. Even if __oom_reap_task_mm > suceeded during the last attempt then attempts == MAX_OOM_REAP_RETRIES > and the if below would bail out. Or what do I miss? Oops, I optimized incorrectly. The original patch was - if (attempts <= MAX_OOM_REAP_RETRIES) + if (attempts <= MAX_OOM_REAP_RETRIES || + test_bit(MMF_OOM_SKIP, &mm->flags)) goto done; with title "Check for MMF_OOM_SKIP before complain". Even if the last attempt failed, there is no need to complain if MMF_OOM_SKIP was already set. > > > pr_info("oom_reaper: unable to reap pid:%d (%s)\n", > > -- > Michal Hocko > SUSE Labs > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. 2018-03-20 12:37 ` Tetsuo Handa @ 2018-03-20 12:47 ` Tetsuo Handa 2018-03-20 20:49 ` David Rientjes 2018-03-20 13:24 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Michal Hocko 1 sibling, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 12:47 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes >From c8954b88c8e78d10829c646c95e076516a54e84f Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Tue, 20 Mar 2018 21:42:12 +0900 Subject: [PATCH] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. I got "oom_reaper: unable to reap pid:" messages when the victim thread was blocked inside free_pgtables() (which occurred after returning from unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when exit_mmap() already set MMF_OOM_SKIP. [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) [ 664.892292] a.out D13272 7558 6931 0x00100084 [ 664.895765] Call Trace: [ 664.897574] ? __schedule+0x25f/0x780 [ 664.900099] schedule+0x2d/0x80 [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 [ 664.908335] ? free_pgd_range+0x569/0x5e0 [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 [ 664.914121] down_write+0x49/0x60 [ 664.916519] ? unlink_file_vma+0x28/0x50 [ 664.919255] unlink_file_vma+0x28/0x50 [ 664.922234] free_pgtables+0x36/0x100 [ 664.924797] exit_mmap+0xbb/0x180 [ 664.927220] mmput+0x50/0x110 [ 664.929504] copy_process.part.41+0xb61/0x1fe0 [ 664.932448] ? _do_fork+0xe6/0x560 [ 664.934902] ? _do_fork+0xe6/0x560 [ 664.937361] _do_fork+0xe6/0x560 [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 [ 664.942693] ? retint_user+0x18/0x18 [ 664.945309] ? page_fault+0x2f/0x50 [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 664.951075] do_syscall_64+0x74/0x230 [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> --- mm/oom_kill.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 5336985..dfd3705 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -590,7 +590,8 @@ static void oom_reap_task(struct task_struct *tsk) while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm)) schedule_timeout_idle(HZ/10); - if (attempts <= MAX_OOM_REAP_RETRIES) + if (attempts <= MAX_OOM_REAP_RETRIES || + test_bit(MMF_OOM_SKIP, &mm->flags)) goto done; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. 2018-03-20 12:47 ` [PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain Tetsuo Handa @ 2018-03-20 20:49 ` David Rientjes 2018-03-22 10:46 ` [PATCH] " Tetsuo Handa 0 siblings, 1 reply; 20+ messages in thread From: David Rientjes @ 2018-03-20 20:49 UTC (permalink / raw) To: Tetsuo Handa; +Cc: mhocko, linux-mm On Tue, 20 Mar 2018, Tetsuo Handa wrote: > I got "oom_reaper: unable to reap pid:" messages when the victim thread > was blocked inside free_pgtables() (which occurred after returning from > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > exit_mmap() already set MMF_OOM_SKIP. > > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > [ 664.892292] a.out D13272 7558 6931 0x00100084 > [ 664.895765] Call Trace: > [ 664.897574] ? __schedule+0x25f/0x780 > [ 664.900099] schedule+0x2d/0x80 > [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 > [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 > [ 664.908335] ? free_pgd_range+0x569/0x5e0 > [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 > [ 664.914121] down_write+0x49/0x60 > [ 664.916519] ? unlink_file_vma+0x28/0x50 > [ 664.919255] unlink_file_vma+0x28/0x50 > [ 664.922234] free_pgtables+0x36/0x100 > [ 664.924797] exit_mmap+0xbb/0x180 > [ 664.927220] mmput+0x50/0x110 > [ 664.929504] copy_process.part.41+0xb61/0x1fe0 > [ 664.932448] ? _do_fork+0xe6/0x560 > [ 664.934902] ? _do_fork+0xe6/0x560 > [ 664.937361] _do_fork+0xe6/0x560 > [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 > [ 664.942693] ? retint_user+0x18/0x18 > [ 664.945309] ? page_fault+0x2f/0x50 > [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 > [ 664.951075] do_syscall_64+0x74/0x230 > [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Cc: Michal Hocko <mhocko@suse.com> > Cc: David Rientjes <rientjes@google.com> Acked-by: David Rientjes <rientjes@google.com> But you'll need to send it to akpm. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. 2018-03-20 20:49 ` David Rientjes @ 2018-03-22 10:46 ` Tetsuo Handa 2018-03-22 11:59 ` Michal Hocko 0 siblings, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-22 10:46 UTC (permalink / raw) To: akpm; +Cc: rientjes, mhocko, linux-mm >From b141cdbe0db852549c94d5b1e6a9967ca69d59fd Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Thu, 22 Mar 2018 19:44:12 +0900 Subject: [PATCH] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. I got "oom_reaper: unable to reap pid:" messages when the victim thread was blocked inside free_pgtables() (which occurred after returning from unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when exit_mmap() already set MMF_OOM_SKIP. [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) [ 664.892292] a.out D13272 7558 6931 0x00100084 [ 664.895765] Call Trace: [ 664.897574] ? __schedule+0x25f/0x780 [ 664.900099] schedule+0x2d/0x80 [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 [ 664.908335] ? free_pgd_range+0x569/0x5e0 [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 [ 664.914121] down_write+0x49/0x60 [ 664.916519] ? unlink_file_vma+0x28/0x50 [ 664.919255] unlink_file_vma+0x28/0x50 [ 664.922234] free_pgtables+0x36/0x100 [ 664.924797] exit_mmap+0xbb/0x180 [ 664.927220] mmput+0x50/0x110 [ 664.929504] copy_process.part.41+0xb61/0x1fe0 [ 664.932448] ? _do_fork+0xe6/0x560 [ 664.934902] ? _do_fork+0xe6/0x560 [ 664.937361] _do_fork+0xe6/0x560 [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 [ 664.942693] ? retint_user+0x18/0x18 [ 664.945309] ? page_fault+0x2f/0x50 [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 664.951075] do_syscall_64+0x74/0x230 [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@suse.com> --- mm/oom_kill.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 5336985..dfd3705 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -590,7 +590,8 @@ static void oom_reap_task(struct task_struct *tsk) while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm)) schedule_timeout_idle(HZ/10); - if (attempts <= MAX_OOM_REAP_RETRIES) + if (attempts <= MAX_OOM_REAP_RETRIES || + test_bit(MMF_OOM_SKIP, &mm->flags)) goto done; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. 2018-03-22 10:46 ` [PATCH] " Tetsuo Handa @ 2018-03-22 11:59 ` Michal Hocko 0 siblings, 0 replies; 20+ messages in thread From: Michal Hocko @ 2018-03-22 11:59 UTC (permalink / raw) To: Tetsuo Handa; +Cc: akpm, rientjes, linux-mm On Thu 22-03-18 19:46:36, Tetsuo Handa wrote: > >From b141cdbe0db852549c94d5b1e6a9967ca69d59fd Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Date: Thu, 22 Mar 2018 19:44:12 +0900 > Subject: [PATCH] mm,oom_reaper: Check for MMF_OOM_SKIP before complain. > > I got "oom_reaper: unable to reap pid:" messages when the victim thread > was blocked inside free_pgtables() (which occurred after returning from > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > exit_mmap() already set MMF_OOM_SKIP. ... because we do not report a failure when MMF_OOM_SKIP is set already (see __oom_reap_task_mm). > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > [ 664.892292] a.out D13272 7558 6931 0x00100084 > [ 664.895765] Call Trace: > [ 664.897574] ? __schedule+0x25f/0x780 > [ 664.900099] schedule+0x2d/0x80 > [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 > [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 > [ 664.908335] ? free_pgd_range+0x569/0x5e0 > [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 > [ 664.914121] down_write+0x49/0x60 > [ 664.916519] ? unlink_file_vma+0x28/0x50 > [ 664.919255] unlink_file_vma+0x28/0x50 > [ 664.922234] free_pgtables+0x36/0x100 > [ 664.924797] exit_mmap+0xbb/0x180 > [ 664.927220] mmput+0x50/0x110 > [ 664.929504] copy_process.part.41+0xb61/0x1fe0 > [ 664.932448] ? _do_fork+0xe6/0x560 > [ 664.934902] ? _do_fork+0xe6/0x560 > [ 664.937361] _do_fork+0xe6/0x560 > [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 > [ 664.942693] ? retint_user+0x18/0x18 > [ 664.945309] ? page_fault+0x2f/0x50 > [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 > [ 664.951075] do_syscall_64+0x74/0x230 > [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Acked-by: David Rientjes <rientjes@google.com> > Cc: Michal Hocko <mhocko@suse.com> OK, fair enough. I think it is on borderline of usefulness but it doesn't seem harmful. I can already see somebody asking why the MMF_OOM_SKIP really requires mmap_sem in __oom_reap_task_mm while it doesn't in oom_reap_task but what ever. Acked-by: Michal Hocko <mhocko@suse.com> > --- > mm/oom_kill.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 5336985..dfd3705 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -590,7 +590,8 @@ static void oom_reap_task(struct task_struct *tsk) > while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm)) > schedule_timeout_idle(HZ/10); > > - if (attempts <= MAX_OOM_REAP_RETRIES) > + if (attempts <= MAX_OOM_REAP_RETRIES || > + test_bit(MMF_OOM_SKIP, &mm->flags)) > goto done; > > > -- > 1.8.3.1 -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt. 2018-03-20 12:37 ` Tetsuo Handa 2018-03-20 12:47 ` [PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain Tetsuo Handa @ 2018-03-20 13:24 ` Michal Hocko 2018-03-20 13:37 ` Tetsuo Handa 1 sibling, 1 reply; 20+ messages in thread From: Michal Hocko @ 2018-03-20 13:24 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, rientjes On Tue 20-03-18 21:37:42, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Tue 20-03-18 20:57:56, Tetsuo Handa wrote: > > > I got "oom_reaper: unable to reap pid:" messages when the victim thread > > > was blocked inside free_pgtables() (which occurred after returning from > > > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > > > __oom_reap_task_mm() returned true (by e.g. finding MMF_OOM_SKIP already > > > set) when oom_reap_task() was trying MAX_OOM_REAP_RETRIES'th attempt. > > > > > > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > > > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > > > > I do not see "oom_reaper: reaped process..." so has the task been > > reaped? > > The log is http://I-love.SAKURA.ne.jp/tmp/serial-20180320.txt.xz . xzgrep oom_reaper: serial-20180320.txt.xz [ 97.144805] oom_reaper: reaped process 2142 (cleanupd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 103.447171] oom_reaper: reaped process 659 (NetworkManager), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 208.906879] oom_reaper: reaped process 1415 (postgres), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 214.179760] oom_reaper: reaped process 588 (irqbalance), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 215.013260] oom_reaper: reaped process 591 (systemd-logind), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 221.483724] oom_reaper: reaped process 7388 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 223.140437] oom_reaper: reaped process 9252 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 224.735597] oom_reaper: reaped process 9257 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) [ 354.342331] oom_reaper: unable to reap pid:16611 (a.out) [ 380.525910] oom_reaper: unable to reap pid:6927 (a.out) [ 408.397539] oom_reaper: unable to reap pid:6927 (a.out) [ 435.702005] oom_reaper: unable to reap pid:7554 (a.out) [ 466.269660] oom_reaper: unable to reap pid:7560 (a.out) [ 495.621196] oom_reaper: unable to reap pid:7563 (a.out) [ 534.279602] oom_reaper: reaped process 22038 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 536.271631] oom_reaper: unable to reap pid:6928 (a.out) [ 600.285293] oom_reaper: unable to reap pid:7550 (a.out) [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) [ 743.188067] oom_reaper: reaped process 7562 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [ 744.980868] oom_reaper: unable to reap pid:6928 (a.out) None of the reaped taks is reported as "unable to reap..." So what actually gets fixed by this patch? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt. 2018-03-20 13:24 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Michal Hocko @ 2018-03-20 13:37 ` Tetsuo Handa 2018-03-20 13:44 ` Michal Hocko 0 siblings, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 13:37 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes Michal Hocko wrote: > On Tue 20-03-18 21:37:42, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Tue 20-03-18 20:57:56, Tetsuo Handa wrote: > > > > I got "oom_reaper: unable to reap pid:" messages when the victim thread > > > > was blocked inside free_pgtables() (which occurred after returning from > > > > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > > > > __oom_reap_task_mm() returned true (by e.g. finding MMF_OOM_SKIP already > > > > set) when oom_reap_task() was trying MAX_OOM_REAP_RETRIES'th attempt. > > > > > > > > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > > > > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > > > > > > I do not see "oom_reaper: reaped process..." so has the task been > > > reaped? > > > > The log is http://I-love.SAKURA.ne.jp/tmp/serial-20180320.txt.xz . > > xzgrep oom_reaper: serial-20180320.txt.xz > [ 97.144805] oom_reaper: reaped process 2142 (cleanupd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 103.447171] oom_reaper: reaped process 659 (NetworkManager), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 208.906879] oom_reaper: reaped process 1415 (postgres), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 214.179760] oom_reaper: reaped process 588 (irqbalance), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 215.013260] oom_reaper: reaped process 591 (systemd-logind), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 221.483724] oom_reaper: reaped process 7388 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 223.140437] oom_reaper: reaped process 9252 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 224.735597] oom_reaper: reaped process 9257 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) > [ 354.342331] oom_reaper: unable to reap pid:16611 (a.out) > [ 380.525910] oom_reaper: unable to reap pid:6927 (a.out) > [ 408.397539] oom_reaper: unable to reap pid:6927 (a.out) > [ 435.702005] oom_reaper: unable to reap pid:7554 (a.out) > [ 466.269660] oom_reaper: unable to reap pid:7560 (a.out) > [ 495.621196] oom_reaper: unable to reap pid:7563 (a.out) > [ 534.279602] oom_reaper: reaped process 22038 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 536.271631] oom_reaper: unable to reap pid:6928 (a.out) > [ 600.285293] oom_reaper: unable to reap pid:7550 (a.out) > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > [ 743.188067] oom_reaper: reaped process 7562 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > [ 744.980868] oom_reaper: unable to reap pid:6928 (a.out) > > None of the reaped taks is reported as "unable to reap..." So what > actually gets fixed by this patch? This patch (which I sent as "[PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain.") makes the OOM reaper not to complain if MMF_OOM_SKIP was already set by exit_mmap(). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt. 2018-03-20 13:37 ` Tetsuo Handa @ 2018-03-20 13:44 ` Michal Hocko 0 siblings, 0 replies; 20+ messages in thread From: Michal Hocko @ 2018-03-20 13:44 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, rientjes On Tue 20-03-18 22:37:31, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Tue 20-03-18 21:37:42, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > On Tue 20-03-18 20:57:56, Tetsuo Handa wrote: > > > > > I got "oom_reaper: unable to reap pid:" messages when the victim thread > > > > > was blocked inside free_pgtables() (which occurred after returning from > > > > > unmap_vmas() and setting MMF_OOM_SKIP). We don't need to complain when > > > > > __oom_reap_task_mm() returned true (by e.g. finding MMF_OOM_SKIP already > > > > > set) when oom_reap_task() was trying MAX_OOM_REAP_RETRIES'th attempt. > > > > > > > > > > [ 663.593821] Killed process 7558 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB > > > > > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > > > > > > > > I do not see "oom_reaper: reaped process..." so has the task been > > > > reaped? > > > > > > The log is http://I-love.SAKURA.ne.jp/tmp/serial-20180320.txt.xz . > > > > xzgrep oom_reaper: serial-20180320.txt.xz > > [ 97.144805] oom_reaper: reaped process 2142 (cleanupd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 103.447171] oom_reaper: reaped process 659 (NetworkManager), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 208.906879] oom_reaper: reaped process 1415 (postgres), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 214.179760] oom_reaper: reaped process 588 (irqbalance), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 215.013260] oom_reaper: reaped process 591 (systemd-logind), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 221.483724] oom_reaper: reaped process 7388 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 223.140437] oom_reaper: reaped process 9252 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 224.735597] oom_reaper: reaped process 9257 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) > > [ 354.342331] oom_reaper: unable to reap pid:16611 (a.out) > > [ 380.525910] oom_reaper: unable to reap pid:6927 (a.out) > > [ 408.397539] oom_reaper: unable to reap pid:6927 (a.out) > > [ 435.702005] oom_reaper: unable to reap pid:7554 (a.out) > > [ 466.269660] oom_reaper: unable to reap pid:7560 (a.out) > > [ 495.621196] oom_reaper: unable to reap pid:7563 (a.out) > > [ 534.279602] oom_reaper: reaped process 22038 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 536.271631] oom_reaper: unable to reap pid:6928 (a.out) > > [ 600.285293] oom_reaper: unable to reap pid:7550 (a.out) > > [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) > > [ 743.188067] oom_reaper: reaped process 7562 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > > [ 744.980868] oom_reaper: unable to reap pid:6928 (a.out) > > > > None of the reaped taks is reported as "unable to reap..." So what > > actually gets fixed by this patch? > > This patch (which I sent as "[PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain.") > makes the OOM reaper not to complain if MMF_OOM_SKIP was already set by exit_mmap(). Sigh. So can you send a patch with a meaningful changelog showing the problem and explaining how it is fixed? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 11:57 [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Tetsuo Handa 2018-03-20 11:57 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Tetsuo Handa @ 2018-03-20 12:28 ` Michal Hocko 2018-03-20 12:52 ` Tetsuo Handa 1 sibling, 1 reply; 20+ messages in thread From: Michal Hocko @ 2018-03-20 12:28 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, David Rientjes On Tue 20-03-18 20:57:55, Tetsuo Handa wrote: > I found that it is not difficult to hit "oom_reaper: unable to reap pid:" > messages if the victim thread is doing copy_process(). Since I noticed > that it is likely helpful to show trace of unable to reap victim thread > for finding locations which should use killable wait, this patch does so. > > [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) > [ 226.611971] a.out D13056 9261 6927 0x00100084 > [ 226.615879] Call Trace: > [ 226.617926] ? __schedule+0x25f/0x780 > [ 226.620559] schedule+0x2d/0x80 > [ 226.623356] rwsem_down_write_failed+0x2bb/0x440 > [ 226.626426] ? rwsem_down_write_failed+0x55/0x440 > [ 226.629458] ? anon_vma_fork+0x124/0x150 > [ 226.632679] call_rwsem_down_write_failed+0x13/0x20 > [ 226.635884] down_write+0x49/0x60 > [ 226.638867] ? copy_process.part.41+0x12f2/0x1fe0 > [ 226.642042] copy_process.part.41+0x12f2/0x1fe0 /* i_mmap_lock_write() in dup_mmap() */ > [ 226.645087] ? _do_fork+0xe6/0x560 > [ 226.647991] _do_fork+0xe6/0x560 > [ 226.650495] ? syscall_trace_enter+0x1a9/0x240 > [ 226.653443] ? retint_user+0x18/0x18 > [ 226.656601] ? page_fault+0x2f/0x50 > [ 226.659159] ? trace_hardirqs_on_caller+0x11f/0x1b0 > [ 226.662399] do_syscall_64+0x74/0x230 > [ 226.664989] entry_SYSCALL_64_after_hwframe+0x42/0xb7 A single stack trace in the changelog would be sufficient IMHO. Appart from that. What do you expect users will do about this trace? Sure they will see a path which holds mmap_sem, we will see a bug report but we can hardly do anything about that. We simply cannot drop the lock from that path in 99% of situations. So _why_ do we want to add more information to the log? [...] > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Cc: Michal Hocko <mhocko@suse.com> > Cc: David Rientjes <rientjes@google.com> > --- > mm/oom_kill.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 5336985..900300c 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -41,6 +41,7 @@ > #include <linux/kthread.h> > #include <linux/init.h> > #include <linux/mmu_notifier.h> > +#include <linux/sched/debug.h> > > #include <asm/tlb.h> > #include "internal.h" > @@ -596,6 +597,7 @@ static void oom_reap_task(struct task_struct *tsk) > > pr_info("oom_reaper: unable to reap pid:%d (%s)\n", > task_pid_nr(tsk), tsk->comm); > + sched_show_task(tsk); > debug_show_all_locks(); > > done: > -- > 1.8.3.1 > -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 12:28 ` [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Michal Hocko @ 2018-03-20 12:52 ` Tetsuo Handa 2018-03-20 13:19 ` Michal Hocko 0 siblings, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 12:52 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes Michal Hocko wrote: > On Tue 20-03-18 20:57:55, Tetsuo Handa wrote: > > I found that it is not difficult to hit "oom_reaper: unable to reap pid:" > > messages if the victim thread is doing copy_process(). Since I noticed > > that it is likely helpful to show trace of unable to reap victim thread > > for finding locations which should use killable wait, this patch does so. > > > > [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) > > [ 226.611971] a.out D13056 9261 6927 0x00100084 > > [ 226.615879] Call Trace: > > [ 226.617926] ? __schedule+0x25f/0x780 > > [ 226.620559] schedule+0x2d/0x80 > > [ 226.623356] rwsem_down_write_failed+0x2bb/0x440 > > [ 226.626426] ? rwsem_down_write_failed+0x55/0x440 > > [ 226.629458] ? anon_vma_fork+0x124/0x150 > > [ 226.632679] call_rwsem_down_write_failed+0x13/0x20 > > [ 226.635884] down_write+0x49/0x60 > > [ 226.638867] ? copy_process.part.41+0x12f2/0x1fe0 > > [ 226.642042] copy_process.part.41+0x12f2/0x1fe0 /* i_mmap_lock_write() in dup_mmap() */ > > [ 226.645087] ? _do_fork+0xe6/0x560 > > [ 226.647991] _do_fork+0xe6/0x560 > > [ 226.650495] ? syscall_trace_enter+0x1a9/0x240 > > [ 226.653443] ? retint_user+0x18/0x18 > > [ 226.656601] ? page_fault+0x2f/0x50 > > [ 226.659159] ? trace_hardirqs_on_caller+0x11f/0x1b0 > > [ 226.662399] do_syscall_64+0x74/0x230 > > [ 226.664989] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > A single stack trace in the changelog would be sufficient IMHO. > Appart from that. What do you expect users will do about this trace? > Sure they will see a path which holds mmap_sem, we will see a bug report > but we can hardly do anything about that. We simply cannot drop the lock > from that path in 99% of situations. So _why_ do we want to add more > information to the log? This case is blocked at i_mmap_lock_write(). If we can add error handling path there, we can replace i_mmap_lock_write() with i_mmap_lock_write_killable() and bail out soon. This patch helps finding such locations. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 12:52 ` Tetsuo Handa @ 2018-03-20 13:19 ` Michal Hocko 2018-03-20 13:30 ` Tetsuo Handa 0 siblings, 1 reply; 20+ messages in thread From: Michal Hocko @ 2018-03-20 13:19 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, rientjes On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Tue 20-03-18 20:57:55, Tetsuo Handa wrote: > > > I found that it is not difficult to hit "oom_reaper: unable to reap pid:" > > > messages if the victim thread is doing copy_process(). Since I noticed > > > that it is likely helpful to show trace of unable to reap victim thread > > > for finding locations which should use killable wait, this patch does so. > > > > > > [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) > > > [ 226.611971] a.out D13056 9261 6927 0x00100084 > > > [ 226.615879] Call Trace: > > > [ 226.617926] ? __schedule+0x25f/0x780 > > > [ 226.620559] schedule+0x2d/0x80 > > > [ 226.623356] rwsem_down_write_failed+0x2bb/0x440 > > > [ 226.626426] ? rwsem_down_write_failed+0x55/0x440 > > > [ 226.629458] ? anon_vma_fork+0x124/0x150 > > > [ 226.632679] call_rwsem_down_write_failed+0x13/0x20 > > > [ 226.635884] down_write+0x49/0x60 > > > [ 226.638867] ? copy_process.part.41+0x12f2/0x1fe0 > > > [ 226.642042] copy_process.part.41+0x12f2/0x1fe0 /* i_mmap_lock_write() in dup_mmap() */ > > > [ 226.645087] ? _do_fork+0xe6/0x560 > > > [ 226.647991] _do_fork+0xe6/0x560 > > > [ 226.650495] ? syscall_trace_enter+0x1a9/0x240 > > > [ 226.653443] ? retint_user+0x18/0x18 > > > [ 226.656601] ? page_fault+0x2f/0x50 > > > [ 226.659159] ? trace_hardirqs_on_caller+0x11f/0x1b0 > > > [ 226.662399] do_syscall_64+0x74/0x230 > > > [ 226.664989] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > > > A single stack trace in the changelog would be sufficient IMHO. > > Appart from that. What do you expect users will do about this trace? > > Sure they will see a path which holds mmap_sem, we will see a bug report > > but we can hardly do anything about that. We simply cannot drop the lock > > from that path in 99% of situations. So _why_ do we want to add more > > information to the log? > > This case is blocked at i_mmap_lock_write(). But why does i_mmap_lock_write matter for oom_reaping. We are not touching hugetlb mappings. dup_mmap holds mmap_sem for write which is the most probable source of the backoff. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 13:19 ` Michal Hocko @ 2018-03-20 13:30 ` Tetsuo Handa 2018-03-20 13:34 ` Michal Hocko 0 siblings, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 13:30 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes Michal Hocko wrote: > On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > A single stack trace in the changelog would be sufficient IMHO. > > > Appart from that. What do you expect users will do about this trace? > > > Sure they will see a path which holds mmap_sem, we will see a bug report > > > but we can hardly do anything about that. We simply cannot drop the lock > > > from that path in 99% of situations. So _why_ do we want to add more > > > information to the log? > > > > This case is blocked at i_mmap_lock_write(). > > But why does i_mmap_lock_write matter for oom_reaping. We are not > touching hugetlb mappings. dup_mmap holds mmap_sem for write which is > the most probable source of the backoff. If i_mmap_lock_write can bail out upon SIGKILL, the OOM victim will be able to release mmap_sem held for write, which helps the OOM reaper not to back off. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 13:30 ` Tetsuo Handa @ 2018-03-20 13:34 ` Michal Hocko 2018-03-20 13:50 ` Tetsuo Handa 0 siblings, 1 reply; 20+ messages in thread From: Michal Hocko @ 2018-03-20 13:34 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, rientjes On Tue 20-03-18 22:30:16, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > A single stack trace in the changelog would be sufficient IMHO. > > > > Appart from that. What do you expect users will do about this trace? > > > > Sure they will see a path which holds mmap_sem, we will see a bug report > > > > but we can hardly do anything about that. We simply cannot drop the lock > > > > from that path in 99% of situations. So _why_ do we want to add more > > > > information to the log? > > > > > > This case is blocked at i_mmap_lock_write(). > > > > But why does i_mmap_lock_write matter for oom_reaping. We are not > > touching hugetlb mappings. dup_mmap holds mmap_sem for write which is > > the most probable source of the backoff. > > If i_mmap_lock_write can bail out upon SIGKILL, the OOM victim will be able to > release mmap_sem held for write, which helps the OOM reaper not to back off. There are so many other blocking calls (including allocations) in dup_mmap that I do not really think i_mmap_lock_write is the biggest problem. That will be likely the case for other mmap_sem write lockers. Really I am not sure dumping more information is beneficial here. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 13:34 ` Michal Hocko @ 2018-03-20 13:50 ` Tetsuo Handa 2018-03-20 14:10 ` Michal Hocko 0 siblings, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 13:50 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes Michal Hocko wrote: > On Tue 20-03-18 22:30:16, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > > > > Michal Hocko wrote: > > > > > A single stack trace in the changelog would be sufficient IMHO. > > > > > Appart from that. What do you expect users will do about this trace? > > > > > Sure they will see a path which holds mmap_sem, we will see a bug report > > > > > but we can hardly do anything about that. We simply cannot drop the lock > > > > > from that path in 99% of situations. So _why_ do we want to add more > > > > > information to the log? > > > > > > > > This case is blocked at i_mmap_lock_write(). > > > > > > But why does i_mmap_lock_write matter for oom_reaping. We are not > > > touching hugetlb mappings. dup_mmap holds mmap_sem for write which is > > > the most probable source of the backoff. > > > > If i_mmap_lock_write can bail out upon SIGKILL, the OOM victim will be able to > > release mmap_sem held for write, which helps the OOM reaper not to back off. > > There are so many other blocking calls (including allocations) in > dup_mmap Yes. But > that I do not really think i_mmap_lock_write is the biggest > problem. That will be likely the case for other mmap_sem write lockers. i_mmap_lock_write() is one of the problems which we could afford fixing. 8 out of 11 "oom_reaper: unable to reap" messages are blocked at i_mmap_lock_write(). [ 226.608508] oom_reaper: unable to reap pid:9261 (a.out) [ 226.611971] a.out D13056 9261 6927 0x00100084 [ 226.615879] Call Trace: [ 226.617926] ? __schedule+0x25f/0x780 [ 226.620559] schedule+0x2d/0x80 [ 226.623356] rwsem_down_write_failed+0x2bb/0x440 [ 226.626426] ? rwsem_down_write_failed+0x55/0x440 [ 226.629458] ? anon_vma_fork+0x124/0x150 [ 226.632679] call_rwsem_down_write_failed+0x13/0x20 [ 226.635884] down_write+0x49/0x60 [ 226.638867] ? copy_process.part.41+0x12f2/0x1fe0 [ 226.642042] copy_process.part.41+0x12f2/0x1fe0 [ 226.645087] ? _do_fork+0xe6/0x560 [ 226.647991] _do_fork+0xe6/0x560 [ 226.650495] ? syscall_trace_enter+0x1a9/0x240 [ 226.653443] ? retint_user+0x18/0x18 [ 226.656601] ? page_fault+0x2f/0x50 [ 226.659159] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 226.662399] do_syscall_64+0x74/0x230 [ 226.664989] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 354.342331] oom_reaper: unable to reap pid:16611 (a.out) [ 354.346707] a.out D13312 16611 6927 0x00100084 [ 354.351173] Call Trace: [ 354.354164] ? __schedule+0x25f/0x780 [ 354.357633] schedule+0x2d/0x80 [ 354.360976] rwsem_down_write_failed+0x2bb/0x440 [ 354.364774] ? rwsem_down_write_failed+0x55/0x440 [ 354.368749] call_rwsem_down_write_failed+0x13/0x20 [ 354.372703] down_write+0x49/0x60 [ 354.376106] ? copy_process.part.41+0x12f2/0x1fe0 [ 354.379956] copy_process.part.41+0x12f2/0x1fe0 [ 354.383791] ? _do_fork+0xe6/0x560 [ 354.386967] _do_fork+0xe6/0x560 [ 354.390016] ? syscall_trace_enter+0x1a9/0x240 [ 354.393756] ? retint_user+0x18/0x18 [ 354.397296] ? page_fault+0x2f/0x50 [ 354.401081] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 354.404772] do_syscall_64+0x74/0x230 [ 354.407749] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 380.525910] oom_reaper: unable to reap pid:6927 (a.out) [ 380.529274] a.out R running task 13544 6927 1 0x00100084 [ 380.533264] Call Trace: [ 380.535433] ? __schedule+0x25f/0x780 [ 380.538052] ? mark_held_locks+0x60/0x80 [ 380.540775] schedule+0x2d/0x80 [ 380.543258] schedule_timeout+0x1a4/0x350 [ 380.546053] ? __next_timer_interrupt+0xd0/0xd0 [ 380.549033] msleep+0x25/0x30 [ 380.551458] shrink_inactive_list+0x5b7/0x690 [ 380.554408] ? __lock_acquire+0x1f1/0xfd0 [ 380.557228] ? find_held_lock+0x2d/0x90 [ 380.559959] shrink_node_memcg+0x340/0x770 [ 380.562776] ? __lock_acquire+0x246/0xfd0 [ 380.565532] ? mem_cgroup_iter+0x121/0x4f0 [ 380.568337] ? mem_cgroup_iter+0x121/0x4f0 [ 380.571203] shrink_node+0xd8/0x370 [ 380.573745] do_try_to_free_pages+0xe3/0x390 [ 380.576759] try_to_free_pages+0xc8/0x110 [ 380.579476] __alloc_pages_slowpath+0x28a/0x8d9 [ 380.582639] __alloc_pages_nodemask+0x21d/0x260 [ 380.585662] new_slab+0x558/0x760 [ 380.588188] ___slab_alloc+0x353/0x6f0 [ 380.590962] ? copy_process.part.41+0x121f/0x1fe0 [ 380.594008] ? find_held_lock+0x2d/0x90 [ 380.596750] ? copy_process.part.41+0x121f/0x1fe0 [ 380.599852] __slab_alloc+0x41/0x7a [ 380.602469] ? copy_process.part.41+0x121f/0x1fe0 [ 380.605607] kmem_cache_alloc+0x1a6/0x1f0 [ 380.608528] copy_process.part.41+0x121f/0x1fe0 [ 380.611613] ? _do_fork+0xe6/0x560 [ 380.614299] _do_fork+0xe6/0x560 [ 380.616752] ? syscall_trace_enter+0x1a9/0x240 [ 380.619813] ? retint_user+0x18/0x18 [ 380.622588] ? page_fault+0x2f/0x50 [ 380.625184] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 380.628841] do_syscall_64+0x74/0x230 [ 380.631660] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 408.397539] oom_reaper: unable to reap pid:6927 (a.out) [ 408.400862] a.out D13544 6927 1 0x00100084 [ 408.404262] Call Trace: [ 408.406449] ? __schedule+0x25f/0x780 [ 408.409063] schedule+0x2d/0x80 [ 408.411440] rwsem_down_write_failed+0x2bb/0x440 [ 408.414544] ? rwsem_down_write_failed+0x55/0x440 [ 408.417684] call_rwsem_down_write_failed+0x13/0x20 [ 408.420866] down_write+0x49/0x60 [ 408.423442] ? copy_process.part.41+0x12f2/0x1fe0 [ 408.426521] copy_process.part.41+0x12f2/0x1fe0 [ 408.429587] ? _do_fork+0xe6/0x560 [ 408.432134] _do_fork+0xe6/0x560 [ 408.434601] ? syscall_trace_enter+0x1a9/0x240 [ 408.437729] ? retint_user+0x18/0x18 [ 408.440391] ? page_fault+0x2f/0x50 [ 408.442955] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 408.446850] do_syscall_64+0x74/0x230 [ 408.449523] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 435.702005] oom_reaper: unable to reap pid:7554 (a.out) [ 435.706759] a.out D13328 7554 6928 0x00100084 [ 435.711417] Call Trace: [ 435.714814] ? __schedule+0x25f/0x780 [ 435.718813] schedule+0x2d/0x80 [ 435.722308] rwsem_down_write_failed+0x2bb/0x440 [ 435.726661] ? rwsem_down_write_failed+0x55/0x440 [ 435.730881] ? anon_vma_fork+0x124/0x150 [ 435.734935] call_rwsem_down_write_failed+0x13/0x20 [ 435.739278] down_write+0x49/0x60 [ 435.743009] ? copy_process.part.41+0x12f2/0x1fe0 [ 435.747173] copy_process.part.41+0x12f2/0x1fe0 [ 435.751304] ? _do_fork+0xe6/0x560 [ 435.754924] _do_fork+0xe6/0x560 [ 435.758538] ? syscall_trace_enter+0x1a9/0x240 [ 435.762533] ? retint_user+0x18/0x18 [ 435.766115] ? page_fault+0x2f/0x50 [ 435.769646] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 435.773850] do_syscall_64+0x74/0x230 [ 435.777364] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 466.269660] oom_reaper: unable to reap pid:7560 (a.out) [ 466.273267] a.out D13120 7560 6928 0x00100084 [ 466.276878] Call Trace: [ 466.279146] ? __schedule+0x25f/0x780 [ 466.281945] schedule+0x2d/0x80 [ 466.284421] rwsem_down_write_failed+0x2bb/0x440 [ 466.287748] ? rwsem_down_write_failed+0x55/0x440 [ 466.291012] ? anon_vma_fork+0x124/0x150 [ 466.293990] call_rwsem_down_write_failed+0x13/0x20 [ 466.297344] down_write+0x49/0x60 [ 466.299925] ? copy_process.part.41+0x12f2/0x1fe0 [ 466.303310] copy_process.part.41+0x12f2/0x1fe0 [ 466.306504] ? _do_fork+0xe6/0x560 [ 466.309257] _do_fork+0xe6/0x560 [ 466.311911] ? syscall_trace_enter+0x1a9/0x240 [ 466.315112] ? retint_user+0x18/0x18 [ 466.317937] ? page_fault+0x2f/0x50 [ 466.320667] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 466.324043] do_syscall_64+0x74/0x230 [ 466.326999] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 495.621196] oom_reaper: unable to reap pid:7563 (a.out) [ 495.624785] a.out D12992 7563 6928 0x00100084 [ 495.628398] Call Trace: [ 495.630634] ? __schedule+0x25f/0x780 [ 495.633464] schedule+0x2d/0x80 [ 495.635952] rwsem_down_write_failed+0x2bb/0x440 [ 495.639169] ? rwsem_down_write_failed+0x55/0x440 [ 495.642432] ? anon_vma_fork+0x124/0x150 [ 495.645216] call_rwsem_down_write_failed+0x13/0x20 [ 495.648651] down_write+0x49/0x60 [ 495.651446] ? copy_process.part.41+0x12f2/0x1fe0 [ 495.654843] copy_process.part.41+0x12f2/0x1fe0 [ 495.658031] ? _do_fork+0xe6/0x560 [ 495.660718] _do_fork+0xe6/0x560 [ 495.663415] ? syscall_trace_enter+0x1a9/0x240 [ 495.666662] ? retint_user+0x18/0x18 [ 495.669369] ? page_fault+0x2f/0x50 [ 495.672162] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 495.675529] do_syscall_64+0x74/0x230 [ 495.678418] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 536.271631] oom_reaper: unable to reap pid:6928 (a.out) [ 536.275197] a.out D12992 6928 1 0x00100084 [ 536.278972] Call Trace: [ 536.281110] ? __schedule+0x25f/0x780 [ 536.283883] schedule+0x2d/0x80 [ 536.286593] rwsem_down_write_failed+0x2bb/0x440 [ 536.289821] ? rwsem_down_write_failed+0x55/0x440 [ 536.293113] ? anon_vma_fork+0x124/0x150 [ 536.296262] call_rwsem_down_write_failed+0x13/0x20 [ 536.299554] down_write+0x49/0x60 [ 536.302514] ? copy_process.part.41+0x12f2/0x1fe0 [ 536.305816] copy_process.part.41+0x12f2/0x1fe0 [ 536.309032] ? _do_fork+0xe6/0x560 [ 536.312206] _do_fork+0xe6/0x560 [ 536.314810] ? syscall_trace_enter+0x1a9/0x240 [ 536.318335] ? retint_user+0x18/0x18 [ 536.321064] ? page_fault+0x2f/0x50 [ 536.323757] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 536.327396] do_syscall_64+0x74/0x230 [ 536.330175] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 600.285293] oom_reaper: unable to reap pid:7550 (a.out) [ 600.289112] a.out D13104 7550 6931 0x00100084 [ 600.292918] Call Trace: [ 600.295506] ? __schedule+0x25f/0x780 [ 600.298614] ? __lock_acquire+0x246/0xfd0 [ 600.301879] schedule+0x2d/0x80 [ 600.304623] schedule_timeout+0x1fd/0x350 [ 600.307633] ? find_held_lock+0x2d/0x90 [ 600.310909] ? mark_held_locks+0x60/0x80 [ 600.314001] ? _raw_spin_unlock_irq+0x24/0x30 [ 600.317382] wait_for_completion+0xab/0x130 [ 600.320579] ? wake_up_q+0x70/0x70 [ 600.323419] flush_work+0x1bd/0x260 [ 600.326434] ? flush_work+0x174/0x260 [ 600.329332] ? destroy_worker+0x90/0x90 [ 600.332301] drain_all_pages+0x16d/0x1e0 [ 600.335464] __alloc_pages_slowpath+0x443/0x8d9 [ 600.338743] __alloc_pages_nodemask+0x21d/0x260 [ 600.342178] new_slab+0x558/0x760 [ 600.344861] ___slab_alloc+0x353/0x6f0 [ 600.347744] ? copy_process.part.41+0x121f/0x1fe0 [ 600.351213] ? find_held_lock+0x2d/0x90 [ 600.354119] ? copy_process.part.41+0x121f/0x1fe0 [ 600.357491] __slab_alloc+0x41/0x7a [ 600.360206] ? copy_process.part.41+0x121f/0x1fe0 [ 600.363480] kmem_cache_alloc+0x1a6/0x1f0 [ 600.366579] copy_process.part.41+0x121f/0x1fe0 [ 600.369761] ? _do_fork+0xe6/0x560 [ 600.372382] _do_fork+0xe6/0x560 [ 600.375142] ? syscall_trace_enter+0x1a9/0x240 [ 600.378258] ? retint_user+0x18/0x18 [ 600.380978] ? page_fault+0x2f/0x50 [ 600.383814] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 600.387096] do_syscall_64+0x74/0x230 [ 600.389928] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 664.684801] oom_reaper: unable to reap pid:7558 (a.out) [ 664.892292] a.out D13272 7558 6931 0x00100084 [ 664.895765] Call Trace: [ 664.897574] ? __schedule+0x25f/0x780 [ 664.900099] schedule+0x2d/0x80 [ 664.902260] rwsem_down_write_failed+0x2bb/0x440 [ 664.905249] ? rwsem_down_write_failed+0x55/0x440 [ 664.908335] ? free_pgd_range+0x569/0x5e0 [ 664.911145] call_rwsem_down_write_failed+0x13/0x20 [ 664.914121] down_write+0x49/0x60 [ 664.916519] ? unlink_file_vma+0x28/0x50 [ 664.919255] unlink_file_vma+0x28/0x50 [ 664.922234] free_pgtables+0x36/0x100 [ 664.924797] exit_mmap+0xbb/0x180 [ 664.927220] mmput+0x50/0x110 [ 664.929504] copy_process.part.41+0xb61/0x1fe0 [ 664.932448] ? _do_fork+0xe6/0x560 [ 664.934902] ? _do_fork+0xe6/0x560 [ 664.937361] _do_fork+0xe6/0x560 [ 664.939742] ? syscall_trace_enter+0x1a9/0x240 [ 664.942693] ? retint_user+0x18/0x18 [ 664.945309] ? page_fault+0x2f/0x50 [ 664.947896] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 664.951075] do_syscall_64+0x74/0x230 [ 664.953747] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 744.980868] oom_reaper: unable to reap pid:6928 (a.out) [ 744.984653] a.out D12992 6928 1 0x00100084 [ 744.988278] Call Trace: [ 744.990524] ? __schedule+0x25f/0x780 [ 744.993435] schedule+0x2d/0x80 [ 744.995877] rwsem_down_write_failed+0x2bb/0x440 [ 744.999272] ? rwsem_down_write_failed+0x55/0x440 [ 745.002759] ? anon_vma_fork+0x124/0x150 [ 745.005789] call_rwsem_down_write_failed+0x13/0x20 [ 745.009324] down_write+0x49/0x60 [ 745.012134] ? copy_process.part.41+0x12f2/0x1fe0 [ 745.015574] copy_process.part.41+0x12f2/0x1fe0 [ 745.018845] ? _do_fork+0xe6/0x560 [ 745.021698] _do_fork+0xe6/0x560 [ 745.024275] ? syscall_trace_enter+0x1a9/0x240 [ 745.027443] ? retint_user+0x18/0x18 [ 745.030281] ? page_fault+0x2f/0x50 [ 745.033138] ? trace_hardirqs_on_caller+0x11f/0x1b0 [ 745.036703] do_syscall_64+0x74/0x230 [ 745.039476] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > > Really I am not sure dumping more information is beneficial here. Converting to use killable where we can afford is beneficial. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 13:50 ` Tetsuo Handa @ 2018-03-20 14:10 ` Michal Hocko 2018-03-20 14:20 ` Tetsuo Handa 0 siblings, 1 reply; 20+ messages in thread From: Michal Hocko @ 2018-03-20 14:10 UTC (permalink / raw) To: Tetsuo Handa; +Cc: linux-mm, rientjes On Tue 20-03-18 22:50:21, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Tue 20-03-18 22:30:16, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > > > > > Michal Hocko wrote: > > > > > > A single stack trace in the changelog would be sufficient IMHO. > > > > > > Appart from that. What do you expect users will do about this trace? > > > > > > Sure they will see a path which holds mmap_sem, we will see a bug report > > > > > > but we can hardly do anything about that. We simply cannot drop the lock > > > > > > from that path in 99% of situations. So _why_ do we want to add more > > > > > > information to the log? > > > > > > > > > > This case is blocked at i_mmap_lock_write(). > > > > > > > > But why does i_mmap_lock_write matter for oom_reaping. We are not > > > > touching hugetlb mappings. dup_mmap holds mmap_sem for write which is > > > > the most probable source of the backoff. > > > > > > If i_mmap_lock_write can bail out upon SIGKILL, the OOM victim will be able to > > > release mmap_sem held for write, which helps the OOM reaper not to back off. > > > > There are so many other blocking calls (including allocations) in > > dup_mmap > > Yes. But > > > that I do not really think i_mmap_lock_write is the biggest > > problem. That will be likely the case for other mmap_sem write lockers. > > i_mmap_lock_write() is one of the problems which we could afford fixing. > 8 out of 11 "oom_reaper: unable to reap" messages are blocked at i_mmap_lock_write(). > [...] > > Really I am not sure dumping more information is beneficial here. > > Converting to use killable where we can afford is beneficial. I am no questioning that. I am questioning the additional information because we won't be able to do anything about mmap_sem holder most of the time. Because they tend block on allocations... -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 14:10 ` Michal Hocko @ 2018-03-20 14:20 ` Tetsuo Handa 2018-03-20 20:46 ` David Rientjes 0 siblings, 1 reply; 20+ messages in thread From: Tetsuo Handa @ 2018-03-20 14:20 UTC (permalink / raw) To: mhocko; +Cc: linux-mm, rientjes Michal Hocko wrote: > On Tue 20-03-18 22:50:21, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Tue 20-03-18 22:30:16, Tetsuo Handa wrote: > > > > Michal Hocko wrote: > > > > > On Tue 20-03-18 21:52:33, Tetsuo Handa wrote: > > > > > > Michal Hocko wrote: > > > > > > > A single stack trace in the changelog would be sufficient IMHO. > > > > > > > Appart from that. What do you expect users will do about this trace? > > > > > > > Sure they will see a path which holds mmap_sem, we will see a bug report > > > > > > > but we can hardly do anything about that. We simply cannot drop the lock > > > > > > > from that path in 99% of situations. So _why_ do we want to add more > > > > > > > information to the log? > > > > > > > > > > > > This case is blocked at i_mmap_lock_write(). > > > > > > > > > > But why does i_mmap_lock_write matter for oom_reaping. We are not > > > > > touching hugetlb mappings. dup_mmap holds mmap_sem for write which is > > > > > the most probable source of the backoff. > > > > > > > > If i_mmap_lock_write can bail out upon SIGKILL, the OOM victim will be able to > > > > release mmap_sem held for write, which helps the OOM reaper not to back off. > > > > > > There are so many other blocking calls (including allocations) in > > > dup_mmap > > > > Yes. But > > > > > that I do not really think i_mmap_lock_write is the biggest > > > problem. That will be likely the case for other mmap_sem write lockers. > > > > i_mmap_lock_write() is one of the problems which we could afford fixing. > > 8 out of 11 "oom_reaper: unable to reap" messages are blocked at i_mmap_lock_write(). > > > [...] > > > Really I am not sure dumping more information is beneficial here. > > > > Converting to use killable where we can afford is beneficial. > > I am no questioning that. I am questioning the additional information > because we won't be able to do anything about mmap_sem holder most of > the time. Because they tend block on allocations... serial-20180320.txt.xz is saying that they tend to block on i_mmap_lock_write() rather than memory allocations. Making memory allocations killable will need a lot of work. But I think that making frequently hitting down_write() killable won't need so much work. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread. 2018-03-20 14:20 ` Tetsuo Handa @ 2018-03-20 20:46 ` David Rientjes 0 siblings, 0 replies; 20+ messages in thread From: David Rientjes @ 2018-03-20 20:46 UTC (permalink / raw) To: Tetsuo Handa; +Cc: mhocko, linux-mm On Tue, 20 Mar 2018, Tetsuo Handa wrote: > > I am no questioning that. I am questioning the additional information > > because we won't be able to do anything about mmap_sem holder most of > > the time. Because they tend block on allocations... > > serial-20180320.txt.xz is saying that they tend to block on i_mmap_lock_write() rather > than memory allocations. Making memory allocations killable will need a lot of work. > But I think that making frequently hitting down_write() killable won't need so much work. > I have available to me more than 28,222,058 occurrences of successful oom reaping and 13,018 occurrences of failing to grab mm->mmap_sem. The number of failures is low on production workloads, so I don't see an issue with emitting a stack trace in these instances if it can help improve things. But for my 0.04% failure rate, I can't say that I would look into them much unless it results in the system livelocking. Unless there's a compelling reason why this is shouldn't be done (too much email sent to linux-mm as a result? :), I say just go ahead and add the darn stack trace. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2018-03-22 11:59 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-20 11:57 [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Tetsuo Handa 2018-03-20 11:57 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Tetsuo Handa 2018-03-20 12:12 ` Michal Hocko 2018-03-20 12:37 ` Tetsuo Handa 2018-03-20 12:47 ` [PATCH 2/2] mm,oom_reaper: Check for MMF_OOM_SKIP before complain Tetsuo Handa 2018-03-20 20:49 ` David Rientjes 2018-03-22 10:46 ` [PATCH] " Tetsuo Handa 2018-03-22 11:59 ` Michal Hocko 2018-03-20 13:24 ` [PATCH 2/2] mm,oom_reaper: Correct MAX_OOM_REAP_RETRIES'th attempt Michal Hocko 2018-03-20 13:37 ` Tetsuo Handa 2018-03-20 13:44 ` Michal Hocko 2018-03-20 12:28 ` [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread Michal Hocko 2018-03-20 12:52 ` Tetsuo Handa 2018-03-20 13:19 ` Michal Hocko 2018-03-20 13:30 ` Tetsuo Handa 2018-03-20 13:34 ` Michal Hocko 2018-03-20 13:50 ` Tetsuo Handa 2018-03-20 14:10 ` Michal Hocko 2018-03-20 14:20 ` Tetsuo Handa 2018-03-20 20:46 ` David Rientjes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).