linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* oom-reaper choosing wrong processes.
@ 2016-07-18 23:18 Dave Jones
  2016-07-19  9:08 ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Jones @ 2016-07-18 23:18 UTC (permalink / raw)
  To: linux-mm

I have a patch I use when I do fuzz-testing runs, that adds some code to the oom killer
such that it only ever kills tasks whose process name begins 'trinity-'.
(The idea being, given that's the only thing really running on this box, killing anything
 else would be a mistake).  For the longest time it's been completely benign, but on 4.7rc7
I see this..


[ 4599.949520] Out of memory: Kill process 2692 (trinity-main) score 0 or sacrifice child
[ 4599.949550] Killed process 2692 (trinity-main) total-vm:3189400kB, anon-rss:1060kB, file-rss:864kB, shmem-rss:60kB
[ 4599.958881] oom_reaper: reaped process 2692 (trinity-main), now anon-rss:0kB, file-rss:0kB, shmem-rss:60kB
[ 4607.741744] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.744269] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.746502] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:140112kB
[ 4607.747159] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.748741] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:140112kB
[ 4607.749447] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.751078] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:140112kB
[ 4607.752320] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.753335] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:140112kB
[ 4607.754481] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.755643] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:140112kB
[ 4607.757109] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
[ 4607.759743] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB

Looks good.

Then..

[ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
[ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
[ 4607.765365] CPU: 2 PID: 1842 Comm: sendmail-mta Not tainted 4.7.0-rc7-think+ #2
[ 4607.765369]  0000000000000000 00000000c4c0ff2b ffff88045038f718 ffffffffa9589f5b
[ 4607.765372]  ffff88044f39d440 ffff88045038fa60 ffff88045038f7a8 ffffffffa9288c68
[ 4607.765375]  0000000000000206 ffff88045038f748 ffffffffa90f9a88 0000000000000206
[ 4607.765375] Call Trace:
[ 4607.765382]  [<ffffffffa9589f5b>] dump_stack+0x68/0x9d
[ 4607.765385]  [<ffffffffa9288c68>] dump_header.isra.16+0x98/0x4f0
[ 4607.765389]  [<ffffffffa90f9a88>] ? preempt_count_sub+0x18/0xd0
[ 4607.765393]  [<ffffffffa9d6cc82>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[ 4607.765396]  [<ffffffffa9593f84>] ? ___ratelimit+0x114/0x1a0
[ 4607.765398]  [<ffffffffa9289f7f>] oom_kill_process+0x3ff/0x7a0
[ 4607.765405]  [<ffffffffa928a923>] out_of_memory+0x583/0x5a0
[ 4607.765407]  [<ffffffffa928a6e3>] ? out_of_memory+0x343/0x5a0
[ 4607.765410]  [<ffffffffa928a3a0>] ? check_panic_on_oom+0x80/0x80
[ 4607.765413]  [<ffffffffa9141642>] ? trace_hardirqs_on_caller+0x182/0x280
[ 4607.765415]  [<ffffffffa9d65fda>] ? mutex_trylock+0x14a/0x240
[ 4607.765417]  [<ffffffffa929295b>] ? __alloc_pages_nodemask+0xf3b/0x1710
[ 4607.765420]  [<ffffffffa9292f4f>] __alloc_pages_nodemask+0x152f/0x1710
[ 4607.765424]  [<ffffffffa9291a20>] ? warn_alloc_failed+0x220/0x220
[ 4607.765426]  [<ffffffffa913bb2d>] ? get_lock_stats+0x3d/0x70
[ 4607.765428]  [<ffffffffa90f9a88>] ? preempt_count_sub+0x18/0xd0
[ 4607.765436]  [<ffffffffa9282dac>] ? pagecache_get_page+0x2c/0x370
[ 4607.765438]  [<ffffffffa9286513>] filemap_fault+0x5b3/0x880
[ 4607.765443]  [<ffffffffa942cd5f>] ext4_filemap_fault+0x4f/0x70
[ 4607.765446]  [<ffffffffa92d0113>] __do_fault+0x143/0x300
[ 4607.765449]  [<ffffffffa92cffd0>] ? wp_page_copy.isra.82+0x860/0x860
[ 4607.765452]  [<ffffffffa9149967>] ? do_raw_spin_unlock+0x97/0x130
[ 4607.765458]  [<ffffffffa92d70be>] handle_mm_fault+0x165e/0x2390
[ 4607.765461]  [<ffffffffa92d5a60>] ? copy_page_range+0xeb0/0xeb0
[ 4607.765463]  [<ffffffffa9035836>] ? native_sched_clock+0x66/0x160
[ 4607.765465]  [<ffffffffa9035836>] ? native_sched_clock+0x66/0x160
[ 4607.765468]  [<ffffffffa913c5bf>] ? __lock_is_held+0x8f/0xd0
[ 4607.765471]  [<ffffffffa92c99ab>] ? vmacache_find+0xeb/0x140
[ 4607.765474]  [<ffffffffa907240d>] __do_page_fault+0x1ed/0x650
[ 4607.765477]  [<ffffffffa9072890>] do_page_fault+0x20/0x70
[ 4607.765480]  [<ffffffffa927d51d>] ? context_tracking_exit+0x1d/0x20
[ 4607.765482]  [<ffffffffa9d6f21f>] page_fault+0x1f/0x30
[ 4607.765505] Mem-Info:
[ 4607.765518] active_anon:3867180 inactive_anon:426354 isolated_anon:96
                active_file:1238 inactive_file:1906 isolated_file:23
                unevictable:16 dirty:173 writeback:0 unstable:0
                slab_reclaimable:39583 slab_unreclaimable:16385
                mapped:76607 shmem:4291476 pagetables:2902 bounce:0
                free:34703 free_pcp:183 free_cma:0
[ 4607.765526] DMA free:15880kB min:60kB low:72kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[ 4607.765530] lowmem_reserve[]: 0 3263 17181 17181
[ 4607.765537] DMA32 free:68284kB min:12824kB low:16164kB high:19504kB active_anon:2741916kB inactive_anon:548580kB active_file:60kB inactive_file:420kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:3459008kB managed:3407724kB mlocked:4kB dirty:128kB writeback:0kB mapped:221140kB shmem:3288392kB slab_reclaimable:23288kB slab_unreclaimable:7616kB kernel_stack:896kB pagetables:1100kB unstable:0kB bounce:0kB free_pcp:260kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:5008 all_unreclaimable? yes
[ 4607.765540] lowmem_reserve[]: 0 0 13917 13917
[ 4607.765547] Normal free:54648kB min:54696kB low:68944kB high:83192kB active_anon:12726804kB inactive_anon:1156836kB active_file:4892kB inactive_file:7204kB unevictable:60kB isolated(anon):384kB isolated(file):92kB present:17283072kB managed:14251916kB mlocked:60kB dirty:564kB writeback:0kB mapped:85288kB shmem:13877512kB slab_reclaimable:135044kB slab_unreclaimable:57908kB kernel_stack:4800kB pagetables:10508kB unstable:0kB bounce:0kB free_pcp:472kB local_pcp:124kB free_cma:0kB writeback_tmp:0kB pages_scanned:313472 all_unreclaimable? yes
[ 4607.765551] lowmem_reserve[]: 0 0 0 0
[ 4607.765568] DMA: 2*4kB (U) 2*8kB (U) 1*16kB (U) 3*32kB (U) 2*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15880kB
[ 4607.765590] DMA32: 376*4kB (UME) 398*8kB (UME) 312*16kB (UE) 163*32kB (UME) 62*64kB (UM) 30*128kB (U) 24*256kB (UE) 7*512kB (UE) 1*1024kB (U) 1*2048kB (M) 8*4096kB (M) = 68272kB
[ 4607.765605] Normal: 1358*4kB (UME) 1694*8kB (UME) 921*16kB (UEH) 474*32kB (UME) 87*64kB (UM) 2*128kB (U) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 54712kB
[ 4607.765607] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 4607.765609] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 4607.765610] 4295139 total pagecache pages
[ 4607.765611] 299 pages in swap cache
[ 4607.765613] Swap cache stats: add 14353678, delete 14353379, find 1451516/2271689
[ 4607.765614] Free swap  = 0kB
[ 4607.765614] Total swap = 7964668kB
[ 4607.765615] 5189516 pages RAM
[ 4607.765616] 0 pages HighMem/MovableOnly
[ 4607.765617] 770632 pages reserved
[ 4607.765618] 0 pages hwpoisoned
[ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
[ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
[ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
[ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
[ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
[ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
[ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
[ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
[ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
[ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
[ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
[ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
[ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
[ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
[ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
[ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
[ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
[ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
[ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
[ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
[ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
[ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
[ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
[ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
[ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
[ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
[ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
[ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
[ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
[ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
[ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
[ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
[ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
[ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
[ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
[ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
[ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
[ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
[ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
[ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
[ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
[ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB


Whoa. Why did it pick systemd-journal ?
My 'skip over !trinity processes' code kicks in, and it then kills the right processes, and the box lives on,
but if I hadn't have had that diff, the wrong process would have been killed.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-18 23:18 oom-reaper choosing wrong processes Dave Jones
@ 2016-07-19  9:08 ` Michal Hocko
  2016-07-19 10:52   ` Tetsuo Handa
  2016-07-19 15:33   ` Dave Jones
  0 siblings, 2 replies; 11+ messages in thread
From: Michal Hocko @ 2016-07-19  9:08 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-mm

On Mon 18-07-16 19:18:50, Dave Jones wrote:
[...]
> [ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
> [ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
[...]
> [ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
> [ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
> [ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
> [ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
> [ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
> [ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
> [ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
> [ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
> [ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
> [ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
> [ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
> [ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
> [ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
> [ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
> [ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
> [ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
> [ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
> [ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
> [ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
> [ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
> [ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
> [ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
> [ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
> [ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
> [ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
> [ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
> [ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
> [ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
> [ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
> [ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
> [ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
> [ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
> [ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
> [ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
> [ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
> [ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
> [ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
> [ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
> [ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
> [ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
> [ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
> [ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
> [ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
> 
> 
> Whoa. Why did it pick systemd-journal ?

Who has picked that? select_bad process?

> My 'skip over !trinity processes' code kicks in, and it then kills the
> right processes, and the box lives on, but if I hadn't have had that
> diff, the wrong process would have been killed.

Could you post your diff please? oom_reaper will reap not only the oom
victims but also tasks which hit the oom path and they are exiting.

see out_of_memory:
	/*
	 * If current has a pending SIGKILL or is exiting, then automatically
	 * select it.  The goal is to allow it to allocate so that it may
	 * quickly exit and free its memory.
	 */
	if (task_will_free_mem(current)) {
		mark_oom_victim(current);
		wake_oom_reaper(current);
		return true;
	}
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-19  9:08 ` Michal Hocko
@ 2016-07-19 10:52   ` Tetsuo Handa
  2016-07-19 15:36     ` Dave Jones
  2016-07-19 15:33   ` Dave Jones
  1 sibling, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2016-07-19 10:52 UTC (permalink / raw)
  To: Michal Hocko, Dave Jones; +Cc: linux-mm

On 2016/07/19 8:18, Dave Jones wrote:
> Whoa. Why did it pick systemd-journal ?

I guess that it is because all trinity processes' mm already had MMF_OOM_REAPED set.

The OOM reaper sets MMF_OOM_REAPED when OOM reap operation succeeded. But
"[ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name" listing
includes processes whose mm already has MMF_OOM_REAPED set. As a result, trinity-c15 and
trinity-c4 are shown again in the listing. While I can't confirm that trinity-c10, trinity-c2,
trinity-c0 and trinity-c11 are already OOM killed, I guess they are already OOM killed and
their mm already had MMF_OOM_REAPED set.

> My 'skip over !trinity processes' code kicks in, and it then kills the right processes, and the box lives on,
> but if I hadn't have had that diff, the wrong process would have been killed.

As of Linux 4.7, processes whose mm already has MMF_OOM_REAPED can be selected for many
times due to not checking MMF_OOM_REAPED when using task_will_free_mem() shortcut in
out_of_memory(). (It will be fixed in Linux 4.8.) That is, I guess that your system had
already hit

  panic("Out of memory and no killable processes...\n")

if trinity processes with MMF_OOM_REAPED mm were not selected again and again.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-19  9:08 ` Michal Hocko
  2016-07-19 10:52   ` Tetsuo Handa
@ 2016-07-19 15:33   ` Dave Jones
  2016-07-20  7:09     ` Michal Hocko
  1 sibling, 1 reply; 11+ messages in thread
From: Dave Jones @ 2016-07-19 15:33 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm

On Tue, Jul 19, 2016 at 11:08:58AM +0200, Michal Hocko wrote:
 > On Mon 18-07-16 19:18:50, Dave Jones wrote:
 > [...]
 > > [ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
 > > [ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
 > [...]
 > > [ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
 > > [ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
 > > [ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
 > > [ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
 > > [ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
 > > [ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
 > > [ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
 > > [ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
 > > [ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
 > > [ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
 > > [ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
 > > [ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
 > > [ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
 > > [ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
 > > [ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
 > > [ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
 > > [ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
 > > [ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
 > > [ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
 > > [ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
 > > [ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
 > > [ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
 > > [ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
 > > [ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
 > > [ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
 > > [ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
 > > [ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
 > > [ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
 > > [ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
 > > [ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
 > > [ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
 > > [ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
 > > [ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
 > > [ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
 > > [ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
 > > [ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
 > > [ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
 > > [ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
 > > [ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
 > > [ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
 > > [ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
 > > [ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
 > > [ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
 > > 
 > > 
 > > Whoa. Why did it pick systemd-journal ?
 > 
 > Who has picked that? select_bad process?
 > 
 > > My 'skip over !trinity processes' code kicks in, and it then kills the
 > > right processes, and the box lives on, but if I hadn't have had that
 > > diff, the wrong process would have been killed.
 > 
 > Could you post your diff please? oom_reaper will reap not only the oom
 > victims but also tasks which hit the oom path and they are exiting.


commit fd1fe9a8ad00ee6a2e81a1152c3150d2289fdf10
Author: Dave Jones <davej@codemonkey.org.uk>
Date:   Mon Jun 15 23:41:09 2015 -0400

    only oom-kill tasks named trinity

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ddf74487f848..2635db037dd5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -767,6 +767,15 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	if (__ratelimit(&oom_rs))
 		dump_header(oc, p, memcg);
 
+	task_lock(p);
+	if (strncmp(p->comm, "trinity-", 8) != 0) {
+		printk(KERN_ERR "%s: would have killed process %d (%s), but continuing instead...\n",
+			__func__, task_pid_nr(p), p->comm);
+		task_unlock(p);
+		return;
+	}
+	task_unlock(p);
+
 	pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n",
 		message, task_pid_nr(p), p->comm, points);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-19 10:52   ` Tetsuo Handa
@ 2016-07-19 15:36     ` Dave Jones
  2016-07-20 10:40       ` Tetsuo Handa
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Jones @ 2016-07-19 15:36 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Michal Hocko, linux-mm

On Tue, Jul 19, 2016 at 07:52:28PM +0900, Tetsuo Handa wrote:
 > On 2016/07/19 8:18, Dave Jones wrote:
 > > Whoa. Why did it pick systemd-journal ?
 > 
 > I guess that it is because all trinity processes' mm already had MMF_OOM_REAPED set.
 > 
 > The OOM reaper sets MMF_OOM_REAPED when OOM reap operation succeeded. But
 > "[ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name" listing
 > includes processes whose mm already has MMF_OOM_REAPED set. As a result, trinity-c15 and
 > trinity-c4 are shown again in the listing. While I can't confirm that trinity-c10, trinity-c2,
 > trinity-c0 and trinity-c11 are already OOM killed, I guess they are already OOM killed and
 > their mm already had MMF_OOM_REAPED set.

That still doesn't explain why it picked the journal process, instead of waiting until
the previous reaping operation had actually killed those Trinity tasks.

 > > My 'skip over !trinity processes' code kicks in, and it then kills the right processes, and the box lives on,
 > > but if I hadn't have had that diff, the wrong process would have been killed.
 > 
 > As of Linux 4.7, processes whose mm already has MMF_OOM_REAPED can be selected for many
 > times due to not checking MMF_OOM_REAPED when using task_will_free_mem() shortcut in
 > out_of_memory(). (It will be fixed in Linux 4.8.) That is, I guess that your system had
 > already hit
 > 
 >   panic("Out of memory and no killable processes...\n")
 > 
 > if trinity processes with MMF_OOM_REAPED mm were not selected again and again.

That panic was not hit. The machine continued running after killing the right tasks.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-19 15:33   ` Dave Jones
@ 2016-07-20  7:09     ` Michal Hocko
  2016-07-20 13:23       ` Dave Jones
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2016-07-20  7:09 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-mm

On Tue 19-07-16 11:33:35, Dave Jones wrote:
> On Tue, Jul 19, 2016 at 11:08:58AM +0200, Michal Hocko wrote:
> > On Mon 18-07-16 19:18:50, Dave Jones wrote:
> > [...]
> > > [ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
> > > [ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
> > [...]
> > > [ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
> > > [ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
> > > [ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
> > > [ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
> > > [ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
> > > [ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
> > > [ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
> > > [ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
> > > [ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
> > > [ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
> > > [ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
> > > [ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
> > > [ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
> > > [ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
> > > [ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
> > > [ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
> > > [ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
> > > [ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
> > > [ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
> > > [ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
> > > [ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
> > > [ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
> > > [ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
> > > [ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
> > > [ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
> > > [ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
> > > [ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
> > > [ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
> > > [ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
> > > [ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
> > > [ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
> > > [ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
> > > [ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
> > > [ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
> > > [ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
> > > [ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
> > > [ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
> > > [ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
> > > [ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
> > > [ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
> > > [ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
> > > [ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
> > > [ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
> > > 
> > > 
> > > Whoa. Why did it pick systemd-journal ?
> > 
> > Who has picked that? select_bad process?

OK, I see. So select_bad_process has selected systemd-journal but your
patch has declined this decision and skipped the oom invocation
altogether. Seeing those trinity-* processes being oom reaped
(repeatedly some of them) is not that surprising and Tetsuo is right
that this is due to out_of_memory->try_oom_reaper(). The current mmotm
tree has this part different to prevent from repeated oom reaping but
that is really minor as multiple attempts to reap a task is not harmful.

The reason why systemd-journal has been selected is very similar.
E.g.
[ 4607.741744] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB

so this task has been already oom reaped and so oom_badness will ignore
it (it simply doesn't make any sense to select this task because it
has been already killed or exiting and oom reaped as well). Others might
be in a similar position or they might have passed exit_mm->tsk->mm = NULL
so they are ignored by the oom killer as well.

HTH
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-19 15:36     ` Dave Jones
@ 2016-07-20 10:40       ` Tetsuo Handa
  0 siblings, 0 replies; 11+ messages in thread
From: Tetsuo Handa @ 2016-07-20 10:40 UTC (permalink / raw)
  To: davej; +Cc: mhocko, linux-mm

Dave Jones wrote:
> On Tue, Jul 19, 2016 at 07:52:28PM +0900, Tetsuo Handa wrote:
>  > On 2016/07/19 8:18, Dave Jones wrote:
>  > > Whoa. Why did it pick systemd-journal ?
>  > 
>  > I guess that it is because all trinity processes' mm already had MMF_OOM_REAPED set.
>  > 
>  > The OOM reaper sets MMF_OOM_REAPED when OOM reap operation succeeded. But
>  > "[ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name" listing
>  > includes processes whose mm already has MMF_OOM_REAPED set. As a result, trinity-c15 and
>  > trinity-c4 are shown again in the listing. While I can't confirm that trinity-c10, trinity-c2,
>  > trinity-c0 and trinity-c11 are already OOM killed, I guess they are already OOM killed and
>  > their mm already had MMF_OOM_REAPED set.
> 
> That still doesn't explain why it picked the journal process, instead of waiting until
> the previous reaping operation had actually killed those Trinity tasks.

I thought your patch did

--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -169,6 +169,8 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
+	if (!strncmp(p->comm, "trinity-", 8))
+		return 0;
 
 	p = find_lock_task_mm(p);
 	if (!p)

to OOM-kill only Trinity tasks. But your patch did not touch OOM victim selection logic.
Then, it is completely normal and expected result that systemd-journald was selected
because systemd-journald got highest score among all OOM-killable !MMF_OOM_REAPED mm
users. Nothing is wrong.

By the way, your patch needs to call put_task_struct(p) before return.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-20  7:09     ` Michal Hocko
@ 2016-07-20 13:23       ` Dave Jones
  2016-07-20 13:33         ` Dave Jones
  2016-07-20 13:36         ` Michal Hocko
  0 siblings, 2 replies; 11+ messages in thread
From: Dave Jones @ 2016-07-20 13:23 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm

On Wed, Jul 20, 2016 at 09:09:23AM +0200, Michal Hocko wrote:
 > On Tue 19-07-16 11:33:35, Dave Jones wrote:
 > > On Tue, Jul 19, 2016 at 11:08:58AM +0200, Michal Hocko wrote:
 > > > On Mon 18-07-16 19:18:50, Dave Jones wrote:
 > > > [...]
 > > > > [ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
 > > > > [ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
 > > > [...]
 > > > > [ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
 > > > > [ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
 > > > > [ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
 > > > > [ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
 > > > > [ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
 > > > > [ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
 > > > > [ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
 > > > > [ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
 > > > > [ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
 > > > > [ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
 > > > > [ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
 > > > > [ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
 > > > > [ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
 > > > > [ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
 > > > > [ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
 > > > > [ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
 > > > > [ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
 > > > > [ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
 > > > > [ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
 > > > > [ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
 > > > > [ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
 > > > > [ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
 > > > > [ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
 > > > > [ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
 > > > > [ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
 > > > > [ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
 > > > > [ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
 > > > > [ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
 > > > > [ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
 > > > > [ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
 > > > > [ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
 > > > > [ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
 > > > > [ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
 > > > > [ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
 > > > > [ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
 > > > > [ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
 > > > > [ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
 > > > > [ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
 > > > > [ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
 > > > > [ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
 > > > > [ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
 > > > > [ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
 > > > > [ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
 > > > > 
 > > > > 
 > > > > Whoa. Why did it pick systemd-journal ?
 > > > 
 > > > Who has picked that? select_bad process?
 > 
 > OK, I see. So select_bad_process has selected systemd-journal but your
 > patch has declined this decision and skipped the oom invocation
 > altogether. Seeing those trinity-* processes being oom reaped
 > (repeatedly some of them) is not that surprising and Tetsuo is right
 > that this is due to out_of_memory->try_oom_reaper(). The current mmotm
 > tree has this part different to prevent from repeated oom reaping but
 > that is really minor as multiple attempts to reap a task is not harmful.
 > 
 > The reason why systemd-journal has been selected is very similar.
 > E.g.
 > [ 4607.741744] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
 > 
 > so this task has been already oom reaped and so oom_badness will ignore
 > it (it simply doesn't make any sense to select this task because it
 > has been already killed or exiting and oom reaped as well). Others might
 > be in a similar position or they might have passed exit_mm->tsk->mm = NULL
 > so they are ignored by the oom killer as well.

I feel like I'm still missing something.  Why isn't "wait for the already reaped trinity tasks to exit"
the right thing to do here (as my diff forced it to do), instead of "pick even more victims even
though we've already got some reaped processes that haven't exited"

Not killing systemd-journald allowed the machine to keep running just fine.
If I hadn't have patched that out, it would have been killed unnecessarily.

	Dave
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-20 13:23       ` Dave Jones
@ 2016-07-20 13:33         ` Dave Jones
  2016-07-20 13:39           ` Michal Hocko
  2016-07-20 13:36         ` Michal Hocko
  1 sibling, 1 reply; 11+ messages in thread
From: Dave Jones @ 2016-07-20 13:33 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm

On Wed, Jul 20, 2016 at 09:23:04AM -0400, Dave Jones wrote:
 
 >  > so this task has been already oom reaped and so oom_badness will ignore
 >  > it (it simply doesn't make any sense to select this task because it
 >  > has been already killed or exiting and oom reaped as well). Others might
 >  > be in a similar position or they might have passed exit_mm->tsk->mm = NULL
 >  > so they are ignored by the oom killer as well.
 > 
 > I feel like I'm still missing something.  Why isn't "wait for the already reaped trinity tasks to exit"
 > the right thing to do here (as my diff forced it to do), instead of "pick even more victims even
 > though we've already got some reaped processes that haven't exited"
 > 
 > Not killing systemd-journald allowed the machine to keep running just fine.
 > If I hadn't have patched that out, it would have been killed unnecessarily.

nm, I figured it out. As Tetsuo pointed out, I was leaking a task struct,
so those already reaped trinity processes would never truly 'exit'.

Mea culpa, thanks.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-20 13:23       ` Dave Jones
  2016-07-20 13:33         ` Dave Jones
@ 2016-07-20 13:36         ` Michal Hocko
  1 sibling, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2016-07-20 13:36 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-mm

On Wed 20-07-16 09:23:04, Dave Jones wrote:
> On Wed, Jul 20, 2016 at 09:09:23AM +0200, Michal Hocko wrote:
>  > On Tue 19-07-16 11:33:35, Dave Jones wrote:
>  > > On Tue, Jul 19, 2016 at 11:08:58AM +0200, Michal Hocko wrote:
>  > > > On Mon 18-07-16 19:18:50, Dave Jones wrote:
>  > > > [...]
>  > > > > [ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
>  > > > > [ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
>  > > > [...]
>  > > > > [ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
>  > > > > [ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
>  > > > > [ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
>  > > > > [ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
>  > > > > [ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
>  > > > > [ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
>  > > > > [ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
>  > > > > [ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
>  > > > > [ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
>  > > > > [ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
>  > > > > [ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
>  > > > > [ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
>  > > > > [ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
>  > > > > [ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
>  > > > > [ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
>  > > > > [ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
>  > > > > [ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
>  > > > > [ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
>  > > > > [ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
>  > > > > [ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
>  > > > > [ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
>  > > > > [ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
>  > > > > [ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
>  > > > > [ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
>  > > > > [ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
>  > > > > [ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
>  > > > > [ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
>  > > > > [ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
>  > > > > [ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
>  > > > > [ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
>  > > > > [ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
>  > > > > [ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
>  > > > > [ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
>  > > > > [ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
>  > > > > [ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
>  > > > > [ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
>  > > > > [ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
>  > > > > [ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
>  > > > > [ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
>  > > > > [ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
>  > > > > [ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
>  > > > > [ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
>  > > > > [ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
>  > > > > 
>  > > > > 
>  > > > > Whoa. Why did it pick systemd-journal ?
>  > > > 
>  > > > Who has picked that? select_bad process?
>  > 
>  > OK, I see. So select_bad_process has selected systemd-journal but your
>  > patch has declined this decision and skipped the oom invocation
>  > altogether. Seeing those trinity-* processes being oom reaped
>  > (repeatedly some of them) is not that surprising and Tetsuo is right
>  > that this is due to out_of_memory->try_oom_reaper(). The current mmotm
>  > tree has this part different to prevent from repeated oom reaping but
>  > that is really minor as multiple attempts to reap a task is not harmful.
>  > 
>  > The reason why systemd-journal has been selected is very similar.
>  > E.g.
>  > [ 4607.741744] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
>  > 
>  > so this task has been already oom reaped and so oom_badness will ignore
>  > it (it simply doesn't make any sense to select this task because it
>  > has been already killed or exiting and oom reaped as well). Others might
>  > be in a similar position or they might have passed exit_mm->tsk->mm = NULL
>  > so they are ignored by the oom killer as well.
> 
> I feel like I'm still missing something.  Why isn't "wait for the
> already reaped trinity tasks to exit" the right thing to do here (as
> my diff forced it to do), instead of "pick even more victims even
> though we've already got some reaped processes that haven't exited"

OK, I was probably not clear enough. We are waiting for the oom reaper
to free up the address space and so the OOM killer will not select
another task while it is being reaped. The thing is that those tasks
have already been reaped they just haven't finished yet. So they are not
a proper oom victim candidate anymore. They are still sitting on some
shmem but that is not really reclaimable.

Does that make more sense now?

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oom-reaper choosing wrong processes.
  2016-07-20 13:33         ` Dave Jones
@ 2016-07-20 13:39           ` Michal Hocko
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2016-07-20 13:39 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-mm

On Wed 20-07-16 09:33:37, Dave Jones wrote:
> On Wed, Jul 20, 2016 at 09:23:04AM -0400, Dave Jones wrote:
>  
>  >  > so this task has been already oom reaped and so oom_badness will ignore
>  >  > it (it simply doesn't make any sense to select this task because it
>  >  > has been already killed or exiting and oom reaped as well). Others might
>  >  > be in a similar position or they might have passed exit_mm->tsk->mm = NULL
>  >  > so they are ignored by the oom killer as well.
>  > 
>  > I feel like I'm still missing something.  Why isn't "wait for the already reaped trinity tasks to exit"
>  > the right thing to do here (as my diff forced it to do), instead of "pick even more victims even
>  > though we've already got some reaped processes that haven't exited"
>  > 
>  > Not killing systemd-journald allowed the machine to keep running just fine.
>  > If I hadn't have patched that out, it would have been killed unnecessarily.
> 
> nm, I figured it out. As Tetsuo pointed out, I was leaking a task struct,
> so those already reaped trinity processes would never truly 'exit'.

Leaked task_struct would leak some memory but they shouldn't have any
effect on the task visibility to the oom killer. Tasks are basically
visible until they are unhashed from the task_list. But the leak could
indeed have some other side effects - like pinning a lot of memory and
so the OOM kill wouldn't be sufficient to make a forward progress.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-07-20 13:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-18 23:18 oom-reaper choosing wrong processes Dave Jones
2016-07-19  9:08 ` Michal Hocko
2016-07-19 10:52   ` Tetsuo Handa
2016-07-19 15:36     ` Dave Jones
2016-07-20 10:40       ` Tetsuo Handa
2016-07-19 15:33   ` Dave Jones
2016-07-20  7:09     ` Michal Hocko
2016-07-20 13:23       ` Dave Jones
2016-07-20 13:33         ` Dave Jones
2016-07-20 13:39           ` Michal Hocko
2016-07-20 13:36         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).