All of lore.kernel.org
 help / color / mirror / Atom feed
* oomkillers gone wild.
@ 2012-06-04 15:27 ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-04 15:27 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

I managed to drive a machine into a state of constant oom killing.
Looking over the logs, it started killing a bunch of processes that look
'wrong' to me..

Here's the first example. From here it just gets worse, taking out
sshd & NetworkManager, and other useful things rather than the memory hogs.

	Dave

[21622.923039] modprobe invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
[21622.924540] modprobe cpuset=/ mems_allowed=0
[21622.925989] Pid: 10940, comm: modprobe Not tainted 3.5.0-rc1+ #51
[21622.927447] Call Trace:
[21622.928984]  [<ffffffff816426b7>] dump_header+0x81/0x2d8
[21622.930427]  [<ffffffff810b7cd5>] ? trace_hardirqs_on_caller+0x115/0x1a0
[21622.931860]  [<ffffffff8164d742>] ? _raw_spin_unlock_irqrestore+0x42/0x80
[21622.933238]  [<ffffffff8131ee0e>] ? ___ratelimit+0x9e/0x130
[21622.934705]  [<ffffffff81144402>] oom_kill_process+0x282/0x2b0
[21622.936165]  [<ffffffff81144919>] out_of_memory+0x249/0x410
[21622.937688]  [<ffffffff8114a85d>] __alloc_pages_nodemask+0xa6d/0xab0
[21622.939215]  [<ffffffff81189cb6>] alloc_pages_vma+0xb6/0x190
[21622.940720]  [<ffffffff81168940>] __do_fault+0x2f0/0x530
[21622.942235]  [<ffffffff81086f81>] ? get_parent_ip+0x11/0x50
[21622.943802]  [<ffffffff81086f81>] ? get_parent_ip+0x11/0x50
[21622.945313]  [<ffffffff8116b253>] handle_pte_fault+0x93/0x9d0
[21622.946819]  [<ffffffff810b2450>] ? lock_release_holdtime.part.24+0x90/0x140
[21622.948527]  [<ffffffff8116c39f>] handle_mm_fault+0x21f/0x2e0
[21622.950080]  [<ffffffff81650a8b>] do_page_fault+0x13b/0x4b0
[21622.951620]  [<ffffffff81079c03>] ? up_write+0x23/0x40
[21622.953089]  [<ffffffff8164de03>] ? error_sti+0x5/0x6
[21622.954648]  [<ffffffff813264fd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[21622.956244]  [<ffffffff8164dc0f>] page_fault+0x1f/0x30
[21622.957866] Mem-Info:
[21622.959468] Node 0 DMA per-cpu:
[21622.961067] CPU    0: hi:    0, btch:   1 usd:   0
[21622.962682] CPU    1: hi:    0, btch:   1 usd:   0
[21622.964277] CPU    2: hi:    0, btch:   1 usd:   0
[21622.965877] CPU    3: hi:    0, btch:   1 usd:   0
[21622.967463] CPU    4: hi:    0, btch:   1 usd:   0
[21622.969056] CPU    5: hi:    0, btch:   1 usd:   0
[21622.970673] CPU    6: hi:    0, btch:   1 usd:   0
[21622.972272] CPU    7: hi:    0, btch:   1 usd:   0
[21622.973840] Node 0 DMA32 per-cpu:
[21622.975414] CPU    0: hi:  186, btch:  31 usd:  46
[21622.977075] CPU    1: hi:  186, btch:  31 usd:   0
[21622.978557] CPU    2: hi:  186, btch:  31 usd:  90
[21622.980103] CPU    3: hi:  186, btch:  31 usd: 102
[21622.981609] CPU    4: hi:  186, btch:  31 usd:   0
[21622.983067] CPU    5: hi:  186, btch:  31 usd:  70
[21622.984445] CPU    6: hi:  186, btch:  31 usd:  63
[21622.985854] CPU    7: hi:  186, btch:  31 usd:   0
[21622.987208] Node 0 Normal per-cpu:
[21622.988498] CPU    0: hi:  186, btch:  31 usd:   0
[21622.989772] CPU    1: hi:  186, btch:  31 usd:   0
[21622.990972] CPU    2: hi:  186, btch:  31 usd:   0
[21622.992201] CPU    3: hi:  186, btch:  31 usd:   0
[21622.993348] CPU    4: hi:  186, btch:  31 usd:   0
[21622.994475] CPU    5: hi:  186, btch:  31 usd:   0
[21622.995528] CPU    6: hi:  186, btch:  31 usd:   0
[21622.996551] CPU    7: hi:  186, btch:  31 usd:   0
[21622.997520] active_anon:38553 inactive_anon:39448 isolated_anon:224
[21622.997520]  active_file:84 inactive_file:233 isolated_file:0
[21622.997520]  unevictable:85 dirty:0 writeback:37692 unstable:0
[21622.997520]  free:19526 slab_reclaimable:23656 slab_unreclaimable:553126
[21622.997520]  mapped:50 shmem:1391 pagetables:2892 bounce:0
[21623.001856] Node 0 DMA free:15768kB min:264kB low:328kB high:396kB active_anon:20kB inactive_anon:100kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15632kB mlocked:0kB dirty:0kB writeback:100kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2 all_unreclaimable? no
[21623.004699] lowmem_reserve[]: 0 2643 3878 3878
[21623.005684] Node 0 DMA32 free:51344kB min:45888kB low:57360kB high:68832kB active_anon:153016kB inactive_anon:152172kB active_file:268kB inactive_file:916kB unevictable:340kB isolated(anon):896kB isolated(file):0kB present:2707204kB mlocked:340kB dirty:0kB writeback:149916kB mapped:200kB shmem:44kB slab_reclaimable:3696kB slab_unreclaimable:1524768kB kernel_stack:725000kB pagetables:6976kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[21623.008972] lowmem_reserve[]: 0 0 1234 1234
[21623.010090] Node 0 Normal free:10992kB min:21424kB low:26780kB high:32136kB active_anon:1176kB inactive_anon:5520kB active_file:68kB inactive_file:16kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1264032kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:5520kB slab_reclaimable:90928kB slab_unreclaimable:687736kB kernel_stack:306280kB pagetables:4592kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:154144 all_unreclaimable? yes
[21623.013696] lowmem_reserve[]: 0 0 0 0
[21623.014957] Node 0 DMA: 3*4kB 0*8kB 1*16kB 0*32kB 4*64kB 1*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 2*4096kB = 15772kB
[21623.016244] Node 0 DMA32: 740*4kB 306*8kB 113*16kB 101*32kB 219*64kB 104*128kB 39*256kB 4*512kB 1*1024kB 0*2048kB 0*4096kB = 50832kB
[21623.017559] Node 0 Normal: 1356*4kB 382*8kB 5*16kB 0*32kB 2*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 10992kB
[21623.018880] 41885 total pagecache pages
[21623.020193] 39819 pages in swap cache
[21623.021507] Swap cache stats: add 51603753, delete 51563934, find 307374090/308668676
[21623.022846] Free swap  = 5162300kB
[21623.024181] Total swap = 6029308kB
[21623.035243] 1041904 pages RAM
[21623.036660] 70330 pages reserved
[21623.038082] 9037 pages shared
[21623.039509] 814628 pages non-shared
[21623.040899] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[21623.042271] [  335]     0   335     7906        1   2       0             0 systemd-journal
[21623.043645] [  341]     0   341     6590        1   0     -17         -1000 udevd
[21623.045019] [  466]     0   466    86847        2   6       0             0 NetworkManager
[21623.046421] [  470]     0   470     4866        3   0       0             0 smartd
[21623.047891] [  476]     0   476    29599       10   2       0             0 crond
[21623.049313] [  487]     0   487     7068        1   7       0             0 systemd-logind
[21623.050768] [  488]    38   488     8301       10   6       0             0 ntpd
[21623.052189] [  501]    81   501     5435        1   7     -13          -900 dbus-daemon
[21623.053606] [  504]     0   504    47106        1   2     -13          -900 polkitd
[21623.054998] [  507]     0   507    27239        1   7       0             0 agetty
[21623.056575] [  509]     0   509    17260        1   3     -13          -900 modem-manager
[21623.058055] [  520]     0   520     4790        2   4       0             0 rpcbind
[21623.059532] [  530]     0   530    19403        1   5     -17         -1000 sshd
[21623.060988] [  554]     0   554    60782        1   0       0             0 rsyslogd
[21623.062421] [  562]    29   562     5883        1   4       0             0 rpc.statd
[21623.063957] [  570]     0   570     6361        1   1       0             0 rpc.idmapd
[21623.065440] [  571]     0   571     5993        1   3       0             0 rpc.mountd
[21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
[21623.068409] [  613]     0   613    21273       26   7       0             0 sendmail
[21623.069905] [  633]    51   633    20167       10   1       0             0 sendmail
[21623.071427] [  736]  1000   736     6212      139   7       0             0 tmux
[21623.072933] [  737]  1000   737    29011        1   4       0             0 bash
[21623.074461] [  811]  1000   811    28022        2   3       0             0 test-random.sh
[21623.075977] [  818]  1000   818     4295        7   3       0             0 trinity
[21623.077512] [  820]  1000   820     4311       16   5       0             0 trinity
[21623.079038] [  821]  1000   821     4309       11   6       0             0 trinity
[21623.080572] [  822]  1000   822     4317        9   1       0             0 trinity
[21623.082114] [  823]  1000   823     4301        9   5       0             0 trinity
[21623.083689] [  825]  1000   825     4229        8   4       0             0 trinity
[21623.085225] [  826]  1000   826     4335       14   4       0             0 trinity
[21623.086784] [  827]  1000   827     4403       13   6       0             0 trinity
[21623.088311] [  828]  1000   828     4343       10   5       0             0 trinity
[21623.089849] [  829]  1000   829     4287       16   6       0             0 trinity
[21623.091393] [  830]  1000   830     4319        9   3       0             0 trinity
[21623.092927] [  831]  1000   831     4321        7   5       0             0 trinity
[21623.094467] [  832]  1000   832     4327       15   2       0             0 trinity
[21623.096003] [  833]  1000   833     4389       14   4       0             0 trinity
[21623.097496] [  889]     0   889     6589        1   1     -17         -1000 udevd
[21623.098983] [  899]     0   899     6589        1   4     -17         -1000 udevd
[21623.100424] [23414]  1000 23414     4309       14   2       0             0 trinity-watchdo
[21623.101840] [23752]  1000 23752     4319        9   4       0             0 trinity-watchdo
[21623.103213] [23753]  1000 23753     4343       17   4       0             0 trinity-watchdo
[21623.104583] [23978]  1000 23978     4327       18   0       0             0 trinity-watchdo
[21623.105902] [24392]  1000 24392     4229       10   3       0             0 trinity-watchdo
[21623.107162] [24408]  1000 24408     4389       20   3       0             0 trinity-watchdo
[21623.108408] [24593]  1000 24593     4403       13   3       0             0 trinity-watchdo
[21623.109607] [24656]  1000 24656     4321       10   5       0             0 trinity-watchdo
[21623.110744] [24686]  1000 24686     4335       16   1       0             0 trinity-watchdo
[21623.111838] [25172]  1000 25172     4287       18   4       0             0 trinity-watchdo
[21623.112893] [25229]  1000 25229     4317       17   2       0             0 trinity-watchdo
[21623.113837] [25338]  1000 25338     4301       14   2       0             0 trinity-watchdo
[21623.114758] [27690]  1000 27690     4311       24   6       0             0 trinity-watchdo
[21623.115728] [29047]  1000 29047     4295       12   7       0             0 trinity-watchdo
[21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
[21623.117430] [ 8230]  1000  8230     5070       50   5       0             0 trinity-child5
[21623.118269] [ 8510]  1000  8510     8696      119   5       0             0 trinity-child5
[21623.119088] [ 8572]  1000  8572     4758       30   0       0             0 trinity-child0
[21623.119887] [ 9084]  1000  9084     4998       36   0       0             0 trinity-child0
[21623.120687] [ 9138]  1000  9138     4343       48   0       0             0 trinity-child0
[21623.121479] [ 9518]  1000  9518     4317       45   6       0             0 trinity-child6
[21623.122280] [ 9563]  1000  9563     4857      216   0       0             0 trinity-child0
[21623.123077] [ 9626]  1000  9626     4317       40   1       0             0 trinity-child1
[21623.123875] [ 9700]  1000  9700     4317       27   2       0             0 trinity-child2
[21623.124658] [ 9967]  1000  9967     4747       40   4       0             0 trinity-child4
[21623.125422] [10011]  1000 10011     4317       36   7       0             0 trinity-child7
[21623.126182] [10064]  1000 10064     4311       80   7       0             0 trinity-child7
[21623.126928] [10072]  1000 10072     4623      346   6       0             0 trinity-child6
[21623.127668] [10077]  1000 10077     4403      231   5       0             0 trinity-child5
[21623.128402] [10079]  1000 10079     4834      241   7       0             0 trinity-child7
[21623.129137] [10084]  1000 10084     4733      510   4       0             0 trinity-child4
[21623.129869] [10085]  1000 10085     4403       20   3       0             0 trinity-child3
[21623.130594] [10086]  1000 10086     4858       27   3       0             0 trinity-child3
[21623.131310] [10099]  1000 10099     4321       27   5       0             0 trinity-child5
[21623.132022] [10100]  1000 10100     4387       26   7       0             0 trinity-child7
[21623.132733] [10102]  1000 10102     5520       27   2       0             0 trinity-child2
[21623.133432] [10117]  1000 10117   532719      293   5       0             0 trinity-child5
[21623.134136] [10143]  1000 10143     4319       25   4       0             0 trinity-child4
[21623.134834] [10144]  1000 10144     4287      121   6       0             0 trinity-child6
[21623.135531] [10178]  1000 10178     4295       25   2       0             0 trinity-child2
[21623.136228] [10224]  1000 10224     4319       24   6       0             0 trinity-child6
[21623.136926] [10235]  1000 10235     4319       24   3       0             0 trinity-child3
[21623.137617] [10243]  1000 10243     4499       25   5       0             0 trinity-child5
[21623.138298] [10250]  1000 10250     4229       27   6       0             0 trinity-child6
[21623.138981] [10251]  1000 10251     4389      109   0       0             0 trinity-child0
[21623.139658] [10254]  1000 10254     4319       27   1       0             0 trinity-child1
[21623.140336] [10369]  1000 10369     5010       45   3       0             0 trinity-child3
[21623.141007] [10389]  1000 10389     4371       76   2       0             0 trinity-child2
[21623.141679] [10458]  1000 10458  1052885       46   6       0             0 trinity-child6
[21623.142343] [10460]  1000 10460     5431       39   2       0             0 trinity-child2
[21623.143016] [10461]  1000 10461     4804       38   7       0             0 trinity-child7
[21623.143689] [10462]  1000 10462     4309       30   3       0             0 trinity-child3
[21623.144358] [10475]  1000 10475     4229       23   1       0             0 trinity-child1
[21623.145033] [10476]  1000 10476     4499      284   0       0             0 trinity-child0
[21623.145711] [10478]  1000 10478     4301      111   3       0             0 trinity-child3
[21623.146382] [10482]  1000 10482     4337       86   7       0             0 trinity-child7
[21623.147064] [10485]  1000 10485     4862      116   2       0             0 trinity-child2
[21623.147743] [10499]  1000 10499     4424       82   5       0             0 trinity-child5
[21623.148423] [10526]  1000 10526     5136       36   1       0             0 trinity-child1
[21623.149111] [10538]  1000 10538     4403      137   1       0             0 trinity-child1
[21623.149796] [10553]  1000 10553     4951      355   1       0             0 trinity-child1
[21623.150483] [10554]  1000 10554     4472       25   2       0             0 trinity-child2
[21623.151155] [10555]  1000 10555     4321       40   0       0             0 trinity-child0
[21623.151840] [10650]  1000 10650     5152      208   1       0             0 trinity-child1
[21623.152515] [10685]  1000 10685     4429      366   4       0             0 trinity-child4
[21623.153180] [10693]  1000 10693     4301       68   6       0             0 trinity-child6
[21623.153855] [10694]  1000 10694     4522       64   6       0             0 trinity-child6
[21623.154532] [10712]  1000 10712     4723      656   3       0             0 trinity-child3
[21623.155202] [10713]  1000 10713     4335      219   5       0             0 trinity-child5
[21623.155877] [10714]  1000 10714     4335      251   4       0             0 trinity-child4
[21623.156555] [10715]  1000 10715     4813      682   7       0             0 trinity-child7
[21623.157226] [10761]  1000 10761     4335      234   2       0             0 trinity-child2
[21623.157904] [10777]  1000 10777     4336      279   6       0             0 trinity-child6
[21623.158580] [10788]  1000 10788     4311       48   0       0             0 trinity-child0
[21623.159250] [10789]  1000 10789     4575      466   2       0             0 trinity-child2
[21623.159927] [10790]  1000 10790     4311       60   4       0             0 trinity-child4
[21623.160607] [10791]  1000 10791     4344      276   6       0             0 trinity-child6
[21623.161279] [10792]  1000 10792     4617      574   3       0             0 trinity-child3
[21623.161950] [10793]  1000 10793     4311       55   5       0             0 trinity-child5
[21623.162618] [10794]  1000 10794     4311       52   1       0             0 trinity-child1
[21623.163277] [10795]  1000 10795     5094      974   1       0             0 trinity-child1
[21623.163939] [10802]  1000 10802     4335       82   0       0             0 trinity-child0
[21623.164605] [10813]  1000 10813     4327      113   1       0             0 trinity-child1
[21623.165271] [10846]  1000 10846     4521      273   7       0             0 trinity-child7
[21623.165938] [10856]  1000 10856     4422      217   4       0             0 trinity-child4
[21623.166605] [10880]  1000 10880     4287       71   2       0             0 trinity-child2
[21623.167266] [10884]  1000 10884     4287       53   0       0             0 trinity-child0
[21623.167936] [10894]  1000 10894     4308       53   0       0             0 trinity-child0
[21623.168604] [10897]  1000 10897     4287       77   4       0             0 trinity-child4
[21623.169257] [10905]  1000 10905     4309       52   4       0             0 trinity-child4
[21623.169920] [10907]  1000 10907     4403      300   6       0             0 trinity-child6
[21623.170581] [10924]  1000 10924     4287       69   7       0             0 trinity-child7
[21623.171242] [10931]  1000 10931     4389      143   3       0             0 trinity-child3
[21623.171910] [10938]     0 10938     3182       53   7       0             0 modprobe
[21623.172576] [10940]     0 10940     3726       24   1       0             0 modprobe
[21623.173239] [10942]     0 10942     3182       52   4       0             0 modprobe
[21623.173907] [10944]     0 10944     3726       43   3       0             0 modprobe
[21623.174577] [10945]     0 10945     3726       34   4       0             0 modprobe
[21623.175237] [10946]  1000 10946     4287       19   2       0             0 trinity
[21623.175906] Out of memory: Kill process 588 (dhclient) score 1732579 or sacrifice child
[21623.176588] Killed process 588 (dhclient) total-vm:88824kB, anon-rss:0kB, file-rss:4kB



wtf ?

we picked this..

[21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient

over say..

[21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3

What went wrong here ?

And why does that score look so.. weird.


If the death spiral of endless oom killing sounds interesting, I have about 300MB of logfile
so far which I can upload somewhere.

	Dave


^ permalink raw reply	[flat|nested] 34+ messages in thread

* oomkillers gone wild.
@ 2012-06-04 15:27 ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-04 15:27 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

I managed to drive a machine into a state of constant oom killing.
Looking over the logs, it started killing a bunch of processes that look
'wrong' to me..

Here's the first example. From here it just gets worse, taking out
sshd & NetworkManager, and other useful things rather than the memory hogs.

	Dave

[21622.923039] modprobe invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
[21622.924540] modprobe cpuset=/ mems_allowed=0
[21622.925989] Pid: 10940, comm: modprobe Not tainted 3.5.0-rc1+ #51
[21622.927447] Call Trace:
[21622.928984]  [<ffffffff816426b7>] dump_header+0x81/0x2d8
[21622.930427]  [<ffffffff810b7cd5>] ? trace_hardirqs_on_caller+0x115/0x1a0
[21622.931860]  [<ffffffff8164d742>] ? _raw_spin_unlock_irqrestore+0x42/0x80
[21622.933238]  [<ffffffff8131ee0e>] ? ___ratelimit+0x9e/0x130
[21622.934705]  [<ffffffff81144402>] oom_kill_process+0x282/0x2b0
[21622.936165]  [<ffffffff81144919>] out_of_memory+0x249/0x410
[21622.937688]  [<ffffffff8114a85d>] __alloc_pages_nodemask+0xa6d/0xab0
[21622.939215]  [<ffffffff81189cb6>] alloc_pages_vma+0xb6/0x190
[21622.940720]  [<ffffffff81168940>] __do_fault+0x2f0/0x530
[21622.942235]  [<ffffffff81086f81>] ? get_parent_ip+0x11/0x50
[21622.943802]  [<ffffffff81086f81>] ? get_parent_ip+0x11/0x50
[21622.945313]  [<ffffffff8116b253>] handle_pte_fault+0x93/0x9d0
[21622.946819]  [<ffffffff810b2450>] ? lock_release_holdtime.part.24+0x90/0x140
[21622.948527]  [<ffffffff8116c39f>] handle_mm_fault+0x21f/0x2e0
[21622.950080]  [<ffffffff81650a8b>] do_page_fault+0x13b/0x4b0
[21622.951620]  [<ffffffff81079c03>] ? up_write+0x23/0x40
[21622.953089]  [<ffffffff8164de03>] ? error_sti+0x5/0x6
[21622.954648]  [<ffffffff813264fd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[21622.956244]  [<ffffffff8164dc0f>] page_fault+0x1f/0x30
[21622.957866] Mem-Info:
[21622.959468] Node 0 DMA per-cpu:
[21622.961067] CPU    0: hi:    0, btch:   1 usd:   0
[21622.962682] CPU    1: hi:    0, btch:   1 usd:   0
[21622.964277] CPU    2: hi:    0, btch:   1 usd:   0
[21622.965877] CPU    3: hi:    0, btch:   1 usd:   0
[21622.967463] CPU    4: hi:    0, btch:   1 usd:   0
[21622.969056] CPU    5: hi:    0, btch:   1 usd:   0
[21622.970673] CPU    6: hi:    0, btch:   1 usd:   0
[21622.972272] CPU    7: hi:    0, btch:   1 usd:   0
[21622.973840] Node 0 DMA32 per-cpu:
[21622.975414] CPU    0: hi:  186, btch:  31 usd:  46
[21622.977075] CPU    1: hi:  186, btch:  31 usd:   0
[21622.978557] CPU    2: hi:  186, btch:  31 usd:  90
[21622.980103] CPU    3: hi:  186, btch:  31 usd: 102
[21622.981609] CPU    4: hi:  186, btch:  31 usd:   0
[21622.983067] CPU    5: hi:  186, btch:  31 usd:  70
[21622.984445] CPU    6: hi:  186, btch:  31 usd:  63
[21622.985854] CPU    7: hi:  186, btch:  31 usd:   0
[21622.987208] Node 0 Normal per-cpu:
[21622.988498] CPU    0: hi:  186, btch:  31 usd:   0
[21622.989772] CPU    1: hi:  186, btch:  31 usd:   0
[21622.990972] CPU    2: hi:  186, btch:  31 usd:   0
[21622.992201] CPU    3: hi:  186, btch:  31 usd:   0
[21622.993348] CPU    4: hi:  186, btch:  31 usd:   0
[21622.994475] CPU    5: hi:  186, btch:  31 usd:   0
[21622.995528] CPU    6: hi:  186, btch:  31 usd:   0
[21622.996551] CPU    7: hi:  186, btch:  31 usd:   0
[21622.997520] active_anon:38553 inactive_anon:39448 isolated_anon:224
[21622.997520]  active_file:84 inactive_file:233 isolated_file:0
[21622.997520]  unevictable:85 dirty:0 writeback:37692 unstable:0
[21622.997520]  free:19526 slab_reclaimable:23656 slab_unreclaimable:553126
[21622.997520]  mapped:50 shmem:1391 pagetables:2892 bounce:0
[21623.001856] Node 0 DMA free:15768kB min:264kB low:328kB high:396kB active_anon:20kB inactive_anon:100kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15632kB mlocked:0kB dirty:0kB writeback:100kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2 all_unreclaimable? no
[21623.004699] lowmem_reserve[]: 0 2643 3878 3878
[21623.005684] Node 0 DMA32 free:51344kB min:45888kB low:57360kB high:68832kB active_anon:153016kB inactive_anon:152172kB active_file:268kB inactive_file:916kB unevictable:340kB isolated(anon):896kB isolated(file):0kB present:2707204kB mlocked:340kB dirty:0kB writeback:149916kB mapped:200kB shmem:44kB slab_reclaimable:3696kB slab_unreclaimable:1524768kB kernel_stack:725000kB pagetables:6976kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[21623.008972] lowmem_reserve[]: 0 0 1234 1234
[21623.010090] Node 0 Normal free:10992kB min:21424kB low:26780kB high:32136kB active_anon:1176kB inactive_anon:5520kB active_file:68kB inactive_file:16kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1264032kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:5520kB slab_reclaimable:90928kB slab_unreclaimable:687736kB kernel_stack:306280kB pagetables:4592kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:154144 all_unreclaimable? yes
[21623.013696] lowmem_reserve[]: 0 0 0 0
[21623.014957] Node 0 DMA: 3*4kB 0*8kB 1*16kB 0*32kB 4*64kB 1*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 2*4096kB = 15772kB
[21623.016244] Node 0 DMA32: 740*4kB 306*8kB 113*16kB 101*32kB 219*64kB 104*128kB 39*256kB 4*512kB 1*1024kB 0*2048kB 0*4096kB = 50832kB
[21623.017559] Node 0 Normal: 1356*4kB 382*8kB 5*16kB 0*32kB 2*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 10992kB
[21623.018880] 41885 total pagecache pages
[21623.020193] 39819 pages in swap cache
[21623.021507] Swap cache stats: add 51603753, delete 51563934, find 307374090/308668676
[21623.022846] Free swap  = 5162300kB
[21623.024181] Total swap = 6029308kB
[21623.035243] 1041904 pages RAM
[21623.036660] 70330 pages reserved
[21623.038082] 9037 pages shared
[21623.039509] 814628 pages non-shared
[21623.040899] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[21623.042271] [  335]     0   335     7906        1   2       0             0 systemd-journal
[21623.043645] [  341]     0   341     6590        1   0     -17         -1000 udevd
[21623.045019] [  466]     0   466    86847        2   6       0             0 NetworkManager
[21623.046421] [  470]     0   470     4866        3   0       0             0 smartd
[21623.047891] [  476]     0   476    29599       10   2       0             0 crond
[21623.049313] [  487]     0   487     7068        1   7       0             0 systemd-logind
[21623.050768] [  488]    38   488     8301       10   6       0             0 ntpd
[21623.052189] [  501]    81   501     5435        1   7     -13          -900 dbus-daemon
[21623.053606] [  504]     0   504    47106        1   2     -13          -900 polkitd
[21623.054998] [  507]     0   507    27239        1   7       0             0 agetty
[21623.056575] [  509]     0   509    17260        1   3     -13          -900 modem-manager
[21623.058055] [  520]     0   520     4790        2   4       0             0 rpcbind
[21623.059532] [  530]     0   530    19403        1   5     -17         -1000 sshd
[21623.060988] [  554]     0   554    60782        1   0       0             0 rsyslogd
[21623.062421] [  562]    29   562     5883        1   4       0             0 rpc.statd
[21623.063957] [  570]     0   570     6361        1   1       0             0 rpc.idmapd
[21623.065440] [  571]     0   571     5993        1   3       0             0 rpc.mountd
[21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
[21623.068409] [  613]     0   613    21273       26   7       0             0 sendmail
[21623.069905] [  633]    51   633    20167       10   1       0             0 sendmail
[21623.071427] [  736]  1000   736     6212      139   7       0             0 tmux
[21623.072933] [  737]  1000   737    29011        1   4       0             0 bash
[21623.074461] [  811]  1000   811    28022        2   3       0             0 test-random.sh
[21623.075977] [  818]  1000   818     4295        7   3       0             0 trinity
[21623.077512] [  820]  1000   820     4311       16   5       0             0 trinity
[21623.079038] [  821]  1000   821     4309       11   6       0             0 trinity
[21623.080572] [  822]  1000   822     4317        9   1       0             0 trinity
[21623.082114] [  823]  1000   823     4301        9   5       0             0 trinity
[21623.083689] [  825]  1000   825     4229        8   4       0             0 trinity
[21623.085225] [  826]  1000   826     4335       14   4       0             0 trinity
[21623.086784] [  827]  1000   827     4403       13   6       0             0 trinity
[21623.088311] [  828]  1000   828     4343       10   5       0             0 trinity
[21623.089849] [  829]  1000   829     4287       16   6       0             0 trinity
[21623.091393] [  830]  1000   830     4319        9   3       0             0 trinity
[21623.092927] [  831]  1000   831     4321        7   5       0             0 trinity
[21623.094467] [  832]  1000   832     4327       15   2       0             0 trinity
[21623.096003] [  833]  1000   833     4389       14   4       0             0 trinity
[21623.097496] [  889]     0   889     6589        1   1     -17         -1000 udevd
[21623.098983] [  899]     0   899     6589        1   4     -17         -1000 udevd
[21623.100424] [23414]  1000 23414     4309       14   2       0             0 trinity-watchdo
[21623.101840] [23752]  1000 23752     4319        9   4       0             0 trinity-watchdo
[21623.103213] [23753]  1000 23753     4343       17   4       0             0 trinity-watchdo
[21623.104583] [23978]  1000 23978     4327       18   0       0             0 trinity-watchdo
[21623.105902] [24392]  1000 24392     4229       10   3       0             0 trinity-watchdo
[21623.107162] [24408]  1000 24408     4389       20   3       0             0 trinity-watchdo
[21623.108408] [24593]  1000 24593     4403       13   3       0             0 trinity-watchdo
[21623.109607] [24656]  1000 24656     4321       10   5       0             0 trinity-watchdo
[21623.110744] [24686]  1000 24686     4335       16   1       0             0 trinity-watchdo
[21623.111838] [25172]  1000 25172     4287       18   4       0             0 trinity-watchdo
[21623.112893] [25229]  1000 25229     4317       17   2       0             0 trinity-watchdo
[21623.113837] [25338]  1000 25338     4301       14   2       0             0 trinity-watchdo
[21623.114758] [27690]  1000 27690     4311       24   6       0             0 trinity-watchdo
[21623.115728] [29047]  1000 29047     4295       12   7       0             0 trinity-watchdo
[21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
[21623.117430] [ 8230]  1000  8230     5070       50   5       0             0 trinity-child5
[21623.118269] [ 8510]  1000  8510     8696      119   5       0             0 trinity-child5
[21623.119088] [ 8572]  1000  8572     4758       30   0       0             0 trinity-child0
[21623.119887] [ 9084]  1000  9084     4998       36   0       0             0 trinity-child0
[21623.120687] [ 9138]  1000  9138     4343       48   0       0             0 trinity-child0
[21623.121479] [ 9518]  1000  9518     4317       45   6       0             0 trinity-child6
[21623.122280] [ 9563]  1000  9563     4857      216   0       0             0 trinity-child0
[21623.123077] [ 9626]  1000  9626     4317       40   1       0             0 trinity-child1
[21623.123875] [ 9700]  1000  9700     4317       27   2       0             0 trinity-child2
[21623.124658] [ 9967]  1000  9967     4747       40   4       0             0 trinity-child4
[21623.125422] [10011]  1000 10011     4317       36   7       0             0 trinity-child7
[21623.126182] [10064]  1000 10064     4311       80   7       0             0 trinity-child7
[21623.126928] [10072]  1000 10072     4623      346   6       0             0 trinity-child6
[21623.127668] [10077]  1000 10077     4403      231   5       0             0 trinity-child5
[21623.128402] [10079]  1000 10079     4834      241   7       0             0 trinity-child7
[21623.129137] [10084]  1000 10084     4733      510   4       0             0 trinity-child4
[21623.129869] [10085]  1000 10085     4403       20   3       0             0 trinity-child3
[21623.130594] [10086]  1000 10086     4858       27   3       0             0 trinity-child3
[21623.131310] [10099]  1000 10099     4321       27   5       0             0 trinity-child5
[21623.132022] [10100]  1000 10100     4387       26   7       0             0 trinity-child7
[21623.132733] [10102]  1000 10102     5520       27   2       0             0 trinity-child2
[21623.133432] [10117]  1000 10117   532719      293   5       0             0 trinity-child5
[21623.134136] [10143]  1000 10143     4319       25   4       0             0 trinity-child4
[21623.134834] [10144]  1000 10144     4287      121   6       0             0 trinity-child6
[21623.135531] [10178]  1000 10178     4295       25   2       0             0 trinity-child2
[21623.136228] [10224]  1000 10224     4319       24   6       0             0 trinity-child6
[21623.136926] [10235]  1000 10235     4319       24   3       0             0 trinity-child3
[21623.137617] [10243]  1000 10243     4499       25   5       0             0 trinity-child5
[21623.138298] [10250]  1000 10250     4229       27   6       0             0 trinity-child6
[21623.138981] [10251]  1000 10251     4389      109   0       0             0 trinity-child0
[21623.139658] [10254]  1000 10254     4319       27   1       0             0 trinity-child1
[21623.140336] [10369]  1000 10369     5010       45   3       0             0 trinity-child3
[21623.141007] [10389]  1000 10389     4371       76   2       0             0 trinity-child2
[21623.141679] [10458]  1000 10458  1052885       46   6       0             0 trinity-child6
[21623.142343] [10460]  1000 10460     5431       39   2       0             0 trinity-child2
[21623.143016] [10461]  1000 10461     4804       38   7       0             0 trinity-child7
[21623.143689] [10462]  1000 10462     4309       30   3       0             0 trinity-child3
[21623.144358] [10475]  1000 10475     4229       23   1       0             0 trinity-child1
[21623.145033] [10476]  1000 10476     4499      284   0       0             0 trinity-child0
[21623.145711] [10478]  1000 10478     4301      111   3       0             0 trinity-child3
[21623.146382] [10482]  1000 10482     4337       86   7       0             0 trinity-child7
[21623.147064] [10485]  1000 10485     4862      116   2       0             0 trinity-child2
[21623.147743] [10499]  1000 10499     4424       82   5       0             0 trinity-child5
[21623.148423] [10526]  1000 10526     5136       36   1       0             0 trinity-child1
[21623.149111] [10538]  1000 10538     4403      137   1       0             0 trinity-child1
[21623.149796] [10553]  1000 10553     4951      355   1       0             0 trinity-child1
[21623.150483] [10554]  1000 10554     4472       25   2       0             0 trinity-child2
[21623.151155] [10555]  1000 10555     4321       40   0       0             0 trinity-child0
[21623.151840] [10650]  1000 10650     5152      208   1       0             0 trinity-child1
[21623.152515] [10685]  1000 10685     4429      366   4       0             0 trinity-child4
[21623.153180] [10693]  1000 10693     4301       68   6       0             0 trinity-child6
[21623.153855] [10694]  1000 10694     4522       64   6       0             0 trinity-child6
[21623.154532] [10712]  1000 10712     4723      656   3       0             0 trinity-child3
[21623.155202] [10713]  1000 10713     4335      219   5       0             0 trinity-child5
[21623.155877] [10714]  1000 10714     4335      251   4       0             0 trinity-child4
[21623.156555] [10715]  1000 10715     4813      682   7       0             0 trinity-child7
[21623.157226] [10761]  1000 10761     4335      234   2       0             0 trinity-child2
[21623.157904] [10777]  1000 10777     4336      279   6       0             0 trinity-child6
[21623.158580] [10788]  1000 10788     4311       48   0       0             0 trinity-child0
[21623.159250] [10789]  1000 10789     4575      466   2       0             0 trinity-child2
[21623.159927] [10790]  1000 10790     4311       60   4       0             0 trinity-child4
[21623.160607] [10791]  1000 10791     4344      276   6       0             0 trinity-child6
[21623.161279] [10792]  1000 10792     4617      574   3       0             0 trinity-child3
[21623.161950] [10793]  1000 10793     4311       55   5       0             0 trinity-child5
[21623.162618] [10794]  1000 10794     4311       52   1       0             0 trinity-child1
[21623.163277] [10795]  1000 10795     5094      974   1       0             0 trinity-child1
[21623.163939] [10802]  1000 10802     4335       82   0       0             0 trinity-child0
[21623.164605] [10813]  1000 10813     4327      113   1       0             0 trinity-child1
[21623.165271] [10846]  1000 10846     4521      273   7       0             0 trinity-child7
[21623.165938] [10856]  1000 10856     4422      217   4       0             0 trinity-child4
[21623.166605] [10880]  1000 10880     4287       71   2       0             0 trinity-child2
[21623.167266] [10884]  1000 10884     4287       53   0       0             0 trinity-child0
[21623.167936] [10894]  1000 10894     4308       53   0       0             0 trinity-child0
[21623.168604] [10897]  1000 10897     4287       77   4       0             0 trinity-child4
[21623.169257] [10905]  1000 10905     4309       52   4       0             0 trinity-child4
[21623.169920] [10907]  1000 10907     4403      300   6       0             0 trinity-child6
[21623.170581] [10924]  1000 10924     4287       69   7       0             0 trinity-child7
[21623.171242] [10931]  1000 10931     4389      143   3       0             0 trinity-child3
[21623.171910] [10938]     0 10938     3182       53   7       0             0 modprobe
[21623.172576] [10940]     0 10940     3726       24   1       0             0 modprobe
[21623.173239] [10942]     0 10942     3182       52   4       0             0 modprobe
[21623.173907] [10944]     0 10944     3726       43   3       0             0 modprobe
[21623.174577] [10945]     0 10945     3726       34   4       0             0 modprobe
[21623.175237] [10946]  1000 10946     4287       19   2       0             0 trinity
[21623.175906] Out of memory: Kill process 588 (dhclient) score 1732579 or sacrifice child
[21623.176588] Killed process 588 (dhclient) total-vm:88824kB, anon-rss:0kB, file-rss:4kB



wtf ?

we picked this..

[21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient

over say..

[21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3

What went wrong here ?

And why does that score look so.. weird.


If the death spiral of endless oom killing sounds interesting, I have about 300MB of logfile
so far which I can upload somewhere.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-04 15:27 ` Dave Jones
@ 2012-06-04 23:30   ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-04 23:30 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Mon, 4 Jun 2012, Dave Jones wrote:

> we picked this..
> 
> [21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
> 
> over say..
> 
> [21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
> 
> What went wrong here ?
> 
> And why does that score look so.. weird.
> 

It sounds like it's because pid 588 has uid=0 and the adjustment for root 
processes is causing an overflow.  I assume this fixes it?
---
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -183,7 +183,7 @@ static bool oom_unkillable_task(struct task_struct *p,
 unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 			  const nodemask_t *nodemask, unsigned long totalpages)
 {
-	unsigned long points;
+	long points;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
@@ -223,7 +223,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 * Never return 0 for an eligible task regardless of the root bonus and
 	 * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
 	 */
-	return points ? points : 1;
+	return points > 0 ? points : 1;
 }
 
 /*

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-04 23:30   ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-04 23:30 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Mon, 4 Jun 2012, Dave Jones wrote:

> we picked this..
> 
> [21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
> 
> over say..
> 
> [21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
> 
> What went wrong here ?
> 
> And why does that score look so.. weird.
> 

It sounds like it's because pid 588 has uid=0 and the adjustment for root 
processes is causing an overflow.  I assume this fixes it?
---
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -183,7 +183,7 @@ static bool oom_unkillable_task(struct task_struct *p,
 unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 			  const nodemask_t *nodemask, unsigned long totalpages)
 {
-	unsigned long points;
+	long points;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
@@ -223,7 +223,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 * Never return 0 for an eligible task regardless of the root bonus and
 	 * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
 	 */
-	return points ? points : 1;
+	return points > 0 ? points : 1;
 }
 
 /*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-04 23:30   ` David Rientjes
@ 2012-06-05 17:44     ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-05 17:44 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

On Mon, Jun 04, 2012 at 04:30:57PM -0700, David Rientjes wrote:
 > On Mon, 4 Jun 2012, Dave Jones wrote:
 > 
 > > we picked this..
 > > 
 > > [21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
 > > 
 > > over say..
 > > 
 > > [21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
 > > 
 > > What went wrong here ?
 > > 
 > > And why does that score look so.. weird.
 > > 
 > 
 > It sounds like it's because pid 588 has uid=0 and the adjustment for root 
 > processes is causing an overflow.  I assume this fixes it?

Still doesn't seem right..

eg..

[42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
..
[42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
..
[42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
..

[42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
[42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB

and a slew of similar 'wrong process' death spiral kills follows..


	Dave


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-05 17:44     ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-05 17:44 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

On Mon, Jun 04, 2012 at 04:30:57PM -0700, David Rientjes wrote:
 > On Mon, 4 Jun 2012, Dave Jones wrote:
 > 
 > > we picked this..
 > > 
 > > [21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
 > > 
 > > over say..
 > > 
 > > [21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
 > > 
 > > What went wrong here ?
 > > 
 > > And why does that score look so.. weird.
 > > 
 > 
 > It sounds like it's because pid 588 has uid=0 and the adjustment for root 
 > processes is causing an overflow.  I assume this fixes it?

Still doesn't seem right..

eg..

[42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
..
[42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
..
[42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
..

[42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
[42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB

and a slew of similar 'wrong process' death spiral kills follows..


	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-05 17:44     ` Dave Jones
@ 2012-06-05 18:52       ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-05 18:52 UTC (permalink / raw)
  To: David Rientjes, Linux Kernel, linux-mm

On Tue, Jun 05, 2012 at 01:44:54PM -0400, Dave Jones wrote:
 > On Mon, Jun 04, 2012 at 04:30:57PM -0700, David Rientjes wrote:
 >  > On Mon, 4 Jun 2012, Dave Jones wrote:
 >  > 
 >  > > we picked this..
 >  > > 
 >  > > [21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
 >  > > 
 >  > > over say..
 >  > > 
 >  > > [21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
 >  > > 
 >  > > What went wrong here ?
 >  > > 
 >  > > And why does that score look so.. weird.
 >  > > 
 >  > 
 >  > It sounds like it's because pid 588 has uid=0 and the adjustment for root 
 >  > processes is causing an overflow.  I assume this fixes it?
 > 
 > Still doesn't seem right..
 > 
 > eg..
 > 
 > [42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
 > ..
 > [42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
 > ..
 > [42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
 > ..
 > 
 > [42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
 > [42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB
 > 
 > and a slew of similar 'wrong process' death spiral kills follows..

So after manually killing all the greedy processes, and getting the box to stop oom-killing
random things, it settled down. But I noticed something odd, that I think I also saw a few
weeks ago..

# free
             total       used       free     shared    buffers     cached
Mem:       3886296    3666924     219372          0       2904      20008
-/+ buffers/cache:    3644012     242284
Swap:      6029308      14488    6014820

What's using up that memory ?

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
142524 142420  99%    9.67K  47510	  3   1520320K task_struct
142560 142417  99%    1.75K   7920	 18    253440K signal_cache
142428 142302  99%    1.19K   5478	 26    175296K task_xstate
306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
143488 143306  99%    0.50K   4484	 32     71744K cred_jar
142560 142421  99%    0.50K   4455       32     71280K task_delay_info
150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128

Why so many task_structs ? There's only 128 processes running, and most of them
are kernel threads.

/sys/kernel/slab/task_struct/alloc_calls shows..

 142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7

I get the impression that the oom-killer hasn't cleaned up properly after killing some of
those forked processes.

any thoughts ?

	Dave



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-05 18:52       ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-05 18:52 UTC (permalink / raw)
  To: David Rientjes, Linux Kernel, linux-mm

On Tue, Jun 05, 2012 at 01:44:54PM -0400, Dave Jones wrote:
 > On Mon, Jun 04, 2012 at 04:30:57PM -0700, David Rientjes wrote:
 >  > On Mon, 4 Jun 2012, Dave Jones wrote:
 >  > 
 >  > > we picked this..
 >  > > 
 >  > > [21623.066911] [  588]     0   588    22206        1   2       0             0 dhclient
 >  > > 
 >  > > over say..
 >  > > 
 >  > > [21623.116597] [ 7092]  1000  7092  1051124    31660   3       0             0 trinity-child3
 >  > > 
 >  > > What went wrong here ?
 >  > > 
 >  > > And why does that score look so.. weird.
 >  > > 
 >  > 
 >  > It sounds like it's because pid 588 has uid=0 and the adjustment for root 
 >  > processes is causing an overflow.  I assume this fixes it?
 > 
 > Still doesn't seem right..
 > 
 > eg..
 > 
 > [42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
 > ..
 > [42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
 > ..
 > [42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
 > ..
 > 
 > [42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
 > [42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB
 > 
 > and a slew of similar 'wrong process' death spiral kills follows..

So after manually killing all the greedy processes, and getting the box to stop oom-killing
random things, it settled down. But I noticed something odd, that I think I also saw a few
weeks ago..

# free
             total       used       free     shared    buffers     cached
Mem:       3886296    3666924     219372          0       2904      20008
-/+ buffers/cache:    3644012     242284
Swap:      6029308      14488    6014820

What's using up that memory ?

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
142524 142420  99%    9.67K  47510	  3   1520320K task_struct
142560 142417  99%    1.75K   7920	 18    253440K signal_cache
142428 142302  99%    1.19K   5478	 26    175296K task_xstate
306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
143488 143306  99%    0.50K   4484	 32     71744K cred_jar
142560 142421  99%    0.50K   4455       32     71280K task_delay_info
150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128

Why so many task_structs ? There's only 128 processes running, and most of them
are kernel threads.

/sys/kernel/slab/task_struct/alloc_calls shows..

 142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7

I get the impression that the oom-killer hasn't cleaned up properly after killing some of
those forked processes.

any thoughts ?

	Dave


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-05 18:52       ` Dave Jones
@ 2012-06-08 19:57         ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-08 19:57 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Tue, 5 Jun 2012, Dave Jones wrote:

>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
> 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
> 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
> 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
> 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
> 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
> 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
> 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
> 
> Why so many task_structs ? There's only 128 processes running, and most of them
> are kernel threads.
> 

Do you have CONFIG_OPROFILE enabled?

> /sys/kernel/slab/task_struct/alloc_calls shows..
> 
>  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
> 
> I get the impression that the oom-killer hasn't cleaned up properly after killing some of
> those forked processes.
> 
> any thoughts ?
> 

If we're leaking task_struct's, meaning that put_task_struct() isn't 
actually freeing them when the refcount goes to 0, then it's certainly not 
because of the oom killer which only sends a SIGKILL to the selected 
process.

Have you tried kmemleak?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-08 19:57         ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-08 19:57 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Tue, 5 Jun 2012, Dave Jones wrote:

>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
> 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
> 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
> 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
> 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
> 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
> 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
> 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
> 
> Why so many task_structs ? There's only 128 processes running, and most of them
> are kernel threads.
> 

Do you have CONFIG_OPROFILE enabled?

> /sys/kernel/slab/task_struct/alloc_calls shows..
> 
>  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
> 
> I get the impression that the oom-killer hasn't cleaned up properly after killing some of
> those forked processes.
> 
> any thoughts ?
> 

If we're leaking task_struct's, meaning that put_task_struct() isn't 
actually freeing them when the refcount goes to 0, then it's certainly not 
because of the oom killer which only sends a SIGKILL to the selected 
process.

Have you tried kmemleak?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-08 19:57         ` David Rientjes
@ 2012-06-08 20:03           ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-08 20:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

On Fri, Jun 08, 2012 at 12:57:36PM -0700, David Rientjes wrote:
 > On Tue, 5 Jun 2012, Dave Jones wrote:
 > 
 > >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
 > > 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
 > > 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
 > > 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
 > > 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
 > > 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
 > > 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
 > > 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
 > > 
 > > Why so many task_structs ? There's only 128 processes running, and most of them
 > > are kernel threads.
 > > 
 > 
 > Do you have CONFIG_OPROFILE enabled?

it's modular (though I should just turn it off, I never use it these days), but not loaded.

 > > /sys/kernel/slab/task_struct/alloc_calls shows..
 > > 
 > >  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
 > > 
 > > I get the impression that the oom-killer hasn't cleaned up properly after killing some of
 > > those forked processes.
 > > 
 > > any thoughts ?
 > > 
 > 
 > If we're leaking task_struct's, meaning that put_task_struct() isn't 
 > actually freeing them when the refcount goes to 0, then it's certainly not 
 > because of the oom killer which only sends a SIGKILL to the selected 
 > process.
 > 
 > Have you tried kmemleak?

I'll give that a shot on Monday. thanks,

	Dave

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-08 20:03           ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-08 20:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

On Fri, Jun 08, 2012 at 12:57:36PM -0700, David Rientjes wrote:
 > On Tue, 5 Jun 2012, Dave Jones wrote:
 > 
 > >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
 > > 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
 > > 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
 > > 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
 > > 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
 > > 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
 > > 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
 > > 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
 > > 
 > > Why so many task_structs ? There's only 128 processes running, and most of them
 > > are kernel threads.
 > > 
 > 
 > Do you have CONFIG_OPROFILE enabled?

it's modular (though I should just turn it off, I never use it these days), but not loaded.

 > > /sys/kernel/slab/task_struct/alloc_calls shows..
 > > 
 > >  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
 > > 
 > > I get the impression that the oom-killer hasn't cleaned up properly after killing some of
 > > those forked processes.
 > > 
 > > any thoughts ?
 > > 
 > 
 > If we're leaking task_struct's, meaning that put_task_struct() isn't 
 > actually freeing them when the refcount goes to 0, then it's certainly not 
 > because of the oom killer which only sends a SIGKILL to the selected 
 > process.
 > 
 > Have you tried kmemleak?

I'll give that a shot on Monday. thanks,

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-05 17:44     ` Dave Jones
@ 2012-06-08 20:15       ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-08 20:15 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Tue, 5 Jun 2012, Dave Jones wrote:

> Still doesn't seem right..
> 
> eg..
> 
> [42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
> ..
> [42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
> ..
> [42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
> ..
> 
> [42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
> [42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB
> 
> and a slew of similar 'wrong process' death spiral kills follows..
> 

On a system not under oom conditions, i.e. before you start trinity, can 
you send the output of

	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
	grep RSS /proc/$(pidof dbus-daemon)/status

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-08 20:15       ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-08 20:15 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Tue, 5 Jun 2012, Dave Jones wrote:

> Still doesn't seem right..
> 
> eg..
> 
> [42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
> ..
> [42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
> ..
> [42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
> ..
> 
> [42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
> [42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB
> 
> and a slew of similar 'wrong process' death spiral kills follows..
> 

On a system not under oom conditions, i.e. before you start trinity, can 
you send the output of

	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
	grep RSS /proc/$(pidof dbus-daemon)/status

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-08 19:57         ` David Rientjes
@ 2012-06-08 20:37           ` Thomas Gleixner
  -1 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2012-06-08 20:37 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, Linux Kernel, linux-mm

On Fri, 8 Jun 2012, David Rientjes wrote:
> On Tue, 5 Jun 2012, Dave Jones wrote:
> 
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
> > 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
> > 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
> > 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
> > 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
> > 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
> > 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
> > 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
> > 
> > Why so many task_structs ? There's only 128 processes running, and most of them
> > are kernel threads.
> > 
> 
> Do you have CONFIG_OPROFILE enabled?
> 
> > /sys/kernel/slab/task_struct/alloc_calls shows..
> > 
> >  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
> > 
> > I get the impression that the oom-killer hasn't cleaned up properly after killing some of
> > those forked processes.
> > 
> > any thoughts ?
> > 
> 
> If we're leaking task_struct's, meaning that put_task_struct() isn't 
> actually freeing them when the refcount goes to 0, then it's certainly not 
> because of the oom killer which only sends a SIGKILL to the selected 
> process.

I rather suspect, that this is a asymetry between get_ and
put_task_struct and refcount just doesn't go to zero.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-08 20:37           ` Thomas Gleixner
  0 siblings, 0 replies; 34+ messages in thread
From: Thomas Gleixner @ 2012-06-08 20:37 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, Linux Kernel, linux-mm

On Fri, 8 Jun 2012, David Rientjes wrote:
> On Tue, 5 Jun 2012, Dave Jones wrote:
> 
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
> > 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
> > 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
> > 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
> > 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
> > 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
> > 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
> > 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
> > 
> > Why so many task_structs ? There's only 128 processes running, and most of them
> > are kernel threads.
> > 
> 
> Do you have CONFIG_OPROFILE enabled?
> 
> > /sys/kernel/slab/task_struct/alloc_calls shows..
> > 
> >  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
> > 
> > I get the impression that the oom-killer hasn't cleaned up properly after killing some of
> > those forked processes.
> > 
> > any thoughts ?
> > 
> 
> If we're leaking task_struct's, meaning that put_task_struct() isn't 
> actually freeing them when the refcount goes to 0, then it's certainly not 
> because of the oom killer which only sends a SIGKILL to the selected 
> process.

I rather suspect, that this is a asymetry between get_ and
put_task_struct and refcount just doesn't go to zero.

Thanks,

	tglx


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-08 20:15       ` David Rientjes
@ 2012-06-08 21:03         ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-08 21:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

On Fri, Jun 08, 2012 at 01:15:50PM -0700, David Rientjes wrote:
 > On Tue, 5 Jun 2012, Dave Jones wrote:
 > 
 > > Still doesn't seem right..
 > > 
 > > eg..
 > > 
 > > [42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
 > > ..
 > > [42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
 > > ..
 > > [42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
 > > ..
 > > 
 > > [42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
 > > [42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB
 > > 
 > > and a slew of similar 'wrong process' death spiral kills follows..
 > > 
 > 
 > On a system not under oom conditions, i.e. before you start trinity, can 
 > you send the output of
 > 
 > 	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > 	grep RSS /proc/$(pidof dbus-daemon)/status

# cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
-900
7441500919753
# grep RSS /proc/$(pidof dbus-daemon)/status
VmRSS:	    1660 kB



	Dave

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-08 21:03         ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-08 21:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linux Kernel, linux-mm

On Fri, Jun 08, 2012 at 01:15:50PM -0700, David Rientjes wrote:
 > On Tue, 5 Jun 2012, Dave Jones wrote:
 > 
 > > Still doesn't seem right..
 > > 
 > > eg..
 > > 
 > > [42309.542776] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
 > > ..
 > > [42309.553933] [  500]    81   500     5435        1   4     -13          -900 dbus-daemon
 > > ..
 > > [42309.597531] [ 9054]  1000  9054   528677    14540   3       0             0 trinity-child3
 > > ..
 > > 
 > > [42309.643057] Out of memory: Kill process 500 (dbus-daemon) score 511952 or sacrifice child
 > > [42309.643620] Killed process 500 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB
 > > 
 > > and a slew of similar 'wrong process' death spiral kills follows..
 > > 
 > 
 > On a system not under oom conditions, i.e. before you start trinity, can 
 > you send the output of
 > 
 > 	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > 	grep RSS /proc/$(pidof dbus-daemon)/status

# cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
-900
7441500919753
# grep RSS /proc/$(pidof dbus-daemon)/status
VmRSS:	    1660 kB



	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-08 20:37           ` Thomas Gleixner
@ 2012-06-10  2:15             ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-10  2:15 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Dave Jones, Linux Kernel, linux-mm

On Fri, 8 Jun 2012, Thomas Gleixner wrote:

> > If we're leaking task_struct's, meaning that put_task_struct() isn't 
> > actually freeing them when the refcount goes to 0, then it's certainly not 
> > because of the oom killer which only sends a SIGKILL to the selected 
> > process.
> 
> I rather suspect, that this is a asymetry between get_ and
> put_task_struct and refcount just doesn't go to zero.
> 

We found an actual task_struct leak within put_task_struct() because of 
the free task notifiers during the 3.4 -rc cycle, so if kmemleak doesn't 
show anything then would it be possible to bisect the problem Dave?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-10  2:15             ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-10  2:15 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Dave Jones, Linux Kernel, linux-mm

On Fri, 8 Jun 2012, Thomas Gleixner wrote:

> > If we're leaking task_struct's, meaning that put_task_struct() isn't 
> > actually freeing them when the refcount goes to 0, then it's certainly not 
> > because of the oom killer which only sends a SIGKILL to the selected 
> > process.
> 
> I rather suspect, that this is a asymetry between get_ and
> put_task_struct and refcount just doesn't go to zero.
> 

We found an actual task_struct leak within put_task_struct() because of 
the free task notifiers during the 3.4 -rc cycle, so if kmemleak doesn't 
show anything then would it be possible to bisect the problem Dave?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-08 21:03         ` Dave Jones
@ 2012-06-10  2:21           ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-10  2:21 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Fri, 8 Jun 2012, Dave Jones wrote:

>  > On a system not under oom conditions, i.e. before you start trinity, can 
>  > you send the output of
>  > 
>  > 	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
>  > 	grep RSS /proc/$(pidof dbus-daemon)/status
> 
> # cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
> -900
> 7441500919753
> # grep RSS /proc/$(pidof dbus-daemon)/status
> VmRSS:	    1660 kB

I'm suspecting you don't have my patch that changes the type of the 
automatic variable in oom_badness() to signed.  Could you retry this with 
that patch or pull 3.5-rc2 which already includes it?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-10  2:21           ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-10  2:21 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm

On Fri, 8 Jun 2012, Dave Jones wrote:

>  > On a system not under oom conditions, i.e. before you start trinity, can 
>  > you send the output of
>  > 
>  > 	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
>  > 	grep RSS /proc/$(pidof dbus-daemon)/status
> 
> # cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
> -900
> 7441500919753
> # grep RSS /proc/$(pidof dbus-daemon)/status
> VmRSS:	    1660 kB

I'm suspecting you don't have my patch that changes the type of the 
automatic variable in oom_badness() to signed.  Could you retry this with 
that patch or pull 3.5-rc2 which already includes it?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-10  2:21           ` David Rientjes
@ 2012-06-10  3:21             ` KOSAKI Motohiro
  -1 siblings, 0 replies; 34+ messages in thread
From: KOSAKI Motohiro @ 2012-06-10  3:21 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, Linux Kernel, linux-mm, kosaki.motohiro

(6/9/12 10:21 PM), David Rientjes wrote:
> On Fri, 8 Jun 2012, Dave Jones wrote:
>
>>   >  On a system not under oom conditions, i.e. before you start trinity, can
>>   >  you send the output of
>>   >
>>   >  	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
>>   >  	grep RSS /proc/$(pidof dbus-daemon)/status
>>
>> # cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
>> -900
>> 7441500919753
>> # grep RSS /proc/$(pidof dbus-daemon)/status
>> VmRSS:	    1660 kB
>
> I'm suspecting you don't have my patch that changes the type of the
> automatic variable in oom_badness() to signed.  Could you retry this with
> that patch or pull 3.5-rc2 which already includes it?

Yes. Dave (Jones), As far as parsed your log, you are using x86_64, right?
As far as my testing, current linus tree works fine at least normal case.
please respin.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-10  3:21             ` KOSAKI Motohiro
  0 siblings, 0 replies; 34+ messages in thread
From: KOSAKI Motohiro @ 2012-06-10  3:21 UTC (permalink / raw)
  To: David Rientjes; +Cc: Dave Jones, Linux Kernel, linux-mm, kosaki.motohiro

(6/9/12 10:21 PM), David Rientjes wrote:
> On Fri, 8 Jun 2012, Dave Jones wrote:
>
>>   >  On a system not under oom conditions, i.e. before you start trinity, can
>>   >  you send the output of
>>   >
>>   >  	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
>>   >  	grep RSS /proc/$(pidof dbus-daemon)/status
>>
>> # cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
>> -900
>> 7441500919753
>> # grep RSS /proc/$(pidof dbus-daemon)/status
>> VmRSS:	    1660 kB
>
> I'm suspecting you don't have my patch that changes the type of the
> automatic variable in oom_badness() to signed.  Could you retry this with
> that patch or pull 3.5-rc2 which already includes it?

Yes. Dave (Jones), As far as parsed your log, you are using x86_64, right?
As far as my testing, current linus tree works fine at least normal case.
please respin.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-10  3:21             ` KOSAKI Motohiro
@ 2012-06-10 20:10               ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-10 20:10 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: David Rientjes, Linux Kernel, linux-mm

On Sat, Jun 09, 2012 at 11:21:47PM -0400, KOSAKI Motohiro wrote:
 > (6/9/12 10:21 PM), David Rientjes wrote:
 > > On Fri, 8 Jun 2012, Dave Jones wrote:
 > >
 > >>   >  On a system not under oom conditions, i.e. before you start trinity, can
 > >>   >  you send the output of
 > >>   >
 > >>   >  	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > >>   >  	grep RSS /proc/$(pidof dbus-daemon)/status
 > >>
 > >> # cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > >> -900
 > >> 7441500919753
 > >> # grep RSS /proc/$(pidof dbus-daemon)/status
 > >> VmRSS:	    1660 kB
 > >
 > > I'm suspecting you don't have my patch that changes the type of the
 > > automatic variable in oom_badness() to signed.  Could you retry this with
 > > that patch or pull 3.5-rc2 which already includes it?

that was with the unsigned long -> long patch.

 > Yes. Dave (Jones), As far as parsed your log, you are using x86_64, right?

yes.

 > As far as my testing, current linus tree works fine at least normal case.
 > please respin.

To double check, here it is in rc2 (which has that patch)..

$ uname -r
3.5.0-rc2+
$ cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
-900
7441500919753
$ grep RSS /proc/$(pidof dbus-daemon)/status
VmRSS:	    1604 kB


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-10 20:10               ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-10 20:10 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: David Rientjes, Linux Kernel, linux-mm

On Sat, Jun 09, 2012 at 11:21:47PM -0400, KOSAKI Motohiro wrote:
 > (6/9/12 10:21 PM), David Rientjes wrote:
 > > On Fri, 8 Jun 2012, Dave Jones wrote:
 > >
 > >>   >  On a system not under oom conditions, i.e. before you start trinity, can
 > >>   >  you send the output of
 > >>   >
 > >>   >  	cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > >>   >  	grep RSS /proc/$(pidof dbus-daemon)/status
 > >>
 > >> # cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > >> -900
 > >> 7441500919753
 > >> # grep RSS /proc/$(pidof dbus-daemon)/status
 > >> VmRSS:	    1660 kB
 > >
 > > I'm suspecting you don't have my patch that changes the type of the
 > > automatic variable in oom_badness() to signed.  Could you retry this with
 > > that patch or pull 3.5-rc2 which already includes it?

that was with the unsigned long -> long patch.

 > Yes. Dave (Jones), As far as parsed your log, you are using x86_64, right?

yes.

 > As far as my testing, current linus tree works fine at least normal case.
 > please respin.

To double check, here it is in rc2 (which has that patch)..

$ uname -r
3.5.0-rc2+
$ cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
-900
7441500919753
$ grep RSS /proc/$(pidof dbus-daemon)/status
VmRSS:	    1604 kB

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-10 20:10               ` Dave Jones
@ 2012-06-10 23:52                 ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-10 23:52 UTC (permalink / raw)
  To: Dave Jones, KOSAKI Motohiro, Linux Kernel, linux-mm

On Sun, 10 Jun 2012, Dave Jones wrote:

> To double check, here it is in rc2 (which has that patch)..
> 
> $ uname -r
> 3.5.0-rc2+
> $ cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
> -900
> 7441500919753
> $ grep RSS /proc/$(pidof dbus-daemon)/status
> VmRSS:	    1604 kB

Eek, yes, that's definitely wrong.  The following should fix it.
---
 mm/oom_kill.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -184,6 +184,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 			  const nodemask_t *nodemask, unsigned long totalpages)
 {
 	long points;
+	long adj;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
@@ -192,7 +193,8 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	if (!p)
 		return 0;
 
-	if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) {
+	adj = p->signal->oom_score_adj;
+	if (adj == OOM_SCORE_ADJ_MIN) {
 		task_unlock(p);
 		return 0;
 	}
@@ -210,14 +212,11 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 * implementation used by LSMs.
 	 */
 	if (has_capability_noaudit(p, CAP_SYS_ADMIN))
-		points -= 30 * totalpages / 1000;
+		adj -= 30;
 
-	/*
-	 * /proc/pid/oom_score_adj ranges from -1000 to +1000 such that it may
-	 * either completely disable oom killing or always prefer a certain
-	 * task.
-	 */
-	points += p->signal->oom_score_adj * totalpages / 1000;
+	/* Normalize to oom_score_adj units */
+	adj *= totalpages / 1000;
+	points += adj;
 
 	/*
 	 * Never return 0 for an eligible task regardless of the root bonus and

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-10 23:52                 ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-10 23:52 UTC (permalink / raw)
  To: Dave Jones, KOSAKI Motohiro, Linux Kernel, linux-mm

On Sun, 10 Jun 2012, Dave Jones wrote:

> To double check, here it is in rc2 (which has that patch)..
> 
> $ uname -r
> 3.5.0-rc2+
> $ cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
> -900
> 7441500919753
> $ grep RSS /proc/$(pidof dbus-daemon)/status
> VmRSS:	    1604 kB

Eek, yes, that's definitely wrong.  The following should fix it.
---
 mm/oom_kill.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -184,6 +184,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 			  const nodemask_t *nodemask, unsigned long totalpages)
 {
 	long points;
+	long adj;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
@@ -192,7 +193,8 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	if (!p)
 		return 0;
 
-	if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) {
+	adj = p->signal->oom_score_adj;
+	if (adj == OOM_SCORE_ADJ_MIN) {
 		task_unlock(p);
 		return 0;
 	}
@@ -210,14 +212,11 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 * implementation used by LSMs.
 	 */
 	if (has_capability_noaudit(p, CAP_SYS_ADMIN))
-		points -= 30 * totalpages / 1000;
+		adj -= 30;
 
-	/*
-	 * /proc/pid/oom_score_adj ranges from -1000 to +1000 such that it may
-	 * either completely disable oom killing or always prefer a certain
-	 * task.
-	 */
-	points += p->signal->oom_score_adj * totalpages / 1000;
+	/* Normalize to oom_score_adj units */
+	adj *= totalpages / 1000;
+	points += adj;
 
 	/*
 	 * Never return 0 for an eligible task regardless of the root bonus and

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
  2012-06-10 23:52                 ` David Rientjes
@ 2012-06-11  0:46                   ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-11  0:46 UTC (permalink / raw)
  To: David Rientjes; +Cc: KOSAKI Motohiro, Linux Kernel, linux-mm

On Sun, Jun 10, 2012 at 04:52:50PM -0700, David Rientjes wrote:
 > On Sun, 10 Jun 2012, Dave Jones wrote:
 > 
 > > To double check, here it is in rc2 (which has that patch)..
 > > 
 > > $ uname -r
 > > 3.5.0-rc2+
 > > $ cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > > -900
 > > 7441500919753
 > > $ grep RSS /proc/$(pidof dbus-daemon)/status
 > > VmRSS:	    1604 kB
 > 
 > Eek, yes, that's definitely wrong.  The following should fix it.

now prints..

-900
0

which I assume is much better ;-)

	Dave

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: oomkillers gone wild.
@ 2012-06-11  0:46                   ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-11  0:46 UTC (permalink / raw)
  To: David Rientjes; +Cc: KOSAKI Motohiro, Linux Kernel, linux-mm

On Sun, Jun 10, 2012 at 04:52:50PM -0700, David Rientjes wrote:
 > On Sun, 10 Jun 2012, Dave Jones wrote:
 > 
 > > To double check, here it is in rc2 (which has that patch)..
 > > 
 > > $ uname -r
 > > 3.5.0-rc2+
 > > $ cat /proc/$(pidof dbus-daemon)/oom_score{_adj,}
 > > -900
 > > 7441500919753
 > > $ grep RSS /proc/$(pidof dbus-daemon)/status
 > > VmRSS:	    1604 kB
 > 
 > Eek, yes, that's definitely wrong.  The following should fix it.

now prints..

-900
0

which I assume is much better ;-)

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch 3.5-rc2] mm, oom: fix and cleanup oom score calculations
  2012-06-11  0:46                   ` Dave Jones
@ 2012-06-11  9:11                     ` David Rientjes
  -1 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-11  9:11 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: Dave Jones, KOSAKI Motohiro, linux-kernel, linux-mm

The divide in p->signal->oom_score_adj * totalpages / 1000 within 
oom_badness() was causing an overflow of the signed long data type.

This adds both the root bias and p->signal->oom_score_adj before doing the 
normalization which fixes the issue and also cleans up the calculation.

Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -184,6 +184,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 			  const nodemask_t *nodemask, unsigned long totalpages)
 {
 	long points;
+	long adj;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
@@ -192,7 +193,8 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	if (!p)
 		return 0;
 
-	if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) {
+	adj = p->signal->oom_score_adj;
+	if (adj == OOM_SCORE_ADJ_MIN) {
 		task_unlock(p);
 		return 0;
 	}
@@ -210,14 +212,11 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 * implementation used by LSMs.
 	 */
 	if (has_capability_noaudit(p, CAP_SYS_ADMIN))
-		points -= 30 * totalpages / 1000;
+		adj -= 30;
 
-	/*
-	 * /proc/pid/oom_score_adj ranges from -1000 to +1000 such that it may
-	 * either completely disable oom killing or always prefer a certain
-	 * task.
-	 */
-	points += p->signal->oom_score_adj * totalpages / 1000;
+	/* Normalize to oom_score_adj units */
+	adj *= totalpages / 1000;
+	points += adj;
 
 	/*
 	 * Never return 0 for an eligible task regardless of the root bonus and

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch 3.5-rc2] mm, oom: fix and cleanup oom score calculations
@ 2012-06-11  9:11                     ` David Rientjes
  0 siblings, 0 replies; 34+ messages in thread
From: David Rientjes @ 2012-06-11  9:11 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: Dave Jones, KOSAKI Motohiro, linux-kernel, linux-mm

The divide in p->signal->oom_score_adj * totalpages / 1000 within 
oom_badness() was causing an overflow of the signed long data type.

This adds both the root bias and p->signal->oom_score_adj before doing the 
normalization which fixes the issue and also cleans up the calculation.

Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -184,6 +184,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 			  const nodemask_t *nodemask, unsigned long totalpages)
 {
 	long points;
+	long adj;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
@@ -192,7 +193,8 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	if (!p)
 		return 0;
 
-	if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) {
+	adj = p->signal->oom_score_adj;
+	if (adj == OOM_SCORE_ADJ_MIN) {
 		task_unlock(p);
 		return 0;
 	}
@@ -210,14 +212,11 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 * implementation used by LSMs.
 	 */
 	if (has_capability_noaudit(p, CAP_SYS_ADMIN))
-		points -= 30 * totalpages / 1000;
+		adj -= 30;
 
-	/*
-	 * /proc/pid/oom_score_adj ranges from -1000 to +1000 such that it may
-	 * either completely disable oom killing or always prefer a certain
-	 * task.
-	 */
-	points += p->signal->oom_score_adj * totalpages / 1000;
+	/* Normalize to oom_score_adj units */
+	adj *= totalpages / 1000;
+	points += adj;
 
 	/*
 	 * Never return 0 for an eligible task regardless of the root bonus and

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 3.5-rc2] mm, oom: fix and cleanup oom score calculations
  2012-06-11  9:11                     ` David Rientjes
@ 2012-06-11 19:13                       ` Dave Jones
  -1 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-11 19:13 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Linus Torvalds, KOSAKI Motohiro, linux-kernel, linux-mm

On Mon, Jun 11, 2012 at 02:11:50AM -0700, David Rientjes wrote:
 > The divide in p->signal->oom_score_adj * totalpages / 1000 within 
 > oom_badness() was causing an overflow of the signed long data type.
 > 
 > This adds both the root bias and p->signal->oom_score_adj before doing the 
 > normalization which fixes the issue and also cleans up the calculation.
 > 
 > Tested-by: Dave Jones <davej@redhat.com>
 > Signed-off-by: David Rientjes <rientjes@google.com>

Even with this, I'm still seeing the wrong processes getting killed..


[17384.172181] trinity-child4 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
[17384.172435] trinity-child4 cpuset=/ mems_allowed=0
[17384.172463] Pid: 32149, comm: trinity-child4 Not tainted 3.5.0-rc2+ #86
[17384.172495] Call Trace:
[17384.172515]  [<ffffffff8164e2f5>] ? _raw_spin_unlock+0x55/0x60
[17384.172547]  [<ffffffff81144014>] dump_header+0x84/0x350
[17384.172576]  [<ffffffff810b7f35>] ? trace_hardirqs_on_caller+0x115/0x1a0
[17384.172610]  [<ffffffff8164e262>] ? _raw_spin_unlock_irqrestore+0x42/0x80
[17384.172646]  [<ffffffff8131face>] ? ___ratelimit+0x9e/0x130
[17384.172676]  [<ffffffff8114486e>] oom_kill_process+0x1ee/0x340
[17384.172707]  [<ffffffff81144ea9>] out_of_memory+0x249/0x410
[17384.172737]  [<ffffffff8114aded>] __alloc_pages_nodemask+0xa6d/0xab0
[17384.172770]  [<ffffffff8118a9e6>] alloc_pages_vma+0xb6/0x190
[17384.172802]  [<ffffffff81169c0a>] do_wp_page+0xca/0x790
[17384.172830]  [<ffffffff8164e2d5>] ? _raw_spin_unlock+0x35/0x60
[17384.172861]  [<ffffffff8116becd>] handle_pte_fault+0x73d/0x9d0
[17384.172891]  [<ffffffff810b26b0>] ? lock_release_holdtime.part.24+0x90/0x140
[17384.172928]  [<ffffffff8116c96f>] handle_mm_fault+0x21f/0x2e0
[17384.172960]  [<ffffffff8165158b>] do_page_fault+0x13b/0x4b0
[17384.172990]  [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
[17384.173019]  [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
[17384.173049]  [<ffffffff81651c59>] ? sub_preempt_count+0x79/0xd0
[17384.173080]  [<ffffffff8164e903>] ? error_sti+0x5/0x6
[17384.173991]  [<ffffffff813271bd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[17384.174894]  [<ffffffff8164e70f>] page_fault+0x1f/0x30
[17384.175830]  [<ffffffff810b26b0>] ? lock_release_holdtime.part.24+0x90/0x140
[17384.176751]  [<ffffffff810b1fde>] ? put_lock_stats.isra.23+0xe/0x40
[17384.177643]  [<ffffffff8132705d>] ? __put_user_4+0x1d/0x30
[17384.178517]  [<ffffffff8106611c>] ? sys_getresuid+0x7c/0xd0
[17384.179383]  [<ffffffff81655c52>] system_call_fastpath+0x16/0x1b
[17384.180238] Mem-Info:
[17384.180996] Node 0 DMA per-cpu:
[17384.181652] CPU    0: hi:    0, btch:   1 usd:   0
[17384.182306] CPU    1: hi:    0, btch:   1 usd:   0
[17384.182944] CPU    2: hi:    0, btch:   1 usd:   0
[17384.183567] CPU    3: hi:    0, btch:   1 usd:   0
[17384.184177] CPU    4: hi:    0, btch:   1 usd:   0
[17384.184772] CPU    5: hi:    0, btch:   1 usd:   0
[17384.185442] CPU    6: hi:    0, btch:   1 usd:   0
[17384.186020] CPU    7: hi:    0, btch:   1 usd:   0
[17384.186591] Node 0 DMA32 per-cpu:
[17384.187158] CPU    0: hi:  186, btch:  31 usd:   1
[17384.187723] CPU    1: hi:  186, btch:  31 usd:   4
[17384.188271] CPU    2: hi:  186, btch:  31 usd:  35
[17384.188809] CPU    3: hi:  186, btch:  31 usd:  47
[17384.189338] CPU    4: hi:  186, btch:  31 usd:  21
[17384.189857] CPU    5: hi:  186, btch:  31 usd:  85
[17384.190367] CPU    6: hi:  186, btch:  31 usd: 134
[17384.190854] CPU    7: hi:  186, btch:  31 usd:  82
[17384.191328] Node 0 Normal per-cpu:
[17384.191795] CPU    0: hi:  186, btch:  31 usd:   5
[17384.192260] CPU    1: hi:  186, btch:  31 usd:   0
[17384.192717] CPU    2: hi:  186, btch:  31 usd:  35
[17384.193169] CPU    3: hi:  186, btch:  31 usd:  43
[17384.193611] CPU    4: hi:  186, btch:  31 usd:  46
[17384.194045] CPU    5: hi:  186, btch:  31 usd:  82
[17384.194472] CPU    6: hi:  186, btch:  31 usd:  31
[17384.194892] CPU    7: hi:  186, btch:  31 usd:  36
[17384.195345] active_anon:20877 inactive_anon:20816 isolated_anon:542
[17384.195345]  active_file:142 inactive_file:490 isolated_file:0
[17384.195345]  unevictable:97 dirty:0 writeback:20142 unstable:0
[17384.195345]  free:22184 slab_reclaimable:29060 slab_unreclaimable:581537
[17384.195345]  mapped:164 shmem:888 pagetables:3103 bounce:0
[17384.197465] Node 0 DMA free:15776kB min:264kB low:328kB high:396kB active_anon:0kB inactive_anon:44kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15632kB mlocked:0kB dirty:0kB writeback:60kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:48kB kernel_stack:8kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2264 all_unreclaimable? yes
[17384.199388] lowmem_reserve[]: 0 2643 3878 3878
[17384.199895] Node 0 DMA32 free:51032kB min:45888kB low:57360kB high:68832kB active_anon:49052kB inactive_anon:49232kB active_file:8kB inactive_file:0kB unevictable:208kB isolated(anon):1504kB isolated(file):208kB present:2707204kB mlocked:208kB dirty:0kB writeback:49584kB mapped:404kB shmem:24kB slab_reclaimable:4304kB slab_unreclaimable:1914140kB kernel_stack:602920kB pagetables:3132kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:234420 all_unreclaimable? yes
[17384.202166] lowmem_reserve[]: 0 0 1234 1234
[17384.202751] Node 0 Normal free:21928kB min:21424kB low:26780kB high:32136kB active_anon:34460kB inactive_anon:33988kB active_file:560kB inactive_file:2140kB unevictable:180kB isolated(anon):664kB isolated(file):0kB present:1264032kB mlocked:180kB dirty:0kB writeback:30924kB mapped:252kB shmem:3528kB slab_reclaimable:111936kB slab_unreclaimable:411960kB kernel_stack:492392kB pagetables:9280kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:142173 all_unreclaimable? yes
[17384.205361] lowmem_reserve[]: 0 0 0 0
[17384.206038] Node 0 DMA: 1*4kB 3*8kB 4*16kB 0*32kB 1*64kB 2*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 2*4096kB = 15772kB
[17384.206747] Node 0 DMA32: 785*4kB 310*8kB 495*16kB 273*32kB 340*64kB 47*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 52100kB
[17384.208180] Node 0 Normal: 240*4kB 2342*8kB 6*16kB 0*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 21968kB
[17384.208952] 25433 total pagecache pages
[17384.209725] 23956 pages in swap cache
[17384.210503] Swap cache stats: add 1989506, delete 1965550, find 7882958/7981228
[17384.211307] Free swap  = 5833752kB
[17384.212115] Total swap = 6029308kB
[17384.220271] 1041904 pages RAM
[17384.221089] 70330 pages reserved
[17384.221894] 17795 pages shared
[17384.222698] 806729 pages non-shared
[17384.223507] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[17384.224410] [  327]     0   327     7866        1   7       0             0 systemd-journal
[17384.225345] [  334]     0   334     6566        1   2     -17         -1000 udevd
[17384.226282] [  482]     0   482    86847        1   0       0             0 NetworkManager
[17384.227205] [  485]     0   485     4866        1   0       0             0 smartd
[17384.228082] [  487]     0   487    29598        1   0       0             0 crond
[17384.228953] [  489]     0   489     7068        1   6       0             0 systemd-logind
[17384.229823] [  490]    38   490     8301        1   5       0             0 ntpd
[17384.230746] [  503]    81   503     5435        1   4     -13          -900 dbus-daemon
[17384.231678] [  506]     0   506    47106        1   7     -13          -900 polkitd
[17384.232614] [  510]     0   510    17260        1   0     -13          -900 modem-manager
[17384.233560] [  511]     0   511    27239        1   7       0             0 agetty
[17384.234564] [  522]     0   522     4790       16   3       0             0 rpcbind
[17384.235628] [  530]     0   530    22206        1   0       0             0 dhclient
[17384.236634] [  536]     0   536    19403        1   6     -17         -1000 sshd
[17384.237599] [  578]    29   578     5883        1   2       0             0 rpc.statd
[17384.238561] [  585]     0   585     5993        1   1       0             0 rpc.mountd
[17384.239522] [  588]     0   588     5845        1   3       0             0 rpc.idmapd
[17384.240479] [  598]     0   598    60782        1   5       0             0 rsyslogd
[17384.241427] [  616]     0   616    21273       38   2       0             0 sendmail
[17384.242358] [  636]    51   636    20167        1   4       0             0 sendmail
[17384.243267] [  786]  1000   786     6100       78   7       0             0 tmux
[17384.244158] [  787]  1000   787    28989        1   4       0             0 bash
[17384.244996] [  861]  1000   861    28022        2   2       0             0 test-random.sh
[17384.245892] [  868]  1000   868     4433       19   1       0             0 trinity
[17384.246744] [  869]  1000   869     4417        5   4       0             0 trinity
[17384.247574] [  870]  1000   870     4365       18   4       0             0 trinity
[17384.248332] [  871]  1000   871     4347        1   7       0             0 trinity
[17384.249117] [  872]  1000   872     4387       19   2       0             0 trinity
[17384.249885] [  873]  1000   873     4285       19   7       0             0 trinity
[17384.250582] [  874]  1000   874     4293       14   0       0             0 trinity
[17384.251258] [  875]  1000   875     4423       20   4       0             0 trinity
[17384.251910] [  876]  1000   876     4365       16   4       0             0 trinity
[17384.252537] [  877]  1000   877     4391       19   2       0             0 trinity
[17384.253149] [  878]  1000   878     4377       10   1       0             0 trinity
[17384.253741] [  879]  1000   879     4347        9   7       0             0 trinity
[17384.254313] [  880]  1000   880     4401       10   4       0             0 trinity
[17384.254877] [  881]  1000   881     4429       18   1       0             0 trinity
[17384.255406] [  882]  1000   882     4389        7   4       0             0 trinity
[17384.255913] [  883]  1000   883     4389       17   2       0             0 trinity
[17384.256414] [  973]     0   973     6566        1   6     -17         -1000 udevd
[17384.256912] [  974]     0   974     6549        1   3     -17         -1000 udevd
[17384.257399] [ 4895]  1000  4895     4387       30   6       0             0 trinity-watchdo
[17384.257897] [ 4952]  1000  4952     4365       28   4       0             0 trinity-watchdo
[17384.258394] [ 4953]  1000  4953     4433       31   1       0             0 trinity-watchdo
[17384.258890] [ 4954]  1000  4954     4347       23   5       0             0 trinity-watchdo
[17384.259389] [ 5300]  1000  5300     4417       16   1       0             0 trinity-watchdo
[17384.259886] [ 5688]  1000  5688     4389       15   6       0             0 trinity-watchdo
[17384.260373] [ 5868]  1000  5868     4429       30   3       0             0 trinity-watchdo
[17384.260847] [ 5972]  1000  5972     4377       20   2       0             0 trinity-watchdo
[17384.261314] [ 5973]  1000  5973     4389       29   4       0             0 trinity-watchdo
[17384.261775] [ 6125]  1000  6125     4347       19   4       0             0 trinity-watchdo
[17384.262232] [ 6136]  1000  6136     4293       18   2       0             0 trinity-watchdo
[17384.262688] [ 6137]  1000  6137     4401       17   6       0             0 trinity-watchdo
[17384.263143] [ 6138]  1000  6138     4423       29   6       0             0 trinity-watchdo
[17384.263593] [ 6139]  1000  6139     4285       30   4       0             0 trinity-watchdo
[17384.264036] [ 6227]  1000  6227     4365       31   6       0             0 trinity-watchdo
[17384.264476] [ 6288]  1000  6288     4391       30   3       0             0 trinity-watchdo
[17384.264930] [30721]  1000 30721     5239       24   7       0             0 trinity-child7
[17384.265361] [30881]  1000 30881     4897       27   2       0             0 trinity-child2
[17384.265791] [30921]  1000 30921     4389      242   0       0             0 trinity-child0
[17384.266216] [30928]  1000 30928     5479       36   0       0             0 trinity-child0
[17384.266637] [30939]  1000 30939     4864       33   1       0             0 trinity-child1
[17384.267052] [31051]  1000 31051     4365      171   2       0             0 trinity-child2
[17384.267467] [31052]  1000 31052     5849     1345   1       0             0 trinity-child1
[17384.267882] [31103]  1000 31103     6277     1764   6       0             0 trinity-child6
[17384.268297] [31109]  1000 31109     4429      109   7       0             0 trinity-child7
[17384.268709] [31137]  1000 31137     4465      303   0       0             0 trinity-child0
[17384.269124] [31141]  1000 31141     5220       65   3       0             0 trinity-child3
[17384.269535] [31158]  1000 31158     4425       70   3       0             0 trinity-child3
[17384.269948] [31160]  1000 31160     5978     1899   7       0             0 trinity-child7
[17384.270355] [31178]  1000 31178     4429      228   0       0             0 trinity-child0
[17384.270761] [31245]  1000 31245     4391      208   0       0             0 trinity-child0
[17384.271162] [31284]  1000 31284     5089      109   1       0             0 trinity-child1
[17384.271575] [31333]  1000 31333     5201      309   2       0             0 trinity-child2
[17384.271984] [31349]  1000 31349     4781      189   0       0             0 trinity-child0
[17384.272393] [31358]  1000 31358     4743       24   5       0             0 trinity-child5
[17384.272806] [31359]  1000 31359     4347       34   3       0             0 trinity-child3
[17384.273219] [31360]  1000 31360     4644       34   6       0             0 trinity-child6
[17384.273634] [31362]  1000 31362     4347       34   1       0             0 trinity-child1
[17384.274049] [31363]  1000 31363     4347       29   2       0             0 trinity-child2
[17384.274466] [31364]  1000 31364     4347       24   4       0             0 trinity-child4
[17384.274892] [31365]  1000 31365     4347       17   0       0             0 trinity-child0
[17384.275305] [31409]  1000 31409     4347        1   3       0             0 trinity-child3
[17384.275719] [31620]  1000 31620     6661       31   7       0             0 trinity-child7
[17384.276134] [31621]  1000 31621     7161       43   5       0             0 trinity-child5
[17384.276550] [31654]  1000 31654     5576       35   4       0             0 trinity-child4
[17384.276972] [31666]  1000 31666     4377       31   0       0             0 trinity-child0
[17384.277385] [31739]  1000 31739     4347       30   6       0             0 trinity-child6
[17384.277797] [31742]  1000 31742     4816      312   7       0             0 trinity-child7
[17384.278203] [31743]  1000 31743     4865      505   0       0             0 trinity-child0
[17384.278609] [31744]  1000 31744     4864       26   3       0             0 trinity-child3
[17384.279016] [31745]  1000 31745     4863       47   5       0             0 trinity-child5
[17384.279422] [31746]  1000 31746     4863       28   4       0             0 trinity-child4
[17384.279829] [31747]  1000 31747     4898       25   7       0             0 trinity-child7
[17384.280235] [31748]  1000 31748     4901       27   5       0             0 trinity-child5
[17384.280642] [31749]  1000 31749     4293       27   4       0             0 trinity-child4
[17384.281049] [31750]  1000 31750     4524      311   0       0             0 trinity-child0
[17384.281460] [31751]  1000 31751     4389       23   3       0             0 trinity-child3
[17384.281870] [31752]  1000 31752     5479       83   2       0             0 trinity-child2
[17384.282279] [31753]  1000 31753     5913      506   6       0             0 trinity-child6
[17384.282689] [31754]  1000 31754     5908      479   4       0             0 trinity-child4
[17384.283103] [31755]  1000 31755     4951       61   7       0             0 trinity-child7
[17384.283518] [31756]  1000 31756     4389       23   5       0             0 trinity-child5
[17384.283936] [31757]  1000 31757     4389       23   1       0             0 trinity-child1
[17384.284350] [31758]  1000 31758     5128      196   6       0             0 trinity-child6
[17384.284786] [31779]  1000 31779     4326       70   7       0             0 trinity-child7
[17384.285206] [31838]  1000 31838     4464      371   5       0             0 trinity-child5
[17384.285620] [31839]  1000 31839     4543      427   1       0             0 trinity-child1
[17384.286037] [31840]  1000 31840     4431      313   6       0             0 trinity-child6
[17384.286454] [31859]  1000 31859     4541      427   2       0             0 trinity-child2
[17384.286867] [31865]  1000 31865     4327       94   6       0             0 trinity-child6
[17384.287281] [31876]  1000 31876     4374      382   6       0             0 trinity-child6
[17384.287693] [31877]  1000 31877     4365      348   0       0             0 trinity-child0
[17384.288105] [31878]  1000 31878     4365      344   1       0             0 trinity-child1
[17384.288516] [31879]  1000 31879     4365      333   5       0             0 trinity-child5
[17384.288922] [31880]  1000 31880     4374      381   7       0             0 trinity-child7
[17384.289328] [31882]  1000 31882     4364      298   4       0             0 trinity-child4
[17384.289735] [31884]  1000 31884     4364      344   3       0             0 trinity-child3
[17384.290139] [31895]  1000 31895     4431      352   3       0             0 trinity-child3
[17384.290543] [31979]  1000 31979     4389      369   4       0             0 trinity-child4
[17384.290944] [31980]  1000 31980     4425      280   5       0             0 trinity-child5
[17384.291348] [31981]  1000 31981     5216     1164   2       0             0 trinity-child2
[17384.291750] [31983]  1000 31983     5546     1489   1       0             0 trinity-child1
[17384.292154] [31984]  1000 31984     4753      737   5       0             0 trinity-child5
[17384.292557] [31985]  1000 31985     5021      989   1       0             0 trinity-child1
[17384.292962] [31986]  1000 31986     4389      419   7       0             0 trinity-child7
[17384.293369] [31987]  1000 31987     4984      952   3       0             0 trinity-child3
[17384.293778] [31988]  1000 31988     4389      445   6       0             0 trinity-child6
[17384.294190] [31989]  1000 31989     6310     2260   6       0             0 trinity-child6
[17384.294603] [32072]  1000 32072     4648      568   2       0             0 trinity-child2
[17384.295026] [32073]  1000 32073     4285      227   1       0             0 trinity-child1
[17384.295441] [32074]  1000 32074     4648      609   3       0             0 trinity-child3
[17384.295856] [32079]  1000 32079     4429      292   3       0             0 trinity-child3
[17384.296267] [32083]  1000 32083     4429      292   4       0             0 trinity-child4
[17384.296677] [32084]  1000 32084     4549      538   5       0             0 trinity-child5
[17384.297083] [32085]  1000 32085     4648      567   6       0             0 trinity-child6
[17384.297490] [32094]  1000 32094     4429      128   5       0             0 trinity-child5
[17384.297899] [32095]  1000 32095     4429       34   3       0             0 trinity-child2
[17384.298306] [32104]  1000 32104     4465      330   7       0             0 trinity-child7
[17384.298715] [32117]  1000 32117     4648      577   4       0             0 trinity-child4
[17384.299128] [32128]  1000 32128     4458      382   4       0             0 trinity-child4
[17384.299537] [32131]  1000 32131     4433      126   6       0             0 trinity-child6
[17384.299948] [32132]  1000 32132     4433      120   5       0             0 trinity-child5
[17384.300356] [32133]  1000 32133     4432      373   7       0             0 trinity-child7
[17384.300764] [32135]  1000 32135     4433      121   1       0             0 trinity-child1
[17384.301172] [32136]  1000 32136     4432      344   2       0             0 trinity-child2
[17384.301581] [32138]  1000 32138     4387      152   2       0             0 trinity-child2
[17384.301987] [32146]  1000 32146     4387      148   3       0             0 trinity-child3
[17384.302391] [32148]  1000 32148     4433       74   0       0             0 trinity-child0
[17384.302795] [32149]  1000 32149     4433      103   4       0             0 trinity-child4
[17384.303200] [32150]  1000 32150     4433      115   3       0             0 trinity-child3
[17384.303606] [32151]  1000 32151     4285      200   7       0             0 trinity-child7
[17384.304011] [32162]  1000 32162     4365      160   4       0             0 trinity-child4
[17384.304417] [32163]  1000 32163     4387      156   1       0             0 trinity-child1
[17384.304832] [32178]  1000 32178     4387       35   1       0             0 trinity-child6
[17384.305238] [32179]  1000 32179     4391      214   2       0             0 trinity-child2
[17384.305646] [32180]  1000 32180     4387      147   0       0             0 trinity-child0
[17384.306053] [32181]  1000 32181     4387      153   4       0             0 trinity-child4
[17384.306459] [32182]  1000 32182     4423      225   4       0             0 trinity-child4
[17384.306865] [32183]  1000 32183     4885      621   5       0             0 trinity-child5
[17384.307270] [32184]  1000 32184     4885      632   1       0             0 trinity-child1
[17384.307676] [32185]  1000 32185     4423      233   6       0             0 trinity-child6
[17384.308086] [32186]  1000 32186     4423      142   2       0             0 trinity-child2
[17384.308493] [32187]  1000 32187     4423      118   0       0             0 trinity-child0
[17384.308897] [32188]  1000 32188     4423      185   7       0             0 trinity-child7
[17384.309302] [32189]  1000 32189     4423      232   3       0             0 trinity-child3
[17384.309707] [32190]  1000 32190     4387       35   4       0             0 trinity-child5
[17384.310116] Out of memory: Kill process 503 (dbus-daemon) score 511952 or sacrifice child
[17384.310534] Killed process 503 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 3.5-rc2] mm, oom: fix and cleanup oom score calculations
@ 2012-06-11 19:13                       ` Dave Jones
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Jones @ 2012-06-11 19:13 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, Linus Torvalds, KOSAKI Motohiro, linux-kernel, linux-mm

On Mon, Jun 11, 2012 at 02:11:50AM -0700, David Rientjes wrote:
 > The divide in p->signal->oom_score_adj * totalpages / 1000 within 
 > oom_badness() was causing an overflow of the signed long data type.
 > 
 > This adds both the root bias and p->signal->oom_score_adj before doing the 
 > normalization which fixes the issue and also cleans up the calculation.
 > 
 > Tested-by: Dave Jones <davej@redhat.com>
 > Signed-off-by: David Rientjes <rientjes@google.com>

Even with this, I'm still seeing the wrong processes getting killed..


[17384.172181] trinity-child4 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
[17384.172435] trinity-child4 cpuset=/ mems_allowed=0
[17384.172463] Pid: 32149, comm: trinity-child4 Not tainted 3.5.0-rc2+ #86
[17384.172495] Call Trace:
[17384.172515]  [<ffffffff8164e2f5>] ? _raw_spin_unlock+0x55/0x60
[17384.172547]  [<ffffffff81144014>] dump_header+0x84/0x350
[17384.172576]  [<ffffffff810b7f35>] ? trace_hardirqs_on_caller+0x115/0x1a0
[17384.172610]  [<ffffffff8164e262>] ? _raw_spin_unlock_irqrestore+0x42/0x80
[17384.172646]  [<ffffffff8131face>] ? ___ratelimit+0x9e/0x130
[17384.172676]  [<ffffffff8114486e>] oom_kill_process+0x1ee/0x340
[17384.172707]  [<ffffffff81144ea9>] out_of_memory+0x249/0x410
[17384.172737]  [<ffffffff8114aded>] __alloc_pages_nodemask+0xa6d/0xab0
[17384.172770]  [<ffffffff8118a9e6>] alloc_pages_vma+0xb6/0x190
[17384.172802]  [<ffffffff81169c0a>] do_wp_page+0xca/0x790
[17384.172830]  [<ffffffff8164e2d5>] ? _raw_spin_unlock+0x35/0x60
[17384.172861]  [<ffffffff8116becd>] handle_pte_fault+0x73d/0x9d0
[17384.172891]  [<ffffffff810b26b0>] ? lock_release_holdtime.part.24+0x90/0x140
[17384.172928]  [<ffffffff8116c96f>] handle_mm_fault+0x21f/0x2e0
[17384.172960]  [<ffffffff8165158b>] do_page_fault+0x13b/0x4b0
[17384.172990]  [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
[17384.173019]  [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
[17384.173049]  [<ffffffff81651c59>] ? sub_preempt_count+0x79/0xd0
[17384.173080]  [<ffffffff8164e903>] ? error_sti+0x5/0x6
[17384.173991]  [<ffffffff813271bd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[17384.174894]  [<ffffffff8164e70f>] page_fault+0x1f/0x30
[17384.175830]  [<ffffffff810b26b0>] ? lock_release_holdtime.part.24+0x90/0x140
[17384.176751]  [<ffffffff810b1fde>] ? put_lock_stats.isra.23+0xe/0x40
[17384.177643]  [<ffffffff8132705d>] ? __put_user_4+0x1d/0x30
[17384.178517]  [<ffffffff8106611c>] ? sys_getresuid+0x7c/0xd0
[17384.179383]  [<ffffffff81655c52>] system_call_fastpath+0x16/0x1b
[17384.180238] Mem-Info:
[17384.180996] Node 0 DMA per-cpu:
[17384.181652] CPU    0: hi:    0, btch:   1 usd:   0
[17384.182306] CPU    1: hi:    0, btch:   1 usd:   0
[17384.182944] CPU    2: hi:    0, btch:   1 usd:   0
[17384.183567] CPU    3: hi:    0, btch:   1 usd:   0
[17384.184177] CPU    4: hi:    0, btch:   1 usd:   0
[17384.184772] CPU    5: hi:    0, btch:   1 usd:   0
[17384.185442] CPU    6: hi:    0, btch:   1 usd:   0
[17384.186020] CPU    7: hi:    0, btch:   1 usd:   0
[17384.186591] Node 0 DMA32 per-cpu:
[17384.187158] CPU    0: hi:  186, btch:  31 usd:   1
[17384.187723] CPU    1: hi:  186, btch:  31 usd:   4
[17384.188271] CPU    2: hi:  186, btch:  31 usd:  35
[17384.188809] CPU    3: hi:  186, btch:  31 usd:  47
[17384.189338] CPU    4: hi:  186, btch:  31 usd:  21
[17384.189857] CPU    5: hi:  186, btch:  31 usd:  85
[17384.190367] CPU    6: hi:  186, btch:  31 usd: 134
[17384.190854] CPU    7: hi:  186, btch:  31 usd:  82
[17384.191328] Node 0 Normal per-cpu:
[17384.191795] CPU    0: hi:  186, btch:  31 usd:   5
[17384.192260] CPU    1: hi:  186, btch:  31 usd:   0
[17384.192717] CPU    2: hi:  186, btch:  31 usd:  35
[17384.193169] CPU    3: hi:  186, btch:  31 usd:  43
[17384.193611] CPU    4: hi:  186, btch:  31 usd:  46
[17384.194045] CPU    5: hi:  186, btch:  31 usd:  82
[17384.194472] CPU    6: hi:  186, btch:  31 usd:  31
[17384.194892] CPU    7: hi:  186, btch:  31 usd:  36
[17384.195345] active_anon:20877 inactive_anon:20816 isolated_anon:542
[17384.195345]  active_file:142 inactive_file:490 isolated_file:0
[17384.195345]  unevictable:97 dirty:0 writeback:20142 unstable:0
[17384.195345]  free:22184 slab_reclaimable:29060 slab_unreclaimable:581537
[17384.195345]  mapped:164 shmem:888 pagetables:3103 bounce:0
[17384.197465] Node 0 DMA free:15776kB min:264kB low:328kB high:396kB active_anon:0kB inactive_anon:44kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15632kB mlocked:0kB dirty:0kB writeback:60kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:48kB kernel_stack:8kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2264 all_unreclaimable? yes
[17384.199388] lowmem_reserve[]: 0 2643 3878 3878
[17384.199895] Node 0 DMA32 free:51032kB min:45888kB low:57360kB high:68832kB active_anon:49052kB inactive_anon:49232kB active_file:8kB inactive_file:0kB unevictable:208kB isolated(anon):1504kB isolated(file):208kB present:2707204kB mlocked:208kB dirty:0kB writeback:49584kB mapped:404kB shmem:24kB slab_reclaimable:4304kB slab_unreclaimable:1914140kB kernel_stack:602920kB pagetables:3132kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:234420 all_unreclaimable? yes
[17384.202166] lowmem_reserve[]: 0 0 1234 1234
[17384.202751] Node 0 Normal free:21928kB min:21424kB low:26780kB high:32136kB active_anon:34460kB inactive_anon:33988kB active_file:560kB inactive_file:2140kB unevictable:180kB isolated(anon):664kB isolated(file):0kB present:1264032kB mlocked:180kB dirty:0kB writeback:30924kB mapped:252kB shmem:3528kB slab_reclaimable:111936kB slab_unreclaimable:411960kB kernel_stack:492392kB pagetables:9280kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:142173 all_unreclaimable? yes
[17384.205361] lowmem_reserve[]: 0 0 0 0
[17384.206038] Node 0 DMA: 1*4kB 3*8kB 4*16kB 0*32kB 1*64kB 2*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 2*4096kB = 15772kB
[17384.206747] Node 0 DMA32: 785*4kB 310*8kB 495*16kB 273*32kB 340*64kB 47*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 52100kB
[17384.208180] Node 0 Normal: 240*4kB 2342*8kB 6*16kB 0*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 21968kB
[17384.208952] 25433 total pagecache pages
[17384.209725] 23956 pages in swap cache
[17384.210503] Swap cache stats: add 1989506, delete 1965550, find 7882958/7981228
[17384.211307] Free swap  = 5833752kB
[17384.212115] Total swap = 6029308kB
[17384.220271] 1041904 pages RAM
[17384.221089] 70330 pages reserved
[17384.221894] 17795 pages shared
[17384.222698] 806729 pages non-shared
[17384.223507] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[17384.224410] [  327]     0   327     7866        1   7       0             0 systemd-journal
[17384.225345] [  334]     0   334     6566        1   2     -17         -1000 udevd
[17384.226282] [  482]     0   482    86847        1   0       0             0 NetworkManager
[17384.227205] [  485]     0   485     4866        1   0       0             0 smartd
[17384.228082] [  487]     0   487    29598        1   0       0             0 crond
[17384.228953] [  489]     0   489     7068        1   6       0             0 systemd-logind
[17384.229823] [  490]    38   490     8301        1   5       0             0 ntpd
[17384.230746] [  503]    81   503     5435        1   4     -13          -900 dbus-daemon
[17384.231678] [  506]     0   506    47106        1   7     -13          -900 polkitd
[17384.232614] [  510]     0   510    17260        1   0     -13          -900 modem-manager
[17384.233560] [  511]     0   511    27239        1   7       0             0 agetty
[17384.234564] [  522]     0   522     4790       16   3       0             0 rpcbind
[17384.235628] [  530]     0   530    22206        1   0       0             0 dhclient
[17384.236634] [  536]     0   536    19403        1   6     -17         -1000 sshd
[17384.237599] [  578]    29   578     5883        1   2       0             0 rpc.statd
[17384.238561] [  585]     0   585     5993        1   1       0             0 rpc.mountd
[17384.239522] [  588]     0   588     5845        1   3       0             0 rpc.idmapd
[17384.240479] [  598]     0   598    60782        1   5       0             0 rsyslogd
[17384.241427] [  616]     0   616    21273       38   2       0             0 sendmail
[17384.242358] [  636]    51   636    20167        1   4       0             0 sendmail
[17384.243267] [  786]  1000   786     6100       78   7       0             0 tmux
[17384.244158] [  787]  1000   787    28989        1   4       0             0 bash
[17384.244996] [  861]  1000   861    28022        2   2       0             0 test-random.sh
[17384.245892] [  868]  1000   868     4433       19   1       0             0 trinity
[17384.246744] [  869]  1000   869     4417        5   4       0             0 trinity
[17384.247574] [  870]  1000   870     4365       18   4       0             0 trinity
[17384.248332] [  871]  1000   871     4347        1   7       0             0 trinity
[17384.249117] [  872]  1000   872     4387       19   2       0             0 trinity
[17384.249885] [  873]  1000   873     4285       19   7       0             0 trinity
[17384.250582] [  874]  1000   874     4293       14   0       0             0 trinity
[17384.251258] [  875]  1000   875     4423       20   4       0             0 trinity
[17384.251910] [  876]  1000   876     4365       16   4       0             0 trinity
[17384.252537] [  877]  1000   877     4391       19   2       0             0 trinity
[17384.253149] [  878]  1000   878     4377       10   1       0             0 trinity
[17384.253741] [  879]  1000   879     4347        9   7       0             0 trinity
[17384.254313] [  880]  1000   880     4401       10   4       0             0 trinity
[17384.254877] [  881]  1000   881     4429       18   1       0             0 trinity
[17384.255406] [  882]  1000   882     4389        7   4       0             0 trinity
[17384.255913] [  883]  1000   883     4389       17   2       0             0 trinity
[17384.256414] [  973]     0   973     6566        1   6     -17         -1000 udevd
[17384.256912] [  974]     0   974     6549        1   3     -17         -1000 udevd
[17384.257399] [ 4895]  1000  4895     4387       30   6       0             0 trinity-watchdo
[17384.257897] [ 4952]  1000  4952     4365       28   4       0             0 trinity-watchdo
[17384.258394] [ 4953]  1000  4953     4433       31   1       0             0 trinity-watchdo
[17384.258890] [ 4954]  1000  4954     4347       23   5       0             0 trinity-watchdo
[17384.259389] [ 5300]  1000  5300     4417       16   1       0             0 trinity-watchdo
[17384.259886] [ 5688]  1000  5688     4389       15   6       0             0 trinity-watchdo
[17384.260373] [ 5868]  1000  5868     4429       30   3       0             0 trinity-watchdo
[17384.260847] [ 5972]  1000  5972     4377       20   2       0             0 trinity-watchdo
[17384.261314] [ 5973]  1000  5973     4389       29   4       0             0 trinity-watchdo
[17384.261775] [ 6125]  1000  6125     4347       19   4       0             0 trinity-watchdo
[17384.262232] [ 6136]  1000  6136     4293       18   2       0             0 trinity-watchdo
[17384.262688] [ 6137]  1000  6137     4401       17   6       0             0 trinity-watchdo
[17384.263143] [ 6138]  1000  6138     4423       29   6       0             0 trinity-watchdo
[17384.263593] [ 6139]  1000  6139     4285       30   4       0             0 trinity-watchdo
[17384.264036] [ 6227]  1000  6227     4365       31   6       0             0 trinity-watchdo
[17384.264476] [ 6288]  1000  6288     4391       30   3       0             0 trinity-watchdo
[17384.264930] [30721]  1000 30721     5239       24   7       0             0 trinity-child7
[17384.265361] [30881]  1000 30881     4897       27   2       0             0 trinity-child2
[17384.265791] [30921]  1000 30921     4389      242   0       0             0 trinity-child0
[17384.266216] [30928]  1000 30928     5479       36   0       0             0 trinity-child0
[17384.266637] [30939]  1000 30939     4864       33   1       0             0 trinity-child1
[17384.267052] [31051]  1000 31051     4365      171   2       0             0 trinity-child2
[17384.267467] [31052]  1000 31052     5849     1345   1       0             0 trinity-child1
[17384.267882] [31103]  1000 31103     6277     1764   6       0             0 trinity-child6
[17384.268297] [31109]  1000 31109     4429      109   7       0             0 trinity-child7
[17384.268709] [31137]  1000 31137     4465      303   0       0             0 trinity-child0
[17384.269124] [31141]  1000 31141     5220       65   3       0             0 trinity-child3
[17384.269535] [31158]  1000 31158     4425       70   3       0             0 trinity-child3
[17384.269948] [31160]  1000 31160     5978     1899   7       0             0 trinity-child7
[17384.270355] [31178]  1000 31178     4429      228   0       0             0 trinity-child0
[17384.270761] [31245]  1000 31245     4391      208   0       0             0 trinity-child0
[17384.271162] [31284]  1000 31284     5089      109   1       0             0 trinity-child1
[17384.271575] [31333]  1000 31333     5201      309   2       0             0 trinity-child2
[17384.271984] [31349]  1000 31349     4781      189   0       0             0 trinity-child0
[17384.272393] [31358]  1000 31358     4743       24   5       0             0 trinity-child5
[17384.272806] [31359]  1000 31359     4347       34   3       0             0 trinity-child3
[17384.273219] [31360]  1000 31360     4644       34   6       0             0 trinity-child6
[17384.273634] [31362]  1000 31362     4347       34   1       0             0 trinity-child1
[17384.274049] [31363]  1000 31363     4347       29   2       0             0 trinity-child2
[17384.274466] [31364]  1000 31364     4347       24   4       0             0 trinity-child4
[17384.274892] [31365]  1000 31365     4347       17   0       0             0 trinity-child0
[17384.275305] [31409]  1000 31409     4347        1   3       0             0 trinity-child3
[17384.275719] [31620]  1000 31620     6661       31   7       0             0 trinity-child7
[17384.276134] [31621]  1000 31621     7161       43   5       0             0 trinity-child5
[17384.276550] [31654]  1000 31654     5576       35   4       0             0 trinity-child4
[17384.276972] [31666]  1000 31666     4377       31   0       0             0 trinity-child0
[17384.277385] [31739]  1000 31739     4347       30   6       0             0 trinity-child6
[17384.277797] [31742]  1000 31742     4816      312   7       0             0 trinity-child7
[17384.278203] [31743]  1000 31743     4865      505   0       0             0 trinity-child0
[17384.278609] [31744]  1000 31744     4864       26   3       0             0 trinity-child3
[17384.279016] [31745]  1000 31745     4863       47   5       0             0 trinity-child5
[17384.279422] [31746]  1000 31746     4863       28   4       0             0 trinity-child4
[17384.279829] [31747]  1000 31747     4898       25   7       0             0 trinity-child7
[17384.280235] [31748]  1000 31748     4901       27   5       0             0 trinity-child5
[17384.280642] [31749]  1000 31749     4293       27   4       0             0 trinity-child4
[17384.281049] [31750]  1000 31750     4524      311   0       0             0 trinity-child0
[17384.281460] [31751]  1000 31751     4389       23   3       0             0 trinity-child3
[17384.281870] [31752]  1000 31752     5479       83   2       0             0 trinity-child2
[17384.282279] [31753]  1000 31753     5913      506   6       0             0 trinity-child6
[17384.282689] [31754]  1000 31754     5908      479   4       0             0 trinity-child4
[17384.283103] [31755]  1000 31755     4951       61   7       0             0 trinity-child7
[17384.283518] [31756]  1000 31756     4389       23   5       0             0 trinity-child5
[17384.283936] [31757]  1000 31757     4389       23   1       0             0 trinity-child1
[17384.284350] [31758]  1000 31758     5128      196   6       0             0 trinity-child6
[17384.284786] [31779]  1000 31779     4326       70   7       0             0 trinity-child7
[17384.285206] [31838]  1000 31838     4464      371   5       0             0 trinity-child5
[17384.285620] [31839]  1000 31839     4543      427   1       0             0 trinity-child1
[17384.286037] [31840]  1000 31840     4431      313   6       0             0 trinity-child6
[17384.286454] [31859]  1000 31859     4541      427   2       0             0 trinity-child2
[17384.286867] [31865]  1000 31865     4327       94   6       0             0 trinity-child6
[17384.287281] [31876]  1000 31876     4374      382   6       0             0 trinity-child6
[17384.287693] [31877]  1000 31877     4365      348   0       0             0 trinity-child0
[17384.288105] [31878]  1000 31878     4365      344   1       0             0 trinity-child1
[17384.288516] [31879]  1000 31879     4365      333   5       0             0 trinity-child5
[17384.288922] [31880]  1000 31880     4374      381   7       0             0 trinity-child7
[17384.289328] [31882]  1000 31882     4364      298   4       0             0 trinity-child4
[17384.289735] [31884]  1000 31884     4364      344   3       0             0 trinity-child3
[17384.290139] [31895]  1000 31895     4431      352   3       0             0 trinity-child3
[17384.290543] [31979]  1000 31979     4389      369   4       0             0 trinity-child4
[17384.290944] [31980]  1000 31980     4425      280   5       0             0 trinity-child5
[17384.291348] [31981]  1000 31981     5216     1164   2       0             0 trinity-child2
[17384.291750] [31983]  1000 31983     5546     1489   1       0             0 trinity-child1
[17384.292154] [31984]  1000 31984     4753      737   5       0             0 trinity-child5
[17384.292557] [31985]  1000 31985     5021      989   1       0             0 trinity-child1
[17384.292962] [31986]  1000 31986     4389      419   7       0             0 trinity-child7
[17384.293369] [31987]  1000 31987     4984      952   3       0             0 trinity-child3
[17384.293778] [31988]  1000 31988     4389      445   6       0             0 trinity-child6
[17384.294190] [31989]  1000 31989     6310     2260   6       0             0 trinity-child6
[17384.294603] [32072]  1000 32072     4648      568   2       0             0 trinity-child2
[17384.295026] [32073]  1000 32073     4285      227   1       0             0 trinity-child1
[17384.295441] [32074]  1000 32074     4648      609   3       0             0 trinity-child3
[17384.295856] [32079]  1000 32079     4429      292   3       0             0 trinity-child3
[17384.296267] [32083]  1000 32083     4429      292   4       0             0 trinity-child4
[17384.296677] [32084]  1000 32084     4549      538   5       0             0 trinity-child5
[17384.297083] [32085]  1000 32085     4648      567   6       0             0 trinity-child6
[17384.297490] [32094]  1000 32094     4429      128   5       0             0 trinity-child5
[17384.297899] [32095]  1000 32095     4429       34   3       0             0 trinity-child2
[17384.298306] [32104]  1000 32104     4465      330   7       0             0 trinity-child7
[17384.298715] [32117]  1000 32117     4648      577   4       0             0 trinity-child4
[17384.299128] [32128]  1000 32128     4458      382   4       0             0 trinity-child4
[17384.299537] [32131]  1000 32131     4433      126   6       0             0 trinity-child6
[17384.299948] [32132]  1000 32132     4433      120   5       0             0 trinity-child5
[17384.300356] [32133]  1000 32133     4432      373   7       0             0 trinity-child7
[17384.300764] [32135]  1000 32135     4433      121   1       0             0 trinity-child1
[17384.301172] [32136]  1000 32136     4432      344   2       0             0 trinity-child2
[17384.301581] [32138]  1000 32138     4387      152   2       0             0 trinity-child2
[17384.301987] [32146]  1000 32146     4387      148   3       0             0 trinity-child3
[17384.302391] [32148]  1000 32148     4433       74   0       0             0 trinity-child0
[17384.302795] [32149]  1000 32149     4433      103   4       0             0 trinity-child4
[17384.303200] [32150]  1000 32150     4433      115   3       0             0 trinity-child3
[17384.303606] [32151]  1000 32151     4285      200   7       0             0 trinity-child7
[17384.304011] [32162]  1000 32162     4365      160   4       0             0 trinity-child4
[17384.304417] [32163]  1000 32163     4387      156   1       0             0 trinity-child1
[17384.304832] [32178]  1000 32178     4387       35   1       0             0 trinity-child6
[17384.305238] [32179]  1000 32179     4391      214   2       0             0 trinity-child2
[17384.305646] [32180]  1000 32180     4387      147   0       0             0 trinity-child0
[17384.306053] [32181]  1000 32181     4387      153   4       0             0 trinity-child4
[17384.306459] [32182]  1000 32182     4423      225   4       0             0 trinity-child4
[17384.306865] [32183]  1000 32183     4885      621   5       0             0 trinity-child5
[17384.307270] [32184]  1000 32184     4885      632   1       0             0 trinity-child1
[17384.307676] [32185]  1000 32185     4423      233   6       0             0 trinity-child6
[17384.308086] [32186]  1000 32186     4423      142   2       0             0 trinity-child2
[17384.308493] [32187]  1000 32187     4423      118   0       0             0 trinity-child0
[17384.308897] [32188]  1000 32188     4423      185   7       0             0 trinity-child7
[17384.309302] [32189]  1000 32189     4423      232   3       0             0 trinity-child3
[17384.309707] [32190]  1000 32190     4387       35   4       0             0 trinity-child5
[17384.310116] Out of memory: Kill process 503 (dbus-daemon) score 511952 or sacrifice child
[17384.310534] Killed process 503 (dbus-daemon) total-vm:21740kB, anon-rss:0kB, file-rss:4kB

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2012-06-11 19:14 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-04 15:27 oomkillers gone wild Dave Jones
2012-06-04 15:27 ` Dave Jones
2012-06-04 23:30 ` David Rientjes
2012-06-04 23:30   ` David Rientjes
2012-06-05 17:44   ` Dave Jones
2012-06-05 17:44     ` Dave Jones
2012-06-05 18:52     ` Dave Jones
2012-06-05 18:52       ` Dave Jones
2012-06-08 19:57       ` David Rientjes
2012-06-08 19:57         ` David Rientjes
2012-06-08 20:03         ` Dave Jones
2012-06-08 20:03           ` Dave Jones
2012-06-08 20:37         ` Thomas Gleixner
2012-06-08 20:37           ` Thomas Gleixner
2012-06-10  2:15           ` David Rientjes
2012-06-10  2:15             ` David Rientjes
2012-06-08 20:15     ` David Rientjes
2012-06-08 20:15       ` David Rientjes
2012-06-08 21:03       ` Dave Jones
2012-06-08 21:03         ` Dave Jones
2012-06-10  2:21         ` David Rientjes
2012-06-10  2:21           ` David Rientjes
2012-06-10  3:21           ` KOSAKI Motohiro
2012-06-10  3:21             ` KOSAKI Motohiro
2012-06-10 20:10             ` Dave Jones
2012-06-10 20:10               ` Dave Jones
2012-06-10 23:52               ` David Rientjes
2012-06-10 23:52                 ` David Rientjes
2012-06-11  0:46                 ` Dave Jones
2012-06-11  0:46                   ` Dave Jones
2012-06-11  9:11                   ` [patch 3.5-rc2] mm, oom: fix and cleanup oom score calculations David Rientjes
2012-06-11  9:11                     ` David Rientjes
2012-06-11 19:13                     ` Dave Jones
2012-06-11 19:13                       ` Dave Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.