* bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
@ 2016-07-09 23:49 Shayan Pooya
2016-07-11 6:41 ` Michal Hocko
0 siblings, 1 reply; 15+ messages in thread
From: Shayan Pooya @ 2016-07-09 23:49 UTC (permalink / raw)
To: cgroups mailinglist, LKML, linux-mm
I came across the following issue in kernel 3.16 (Ubuntu 14.04) which
was then reproduced in kernels 4.4 LTS:
After a couple of of memcg oom-kills in a cgroup, a syscall in
*another* process in the same cgroup hangs indefinitely.
Reproducing:
# mkdir -p strace_run
# mkdir /sys/fs/cgroup/memory/1
# echo 1073741824 > /sys/fs/cgroup/memory/1/memory.limit_in_bytes
# echo 0 > /sys/fs/cgroup/memory/1/memory.swappiness
# for i in $(seq 1000); do ./call-mem-hog
/sys/fs/cgroup/memory/1/cgroup.procs & done
Where call-mem-hog is:
#!/bin/sh
set -ex
echo $$ > $1
echo "Adding $$ to $1"
strace -ff -tt ./mem-hog 2> strace_run/$$
Initially I thought it was a userspace bug in dash as it only happened
with /bin/sh (which points to dash) and not with bash. I see the
following hanging processes:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 20999 0.0 0.0 4508 100 pts/6 S 16:28 0:00
/bin/sh ./call-mem-hog /sys/fs/cgroup/memory/1/cgroup.procs
However, when using strace, I noticed that sometimes there is actually
a mem-hog process hanging on sbrk syscall (Of course the
memory.oom_control is 0 and this is not expected).
Sending an ABRT signal to the waiting strace process then resulted in
the mem-hog process getting oom-killed by the kernel.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-09 23:49 bug in memcg oom-killer results in a hung syscall in another process in the same cgroup Shayan Pooya
@ 2016-07-11 6:41 ` Michal Hocko
2016-07-11 17:40 ` Shayan Pooya
0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2016-07-11 6:41 UTC (permalink / raw)
To: Shayan Pooya; +Cc: cgroups mailinglist, LKML, linux-mm
On Sat 09-07-16 16:49:32, Shayan Pooya wrote:
> I came across the following issue in kernel 3.16 (Ubuntu 14.04) which
> was then reproduced in kernels 4.4 LTS:
> After a couple of of memcg oom-kills in a cgroup, a syscall in
> *another* process in the same cgroup hangs indefinitely.
>
> Reproducing:
>
> # mkdir -p strace_run
> # mkdir /sys/fs/cgroup/memory/1
> # echo 1073741824 > /sys/fs/cgroup/memory/1/memory.limit_in_bytes
> # echo 0 > /sys/fs/cgroup/memory/1/memory.swappiness
> # for i in $(seq 1000); do ./call-mem-hog
> /sys/fs/cgroup/memory/1/cgroup.procs & done
>
> Where call-mem-hog is:
> #!/bin/sh
> set -ex
> echo $$ > $1
> echo "Adding $$ to $1"
> strace -ff -tt ./mem-hog 2> strace_run/$$
>
>
> Initially I thought it was a userspace bug in dash as it only happened
> with /bin/sh (which points to dash) and not with bash. I see the
> following hanging processes:
>
> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
> root 20999 0.0 0.0 4508 100 pts/6 S 16:28 0:00
> /bin/sh ./call-mem-hog /sys/fs/cgroup/memory/1/cgroup.procs
>
> However, when using strace, I noticed that sometimes there is actually
> a mem-hog process hanging on sbrk syscall (Of course the
> memory.oom_control is 0 and this is not expected).
> Sending an ABRT signal to the waiting strace process then resulted in
> the mem-hog process getting oom-killed by the kernel.
Could you post the stack trace of the hung oom victim? Also could you
post the full kernel log?
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-11 6:41 ` Michal Hocko
@ 2016-07-11 17:40 ` Shayan Pooya
2016-07-11 18:33 ` Shayan Pooya
2016-07-12 7:17 ` Michal Hocko
0 siblings, 2 replies; 15+ messages in thread
From: Shayan Pooya @ 2016-07-11 17:40 UTC (permalink / raw)
To: Michal Hocko; +Cc: cgroups mailinglist, LKML, linux-mm
>
> Could you post the stack trace of the hung oom victim? Also could you
> post the full kernel log?
Here is the stack of the process that lives (it is *not* the
oom-victim) in a run with 100 processes and *without* strace:
# cat /proc/7688/stack
[<ffffffff81100292>] futex_wait_queue_me+0xc2/0x120
[<ffffffff811005a6>] futex_wait+0x116/0x280
[<ffffffff81102d90>] do_futex+0x120/0x540
[<ffffffff81103231>] SyS_futex+0x81/0x180
[<ffffffff81825bf2>] entry_SYSCALL_64_fastpath+0x16/0x71
[<ffffffffffffffff>] 0xffffffffffffffff
Also:
# pgrep call-mem-hog | wc -l
30
they are all like:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 7570 0.0 0.0 4508 100 pts/9 S 10:14 0:00
/bin/sh ./call-mem-hog /sys/fs/cgroup/memory/1/cgroup.procs
# cat /sys/fs/cgroup/memory/1/cgroup.procs | wc -l
30
# uname -a
Linux sanblas 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux
root@sanblas:~/oom_stuff# grep 'Killed process' kern.log | wc -l
64
The full kern.log and the output of (echo t > /proc/sysrq-trigger) are
available at https://gist.github.com/pooya/da6fce58ce546c7a3631b2eb16152c0c
The kernel log for the oom-kills is pasted here:
-- Logs begin at Mon 2016-07-11 10:00:25 PDT. --
Jul 11 10:05:09 sanblas systemd[1]: Stopping CUPS Scheduler...
Jul 11 10:05:10 sanblas systemd[1]: Stopped CUPS Scheduler.
Jul 11 10:05:10 sanblas systemd[1]: Started CUPS Scheduler.
Jul 11 10:05:35 sanblas anacron[840]: Job `cron.daily' terminated
(mailing output)
Jul 11 10:05:35 sanblas anacron[840]: anacron: Can't find sendmail at
/usr/sbin/sendmail, not mailing output
Jul 11 10:05:35 sanblas anacron[840]: Can't find sendmail at
/usr/sbin/sendmail, not mailing output
Jul 11 10:10:26 sanblas anacron[840]: Job `cron.weekly' started
Jul 11 10:10:26 sanblas anacron[3456]: Updated timestamp for job
`cron.weekly' to 2016-07-11
Jul 11 10:10:48 sanblas anacron[840]: Job `cron.weekly' terminated
Jul 11 10:10:48 sanblas anacron[840]: Normal exit (2 jobs run)
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 6 PID: 7546 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 000000002feb37d8
ffff8801da493c88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8801da493d68 ffff8802134e44c0
ffff8801da493cf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff8801da493d10 ffff8801da493cc8
ffffffff81190b3b ffff8800c0346e00
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 92
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:362496KB active_anon:684160KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7536] 0 7536 35261 32488
75 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7538] 0 7538 26483 23679
58 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7540] 0 7540 32852 30110
70 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7542] 0 7542 25229 22435
56 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 21896 19089
46 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7546] 0 7546 31235 28465
64 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 25163 22380
55 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 16187 13404
37 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 16121 13346
37 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7554] 0 7554 24206 21392
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 18431 15621
41 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 11864 9037
26 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 11006 8249
28 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 8894 6100
21 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 6221 3427
16 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 1127 24
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7567] 0 7567 1127 198
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 1127 25
5 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7536 (mem-hog) score 120 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7536 (mem-hog)
total-vm:141044kB, anon-rss:127864kB, file-rss:2088kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 5 PID: 7540 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 000000009c7c8bd0
ffff8800c00a7c88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800c00a7d68 ffff8801efbab700
ffff8800c00a7cf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff88021eb56d00 ffff8800c00a7cc8
ffffffff81190b3b ffff8801f8806e00
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 207
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:524908KB active_anon:523284KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7538] 0 7538 27869 25065
60 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7540] 0 7540 34238 31496
72 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7542] 0 7542 26648 23821
58 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 24602 21861
51 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7546] 0 7546 34007 31236
69 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 27143 24360
59 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 18893 16110
42 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 17507 14731
40 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7554] 0 7554 25559 22777
54 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 19784 17007
43 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 13943 11148
30 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 15197 12407
36 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 11633 8871
26 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 9554 6793
22 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 4439 1640
14 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 1127 25
6 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7569] 0 7569 1127 177
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
5 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7540 (mem-hog) score 116 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7540 (mem-hog)
total-vm:136952kB, anon-rss:123912kB, file-rss:2072kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 7 PID: 7560 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 00000000d75ee657
ffff8800c29cbc88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800c29cbd68 ffff8801ef8a1b80
ffff8800c29cbcf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff88021ebd6d00 ffff8800c29cbcc8
ffffffff81190b3b ffff8801f8806e00
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 685
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525500KB active_anon:523076KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7538] 0 7538 29354 26581
63 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7542] 0 7542 29156 26393
63 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 24635 21861
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7546] 0 7546 35261 32488
72 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 29024 26271
62 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 20378 17624
45 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 21500 18688
47 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7554] 0 7554 28067 25283
59 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 23777 20964
51 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 16154 13324
35 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 16187 13396
38 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 15527 12762
34 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 11138 8308
26 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 6353 3553
18 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 4538 1797
14 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7546 (mem-hog) score 120 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7546 (mem-hog)
total-vm:141044kB, anon-rss:127848kB, file-rss:2104kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 6 PID: 7554 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 000000001c5a024f
ffff8801f8b63c88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8801f8b63d68 ffff8802130c8000
ffff8801f8b63cf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff88021eb96d00 ffff8801f8b63cc8
ffffffff81190b3b ffff8800c0345280
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 1170
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525444KB active_anon:523132KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7538] 0 7538 31037 28227
66 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7542] 0 7542 31598 28833
68 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 27374 24626
57 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 30806 27985
66 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 20378 17624
45 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 23942 21194
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7554] 0 7554 30542 27787
64 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 26087 23273
56 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 19157 16357
41 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 17870 15108
41 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 19421 16655
42 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 14042 11274
31 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 8036 5198
21 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 7475 4698
20 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 1127 24
6 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7542 (mem-hog) score 106 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7542 (mem-hog)
total-vm:126392kB, anon-rss:113328kB, file-rss:2004kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 0 PID: 7538 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 000000004f1b33e0
ffff8800c2827c88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800c2827d68 ffff8801f8975280
ffff8800c2827cf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff88021ea16d00 ffff8800c2827cc8
ffffffff81190b3b ffff8801ef8a5280
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 1419
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525388KB active_anon:523188KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7538] 0 7538 34799 31987
74 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 31070 28320
64 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 32324 29501
69 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 22391 19596
49 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 25493 22711
55 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7554] 0 7554 30542 27787
64 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 26912 24129
57 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 20741 17940
44 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 20774 17944
47 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 21434 18634
45 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 17045 14243
37 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 10049 7243
25 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 9653 6874
24 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 4571 1788
15 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7574] 0 7574 1127 25
6 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7538 (mem-hog) score 118 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7538 (mem-hog)
total-vm:139196kB, anon-rss:125988kB, file-rss:1960kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 0 PID: 7572 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 00000000f67c0969
ffff8800d325bc88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800d325bd68 ffff8801f8800000
ffff8800d325bcf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff88021ea16d00 ffff8800d325bcc8
ffffffff81190b3b ffff8801f8800dc0
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 1973
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525248KB active_anon:523328KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 32786 30030
68 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 34238 31480
73 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 23777 20978
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 25526 22776
55 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7554] 0 7554 34304 31544
71 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 29387 26568
62 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 24866 22093
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 23579 20778
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 23117 20345
49 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 19058 16288
41 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 12557 9749
30 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 11336 8586
28 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 6914 4096
20 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7574] 0 7574 5231 2433
15 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7576] 0 7576 4373 1600
14 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7578] 0 7578 1127 25
4 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7554 (mem-hog) score 116 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7554 (mem-hog)
total-vm:137216kB, anon-rss:124116kB, file-rss:2060kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 7 PID: 7576 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 000000000d66bd99
ffff8800d983fc88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800d983fd68 ffff8801ef8a0dc0
ffff8800d983fcf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff8800d983fd10 ffff8800d983fcc8
ffffffff81190b3b ffff8801ef8a6e00
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 2218
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525320KB active_anon:523128KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7544] 0 7544 36878 34120
75 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 36119 33327
76 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 24767 22033
54 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 27704 24949
60 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 32225 29403
68 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 26912 24138
56 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 23579 20778
52 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 26054 23312
54 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 21203 18399
45 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 14537 11727
34 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 13250 10499
31 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 7343 4556
21 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7574] 0 7574 7970 5204
21 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7576] 0 7576 5627 2853
17 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7578] 0 7578 6023 3240
17 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7580] 0 7580 4109 1348
13 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7581] 0 7581 1127 174
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7582] 0 7582 1127 24
6 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7544 (mem-hog) score 126 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7544 (mem-hog)
total-vm:147512kB, anon-rss:134392kB, file-rss:2088kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 7 PID: 7576 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 000000000d66bd99
ffff8800d983fc88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800d983fd68 ffff8801ef8a0000
ffff8800d983fcf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff8800d983fd10 ffff8800d983fcc8
ffffffff81190b3b ffff88021480b700
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 2505
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525084KB active_anon:523364KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7548] 0 7548 38297 35503
81 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 25592 22823
55 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 30146 27389
64 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 32885 30062
69 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 29717 26909
61 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 25658 22887
56 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 28793 26016
60 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 22886 20114
49 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 17705 14893
40 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 15461 12675
36 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 9455 6666
25 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7574] 0 7574 8135 5334
21 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7576] 0 7576 7277 4501
20 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7578] 0 7578 7673 4889
20 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7580] 0 7580 8366 5637
21 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7582] 0 7582 1127 24
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7584] 0 7584 6221 3431
17 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7586] 0 7586 3317 412
11 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7587] 0 7587 1127 200
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7588] 0 7588 1127 25
5 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7548 (mem-hog) score 131 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7548 (mem-hog)
total-vm:153188kB, anon-rss:139964kB, file-rss:2048kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 7 PID: 7580 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 00000000f5ff64c5
ffff8800d326bc88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff8800d326bd68 ffff8801f8801b80
ffff8800d326bcf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff8800d326bd10 ffff8800d326bcc8
ffffffff81190b3b ffff8801f8970000
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 3648
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525232KB active_anon:523216KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 26813 24070
58 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7552] 0 7552 34535 31739
73 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 34106 31312
71 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 31103 28287
64 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 28694 25912
62 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 31367 28579
65 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 23744 20967
50 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 19949 17130
44 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 18167 15375
41 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 13547 10753
33 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7574] 0 7574 8960 6185
23 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7576] 0 7576 9158 6409
24 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7578] 0 7578 9719 6927
24 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7580] 0 7580 10412 7679
25 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7582] 0 7582 1127 24
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7584] 0 7584 8630 5863
22 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7586] 0 7586 4868 2031
14 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7588] 0 7588 3812 1017
13 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7590] 0 7590 1127 25
6 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7552 (mem-hog) score 117 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7552 (mem-hog)
total-vm:138140kB, anon-rss:124884kB, file-rss:2072kB
Jul 11 10:14:02 sanblas kernel: mem-hog invoked oom-killer:
gfp_mask=0x24000c0, order=0, oom_score_adj=0
Jul 11 10:14:02 sanblas kernel: mem-hog cpuset=/ mems_allowed=0
Jul 11 10:14:02 sanblas kernel: CPU: 2 PID: 7564 Comm: mem-hog Not
tainted 4.4.0-24-generic #43-Ubuntu
Jul 11 10:14:02 sanblas kernel: Hardware name: Dell Inc. OptiPlex
9020/00V62H, BIOS A10 01/08/2015
Jul 11 10:14:02 sanblas kernel: 0000000000000286 00000000233744d1
ffff880213fa3c88 ffffffff813eab23
Jul 11 10:14:02 sanblas kernel: ffff880213fa3d68 ffff8802134e3700
ffff880213fa3cf8 ffffffff8120906e
Jul 11 10:14:02 sanblas kernel: ffff880213fa3d10 ffff880213fa3cc8
ffffffff81190b3b ffff8802134e5280
Jul 11 10:14:02 sanblas kernel: Call Trace:
Jul 11 10:14:02 sanblas kernel: [<ffffffff813eab23>] dump_stack+0x63/0x90
Jul 11 10:14:02 sanblas kernel: [<ffffffff8120906e>] dump_header+0x5a/0x1c5
Jul 11 10:14:02 sanblas kernel: [<ffffffff81190b3b>] ?
find_lock_task_mm+0x3b/0x80
Jul 11 10:14:02 sanblas kernel: [<ffffffff81191102>]
oom_kill_process+0x202/0x3c0
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fce94>] ?
mem_cgroup_iter+0x204/0x390
Jul 11 10:14:02 sanblas kernel: [<ffffffff811feef3>]
mem_cgroup_out_of_memory+0x2b3/0x300
Jul 11 10:14:02 sanblas kernel: [<ffffffff811ffcc8>]
mem_cgroup_oom_synchronize+0x338/0x350
Jul 11 10:14:02 sanblas kernel: [<ffffffff811fb1f0>] ?
kzalloc_node.constprop.48+0x20/0x20
Jul 11 10:14:02 sanblas kernel: [<ffffffff811917b4>]
pagefault_out_of_memory+0x44/0xc0
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b2c2>] mm_fault_error+0x82/0x160
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b778>]
__do_page_fault+0x3d8/0x400
Jul 11 10:14:02 sanblas kernel: [<ffffffff8106b7c2>] do_page_fault+0x22/0x30
Jul 11 10:14:02 sanblas kernel: [<ffffffff81827d78>] page_fault+0x28/0x30
Jul 11 10:14:02 sanblas kernel: Task in /1 killed as a result of limit of /1
Jul 11 10:14:02 sanblas kernel: memory: usage 1048576kB, limit
1048576kB, failcnt 3884
Jul 11 10:14:02 sanblas kernel: memory+swap: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: kmem: usage 0kB, limit
9007199254740988kB, failcnt 0
Jul 11 10:14:02 sanblas kernel: Memory cgroup stats for /1: cache:0KB
rss:1048576KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB
inactive_anon:525132KB active_anon:523188KB inactive_file:0KB
active_file:0KB unevictable:0KB
Jul 11 10:14:02 sanblas kernel: [ pid ] uid tgid total_vm rss
nr_ptes nr_pmds swapents oom_score_adj name
Jul 11 10:14:02 sanblas kernel: [ 7550] 0 7550 28991 26246
62 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7556] 0 7556 36878 34082
77 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7558] 0 7558 32456 29673
67 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7560] 0 7560 31334 28550
68 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7562] 0 7562 33512 30755
69 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7564] 0 7564 28199 25388
59 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7566] 0 7566 20576 17789
46 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7568] 0 7568 20939 18146
46 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7570] 0 7570 1127 25
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7572] 0 7572 16253 13457
38 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7574] 0 7574 10247 7437
25 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7576] 0 7576 11171 8388
28 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7578] 0 7578 11072 8312
27 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7580] 0 7580 11798 9063
28 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7582] 0 7582 1127 24
8 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7584] 0 7584 9191 6390
23 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7586] 0 7586 6221 3416
17 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7588] 0 7588 4373 1610
14 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7590] 0 7590 4307 1513
15 3 0 0 mem-hog
Jul 11 10:14:02 sanblas kernel: [ 7592] 0 7592 1127 24
6 3 0 0 call-mem-hog
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7556 (mem-hog) score 126 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7556 (mem-hog)
total-vm:147512kB, anon-rss:134396kB, file-rss:1932kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7560 (mem-hog) score 121 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7560 (mem-hog)
total-vm:141968kB, anon-rss:128812kB, file-rss:2016kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7562 (mem-hog) score 130 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7562 (mem-hog)
total-vm:151736kB, anon-rss:138572kB, file-rss:2124kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7558 (mem-hog) score 133 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7558 (mem-hog)
total-vm:155168kB, anon-rss:142012kB, file-rss:2000kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7550 (mem-hog) score 125 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7550 (mem-hog)
total-vm:146720kB, anon-rss:133544kB, file-rss:2048kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7564 (mem-hog) score 123 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7564 (mem-hog)
total-vm:144740kB, anon-rss:131736kB, file-rss:1992kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7566 (mem-hog) score 107 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7566 (mem-hog)
total-vm:126788kB, anon-rss:113500kB, file-rss:1972kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7568 (mem-hog) score 110 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7568 (mem-hog)
total-vm:129824kB, anon-rss:116660kB, file-rss:2088kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7574 (mem-hog) score 109 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7574 (mem-hog)
total-vm:128900kB, anon-rss:115876kB, file-rss:2004kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7578 (mem-hog) score 111 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7578 (mem-hog)
total-vm:131144kB, anon-rss:118016kB, file-rss:2048kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7572 (mem-hog) score 104 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7572 (mem-hog)
total-vm:124148kB, anon-rss:110848kB, file-rss:2048kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7576 (mem-hog) score 110 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7576 (mem-hog)
total-vm:129956kB, anon-rss:116932kB, file-rss:2088kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7584 (mem-hog) score 100 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7584 (mem-hog)
total-vm:120056kB, anon-rss:106876kB, file-rss:2016kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7602 (mem-hog) score 111 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7602 (mem-hog)
total-vm:131672kB, anon-rss:118560kB, file-rss:1972kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7580 (mem-hog) score 116 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7580 (mem-hog)
total-vm:136292kB, anon-rss:122976kB, file-rss:2124kB
Jul 11 10:14:02 sanblas kernel: Memory cgroup out of memory: Kill
process 7588 (mem-hog) score 119 or sacrifice child
Jul 11 10:14:02 sanblas kernel: Killed process 7588 (mem-hog)
total-vm:139988kB, anon-rss:126904kB, file-rss:2124kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7600 (mem-hog) score 129 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7600 (mem-hog)
total-vm:150680kB, anon-rss:137516kB, file-rss:1960kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7594 (mem-hog) score 119 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7594 (mem-hog)
total-vm:140384kB, anon-rss:127188kB, file-rss:1972kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7592 (mem-hog) score 127 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7592 (mem-hog)
total-vm:148304kB, anon-rss:135112kB, file-rss:1932kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7586 (mem-hog) score 139 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7586 (mem-hog)
total-vm:161900kB, anon-rss:148792kB, file-rss:1972kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7590 (mem-hog) score 155 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7590 (mem-hog)
total-vm:179192kB, anon-rss:165940kB, file-rss:2004kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7596 (mem-hog) score 143 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7596 (mem-hog)
total-vm:165860kB, anon-rss:152748kB, file-rss:1932kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7606 (mem-hog) score 149 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7606 (mem-hog)
total-vm:172064kB, anon-rss:158812kB, file-rss:1972kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7610 (mem-hog) score 145 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7610 (mem-hog)
total-vm:168104kB, anon-rss:154836kB, file-rss:1972kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7622 (mem-hog) score 138 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7622 (mem-hog)
total-vm:160844kB, anon-rss:147708kB, file-rss:2104kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7614 (mem-hog) score 148 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7614 (mem-hog)
total-vm:171272kB, anon-rss:158016kB, file-rss:2088kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7624 (mem-hog) score 152 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7624 (mem-hog)
total-vm:175628kB, anon-rss:162536kB, file-rss:2088kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7620 (mem-hog) score 140 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7620 (mem-hog)
total-vm:163088kB, anon-rss:150080kB, file-rss:1976kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7626 (mem-hog) score 139 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7626 (mem-hog)
total-vm:161504kB, anon-rss:148268kB, file-rss:2088kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7636 (mem-hog) score 137 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7636 (mem-hog)
total-vm:161240kB, anon-rss:148076kB, file-rss:1992kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7640 (mem-hog) score 153 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7640 (mem-hog)
total-vm:177212kB, anon-rss:163888kB, file-rss:1972kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7642 (mem-hog) score 155 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7642 (mem-hog)
total-vm:178928kB, anon-rss:165776kB, file-rss:2104kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7650 (mem-hog) score 160 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7650 (mem-hog)
total-vm:184604kB, anon-rss:171320kB, file-rss:2124kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7644 (mem-hog) score 172 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7644 (mem-hog)
total-vm:197540kB, anon-rss:184480kB, file-rss:1992kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7646 (mem-hog) score 193 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7646 (mem-hog)
total-vm:219980kB, anon-rss:206948kB, file-rss:2004kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7648 (mem-hog) score 183 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7648 (mem-hog)
total-vm:210344kB, anon-rss:197172kB, file-rss:2104kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7656 (mem-hog) score 208 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7656 (mem-hog)
total-vm:235820kB, anon-rss:222804kB, file-rss:2016kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7660 (mem-hog) score 223 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7660 (mem-hog)
total-vm:252188kB, anon-rss:238904kB, file-rss:2048kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7658 (mem-hog) score 230 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7658 (mem-hog)
total-vm:259712kB, anon-rss:246532kB, file-rss:2048kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7672 (mem-hog) score 213 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7672 (mem-hog)
total-vm:241892kB, anon-rss:228608kB, file-rss:1932kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7680 (mem-hog) score 185 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7680 (mem-hog)
total-vm:211268kB, anon-rss:198004kB, file-rss:2048kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7682 (mem-hog) score 181 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7682 (mem-hog)
total-vm:206912kB, anon-rss:193788kB, file-rss:1984kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7684 (mem-hog) score 197 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7684 (mem-hog)
total-vm:224204kB, anon-rss:210924kB, file-rss:2104kB
Jul 11 10:14:03 sanblas kernel: Memory cgroup out of memory: Kill
process 7694 (mem-hog) score 185 or sacrifice child
Jul 11 10:14:03 sanblas kernel: Killed process 7694 (mem-hog)
total-vm:211796kB, anon-rss:198524kB, file-rss:1960kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7692 (mem-hog) score 186 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7692 (mem-hog)
total-vm:212060kB, anon-rss:199020kB, file-rss:2016kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7704 (mem-hog) score 165 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7704 (mem-hog)
total-vm:189884kB, anon-rss:176616kB, file-rss:1932kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7714 (mem-hog) score 162 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7714 (mem-hog)
total-vm:186188kB, anon-rss:172916kB, file-rss:2060kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7706 (mem-hog) score 155 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7706 (mem-hog)
total-vm:179456kB, anon-rss:166320kB, file-rss:1932kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7700 (mem-hog) score 184 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7700 (mem-hog)
total-vm:209552kB, anon-rss:196392kB, file-rss:2072kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7716 (mem-hog) score 193 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7716 (mem-hog)
total-vm:220376kB, anon-rss:207220kB, file-rss:2000kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7708 (mem-hog) score 240 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7708 (mem-hog)
total-vm:270140kB, anon-rss:257088kB, file-rss:2000kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7712 (mem-hog) score 292 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7712 (mem-hog)
total-vm:326636kB, anon-rss:313560kB, file-rss:2000kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7726 (mem-hog) score 327 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7726 (mem-hog)
total-vm:364652kB, anon-rss:351328kB, file-rss:2088kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7722 (mem-hog) score 487 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7722 (mem-hog)
total-vm:537044kB, anon-rss:523984kB, file-rss:2072kB
Jul 11 10:14:04 sanblas kernel: Memory cgroup out of memory: Kill
process 7732 (mem-hog) score 973 or sacrifice child
Jul 11 10:14:04 sanblas kernel: Killed process 7732 (mem-hog)
total-vm:1061216kB, anon-rss:1048040kB, file-rss:1976kB
Jul 11 10:15:26 sanblas systemd[1]: Starting Cleanup of Temporary Directories...
Jul 11 10:15:26 sanblas systemd-tmpfiles[7745]:
[/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log",
ignoring.
Jul 11 10:15:26 sanblas systemd[1]: Started Cleanup of Temporary Directories.
Jul 11 10:17:01 sanblas CRON[7755]: pam_unix(cron:session): session
opened for user root by (uid=0)
Jul 11 10:17:01 sanblas CRON[7756]: (root) CMD ( cd / && run-parts
--report /etc/cron.hourly)
Jul 11 10:17:02 sanblas CRON[7755]: pam_unix(cron:session): session
closed for user root
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-11 17:40 ` Shayan Pooya
@ 2016-07-11 18:33 ` Shayan Pooya
2016-07-12 7:19 ` Michal Hocko
2016-07-12 7:17 ` Michal Hocko
1 sibling, 1 reply; 15+ messages in thread
From: Shayan Pooya @ 2016-07-11 18:33 UTC (permalink / raw)
To: Michal Hocko; +Cc: cgroups mailinglist, LKML, linux-mm
>> Could you post the stack trace of the hung oom victim? Also could you
>> post the full kernel log?
With strace, when running 500 concurrent mem-hog tasks on the same
kernel, 33 of them failed with:
strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
`THREAD_GETMEM (self, tid) != ppid' failed.
Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
patch was not accepted.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-11 17:40 ` Shayan Pooya
2016-07-11 18:33 ` Shayan Pooya
@ 2016-07-12 7:17 ` Michal Hocko
1 sibling, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2016-07-12 7:17 UTC (permalink / raw)
To: Shayan Pooya; +Cc: cgroups mailinglist, LKML, linux-mm
On Mon 11-07-16 10:40:55, Shayan Pooya wrote:
> >
> > Could you post the stack trace of the hung oom victim? Also could you
> > post the full kernel log?
>
> Here is the stack of the process that lives (it is *not* the
> oom-victim) in a run with 100 processes and *without* strace:
>
> # cat /proc/7688/stack
> [<ffffffff81100292>] futex_wait_queue_me+0xc2/0x120
> [<ffffffff811005a6>] futex_wait+0x116/0x280
> [<ffffffff81102d90>] do_futex+0x120/0x540
> [<ffffffff81103231>] SyS_futex+0x81/0x180
> [<ffffffff81825bf2>] entry_SYSCALL_64_fastpath+0x16/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff
I am not sure I understand. Is this the hung task?
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-11 18:33 ` Shayan Pooya
@ 2016-07-12 7:19 ` Michal Hocko
2016-07-12 15:35 ` Shayan Pooya
0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2016-07-12 7:19 UTC (permalink / raw)
To: Shayan Pooya; +Cc: cgroups mailinglist, LKML, linux-mm
On Mon 11-07-16 11:33:19, Shayan Pooya wrote:
> >> Could you post the stack trace of the hung oom victim? Also could you
> >> post the full kernel log?
>
> With strace, when running 500 concurrent mem-hog tasks on the same
> kernel, 33 of them failed with:
>
> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
> `THREAD_GETMEM (self, tid) != ppid' failed.
>
> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
> And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
> patch was not accepted.
OK, so the problem is that the oom killed task doesn't report the futex
release properly? If yes then I fail to see how that is memcg specific.
Could you try to clarify what you consider a bug again, please? I am not
really sure I understand this report.
Thanks!
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-12 7:19 ` Michal Hocko
@ 2016-07-12 15:35 ` Shayan Pooya
2016-07-12 15:52 ` Konstantin Khlebnikov
2016-07-13 8:08 ` Michal Hocko
0 siblings, 2 replies; 15+ messages in thread
From: Shayan Pooya @ 2016-07-12 15:35 UTC (permalink / raw)
To: Michal Hocko, Konstantin Khlebnikov, koct9i
Cc: cgroups mailinglist, LKML, linux-mm
>> With strace, when running 500 concurrent mem-hog tasks on the same
>> kernel, 33 of them failed with:
>>
>> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
>> `THREAD_GETMEM (self, tid) != ppid' failed.
>>
>> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
>> And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
>> patch was not accepted.
>
> OK, so the problem is that the oom killed task doesn't report the futex
> release properly? If yes then I fail to see how that is memcg specific.
> Could you try to clarify what you consider a bug again, please? I am not
> really sure I understand this report.
It looks like it is just a very easy way to reproduce the problem that
Konstantin described in that lkml thread. That patch was not accepted
and I see no other fixes for that issue upstream. Here is a copy of
his root-cause analysis from said thread:
Whole sequence looks like: task calls fork, glibc calls syscall clone with
CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
Child task gets read-only copy of VM including TLS. Child calls put_user()
to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
fault and it fails because do_wp_page() hits memcg limit without invoking
OOM-killer because this is page-fault from kernel-space. Put_user returns
-EFAULT, which is ignored. Child returns into user-space and catches here
assert (THREAD_GETMEM (self, tid) != ppid), glibc tries to print something
but hangs on deadlock on internal locks. Halt and catch fire.
Regards
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-12 15:35 ` Shayan Pooya
@ 2016-07-12 15:52 ` Konstantin Khlebnikov
2016-07-12 16:52 ` Oleg Nesterov
2016-07-12 22:57 ` Shayan Pooya
2016-07-13 8:08 ` Michal Hocko
1 sibling, 2 replies; 15+ messages in thread
From: Konstantin Khlebnikov @ 2016-07-12 15:52 UTC (permalink / raw)
To: Shayan Pooya, Michal Hocko, koct9i
Cc: cgroups mailinglist, LKML, linux-mm, Oleg Nesterov
On 12.07.2016 18:35, Shayan Pooya wrote:
>>> With strace, when running 500 concurrent mem-hog tasks on the same
>>> kernel, 33 of them failed with:
>>>
>>> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
>>> `THREAD_GETMEM (self, tid) != ppid' failed.
>>>
>>> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
>>> And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
>>> patch was not accepted.
>>
>> OK, so the problem is that the oom killed task doesn't report the futex
>> release properly? If yes then I fail to see how that is memcg specific.
>> Could you try to clarify what you consider a bug again, please? I am not
>> really sure I understand this report.
>
> It looks like it is just a very easy way to reproduce the problem that
> Konstantin described in that lkml thread. That patch was not accepted
> and I see no other fixes for that issue upstream. Here is a copy of
> his root-cause analysis from said thread:
>
> Whole sequence looks like: task calls fork, glibc calls syscall clone with
> CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
> Child task gets read-only copy of VM including TLS. Child calls put_user()
> to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
> fault and it fails because do_wp_page() hits memcg limit without invoking
> OOM-killer because this is page-fault from kernel-space. Put_user returns
> -EFAULT, which is ignored. Child returns into user-space and catches here
> assert (THREAD_GETMEM (self, tid) != ppid), glibc tries to print something
> but hangs on deadlock on internal locks. Halt and catch fire.
>
>
Yep. Bug still not fixed in upstream. In our kernel I've plugged it with this:
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2808,8 +2808,9 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
balance_callback(rq);
preempt_enable();
- if (current->set_child_tid)
- put_user(task_pid_vnr(current), current->set_child_tid);
+ if (current->set_child_tid &&
+ put_user(task_pid_vnr(current), current->set_child_tid))
+ force_sig(SIGSEGV, current);
}
Add Oleg into CC. IIRR he had some ideas how to fix this. =)
--
Konstantin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-12 15:52 ` Konstantin Khlebnikov
@ 2016-07-12 16:52 ` Oleg Nesterov
2016-07-12 22:57 ` Shayan Pooya
1 sibling, 0 replies; 15+ messages in thread
From: Oleg Nesterov @ 2016-07-12 16:52 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Shayan Pooya, Michal Hocko, koct9i, cgroups mailinglist, LKML, linux-mm
On 07/12, Konstantin Khlebnikov wrote:
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2808,8 +2808,9 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
> balance_callback(rq);
> preempt_enable();
>
> - if (current->set_child_tid)
> - put_user(task_pid_vnr(current), current->set_child_tid);
> + if (current->set_child_tid &&
> + put_user(task_pid_vnr(current), current->set_child_tid))
> + force_sig(SIGSEGV, current);
> }
>
> Add Oleg into CC. IIRR he had some ideas how to fix this. =)
Heh. OK, OK, thank you Konstantin ;)
I'll try to recall tomorrow, but iirc I only have some ideas of how
we can happily blame the FAULT_FLAG_USER logic.
d, in this particular case, perhaps glibc/set_child_tid too because
(again, iirc) it would nice to simply kill it, it is only used for
some sanity checks...
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-12 15:52 ` Konstantin Khlebnikov
2016-07-12 16:52 ` Oleg Nesterov
@ 2016-07-12 22:57 ` Shayan Pooya
2016-07-14 13:22 ` Oleg Nesterov
1 sibling, 1 reply; 15+ messages in thread
From: Shayan Pooya @ 2016-07-12 22:57 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Michal Hocko, koct9i, cgroups mailinglist, LKML, linux-mm, Oleg Nesterov
> Yep. Bug still not fixed in upstream. In our kernel I've plugged it with
> this:
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2808,8 +2808,9 @@ asmlinkage __visible void schedule_tail(struct
> task_struct *prev)
> balance_callback(rq);
> preempt_enable();
>
> - if (current->set_child_tid)
> - put_user(task_pid_vnr(current), current->set_child_tid);
> + if (current->set_child_tid &&
> + put_user(task_pid_vnr(current), current->set_child_tid))
> + force_sig(SIGSEGV, current);
> }
I just verified that with your patch there is no hung processes and I
see processes getting SIGSEGV as expected.
Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-12 15:35 ` Shayan Pooya
2016-07-12 15:52 ` Konstantin Khlebnikov
@ 2016-07-13 8:08 ` Michal Hocko
1 sibling, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2016-07-13 8:08 UTC (permalink / raw)
To: Shayan Pooya
Cc: Konstantin Khlebnikov, koct9i, cgroups mailinglist, LKML, linux-mm
On Tue 12-07-16 08:35:06, Shayan Pooya wrote:
> >> With strace, when running 500 concurrent mem-hog tasks on the same
> >> kernel, 33 of them failed with:
> >>
> >> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
> >> `THREAD_GETMEM (self, tid) != ppid' failed.
> >>
> >> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
> >> And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
> >> patch was not accepted.
> >
> > OK, so the problem is that the oom killed task doesn't report the futex
> > release properly? If yes then I fail to see how that is memcg specific.
> > Could you try to clarify what you consider a bug again, please? I am not
> > really sure I understand this report.
>
> It looks like it is just a very easy way to reproduce the problem that
> Konstantin described in that lkml thread. That patch was not accepted
> and I see no other fixes for that issue upstream. Here is a copy of
> his root-cause analysis from said thread:
>
> Whole sequence looks like: task calls fork, glibc calls syscall clone with
> CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
> Child task gets read-only copy of VM including TLS. Child calls put_user()
> to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
> fault and it fails because do_wp_page() hits memcg limit without invoking
> OOM-killer because this is page-fault from kernel-space. Put_user returns
> -EFAULT, which is ignored. Child returns into user-space and catches here
> assert (THREAD_GETMEM (self, tid) != ppid), glibc tries to print something
> but hangs on deadlock on internal locks. Halt and catch fire.
OK, I see! Thanks for the clarification. So the bug is that put_user
return value is ignored. Let's see whether Konstantin's patch will be
accepted or Oleg comes with something else.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-12 22:57 ` Shayan Pooya
@ 2016-07-14 13:22 ` Oleg Nesterov
2016-07-14 15:35 ` Shayan Pooya
0 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2016-07-14 13:22 UTC (permalink / raw)
To: Shayan Pooya
Cc: Konstantin Khlebnikov, Michal Hocko, koct9i, cgroups mailinglist,
LKML, linux-mm
On 07/12, Shayan Pooya wrote:
>
> > Yep. Bug still not fixed in upstream. In our kernel I've plugged it with
> > this:
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2808,8 +2808,9 @@ asmlinkage __visible void schedule_tail(struct
> > task_struct *prev)
> > balance_callback(rq);
> > preempt_enable();
> >
> > - if (current->set_child_tid)
> > - put_user(task_pid_vnr(current), current->set_child_tid);
> > + if (current->set_child_tid &&
> > + put_user(task_pid_vnr(current), current->set_child_tid))
> > + force_sig(SIGSEGV, current);
> > }
>
> I just verified that with your patch there is no hung processes and I
> see processes getting SIGSEGV as expected.
Well, but we can't do this. And "as expected" is actually just wrong. I still
think that the whole FAULT_FLAG_USER logic is not right. This needs another email.
fork() should not fail because there is a memory hog in the same memcg. Worse,
pthread_create() can kill the caller by the same reason. And we have the same
or even worse problem with ->clear_child_tid, pthread_join() can hang forever.
Unlikely we want to kill the application in this case ;)
And in fact I think that the problem has nothing to do with set/claer_child_tid
in particular.
I am just curious... can you reproduce the problem reliably? If yes, can you try
the patch below ? Just in case, this is not the real fix in any case...
Oleg.
--- x/kernel/sched/core.c
+++ x/kernel/sched/core.c
@@ -2793,8 +2793,11 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
balance_callback(rq);
preempt_enable();
- if (current->set_child_tid)
+ if (current->set_child_tid) {
+ mem_cgroup_oom_enable();
put_user(task_pid_vnr(current), current->set_child_tid);
+ mem_cgroup_oom_disable();
+ }
}
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-14 13:22 ` Oleg Nesterov
@ 2016-07-14 15:35 ` Shayan Pooya
2016-07-15 16:58 ` Shayan Pooya
0 siblings, 1 reply; 15+ messages in thread
From: Shayan Pooya @ 2016-07-14 15:35 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Konstantin Khlebnikov, Michal Hocko, Konstantin Khlebnikov,
cgroups mailinglist, LKML, linux-mm
> Well, but we can't do this. And "as expected" is actually just wrong. I still
> think that the whole FAULT_FLAG_USER logic is not right. This needs another email.
I meant as expected from the content of the patch :) I think
Konstantin agrees that this patch cannot be merged upstream.
> fork() should not fail because there is a memory hog in the same memcg. Worse,
> pthread_create() can kill the caller by the same reason. And we have the same
> or even worse problem with ->clear_child_tid, pthread_join() can hang forever.
> Unlikely we want to kill the application in this case ;)
>
> And in fact I think that the problem has nothing to do with set/claer_child_tid
> in particular.
>
> I am just curious... can you reproduce the problem reliably? If yes, can you try
> the patch below ? Just in case, this is not the real fix in any case...
Yes. It deterministically results in hung processes in vanilla kernel.
I'll try this patch.
> --- x/kernel/sched/core.c
> +++ x/kernel/sched/core.c
> @@ -2793,8 +2793,11 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
> balance_callback(rq);
> preempt_enable();
>
> - if (current->set_child_tid)
> + if (current->set_child_tid) {
> + mem_cgroup_oom_enable();
> put_user(task_pid_vnr(current), current->set_child_tid);
> + mem_cgroup_oom_disable();
> + }
> }
>
> /*
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-14 15:35 ` Shayan Pooya
@ 2016-07-15 16:58 ` Shayan Pooya
2016-07-18 13:53 ` Oleg Nesterov
0 siblings, 1 reply; 15+ messages in thread
From: Shayan Pooya @ 2016-07-15 16:58 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Konstantin Khlebnikov, Michal Hocko, Konstantin Khlebnikov,
cgroups mailinglist, LKML, linux-mm
>> I am just curious... can you reproduce the problem reliably? If yes, can you try
>> the patch below ? Just in case, this is not the real fix in any case...
>
> Yes. It deterministically results in hung processes in vanilla kernel.
> I'll try this patch.
I'll have to correct this. I can reproduce this issue easily on
high-end servers and normal laptops. But for some reason it does not
happen very often in vmware guests (maybe related to lower
parallelism).
>> --- x/kernel/sched/core.c
>> +++ x/kernel/sched/core.c
>> @@ -2793,8 +2793,11 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
>> balance_callback(rq);
>> preempt_enable();
>>
>> - if (current->set_child_tid)
>> + if (current->set_child_tid) {
>> + mem_cgroup_oom_enable();
>> put_user(task_pid_vnr(current), current->set_child_tid);
>> + mem_cgroup_oom_disable();
>> + }
>> }
>>
>> /*
I tried this patch and I still see the same stuck processes (assuming
that's what you were curious about).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
2016-07-15 16:58 ` Shayan Pooya
@ 2016-07-18 13:53 ` Oleg Nesterov
0 siblings, 0 replies; 15+ messages in thread
From: Oleg Nesterov @ 2016-07-18 13:53 UTC (permalink / raw)
To: Shayan Pooya
Cc: Konstantin Khlebnikov, Michal Hocko, Konstantin Khlebnikov,
cgroups mailinglist, LKML, linux-mm
On 07/15, Shayan Pooya wrote:
>
> >> --- x/kernel/sched/core.c
> >> +++ x/kernel/sched/core.c
> >> @@ -2793,8 +2793,11 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
> >> balance_callback(rq);
> >> preempt_enable();
> >>
> >> - if (current->set_child_tid)
> >> + if (current->set_child_tid) {
> >> + mem_cgroup_oom_enable();
> >> put_user(task_pid_vnr(current), current->set_child_tid);
> >> + mem_cgroup_oom_disable();
> >> + }
> >> }
> >>
> >> /*
>
> I tried this patch and I still see the same stuck processes (assuming
> that's what you were curious about).
Of course. Because I am stupid. Firtsly, I forgot to include another
change in fault.c. And now I see that change was wrong anyway.
I'll try to make another debugging patch today later, but let me repeat
that it won't fix the real problem anyway.
Thanks, and sorry for wasting your time.
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-07-18 13:52 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-09 23:49 bug in memcg oom-killer results in a hung syscall in another process in the same cgroup Shayan Pooya
2016-07-11 6:41 ` Michal Hocko
2016-07-11 17:40 ` Shayan Pooya
2016-07-11 18:33 ` Shayan Pooya
2016-07-12 7:19 ` Michal Hocko
2016-07-12 15:35 ` Shayan Pooya
2016-07-12 15:52 ` Konstantin Khlebnikov
2016-07-12 16:52 ` Oleg Nesterov
2016-07-12 22:57 ` Shayan Pooya
2016-07-14 13:22 ` Oleg Nesterov
2016-07-14 15:35 ` Shayan Pooya
2016-07-15 16:58 ` Shayan Pooya
2016-07-18 13:53 ` Oleg Nesterov
2016-07-13 8:08 ` Michal Hocko
2016-07-12 7:17 ` Michal Hocko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).