linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Dave Jones <davej@codemonkey.org.uk>
Cc: linux-mm@kvack.org
Subject: Re: oom-reaper choosing wrong processes.
Date: Wed, 20 Jul 2016 15:36:27 +0200	[thread overview]
Message-ID: <20160720133627.GN11249@dhcp22.suse.cz> (raw)
In-Reply-To: <20160720132304.GA11434@codemonkey.org.uk>

On Wed 20-07-16 09:23:04, Dave Jones wrote:
> On Wed, Jul 20, 2016 at 09:09:23AM +0200, Michal Hocko wrote:
>  > On Tue 19-07-16 11:33:35, Dave Jones wrote:
>  > > On Tue, Jul 19, 2016 at 11:08:58AM +0200, Michal Hocko wrote:
>  > > > On Mon 18-07-16 19:18:50, Dave Jones wrote:
>  > > > [...]
>  > > > > [ 4607.765352] sendmail-mta invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), order=0, oom_score_adj=0
>  > > > > [ 4607.765359] sendmail-mta cpuset=/ mems_allowed=0
>  > > > [...]
>  > > > > [ 4607.765619] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
>  > > > > [ 4607.765637] [  749]     0   749    13116      782      29       3      385             0 systemd-journal
>  > > > > [ 4607.765641] [  793]     0   793    10640       10      23       3      285         -1000 systemd-udevd
>  > > > > [ 4607.765647] [ 1647]     0  1647    11928       16      27       3      111             0 rpcbind
>  > > > > [ 4607.765651] [ 1653]     0  1653     5841        0      15       3       54             0 rpc.idmapd
>  > > > > [ 4607.765656] [ 1655]     0  1655    11052       24      26       3      114             0 systemd-logind
>  > > > > [ 4607.765687] [ 1657]     0  1657    64579      181      28       3      161             0 rsyslogd
>  > > > > [ 4607.765691] [ 1660]     0  1660     1058        0       8       3       38             0 acpid
>  > > > > [ 4607.765696] [ 1661]     0  1661     7414       22      18       3       52             0 cron
>  > > > > [ 4607.765700] [ 1662]     0  1662     6993        0      19       3       54             0 atd
>  > > > > [ 4607.765704] [ 1664]   105  1664    10744       40      26       3       79          -900 dbus-daemon
>  > > > > [ 4607.765708] [ 1671]     0  1671     6264       29      17       3      157             0 smartd
>  > > > > [ 4607.765712] [ 1738]     0  1738    16948        0      37       3      204         -1000 sshd
>  > > > > [ 4607.765716] [ 1742]     0  1742     9461        0      22       3      195             0 rpc.mountd
>  > > > > [ 4607.765721] [ 1776]     0  1776     3624        0      12       3       39             0 agetty
>  > > > > [ 4607.765725] [ 1797]     0  1797     3319        0      10       3       48             0 mcelog
>  > > > > [ 4607.765729] [ 1799]     0  1799     4824       21      15       3       39             0 irqbalance
>  > > > > [ 4607.765733] [ 1803]   108  1803    25492       42      24       3      124             0 ntpd
>  > > > > [ 4607.765737] [ 1842]     0  1842    19793       48      39       3      410             0 sendmail-mta
>  > > > > [ 4607.765746] [ 1878]     0  1878     5121        0      13       3      262             0 dhclient
>  > > > > [ 4607.765752] [ 2145]  1000  2145    15627        0      33       3      213             0 systemd
>  > > > > [ 4607.765756] [ 2148]  1000  2148    19584        4      40       3      438             0 (sd-pam)
>  > > > > [ 4607.765760] [ 2643]  1000  2643     7465      433      19       3      152             0 tmux
>  > > > > [ 4607.765764] [ 2644]  1000  2644     5864        0      16       3      508             0 bash
>  > > > > [ 4607.765768] [ 2678]  1000  2678     3328       89      11       3       19             0 test-multi.sh
>  > > > > [ 4607.765774] [ 2693]  1000  2693     5864        1      16       3      507             0 bash
>  > > > > [ 4607.765782] [ 6456]  1000  6456     3091       21      11       3       24             0 dmesg
>  > > > > [ 4607.765787] [18624]  1000 18624   750863    43368     520       6        0           500 trinity-c10
>  > > > > [ 4607.765792] [21525]  1000 21525   797320    20517     493       7        0           500 trinity-c15
>  > > > > [ 4607.765796] [22023]  1000 22023   797349     1985     319       7        0           500 trinity-c2
>  > > > > [ 4607.765814] [22658]  1000 22658   797382        1     458       7        0           500 trinity-c0
>  > > > > [ 4607.765818] [26334]  1000 26334   797217    34960     412       7        0           500 trinity-c4
>  > > > > [ 4607.765823] [26388]  1000 26388   797383     9401     118       7        0           500 trinity-c11
>  > > > > [ 4607.765826] oom_kill_process: would have killed process 749 (systemd-journal), but continuing instead...
>  > > > > [ 4608.147644] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136724kB
>  > > > > [ 4608.148218] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174356kB
>  > > > > [ 4608.149795] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
>  > > > > [ 4608.150734] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174348kB
>  > > > > [ 4608.152489] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:86288kB
>  > > > > [ 4608.156127] oom_reaper: reaped process 18624 (trinity-c10), now anon-rss:0kB, file-rss:0kB, shmem-rss:174336kB
>  > > > > [ 4608.158798] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
>  > > > > [ 4608.161336] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
>  > > > > [ 4608.163836] oom_reaper: reaped process 26334 (trinity-c4), now anon-rss:0kB, file-rss:0kB, shmem-rss:136652kB
>  > > > > 
>  > > > > 
>  > > > > Whoa. Why did it pick systemd-journal ?
>  > > > 
>  > > > Who has picked that? select_bad process?
>  > 
>  > OK, I see. So select_bad_process has selected systemd-journal but your
>  > patch has declined this decision and skipped the oom invocation
>  > altogether. Seeing those trinity-* processes being oom reaped
>  > (repeatedly some of them) is not that surprising and Tetsuo is right
>  > that this is due to out_of_memory->try_oom_reaper(). The current mmotm
>  > tree has this part different to prevent from repeated oom reaping but
>  > that is really minor as multiple attempts to reap a task is not harmful.
>  > 
>  > The reason why systemd-journal has been selected is very similar.
>  > E.g.
>  > [ 4607.741744] oom_reaper: reaped process 21525 (trinity-c15), now anon-rss:0kB, file-rss:0kB, shmem-rss:82072kB
>  > 
>  > so this task has been already oom reaped and so oom_badness will ignore
>  > it (it simply doesn't make any sense to select this task because it
>  > has been already killed or exiting and oom reaped as well). Others might
>  > be in a similar position or they might have passed exit_mm->tsk->mm = NULL
>  > so they are ignored by the oom killer as well.
> 
> I feel like I'm still missing something.  Why isn't "wait for the
> already reaped trinity tasks to exit" the right thing to do here (as
> my diff forced it to do), instead of "pick even more victims even
> though we've already got some reaped processes that haven't exited"

OK, I was probably not clear enough. We are waiting for the oom reaper
to free up the address space and so the OOM killer will not select
another task while it is being reaped. The thing is that those tasks
have already been reaped they just haven't finished yet. So they are not
a proper oom victim candidate anymore. They are still sitting on some
shmem but that is not really reclaimable.

Does that make more sense now?

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      parent reply	other threads:[~2016-07-20 13:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-18 23:18 oom-reaper choosing wrong processes Dave Jones
2016-07-19  9:08 ` Michal Hocko
2016-07-19 10:52   ` Tetsuo Handa
2016-07-19 15:36     ` Dave Jones
2016-07-20 10:40       ` Tetsuo Handa
2016-07-19 15:33   ` Dave Jones
2016-07-20  7:09     ` Michal Hocko
2016-07-20 13:23       ` Dave Jones
2016-07-20 13:33         ` Dave Jones
2016-07-20 13:39           ` Michal Hocko
2016-07-20 13:36         ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160720133627.GN11249@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=davej@codemonkey.org.uk \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).