From: David Rientjes <rientjes@google.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom
Date: Fri, 17 Jul 2020 12:26:07 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.23.453.2007171212210.3398972@chino.kir.corp.google.com> (raw)
In-Reply-To: <CALOAHbA5J23Fo3AmdANbPa_dDbjXJzLGb3PaZF8emfNENfcaJA@mail.gmail.com>
On Fri, 17 Jul 2020, Yafang Shao wrote:
> > > Actually the kernel is doing it now, see bellow,
> > >
> > > dump_header() <<<< dump lots of information
> > > __oom_kill_process
> > > p = find_lock_task_mm(victim);
> > > if (!p)
> > > return; <<<< without killing any process.
> > >
> >
> > Ah, this is catching an instance where the chosen process has already done
> > exit_mm(), good catch -- I can find examples of this by scraping kernel
> > logs from our fleet.
> >
> > So it appears there is precedence for dumping all the oom info but not
> > actually performing any action for it and I made the earlier point that
> > diagnostic information in the kernel log here is still useful. I think it
> > is still preferable that the kernel at least tell us why it didn't do
> > anything, but as you mention that already happens today.
> >
> > Would you like to send a patch that checks for mem_cgroup_margin() here as
> > well? A second patch could make the possible inaction more visibile,
> > something like "Process ${pid} (${comm}) is already exiting" for the above
> > check or "Memcg ${memcg} is no longer out of memory".
> >
> > Another thing that these messages indicate, beyond telling us why the oom
> > killer didn't actually SIGKILL anything, is that we can expect some skew
> > in the memory stats that shows an availability of memory.
> >
>
> Agreed, these messages would be helpful.
> I will send a patch for it.
>
Thanks Yafang. We should also continue talking about challenges you
encounter with the oom killer either at the system level or for memcg
limit ooms in a separate thread. It's clear that you are meeting several
of the issues that we have previously seen ourselves.
I could do a full audit of all our oom killer changes that may be
interesting to you, but off the top of my head:
- A means of triggering a memcg oom through the kernel: think of sysrq+f
but scoped to processes attached to a memcg hierarchy. This allows
userspace to reliably oom kill processes on overcommitted systems
(SIGKILL can be insufficient if we depend on oom reaping, for example,
to make forward progress)
- Storing the state of a memcg's memory at the time reclaim has failed
and we must oom kill: when the memcg oom killer is disabled so that
userspace can handle it, if it triggers an oom kill through the kernel
because it prefers an oom kill on an overcommitted system, we need to
dump the state of the memory at oom rather than with the stack of the
explicit trigger
- Supplement memcg oom notification with an additional notification event
on kernel oom kill: allows users to register for an event that triggers
when the kernel oom killer kills something (and keeps a count of these
events available for read)
- Add a notion of an oom delay: on overcommitted systems, userspace may
become unreliable or unresponsive despite our best efforts, this
supplements the ability to disable the oom killer for a memcg hierarchy
with the ability to disable it for a set period of time until the oom
killer intervenes and kills something (last ditch effort).
I'd be happy to discuss any of these topics if you are interested.
next prev parent reply other threads:[~2020-07-17 19:26 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-14 13:57 [PATCH v2] memcg, oom: check memcg margin for parallel oom Yafang Shao
2020-07-14 14:05 ` Michal Hocko
2020-07-14 14:30 ` Chris Down
2020-07-14 18:46 ` David Rientjes
2020-07-15 1:44 ` Yafang Shao
2020-07-15 2:44 ` David Rientjes
2020-07-15 3:10 ` Yafang Shao
2020-07-15 3:18 ` David Rientjes
2020-07-15 3:31 ` Yafang Shao
2020-07-15 17:30 ` David Rientjes
2020-07-16 2:38 ` Yafang Shao
2020-07-16 7:04 ` David Rientjes
2020-07-16 11:53 ` Yafang Shao
2020-07-16 12:21 ` Michal Hocko
2020-07-16 13:09 ` Tetsuo Handa
2020-07-16 19:53 ` David Rientjes
2020-07-17 1:35 ` Yafang Shao
2020-07-17 19:26 ` David Rientjes [this message]
2020-07-18 2:15 ` Yafang Shao
2020-07-16 5:54 ` Tetsuo Handa
2020-07-16 6:11 ` Michal Hocko
2020-07-16 7:06 ` David Rientjes
2020-07-16 6:08 ` Michal Hocko
2020-07-16 6:56 ` David Rientjes
2020-07-16 7:12 ` Michal Hocko
2020-07-16 20:04 ` David Rientjes
2020-07-28 18:04 ` Johannes Weiner
2020-07-15 6:56 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.23.453.2007171212210.3398972@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).