All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom
Date: Thu, 16 Jul 2020 19:53:12 +0800	[thread overview]
Message-ID: <CALOAHbBKLCoaPMneDnk+SiRG5zKyDNoBXSJCXE3=efp4Bkpreg@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007152356260.2921049@chino.kir.corp.google.com>

On Thu, Jul 16, 2020 at 3:04 PM David Rientjes <rientjes@google.com> wrote:
>
> On Thu, 16 Jul 2020, Yafang Shao wrote:
>
> > >  But yes, we store vital
> > > information about the memcg at the time of the first oom event when the
> > > oom killer is disabled (to allow userspace to determine what the best
> > > course of action is).
> > >
> >
> > It would be better if you could upstream the features in your kernel,
> > and I think it could also help the other users.
> >
>
> Everything we've discussed so far has been proposed in the past, actually.
> I think we stress the oom killer and use it at scale that others do not,
> so only a subset of users would find it interesting.  You are very likely
> one of those subset of users.
>
> We should certainly talk about other issues that we have run into that
> make the upstream oom killer unusable.  Are there other areas that you're
> currently focused on or having trouble with?  I'd be happy to have a
> discussion on how we have resolved a lot of its issues.
>
> > I understand what you mean "point of no return", but that seems a
> > workaround rather than a fix.
> > If you don't want to kill unnecessary processes, then checking the
> > memcg margin before sending sigkill is better, because as I said
> > before the race will be most likely happening in dump_header().
> > If you don't want to show strange OOM information like "your process
> > was oom killed and it shows usage is 60MB in a memcg limited
> > to 100MB", it is better to get the snapshot of the OOM when it is
> > triggered and then show it later, and I think it could also apply to
> > the global OOM.
> >
>
> It's likely a misunderstanding: I wasn't necessarily concerned about
> showing 60MB in a memcg limited to 100MB, that part we can deal with, the
> concern was after dumping all of that great information that instead of
> getting a "Killed process..." we get a "Oh, there's memory now, just
> kidding about everything we just dumped" ;)
>

Actually the kernel is doing it now, see bellow,

dump_header() <<<< dump lots of information
__oom_kill_process
    p = find_lock_task_mm(victim);
    if (!p)
       return;   <<<< without killing any process.


> We could likely enlighten userspace about that so that we don't consider
> that to be an actual oom kill.  But I also very much agree that after
> dump_header() would be appropriate as well since the goal is to prevent
> unnecessary oom killing.
>
> Would you mind sending a patch to check mem_cgroup_margin() on
> is_memcg_oom() prior to sending the SIGKILL to the victim and printing the
> "Killed process..." line?  We'd need a line that says "xKB of memory now
> available -- suppressing oom kill" or something along those lines so
> userspace understands what happened.  But the memory info that it emits
> both for the state of the memcg and system RAM may also be helpful to
> understand why we got to the oom kill in the first place, which is also a
> good thing.
>
> I'd happy ack that patch since it would be a comprehensive solution that
> avoids oom kill of user processes at all costs, which is a goal I think we
> can all rally behind.

I'd prefer to put dump_header() behind do_send_sig_info(), for example,

__oom_kill_process()
    do_send_sig_info()
    dump_header() <<<< may better put it behind wake_oom_reaper(), but
it may loses some information to dump...
    pr_err("%s: Killed process %d (%s)....")

Because the main goal of OOM is to kill a process to free pages ASAP
to avoid system stall or memcg stall.
We all find that  dump_header() may take a long time to finish
especially if there is a slow console, and this long time may cause a
great system stall, so we'd better defer the dump of it.

But that should be another topic.

-- 
Thanks
Yafang


  reply	other threads:[~2020-07-16 11:53 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14 13:57 [PATCH v2] memcg, oom: check memcg margin for parallel oom Yafang Shao
2020-07-14 14:05 ` Michal Hocko
2020-07-14 14:30 ` Chris Down
2020-07-14 18:46 ` David Rientjes
2020-07-15  1:44   ` Yafang Shao
2020-07-15  2:44     ` David Rientjes
2020-07-15  3:10       ` Yafang Shao
2020-07-15  3:18         ` David Rientjes
2020-07-15  3:31           ` Yafang Shao
2020-07-15 17:30             ` David Rientjes
2020-07-16  2:38               ` Yafang Shao
2020-07-16  7:04                 ` David Rientjes
2020-07-16 11:53                   ` Yafang Shao [this message]
2020-07-16 12:21                     ` Michal Hocko
2020-07-16 13:09                       ` Tetsuo Handa
2020-07-16 19:53                     ` David Rientjes
2020-07-17  1:35                       ` Yafang Shao
2020-07-17 19:26                         ` David Rientjes
2020-07-18  2:15                           ` Yafang Shao
2020-07-16  5:54               ` Tetsuo Handa
2020-07-16  6:11                 ` Michal Hocko
2020-07-16  7:06                   ` David Rientjes
2020-07-16  6:08               ` Michal Hocko
2020-07-16  6:56                 ` David Rientjes
2020-07-16  7:12                   ` Michal Hocko
2020-07-16 20:04                     ` David Rientjes
2020-07-28 18:04                   ` Johannes Weiner
2020-07-15  6:56         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALOAHbBKLCoaPMneDnk+SiRG5zKyDNoBXSJCXE3=efp4Bkpreg@mail.gmail.com' \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.