All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>, Roman Gushchin <guro@fb.com>,
	Greg Thelen <gthelen@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>, Cgroups <cgroups@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max
Date: Tue, 5 May 2020 12:40:31 -0400	[thread overview]
Message-ID: <20200505164031.GC58018@cmpxchg.org> (raw)
In-Reply-To: <CALvZod48mu1w=fjpD=GXqCMdNq_8rExQ177_EP+Lx+TvR6fw+w@mail.gmail.com>

On Tue, May 05, 2020 at 08:35:45AM -0700, Shakeel Butt wrote:
> On Tue, May 5, 2020 at 8:27 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > On Mon, May 04, 2020 at 12:23:51PM -0700, Shakeel Butt wrote:
> > > On Mon, May 4, 2020 at 9:06 AM Michal Hocko <mhocko@kernel.org> wrote:
> > > > I really hate to repeat myself but this is no different from a regular
> > > > oom situation.
> > >
> > > Conceptually yes there is no difference but there is no *divine
> > > restriction* to not make a difference if there is a real world
> > > use-case which would benefit from it.
> >
> > I would wholeheartedly agree with this in general.
> >
> > However, we're talking about the very semantics that set memory.max
> > apart from memory.high: triggering OOM kills to enforce the limit.
> >
> > > > when the kernel cannot act and mentions that along with the
> > > > oom report so that whoever consumes that information can debug or act on
> > > > that fact.
> > > >
> > > > Silencing the oom report is simply removing a potentially useful
> > > > aid to debug further a potential problem.
> > >
> > > *Potentially* useful for debugging versus actually beneficial for
> > > "sweep before tear down" use-case. Also I am not saying to make "no
> > > dumps for memory.max when no eligible tasks" a set in stone rule. We
> > > can always reevaluate when such information will actually be useful.
> > >
> > > Johannes/Andrew, what's your opinion?
> >
> > I still think that if you want to sweep without triggering OOMs,
> > memory.high has the matching semantics.
> >
> > As you pointed out, it doesn't work well for foreign charges, but that
> > is more of a limitation in the implementation than in the semantics:
> >
> >         /*
> >          * If the hierarchy is above the normal consumption range, schedule
> >          * reclaim on returning to userland.  We can perform reclaim here
> >          * if __GFP_RECLAIM but let's always punt for simplicity and so that
> >          * GFP_KERNEL can consistently be used during reclaim.  @memcg is
> >          * not recorded as it most likely matches current's and won't
> >          * change in the meantime.  As high limit is checked again before
> >          * reclaim, the cost of mismatch is negligible.
> >          */
> >
> > Wouldn't it be more useful to fix that instead? It shouldn't be much
> > of a code change to do sync reclaim in try_charge().
> 
> Sync reclaim would really simplify the remote charging case. Though
> should sync reclaim only be done for remote charging or for all?

I would do it for all __GFP_RECLAIM callers, no need to special case
remote charging unnecessarily IMO.

We can do both the reclaim as well as the penalty throttling
synchronously, i.e. all of mem_cgroup_handle_over_high(). And then
punt to the userspace resume when we either didn't reclaim or are
still over the threshold after reclaim.

Btw we should probably kick off high_work rather than set userspace
resume on foreign charges, right? Otherwise we may not reclaim at all
when we push a group over its limit from outside (and in a
!__GFP_RECLAIM context).

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH] memcg: oom: ignore oom warnings from memory.max
Date: Tue, 5 May 2020 12:40:31 -0400	[thread overview]
Message-ID: <20200505164031.GC58018@cmpxchg.org> (raw)
In-Reply-To: <CALvZod48mu1w=fjpD=GXqCMdNq_8rExQ177_EP+Lx+TvR6fw+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, May 05, 2020 at 08:35:45AM -0700, Shakeel Butt wrote:
> On Tue, May 5, 2020 at 8:27 AM Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> >
> > On Mon, May 04, 2020 at 12:23:51PM -0700, Shakeel Butt wrote:
> > > On Mon, May 4, 2020 at 9:06 AM Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > > > I really hate to repeat myself but this is no different from a regular
> > > > oom situation.
> > >
> > > Conceptually yes there is no difference but there is no *divine
> > > restriction* to not make a difference if there is a real world
> > > use-case which would benefit from it.
> >
> > I would wholeheartedly agree with this in general.
> >
> > However, we're talking about the very semantics that set memory.max
> > apart from memory.high: triggering OOM kills to enforce the limit.
> >
> > > > when the kernel cannot act and mentions that along with the
> > > > oom report so that whoever consumes that information can debug or act on
> > > > that fact.
> > > >
> > > > Silencing the oom report is simply removing a potentially useful
> > > > aid to debug further a potential problem.
> > >
> > > *Potentially* useful for debugging versus actually beneficial for
> > > "sweep before tear down" use-case. Also I am not saying to make "no
> > > dumps for memory.max when no eligible tasks" a set in stone rule. We
> > > can always reevaluate when such information will actually be useful.
> > >
> > > Johannes/Andrew, what's your opinion?
> >
> > I still think that if you want to sweep without triggering OOMs,
> > memory.high has the matching semantics.
> >
> > As you pointed out, it doesn't work well for foreign charges, but that
> > is more of a limitation in the implementation than in the semantics:
> >
> >         /*
> >          * If the hierarchy is above the normal consumption range, schedule
> >          * reclaim on returning to userland.  We can perform reclaim here
> >          * if __GFP_RECLAIM but let's always punt for simplicity and so that
> >          * GFP_KERNEL can consistently be used during reclaim.  @memcg is
> >          * not recorded as it most likely matches current's and won't
> >          * change in the meantime.  As high limit is checked again before
> >          * reclaim, the cost of mismatch is negligible.
> >          */
> >
> > Wouldn't it be more useful to fix that instead? It shouldn't be much
> > of a code change to do sync reclaim in try_charge().
> 
> Sync reclaim would really simplify the remote charging case. Though
> should sync reclaim only be done for remote charging or for all?

I would do it for all __GFP_RECLAIM callers, no need to special case
remote charging unnecessarily IMO.

We can do both the reclaim as well as the penalty throttling
synchronously, i.e. all of mem_cgroup_handle_over_high(). And then
punt to the userspace resume when we either didn't reclaim or are
still over the threshold after reclaim.

Btw we should probably kick off high_work rather than set userspace
resume on foreign charges, right? Otherwise we may not reclaim at all
when we push a group over its limit from outside (and in a
!__GFP_RECLAIM context).

  parent reply	other threads:[~2020-05-05 16:40 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 18:27 [PATCH] memcg: oom: ignore oom warnings from memory.max Shakeel Butt
2020-04-30 18:27 ` Shakeel Butt
2020-04-30 19:06 ` Roman Gushchin
2020-04-30 19:06   ` Roman Gushchin
2020-04-30 19:30   ` Johannes Weiner
2020-04-30 19:30     ` Johannes Weiner
2020-04-30 20:23     ` Roman Gushchin
2020-04-30 20:23       ` Roman Gushchin
2020-04-30 19:31   ` Shakeel Butt
2020-04-30 19:31     ` Shakeel Butt
2020-04-30 19:31     ` Shakeel Butt
2020-04-30 19:29 ` Johannes Weiner
2020-04-30 20:20   ` Shakeel Butt
2020-04-30 20:20     ` Shakeel Butt
2020-04-30 20:20     ` Shakeel Butt
2020-05-04  6:57     ` Michal Hocko
2020-05-04  6:57       ` Michal Hocko
2020-05-04 13:54       ` Shakeel Butt
2020-05-04 13:54         ` Shakeel Butt
2020-05-04 13:54         ` Shakeel Butt
2020-05-01  1:39 ` Yafang Shao
2020-05-01  1:39   ` Yafang Shao
2020-05-01  2:04   ` Shakeel Butt
2020-05-01  2:04     ` Shakeel Butt
2020-05-01  2:12     ` Yafang Shao
2020-05-01  2:12       ` Yafang Shao
2020-05-01  2:12       ` Yafang Shao
2020-05-04  7:03   ` Michal Hocko
2020-05-04  7:03     ` Michal Hocko
2020-05-04  7:26     ` Yafang Shao
2020-05-04  7:26       ` Yafang Shao
2020-05-04  7:26       ` Yafang Shao
2020-05-04  7:35       ` Michal Hocko
2020-05-04  7:40         ` Yafang Shao
2020-05-04  7:40           ` Yafang Shao
2020-05-04  7:40           ` Yafang Shao
2020-05-04  8:03           ` Michal Hocko
2020-05-04  8:03             ` Michal Hocko
2020-05-04  6:56 ` Michal Hocko
2020-05-04  6:56   ` Michal Hocko
2020-05-04 13:54   ` Shakeel Butt
2020-05-04 13:54     ` Shakeel Butt
2020-05-04 14:11     ` Michal Hocko
2020-05-04 14:53       ` Shakeel Butt
2020-05-04 14:53         ` Shakeel Butt
2020-05-04 14:53         ` Shakeel Butt
2020-05-04 15:00         ` Michal Hocko
2020-05-04 15:35           ` Shakeel Butt
2020-05-04 15:35             ` Shakeel Butt
2020-05-04 15:35             ` Shakeel Butt
2020-05-04 15:39             ` Yafang Shao
2020-05-04 15:39               ` Yafang Shao
2020-05-04 15:39               ` Yafang Shao
2020-05-04 16:06             ` Michal Hocko
2020-05-04 16:06               ` Michal Hocko
2020-05-04 19:23               ` Shakeel Butt
2020-05-04 19:23                 ` Shakeel Butt
2020-05-05  7:13                 ` Michal Hocko
2020-05-05  7:13                   ` Michal Hocko
2020-05-05 15:03                   ` Shakeel Butt
2020-05-05 15:03                     ` Shakeel Butt
2020-05-05 16:57                     ` Johannes Weiner
2020-05-05 16:57                       ` Johannes Weiner
2020-05-05 15:27                 ` Johannes Weiner
2020-05-05 15:27                   ` Johannes Weiner
2020-05-05 15:35                   ` Shakeel Butt
2020-05-05 15:35                     ` Shakeel Butt
2020-05-05 15:35                     ` Shakeel Butt
2020-05-05 15:49                     ` Michal Hocko
2020-05-05 15:49                       ` Michal Hocko
2020-05-05 16:40                     ` Johannes Weiner [this message]
2020-05-05 16:40                       ` Johannes Weiner
2020-05-04 14:20     ` Tetsuo Handa
2020-05-04 14:20       ` Tetsuo Handa
2020-05-04 14:57       ` Shakeel Butt
2020-05-04 14:57         ` Shakeel Butt
2020-05-04 15:44         ` Tetsuo Handa
2020-05-04 15:44           ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200505164031.GC58018@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.