linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux MM <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>,
	LKML <linux-kernel@vger.kernel.org>, Domas Mituzas <domas@fb.com>,
	Tejun Heo <tj@kernel.org>, Chris Down <chris@chrisdown.name>
Subject: Re: [PATCH] mm: memcontrol: avoid workload stalls when lowering memory.high
Date: Fri, 10 Jul 2020 11:42:05 -0700	[thread overview]
Message-ID: <20200710184205.GB350256@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <CALvZod6Yk8QoZjbNkGE8-qeOD187Nu-+VwasoROGZs_UsMgbEQ@mail.gmail.com>

On Fri, Jul 10, 2020 at 07:12:22AM -0700, Shakeel Butt wrote:
> On Fri, Jul 10, 2020 at 5:29 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Thu 09-07-20 12:47:18, Roman Gushchin wrote:
> > > Memory.high limit is implemented in a way such that the kernel
> > > penalizes all threads which are allocating a memory over the limit.
> > > Forcing all threads into the synchronous reclaim and adding some
> > > artificial delays allows to slow down the memory consumption and
> > > potentially give some time for userspace oom handlers/resource control
> > > agents to react.
> > >
> > > It works nicely if the memory usage is hitting the limit from below,
> > > however it works sub-optimal if a user adjusts memory.high to a value
> > > way below the current memory usage. It basically forces all workload
> > > threads (doing any memory allocations) into the synchronous reclaim
> > > and sleep. This makes the workload completely unresponsive for
> > > a long period of time and can also lead to a system-wide contention on
> > > lru locks. It can happen even if the workload is not actually tight on
> > > memory and has, for example, a ton of cold pagecache.
> > >
> > > In the current implementation writing to memory.high causes an atomic
> > > update of page counter's high value followed by an attempt to reclaim
> > > enough memory to fit into the new limit. To fix the problem described
> > > above, all we need is to change the order of execution: try to push
> > > the memory usage under the limit first, and only then set the new
> > > high limit.
> >
> > Shakeel would this help with your pro-active reclaim usecase? It would
> > require to reset the high limit right after the reclaim returns which is
> > quite ugly but it would at least not require a completely new interface.
> > You would simply do
> >         high = current - to_reclaim
> >         echo $high > memory.high
> >         echo infinity > memory.high # To prevent direct reclaim
> >                                     # allocation stalls
> >
> 
> This will reduce the chance of stalls but the interface is still
> non-delegatable i.e. applications can not change their own memory.high
> for the use-cases like application controlled proactive reclaim and
> uswapd.

Can you, please, elaborate a bit more on this? I didn't understand
why.

> 
> One more ugly fix would be to add one more layer of cgroup and the
> application use memory.high of that layer to fulfill such use-cases.
> 
> I think providing a new interface would allow us to have a much
> cleaner solution than to settle on a bunch of ugly hacks.
> 
> > The primary reason to set the high limit in advance was to catch
> > potential runaways more effectively because they would just get
> > throttled while memory_high_write is reclaiming. With this change
> > the reclaim here might be just playing never ending catch up. On the
> > plus side a break out from the reclaim loop would just enforce the limit
> > so if the operation takes too long then the reclaim burden will move
> > over to consumers eventually. So I do not see any real danger.
> >
> > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > > Reported-by: Domas Mituzas <domas@fb.com>
> > > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > > Cc: Michal Hocko <mhocko@kernel.org>
> > > Cc: Tejun Heo <tj@kernel.org>
> > > Cc: Shakeel Butt <shakeelb@google.com>
> > > Cc: Chris Down <chris@chrisdown.name>
> >
> > Acked-by: Michal Hocko <mhocko@suse.com>
> >
> 
> This patch seems reasonable on its own.
> 
> Reviewed-by: Shakeel Butt <shakeelb@google.com>

Thank you!


  reply	other threads:[~2020-07-10 18:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-09 19:47 [PATCH] mm: memcontrol: avoid workload stalls when lowering memory.high Roman Gushchin
2020-07-10 12:29 ` Michal Hocko
2020-07-10 14:12   ` Shakeel Butt
2020-07-10 18:42     ` Roman Gushchin [this message]
2020-07-10 19:19       ` Shakeel Butt
2020-07-10 19:41         ` Roman Gushchin
2020-07-14  8:41         ` Michal Hocko
2020-07-14 15:32           ` Shakeel Butt
2020-07-14 15:50             ` Michal Hocko
2020-07-14 15:38         ` Johannes Weiner
2020-07-14 17:06           ` Shakeel Butt
2020-07-15 16:54             ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200710184205.GB350256@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chrisdown.name \
    --cc=domas@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).