linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Hillf Danton <hdanton@sina.com>
Cc: linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Chris Down <chris@chrisdown.name>, Tejun Heo <tj@kernel.org>,
	Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>
Subject: Re: [RFC v2] memcg: add memcg lru for page reclaiming
Date: Tue, 29 Oct 2019 11:46:54 -0400	[thread overview]
Message-ID: <20191029154654.GC33522@cmpxchg.org> (raw)
In-Reply-To: <20191026110745.12956-1-hdanton@sina.com>

On Sat, Oct 26, 2019 at 07:07:45PM +0800, Hillf Danton wrote:
> 
> Currently soft limit reclaim is frozen, see
> Documentation/admin-guide/cgroup-v2.rst for reasons.
> 
> This work adds memcg hook into kswapd's logic to bypass slr,
> paving a brick for its cleanup later.
> 
> After b23afb93d317 ("memcg: punt high overage reclaim to
> return-to-userland path"), high limit breachers are reclaimed one
> after another spiraling up through the memcg hierarchy before
> returning to userspace.
> 
> We can not add new hook yet if it is infeasible to defer that
> reclaiming a bit further until kswapd becomes active.
> 
> It can be defered however because high limit breach looks benign
> in the absence of memory pressure, or we ensure it will be
> reclaimed soon in the presence of kswapd.

I have no idea what this patch is actually trying to do. But this
premise here, as well as the implementation, are seriously flawed.

memory.high needs to be enforced synchronously. Current users expect
workloads to be strictly contained or throttled by memory.high in
order to ensure consistent behavior regardless of the host
environment, as well as prevent interference with other workloads
whose startup time could be slowed down by this lack of containment.

On the implementation side, it appears you patched out reclaim but
left in the throttling that's supposed to make up for failing
reclaim. That means that once a cgroup tree's cache footprint grows
past its memory.high, instead of simply picking up the cold cache
pages, it'll get throttled heavily and see extreme memory pressure. It
could take ages for it to grow to the point where kswapd wakes up.

Nacked-by: Johannes Weiner <hannes@cmpxchg.org>

  parent reply	other threads:[~2019-10-29 15:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20191026110745.12956-1-hdanton@sina.com>
2019-10-29  8:37 ` [RFC v2] memcg: add memcg lru for page reclaiming Michal Hocko
2019-10-29 15:46 ` Johannes Weiner [this message]
2019-11-07  9:02 ` [memcg] 1fc14cf673: invoked_oom-killer:gfp_mask=0x kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191029154654.GC33522@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chrisdown.name \
    --cc=guro@fb.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).