linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Yang Shi <yang.shi@linux.alibaba.com>
Cc: mhocko@suse.com, shakeelb@google.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC v3 PATCH 0/5] mm: memcontrol: do memory reclaim when offlining
Date: Wed, 9 Jan 2019 17:51:43 -0500	[thread overview]
Message-ID: <20190109225143.GA22252@cmpxchg.org> (raw)
In-Reply-To: <9de4bb4a-6bb7-e13a-0d9a-c1306e1b3e60@linux.alibaba.com>

On Wed, Jan 09, 2019 at 02:09:20PM -0800, Yang Shi wrote:
> On 1/9/19 1:23 PM, Johannes Weiner wrote:
> > On Wed, Jan 09, 2019 at 12:36:11PM -0800, Yang Shi wrote:
> > > As I mentioned above, if we know some page caches from some memcgs
> > > are referenced one-off and unlikely shared, why just keep them
> > > around to increase memory pressure?
> > It's just not clear to me that your scenarios are generic enough to
> > justify adding two interfaces that we have to maintain forever, and
> > that they couldn't be solved with existing mechanisms.
> > 
> > Please explain:
> > 
> > - Unmapped clean page cache isn't expensive to reclaim, certainly
> >    cheaper than the IO involved in new application startup. How could
> >    recycling clean cache be a prohibitive part of workload warmup?
> 
> It is nothing about recycling. Those page caches might be referenced by
> memcg just once, then nobody touch them until memory pressure is hit. And,
> they might be not accessed again at any time soon.

I meant recycling the page frames, not the cache in them. So the new
workload as it starts up needs to take those pages from the LRU list
instead of just the allocator freelist. While that's obviously not the
same cost, it's not clear why the difference would be prohibitive to
application startup especially since app startup tends to be dominated
by things like IO to fault in executables etc.

> > - Why you couldn't set memory.high or memory.max to 0 after the
> >    application quits and before you call rmdir on the cgroup
> 
> I recall I explained this in the review email for the first version. Set
> memory.high or memory.max to 0 would trigger direct reclaim which may stall
> the offline of memcg. But, we have "restarting the same name job" logic in
> our usecase (I'm not quite sure why they do so). Basically, it means to
> create memcg with the exact same name right after the old one is deleted,
> but may have different limit or other settings. The creation has to wait for
> rmdir is done.

This really needs a fix on your end. We cannot add new cgroup control
files because you cannot handle a delayed release in the cgroupfs
namespace while you're reclaiming associated memory. A simple serial
number would fix this.

Whether others have asked for this knob or not, these patches should
come with a solid case in the cover letter and changelogs that explain
why this ABI is necessary to solve a generic cgroup usecase. But it
sounds to me that setting the limit to 0 once the group is empty would
meet the functional requirement (use fork() if you don't want to wait)
of what you are trying to do.

I don't think the new interface bar is met here.

  reply	other threads:[~2019-01-09 22:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-09 19:14 [RFC v3 PATCH 0/5] mm: memcontrol: do memory reclaim when offlining Yang Shi
2019-01-09 19:14 ` [v3 PATCH 1/5] doc: memcontrol: fix the obsolete content about force empty Yang Shi
2019-01-09 19:14 ` [v3 PATCH 2/5] mm: memcontrol: add may_swap parameter to mem_cgroup_force_empty() Yang Shi
2019-01-09 19:14 ` [v3 PATCH 3/5] mm: memcontrol: introduce wipe_on_offline interface Yang Shi
2019-01-09 19:14 ` [v3 PATCH 4/5] mm: memcontrol: bring force_empty into default hierarchy Yang Shi
2019-01-09 19:14 ` [v3 PATCH 5/5] doc: memcontrol: add description for wipe_on_offline Yang Shi
2019-01-10 12:00   ` William Kucharski
2019-01-09 19:32 ` [RFC v3 PATCH 0/5] mm: memcontrol: do memory reclaim when offlining Johannes Weiner
2019-01-09 20:36   ` Yang Shi
2019-01-09 21:23     ` Johannes Weiner
2019-01-09 22:09       ` Yang Shi
2019-01-09 22:51         ` Johannes Weiner [this message]
2019-01-10  1:47           ` Yang Shi
2019-01-14 19:01             ` Johannes Weiner
2019-01-17 22:55               ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190109225143.GA22252@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).