linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 2/2] mm: memcontrol: deprecate swapaccounting=0 mode
Date: Mon, 22 Mar 2021 11:53:41 -0400	[thread overview]
Message-ID: <YFi9hVQM9maWyhWv@cmpxchg.org> (raw)
In-Reply-To: <alpine.LSU.2.11.2103191633350.1043@eggly.anvils>

On Fri, Mar 19, 2021 at 05:37:05PM -0700, Hugh Dickins wrote:
> On Fri, 19 Mar 2021, Johannes Weiner wrote:
> 
> > The swapaccounting= commandline option already does very little
> > today. To close a trivial containment failure case, the swap ownership
> > tracking part of the swap controller has recently become mandatory
> > (see commit 2d1c498072de ("mm: memcontrol: make swap tracking an
> > integral part of memory control") for details), which makes up the
> > majority of the work during swapout, swapin, and the swap slot map.
> > 
> > The only thing left under this flag is the page_counter operations and
> > the visibility of the swap control files in the first place, which are
> > rather meager savings. There also aren't many scenarios, if any, where
> > controlling the memory of a cgroup while allowing it unlimited access
> > to a global swap space is a workable resource isolation stragegy.
> > 
> > On the other hand, there have been several bugs and confusion around
> > the many possible swap controller states (cgroup1 vs cgroup2 behavior,
> > memory accounting without swap accounting, memcg runtime disabled).
> > 
> > This puts the maintenance overhead of retaining the toggle above its
> > practical benefits. Deprecate it.
> > 
> > Suggested-by: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> This crashes, and needs a fix: see below (plus some nits).
> 
> But it's a very welcome cleanup: just getting rid of all those
> !cgroup_memory_noswap double negatives is a relief in itself.
> 
> It does suggest eliminating CONFIG_MEMCG_SWAP altogether (just
> using #ifdef CONFIG_SWAP instead, in those parts of CONFIG_MEMCG code);
> but you're right that's a separate cleanup, and not nearly so worthwhile
> as this one (I notice CONFIG_MEMCG_SWAP in some of the arch defconfigs,
> and don't know whether whoever removes CONFIG_MEMCG_SWAP would be
> obligated to remove those too).

2d1c498072de ("mm: memcontrol: make swap tracking an integral part of
memory control") made it invisible to the user already, I only kept
the symbol for convenience in the Makefile:

obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o

But I guess we could replace it with

ifdef CONFIG_SWAP
obj-$(CONFIG_MEMCG) += swap_cgroup.c
endif

and I agree, everywhere else it's currently used would be nicer
without it. I'll send a separate patch.

> > @@ -99,7 +92,11 @@ static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
> >  /* Whether legacy memory+swap accounting is active */
> >  static bool do_memsw_account(void)
> >  {
> > -	return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap;
> > +	/* cgroup2 doesn't do mem+swap accounting */
> > +	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +		return false;
> > +
> > +	return true;
> 
> Nit: I'm not fond of the "if (boolean()) return true; else return false;"
> codestyle, and would prefer the straightforward
> 
> 	return !cgroup_subsys_on_dfl(memory_cgrp_subsys);
> 
> but you've chosen otherwise, so, okay.

I prefer a series of branches if a single expression becomes unwieldy,
and individual conditions deserve their own comments.

But it's indeed pretty silly in this case, I'll make it a single line.

> > @@ -7124,7 +7121,7 @@ long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg)
> >  {
> >  	long nr_swap_pages = get_nr_swap_pages();
> >  
> > -	if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> 
> That needs to check mem_cgroup_disabled() where it used to check
> cgroup_memory_noswap.  The convolutions are confusing (which is
> precisely why this is such a welcome cleanup), but without a
> mem_cgroup_disabled() (or NULL memcg) check there, the
> cgroup_disable=memory case oopses on NULLish pointer dereference
> when mem_cgroup_get_nr_swap_pages() is called from get_scan_count().
> 
> (This little function has been repeatedly troublesome that way.)

Sigh, yes, this will hopefully be the last instance of this bug...

Thanks for catching that, I'll fix it up.

> > @@ -7141,7 +7138,7 @@ bool mem_cgroup_swap_full(struct page *page)
> >  
> >  	if (vm_swap_full())
> >  		return true;
> > -	if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> 
> Would it now be better to say "if (do_memsw_account())" there?

Yes, I'll change that.

> Or would it be better to eliminate do_memsw_account() altogether,
> and just use !cgroup_subsys_on_dfl(memory_cgrp_subsys) throughout?

I have found do_memsw_account() useful in the past to find those
related pieces. The details elude me know but I remember searching for
this string often during the recent work in this area.

> (Though I don't find "cgroup_subsys_on_dfl" very informative.)

This routinely bothers me as well, but I have not been able to come up
with a good replacement.

memcg_v1()? memcg_legacy_mode()? memory_cgroup_in_bytes()?

> > @@ -7291,27 +7287,13 @@ static struct cftype memsw_files[] = {
> >  	{ },	/* terminate */
> >  };
> >  
> > -/*
> > - * If mem_cgroup_swap_init() is implemented as a subsys_initcall()
> > - * instead of a core_initcall(), this could mean cgroup_memory_noswap still
> > - * remains set to false even when memcg is disabled via "cgroup_disable=memory"
> > - * boot parameter. This may result in premature OOPS inside
> > - * mem_cgroup_get_nr_swap_pages() function in corner cases.
> > - */
> >  static int __init mem_cgroup_swap_init(void)
> >  {
> > -	/* No memory control -> no swap control */
> > -	if (mem_cgroup_disabled())
> > -		cgroup_memory_noswap = true;
> > -
> > -	if (cgroup_memory_noswap)
> > -		return 0;
> > -
> 
> Shakeel suggested "if (mem_cgroup_disabled()) return 0;" here,
> and that was the first thing I tried when I got the crash.

It really isn't obviously safe. I'll put it back in.


  reply	other threads:[~2021-03-22 15:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19  5:49 [PATCH 1/2] mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled Johannes Weiner
2021-03-19  5:49 ` [PATCH 2/2] mm: memcontrol: deprecate swapaccounting=0 mode Johannes Weiner
2021-03-19 13:49   ` Shakeel Butt
2021-03-19 17:36     ` Johannes Weiner
2021-03-19 18:20       ` Shakeel Butt
2021-03-20  0:37   ` Hugh Dickins
2021-03-22 15:53     ` Johannes Weiner [this message]
2021-03-19 14:00 ` [PATCH 1/2] mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled Shakeel Butt
2021-03-19 23:33 ` Hugh Dickins
2021-03-22  9:41 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFi9hVQM9maWyhWv@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hughd@google.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).