All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/3] mm: memcontrol: deprecate charge moving
Date: Wed, 7 Dec 2022 13:51:08 -0800	[thread overview]
Message-ID: <20221207135108.fe1d51f7581f6ff86dbf9bc8@linux-foundation.org> (raw)
In-Reply-To: <CALvZod6WcBifeWJYG_QLr9Uy5aSbpLoCVyOp+FVx0ca1gzq4fA@mail.gmail.com>

On Tue, 6 Dec 2022 16:03:54 -0800 Shakeel Butt <shakeelb@google.com> wrote:

> On Tue, Dec 6, 2022 at 9:14 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> >
> > Charge moving mode in cgroup1 allows memory to follow tasks as they
> > migrate between cgroups. This is, and always has been, a questionable
> > thing to do - for several reasons.
> >
> > First, it's expensive. Pages need to be identified, locked and
> > isolated from various MM operations, and reassigned, one by one.
> >
> > Second, it's unreliable. Once pages are charged to a cgroup, there
> > isn't always a clear owner task anymore. Cache isn't moved at all, for
> > example. Mapped memory is moved - but if trylocking or isolating a
> > page fails, it's arbitrarily left behind. Frequent moving between
> > domains may leave a task's memory scattered all over the place.
> >
> > Third, it isn't really needed. Launcher tasks can kick off workload
> > tasks directly in their target cgroup. Using dedicated per-workload
> > groups allows fine-grained policy adjustments - no need to move tasks
> > and their physical pages between control domains. The feature was
> > never forward-ported to cgroup2, and it hasn't been missed.
> >
> > Despite it being a niche usecase, the maintenance overhead of
> > supporting it is enormous. Because pages are moved while they are live
> > and subject to various MM operations, the synchronization rules are
> > complicated. There are lock_page_memcg() in MM and FS code, which
> > non-cgroup people don't understand. In some cases we've been able to
> > shift code and cgroup API calls around such that we can rely on native
> > locking as much as possible. But that's fragile, and sometimes we need
> > to hold MM locks for longer than we otherwise would (pte lock e.g.).
> >
> > Mark the feature deprecated. Hopefully we can remove it soon.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Acked-by: Shakeel Butt <shakeelb@google.com>
> 
> I would request this patch to be backported to stable kernels as well
> for early warnings to users which update to newer kernels very late.

Sounds reasonable, but the changelog should have a few words in it
explaining why we're requesting the backport.  I guess I can type those
in.

We're at -rc8 and I'm not planning on merging these up until after
6.2-rc1 is out.  Please feel free to argue with me on that score.

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 3/3] mm: memcontrol: deprecate charge moving
Date: Wed, 7 Dec 2022 13:51:08 -0800	[thread overview]
Message-ID: <20221207135108.fe1d51f7581f6ff86dbf9bc8@linux-foundation.org> (raw)
In-Reply-To: <CALvZod6WcBifeWJYG_QLr9Uy5aSbpLoCVyOp+FVx0ca1gzq4fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, 6 Dec 2022 16:03:54 -0800 Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:

> On Tue, Dec 6, 2022 at 9:14 AM Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> >
> > Charge moving mode in cgroup1 allows memory to follow tasks as they
> > migrate between cgroups. This is, and always has been, a questionable
> > thing to do - for several reasons.
> >
> > First, it's expensive. Pages need to be identified, locked and
> > isolated from various MM operations, and reassigned, one by one.
> >
> > Second, it's unreliable. Once pages are charged to a cgroup, there
> > isn't always a clear owner task anymore. Cache isn't moved at all, for
> > example. Mapped memory is moved - but if trylocking or isolating a
> > page fails, it's arbitrarily left behind. Frequent moving between
> > domains may leave a task's memory scattered all over the place.
> >
> > Third, it isn't really needed. Launcher tasks can kick off workload
> > tasks directly in their target cgroup. Using dedicated per-workload
> > groups allows fine-grained policy adjustments - no need to move tasks
> > and their physical pages between control domains. The feature was
> > never forward-ported to cgroup2, and it hasn't been missed.
> >
> > Despite it being a niche usecase, the maintenance overhead of
> > supporting it is enormous. Because pages are moved while they are live
> > and subject to various MM operations, the synchronization rules are
> > complicated. There are lock_page_memcg() in MM and FS code, which
> > non-cgroup people don't understand. In some cases we've been able to
> > shift code and cgroup API calls around such that we can rely on native
> > locking as much as possible. But that's fragile, and sometimes we need
> > to hold MM locks for longer than we otherwise would (pte lock e.g.).
> >
> > Mark the feature deprecated. Hopefully we can remove it soon.
> >
> > Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> 
> Acked-by: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> 
> I would request this patch to be backported to stable kernels as well
> for early warnings to users which update to newer kernels very late.

Sounds reasonable, but the changelog should have a few words in it
explaining why we're requesting the backport.  I guess I can type those
in.

We're at -rc8 and I'm not planning on merging these up until after
6.2-rc1 is out.  Please feel free to argue with me on that score.

  reply	other threads:[~2022-12-07 21:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-06 17:13 [PATCH v2 0/3] mm: push down lock_page_memcg() Johannes Weiner
2022-12-06 17:13 ` Johannes Weiner
2022-12-06 17:13 ` [PATCH 1/3] mm: memcontrol: skip moving non-present pages that are mapped elsewhere Johannes Weiner
2022-12-06 17:13   ` Johannes Weiner
2022-12-07  1:51   ` Hugh Dickins
2022-12-08  0:36   ` Shakeel Butt
2022-12-08  0:36     ` Shakeel Butt
2022-12-06 17:13 ` [PATCH 2/3] mm: rmap: remove lock_page_memcg() Johannes Weiner
2022-12-06 17:13   ` Johannes Weiner
2022-12-07  1:52   ` Hugh Dickins
2022-12-07  1:52     ` Hugh Dickins
2022-12-08  0:36   ` Shakeel Butt
2022-12-06 17:13 ` [PATCH 3/3] mm: memcontrol: deprecate charge moving Johannes Weiner
2022-12-06 17:13   ` Johannes Weiner
2022-12-07  0:03   ` Shakeel Butt
2022-12-07  0:03     ` Shakeel Butt
2022-12-07 21:51     ` Andrew Morton [this message]
2022-12-07 21:51       ` Andrew Morton
2022-12-07 22:15       ` Shakeel Butt
2022-12-07 22:15         ` Shakeel Butt
2022-12-07  1:58   ` Hugh Dickins
2022-12-07  1:58     ` Hugh Dickins
2022-12-07 13:00     ` Johannes Weiner
2022-12-07 13:00       ` Johannes Weiner
2022-12-07 14:07 ` [PATCH v2 0/3] mm: push down lock_page_memcg() Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221207135108.fe1d51f7581f6ff86dbf9bc8@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.