All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Leonardo Bras <leobras@redhat.com>,
	Frederic Weisbecker <fweisbecker@suse.de>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Frederic Weisbecker <frederic@kernel.org>
Subject: Re: [PATCH 2/2] memcg: do not drain charge pcp caches on remote isolated cpus
Date: Sat, 18 Mar 2023 09:08:57 +0100	[thread overview]
Message-ID: <ZBVxmV78EyfDhvn/@dhcp22.suse.cz> (raw)
In-Reply-To: <20230318032350.2078-1-hdanton@sina.com>

On Sat 18-03-23 11:23:50, Hillf Danton wrote:
> On 17 Mar 2023 14:44:48 +0100 Michal Hocko <mhocko@suse.com>
> > Leonardo Bras has noticed that pcp charge cache draining might be
> > disruptive on workloads relying on 'isolated cpus', a feature commonly
> > used on workloads that are sensitive to interruption and context
> > switching such as vRAN and Industrial Control Systems.
> > 
> > There are essentially two ways how to approach the issue. We can either
> > allow the pcp cache to be drained on a different rather than a local cpu
> > or avoid remote flushing on isolated cpus.
> > 
> > The current pcp charge cache is really optimized for high performance
> > and it always relies to stick with its cpu. That means it only requires
> > local_lock (preempt_disable on !RT) and draining is handed over to pcp
> > WQ to drain locally again.
> > 
> > The former solution (remote draining) would require to add an additional
> > locking to prevent local charges from racing with the draining. This
> > adds an atomic operation to otherwise simple arithmetic fast path in the
> > try_charge path. Another concern is that the remote draining can cause a
> > lock contention for the isolated workloads and therefore interfere with
> > it indirectly via user space interfaces.
> > 
> > Another option is to avoid draining scheduling on isolated cpus
> > altogether. That means that those remote cpus would keep their charges
> > even after drain_all_stock returns. This is certainly not optimal either
> > but it shouldn't really cause any major problems. In the worst case
> > (many isolated cpus with charges - each of them with MEMCG_CHARGE_BATCH
> > i.e 64 page) the memory consumption of a memcg would be artificially
> > higher than can be immediately used from other cpus.
> > 
> > Theoretically a memcg OOM killer could be triggered pre-maturely.
> > Currently it is not really clear whether this is a practical problem
> > though. Tight memcg limit would be really counter productive to cpu
> > isolated workloads pretty much by definition because any memory
> > reclaimed induced by memcg limit could break user space timing
> > expectations as those usually expect execution in the userspace most of
> > the time.
> > 
> > Also charges could be left behind on memcg removal. Any future charge on
> > those isolated cpus will drain that pcp cache so this won't be a
> > permanent leak.
> > 
> > Considering cons and pros of both approaches this patch is implementing
> > the second option and simply do not schedule remote draining if the
> > target cpu is isolated. This solution is much more simpler. It doesn't
> > add any new locking and it is more more predictable from the user space
> > POV. Should the pre-mature memcg OOM become a real life problem, we can
> > revisit this decision.
> 
> JFYI feel free to take a look at the non-housekeeping CPUs [1].
> 
> [1] https://lore.kernel.org/lkml/20230223150624.GA29739@lst.de/

Such an approach would require remote draining and I hope I have
explained why that is not a preferred way in this case. Other than that
I do agree with Christoph that a generic approach would be really nice.

-- 
Michal Hocko
SUSE Labs

      reply	other threads:[~2023-03-18  8:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17 13:44 [PATCH 0/2] memcg, cpuisol: do not interfere pcp cache charges draining with cpuisol workloads Michal Hocko
2023-03-17 13:44 ` [PATCH 1/2] sched/isolation: Add cpu_is_isolated() API Michal Hocko
2023-03-17 18:33   ` Marcelo Tosatti
2023-03-17 18:35     ` Marcelo Tosatti
2023-03-18  8:04       ` Michal Hocko
2023-03-24 22:35         ` Frederic Weisbecker
2023-03-27 10:24           ` Marcelo Tosatti
2023-03-28 11:38             ` Frederic Weisbecker
2023-03-28 11:48             ` Michal Hocko
2023-03-29 14:20               ` Marcelo Tosatti
2023-03-30 13:28                 ` Michal Hocko
2023-03-30 15:21                   ` Marcelo Tosatti
2023-03-17 13:44 ` [PATCH 2/2] memcg: do not drain charge pcp caches on remote isolated cpus Michal Hocko
2023-03-17 20:08   ` Shakeel Butt
2023-03-17 21:51   ` kernel test robot
2023-03-17 22:22   ` kernel test robot
2023-03-17 23:32     ` Andrew Morton
2023-03-18  8:03       ` Michal Hocko
2023-03-18  3:23   ` Hillf Danton
2023-03-18  8:08     ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBVxmV78EyfDhvn/@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=frederic@kernel.org \
    --cc=fweisbecker@suse.de \
    --cc=hdanton@sina.com \
    --cc=leobras@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.