All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: "Leonardo Brás" <leobras@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining
Date: Fri, 27 Jan 2023 15:50:31 -0800	[thread overview]
Message-ID: <Y9RjRxe5Ao2/u+1Y@P9FQF9L96D.corp.robot.car> (raw)
In-Reply-To: <029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com>

On Fri, Jan 27, 2023 at 04:29:37PM -0300, Leonardo Brás wrote:
> On Fri, 2023-01-27 at 10:29 +0100, Michal Hocko wrote:
> > On Fri 27-01-23 04:35:22, Leonardo Brás wrote:
> > > On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote:
> > > > On Fri 27-01-23 04:14:19, Leonardo Brás wrote:
> > > > > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote:
> > > > [...]
> > > > > > I'd rather opt out of stock draining for isolated cpus: it might slightly reduce
> > > > > > the accuracy of memory limits and slightly increase the memory footprint (all
> > > > > > those dying memcgs...), but the impact will be limited. Actually it is limited
> > > > > > by the number of cpus.
> > > > > 
> > > > > I was discussing this same idea with Marcelo yesterday morning.
> > > > > 
> > > > > The questions had in the topic were:
> > > > > a - About how many pages the pcp cache will hold before draining them itself? 
> > > > 
> > > > MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The cache
> > > > doesn't really hold any pages. It is a mere counter of how many charges
> > > > have been accounted for the memcg page counter. So it is not really
> > > > consuming proportional amount of resources. It just pins the
> > > > corresponding memcg. Have a look at consume_stock and refill_stock
> > > 
> > > I see. Thanks for pointing that out!
> > > 
> > > So in worst case scenario the memcg would have reserved 64 pages * (numcpus - 1)
> > 
> > s@numcpus@num_isolated_cpus@
> 
> I was thinking worst case scenario being (ncpus - 1) being isolated.
> 
> > 
> > > that are not getting used, and may cause an 'earlier' OOM if this amount is
> > > needed but can't be freed.
> > 
> > s@OOM@memcg OOM@
>  
> > > In the wave of worst case, supposing a big powerpc machine, 256 CPUs, each
> > > holding 64k * 64 pages => 1GB memory - 4MB (one cpu using resources).
> > > It's starting to get too big, but still ok for a machine this size.
> > 
> > It is more about the memcg limit rather than the size of the machine.
> > Again, let's focus on actual usacase. What is the usual memcg setup with
> > those isolcpus
> 
> I understand it's about the limit, not actually allocated memory. When I point
> the machine size, I mean what is expected to be acceptable from a user in that
> machine.
> 
> > 
> > > The thing is that it can present an odd behavior: 
> > > You have a cgroup created before, now empty, and try to run given application,
> > > and hits OOM.
> > 
> > The application would either consume those cached charges or flush them
> > if it is running in a different memcg. Or what do you have in mind?
> 
> 1 - Create a memcg with a VM inside, multiple vcpus pinned to isolated cpus. 
> 2 - Run multi-cpu task inside the VM, it allocates memory for every CPU and keep
>     the pcp cache
> 3 - Try to run a single-cpu task (pinned?) inside the VM, which uses almost all
>     the available memory.
> 4 - memcg OOM.
> 
> Does it make sense?

It can happen now as well, you just need a competing drain request.

Honestly, I feel the probability of this scenario to be a real problem is fairly low.
I don't recall any complains on spurious OOMs because of races in the draining code.
Usually machines which are tight on memory are rarely have so many idle cpus.

Thanks!

  reply	other threads:[~2023-01-27 23:50 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25  7:34 [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Leonardo Bras
2023-01-25  7:34 ` Leonardo Bras
2023-01-25  7:34 ` [PATCH v2 1/5] mm/memcontrol: Align percpu memcg_stock to cache Leonardo Bras
2023-01-25  7:34   ` Leonardo Bras
2023-01-25  7:34 ` [PATCH v2 2/5] mm/memcontrol: Change stock_lock type from local_lock_t to spinlock_t Leonardo Bras
2023-01-25  7:34   ` Leonardo Bras
2023-01-25  7:35 ` [PATCH v2 3/5] mm/memcontrol: Reorder memcg_stock_pcp members to avoid holes Leonardo Bras
2023-01-25  7:35   ` Leonardo Bras
2023-01-25  7:35 ` [PATCH v2 4/5] mm/memcontrol: Perform all stock drain in current CPU Leonardo Bras
2023-01-25  7:35 ` [PATCH v2 5/5] mm/memcontrol: Remove flags from memcg_stock_pcp Leonardo Bras
2023-01-25  7:35   ` Leonardo Bras
2023-01-25  8:33 ` [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Michal Hocko
2023-01-25  8:33   ` Michal Hocko
2023-01-25 11:06   ` Leonardo Brás
2023-01-25 11:06     ` Leonardo Brás
2023-01-25 11:39     ` Michal Hocko
2023-01-25 11:39       ` Michal Hocko
2023-01-25 18:22     ` Marcelo Tosatti
2023-01-25 18:22       ` Marcelo Tosatti
2023-01-25 23:14       ` Roman Gushchin
2023-01-25 23:14         ` Roman Gushchin
2023-01-26  7:41         ` Michal Hocko
2023-01-26  7:41           ` Michal Hocko
2023-01-26 18:03           ` Marcelo Tosatti
2023-01-26 19:20             ` Michal Hocko
2023-01-27  0:32               ` Marcelo Tosatti
2023-01-27  0:32                 ` Marcelo Tosatti
2023-01-27  6:58                 ` Michal Hocko
2023-01-27  6:58                   ` Michal Hocko
2023-02-01 18:31               ` Roman Gushchin
2023-01-26 23:12           ` Roman Gushchin
2023-01-27  7:11             ` Michal Hocko
2023-01-27  7:11               ` Michal Hocko
2023-01-27  7:22               ` Leonardo Brás
2023-01-27  8:12                 ` Leonardo Brás
2023-01-27  8:12                   ` Leonardo Brás
2023-01-27  9:23                   ` Michal Hocko
2023-01-27  9:23                     ` Michal Hocko
2023-01-27 13:03                   ` Frederic Weisbecker
2023-01-27 13:03                     ` Frederic Weisbecker
2023-01-27 13:58               ` Michal Hocko
2023-01-27 13:58                 ` Michal Hocko
2023-01-27 18:18                 ` Roman Gushchin
2023-01-27 18:18                   ` Roman Gushchin
2023-02-03 15:21                   ` Michal Hocko
2023-02-03 15:21                     ` Michal Hocko
2023-02-03 19:25                     ` Roman Gushchin
2023-02-03 19:25                       ` Roman Gushchin
2023-02-13 13:36                       ` Michal Hocko
2023-02-13 13:36                         ` Michal Hocko
2023-01-27  7:14             ` Leonardo Brás
2023-01-27  7:14               ` Leonardo Brás
2023-01-27  7:20               ` Michal Hocko
2023-01-27  7:20                 ` Michal Hocko
2023-01-27  7:35                 ` Leonardo Brás
2023-01-27  9:29                   ` Michal Hocko
2023-01-27 19:29                     ` Leonardo Brás
2023-01-27 19:29                       ` Leonardo Brás
2023-01-27 23:50                       ` Roman Gushchin [this message]
2023-01-26 18:19         ` Marcelo Tosatti
2023-01-26 18:19           ` Marcelo Tosatti
2023-01-27  5:40           ` Leonardo Brás
2023-01-27  5:40             ` Leonardo Brás
2023-01-26  2:01       ` Hillf Danton
2023-01-26  7:45       ` Michal Hocko
2023-01-26  7:45         ` Michal Hocko
2023-01-26 18:14         ` Marcelo Tosatti
2023-01-26 18:14           ` Marcelo Tosatti
2023-01-26 19:13           ` Michal Hocko
2023-01-26 19:13             ` Michal Hocko
2023-01-27  6:55             ` Leonardo Brás
2023-01-27  6:55               ` Leonardo Brás
2023-01-31 11:35               ` Marcelo Tosatti
2023-01-31 11:35                 ` Marcelo Tosatti
2023-02-01  4:36                 ` Leonardo Brás
2023-02-01 12:52                   ` Michal Hocko
2023-02-01 12:52                     ` Michal Hocko
2023-02-01 12:41                 ` Michal Hocko
2023-02-01 12:41                   ` Michal Hocko
2023-02-04  4:55                   ` Leonardo Brás
2023-02-04  4:55                     ` Leonardo Brás
2023-02-05 19:49                     ` Roman Gushchin
2023-02-07  3:18                       ` Leonardo Brás
2023-02-08 19:23                         ` Roman Gushchin
2023-02-08 19:23                           ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9RjRxe5Ao2/u+1Y@P9FQF9L96D.corp.robot.car \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=leobras@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mtosatti@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.