All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Muchun Song <songmuchun@bytedance.com>, <willy@infradead.org>,
	<akpm@linux-foundation.org>, <hannes@cmpxchg.org>,
	<mhocko@kernel.org>, <vdavydov.dev@gmail.com>,
	<shakeelb@google.com>, <shy828301@gmail.com>, <alexs@kernel.org>,
	<alexander.h.duyck@linux.intel.com>, <richard.weiyang@gmail.com>,
	<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>
Subject: Re: [PATCH 0/9] Shrink the list lru size on memory cgroup removal
Date: Thu, 29 Apr 2021 18:39:40 -0700	[thread overview]
Message-ID: <YItf3GIUs2skeuyi@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20210430004903.GF1872259@dread.disaster.area>

On Fri, Apr 30, 2021 at 10:49:03AM +1000, Dave Chinner wrote:
> On Wed, Apr 28, 2021 at 05:49:40PM +0800, Muchun Song wrote:
> > In our server, we found a suspected memory leak problem. The kmalloc-32
> > consumes more than 6GB of memory. Other kmem_caches consume less than 2GB
> > memory.
> > 
> > After our in-depth analysis, the memory consumption of kmalloc-32 slab
> > cache is the cause of list_lru_one allocation.
> > 
> >   crash> p memcg_nr_cache_ids
> >   memcg_nr_cache_ids = $2 = 24574
> > 
> > memcg_nr_cache_ids is very large and memory consumption of each list_lru
> > can be calculated with the following formula.
> > 
> >   num_numa_node * memcg_nr_cache_ids * 32 (kmalloc-32)
> > 
> > There are 4 numa nodes in our system, so each list_lru consumes ~3MB.
> > 
> >   crash> list super_blocks | wc -l
> >   952
> 
> The more I see people trying to work around this, the more I think
> that the way memcgs have been grafted into the list_lru is back to
> front.
> 
> We currently allocate scope for every memcg to be able to tracked on
> every not on every superblock instantiated in the system, regardless
> of whether that superblock is even accessible to that memcg.
> 
> These huge memcg counts come from container hosts where memcgs are
> confined to just a small subset of the total number of superblocks
> that instantiated at any given point in time.
> 
> IOWs, for these systems with huge container counts, list_lru does
> not need the capability of tracking every memcg on every superblock.
> 
> What it comes down to is that the list_lru is only needed for a
> given memcg if that memcg is instatiating and freeing objects on a
> given list_lru.
> 
> Which makes me think we should be moving more towards "add the memcg
> to the list_lru at the first insert" model rather than "instantiate
> all at memcg init time just in case". The model we originally came
> up with for supprting memcgs is really starting to show it's limits,
> and we should address those limitations rahter than hack more
> complexity into the system that does nothing to remove the
> limitations that are causing the problems in the first place.

I totally agree.

It looks like the initial implementation of the whole kernel memory accounting
and memcg-aware shrinkers was based on the idea that the number of memory
cgroups is relatively small and stable. With systemd creating a separate cgroup
for everything including short-living processes it simple not true anymore.

Thanks!

  reply	other threads:[~2021-04-30  1:40 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-28  9:49 [PATCH 0/9] Shrink the list lru size on memory cgroup removal Muchun Song
2021-04-28  9:49 ` [PATCH 1/9] mm: list_lru: fix list_lru_count_one() return value Muchun Song
2021-04-28  9:49 ` [PATCH 2/9] mm: memcontrol: remove kmemcg_id reparenting Muchun Song
2021-04-28  9:49 ` [PATCH 3/9] mm: list_lru: rename memcg_drain_all_list_lrus to memcg_reparent_list_lrus Muchun Song
2021-04-28  9:49 ` [PATCH 4/9] mm: memcontrol: remove the kmem states Muchun Song
2021-04-28  9:49 ` [PATCH 5/9] mm: memcontrol: move memcg_online_kmem() to mem_cgroup_css_online() Muchun Song
2021-04-28  9:49 ` [PATCH 6/9] mm: list_lru: support for shrinking list lru Muchun Song
2021-04-28  9:49 ` [PATCH 7/9] ida: introduce ida_max() to return the maximum allocated ID Muchun Song
2021-04-29  6:47   ` Christoph Hellwig
2021-04-29  7:36     ` [External] " Muchun Song
2021-04-29  7:36       ` Muchun Song
2021-04-28  9:49 ` [PATCH 8/9] mm: memcontrol: shrink the list lru size Muchun Song
2021-04-28  9:49 ` [PATCH 9/9] mm: memcontrol: rename memcg_{get,put}_cache_ids to memcg_list_lru_resize_{lock,unlock} Muchun Song
2021-04-28 23:32 ` [PATCH 0/9] Shrink the list lru size on memory cgroup removal Shakeel Butt
2021-04-28 23:32   ` Shakeel Butt
2021-04-29  3:05   ` [External] " Muchun Song
2021-04-29  3:05     ` Muchun Song
2021-04-30  0:49 ` Dave Chinner
2021-04-30  1:39   ` Roman Gushchin [this message]
2021-04-30  3:27     ` Dave Chinner
2021-04-30  8:32       ` [External] " Muchun Song
2021-04-30  8:32         ` Muchun Song
2021-05-01  3:10         ` Roman Gushchin
2021-05-01  3:27         ` Matthew Wilcox
2021-05-02 23:58         ` Dave Chinner
2021-05-03  6:33           ` Muchun Song
2021-05-03  6:33             ` Muchun Song
2021-05-05  1:13             ` Dave Chinner
2021-05-07  5:45               ` Muchun Song
2021-05-07  5:45                 ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YItf3GIUs2skeuyi@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=alexs@kernel.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=richard.weiyang@gmail.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=vdavydov.dev@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.