All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: <willy@infradead.org>, <akpm@linux-foundation.org>,
	<hannes@cmpxchg.org>, <mhocko@kernel.org>,
	<vdavydov.dev@gmail.com>, <shakeelb@google.com>,
	<shy828301@gmail.com>, <alexs@kernel.org>,
	<richard.weiyang@gmail.com>, <david@fromorbit.com>,
	<trond.myklebust@hammerspace.com>, <anna.schumaker@netapp.com>,
	<jaegeuk@kernel.org>, <chao@kernel.org>,
	<kari.argillander@gmail.com>, <linux-fsdevel@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-nfs@vger.kernel.org>, <zhengqi.arch@bytedance.com>,
	<duanxiongchun@bytedance.com>, <fam.zheng@bytedance.com>,
	<smuchun@gmail.com>, <cl@linux.com>, <penberg@kernel.org>,
	<rientjes@google.com>, <vbabka@suse.cz>
Subject: Re: [PATCH v5 02/16] mm: introduce kmem_cache_alloc_lru
Date: Thu, 6 Jan 2022 19:04:53 -0800	[thread overview]
Message-ID: <Ydet1XmiY8SZPLUx@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20211220085649.8196-3-songmuchun@bytedance.com>

On Mon, Dec 20, 2021 at 04:56:35PM +0800, Muchun Song wrote:
> We currently allocate scope for every memcg to be able to tracked on
> every superblock instantiated in the system, regardless of whether
> that superblock is even accessible to that memcg.
> 
> These huge memcg counts come from container hosts where memcgs are
> confined to just a small subset of the total number of superblocks
> that instantiated at any given point in time.
> 
> For these systems with huge container counts, list_lru does not need
> the capability of tracking every memcg on every superblock. What it
> comes down to is that adding the memcg to the list_lru at the first
> insert. So introduce kmem_cache_alloc_lru to allocate objects and its
> list_lru. In the later patch, we will convert all inode and dentry
> allocation from kmem_cache_alloc to kmem_cache_alloc_lru.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  include/linux/list_lru.h   |   4 ++
>  include/linux/memcontrol.h |  14 ++++++
>  include/linux/slab.h       |   3 ++
>  mm/list_lru.c              | 104 +++++++++++++++++++++++++++++++++++++++++----
>  mm/memcontrol.c            |  14 ------
>  mm/slab.c                  |  39 +++++++++++------
>  mm/slab.h                  |  25 +++++++++--
>  mm/slob.c                  |   6 +++
>  mm/slub.c                  |  42 ++++++++++++------
>  9 files changed, 198 insertions(+), 53 deletions(-)
> 
> diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
> index 729a27b6ff53..ab912c49334f 100644
> --- a/include/linux/list_lru.h
> +++ b/include/linux/list_lru.h
> @@ -56,6 +56,8 @@ struct list_lru {
>  	struct list_head	list;
>  	int			shrinker_id;
>  	bool			memcg_aware;
> +	/* protects ->mlrus->mlru[i] */
> +	spinlock_t		lock;
>  	/* for cgroup aware lrus points to per cgroup lists, otherwise NULL */
>  	struct list_lru_memcg	__rcu *mlrus;
>  #endif
> @@ -72,6 +74,8 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware,
>  #define list_lru_init_memcg(lru, shrinker)		\
>  	__list_lru_init((lru), true, NULL, shrinker)
>  
> +int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru,
> +			 gfp_t gfp);
>  int memcg_update_all_list_lrus(int num_memcgs);
>  void memcg_drain_all_list_lrus(int src_idx, struct mem_cgroup *dst_memcg);
>  
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 0c5c403f4be6..561ba47760db 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -520,6 +520,20 @@ static inline struct mem_cgroup *page_memcg_check(struct page *page)
>  	return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
>  }
>  
> +static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg)
> +{
> +	struct mem_cgroup *memcg;
> +
> +	rcu_read_lock();
> +retry:
> +	memcg = obj_cgroup_memcg(objcg);
> +	if (unlikely(!css_tryget(&memcg->css)))
> +		goto retry;
> +	rcu_read_unlock();
> +
> +	return memcg;
> +}
> +
>  #ifdef CONFIG_MEMCG_KMEM
>  /*
>   * folio_memcg_kmem - Check if the folio has the memcg_kmem flag set.
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 181045148b06..eccbd21d3753 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -135,6 +135,7 @@
>  
>  #include <linux/kasan.h>
>  
> +struct list_lru;
>  struct mem_cgroup;
>  /*
>   * struct kmem_cache related prototypes
> @@ -425,6 +426,8 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
>  
>  void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __alloc_size(1);
>  void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc;
> +void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
> +			   gfp_t gfpflags) __assume_slab_alignment __malloc;

I'm not a big fan of this patch: I don't see why preparing the lru
infrastructure has to be integrated that deep into the slab code.

Why can't kmem_cache_alloc_lru() be a simple wrapper like (pseudo-code):
  void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
			   gfp_t gfpflags) {
	if (necessarily)
	   prepare_lru_infra();
	return kmem_cache_alloc();
  }

In the current form the patch breaks the API layering. Maybe it's strictly
necessarily, but we should have a __very__ strong reason for this.

Thanks!

cc Slab maintainers

  reply	other threads:[~2022-01-07  3:05 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-20  8:56 [PATCH v5 00/16] Optimize list lru memory consumption Muchun Song
2021-12-20  8:56 ` [PATCH v5 01/16] mm: list_lru: optimize memory consumption of arrays of per cgroup lists Muchun Song
2022-01-07  0:05   ` Roman Gushchin
2022-01-09  4:49     ` Muchun Song
2022-01-10 18:42       ` Roman Gushchin
2022-01-11  3:19         ` Muchun Song
2021-12-20  8:56 ` [PATCH v5 02/16] mm: introduce kmem_cache_alloc_lru Muchun Song
2022-01-07  3:04   ` Roman Gushchin [this message]
2022-01-09  6:21     ` Muchun Song
2022-01-10 18:47       ` Roman Gushchin
2022-01-11 15:41         ` Vlastimil Babka
2022-01-11 17:54           ` Roman Gushchin
2021-12-20  8:56 ` [PATCH v5 03/16] fs: introduce alloc_inode_sb() to allocate filesystems specific inode Muchun Song
2022-01-11 18:55   ` Roman Gushchin
2022-01-12  2:54     ` Muchun Song
2021-12-20  8:56 ` [PATCH v5 04/16] fs: allocate inode by using alloc_inode_sb() Muchun Song
2022-01-11 18:58   ` Roman Gushchin
2022-01-12  2:55     ` Muchun Song
2021-12-20  8:56 ` [PATCH v5 05/16] f2fs: " Muchun Song
2022-01-11 19:03   ` Roman Gushchin
2021-12-20  8:56 ` [PATCH v5 06/16] nfs42: use a specific kmem_cache to allocate nfs4_xattr_entry Muchun Song
2021-12-20  8:56 ` [PATCH v5 07/16] mm: dcache: use kmem_cache_alloc_lru() to allocate dentry Muchun Song
2022-01-11 19:05   ` Roman Gushchin
2021-12-20  8:56 ` [PATCH v5 08/16] xarray: use kmem_cache_alloc_lru to allocate xa_node Muchun Song
2022-01-11 19:14   ` Roman Gushchin
2021-12-20  8:56 ` [PATCH v5 09/16] mm: memcontrol: move memcg_online_kmem() to mem_cgroup_css_online() Muchun Song
2022-01-11 19:17   ` Roman Gushchin
2021-12-20  8:56 ` [PATCH v5 10/16] mm: list_lru: allocate list_lru_one only when needed Muchun Song
2022-01-06 11:00   ` Michal Koutný
2022-01-12 13:22     ` Muchun Song
2022-01-13 13:32       ` Michal Koutný
2022-01-18 12:05         ` Muchun Song
2022-01-19  9:33           ` Michal Koutný
2022-01-21  5:28             ` Muchun Song
2022-01-11 20:00   ` Roman Gushchin
2022-01-12  4:48     ` Muchun Song
2021-12-20  8:56 ` [PATCH v5 11/16] mm: list_lru: rename memcg_drain_all_list_lrus to memcg_reparent_list_lrus Muchun Song
2021-12-20  8:56 ` [PATCH v5 12/16] mm: list_lru: replace linear array with xarray Muchun Song
2021-12-20  8:56 ` [PATCH v5 13/16] mm: memcontrol: reuse memory cgroup ID for kmem ID Muchun Song
2021-12-20  9:27   ` Mika Penttilä
2022-01-05 17:03   ` Michal Koutný
2022-01-06  3:34     ` Muchun Song
2021-12-20  8:56 ` [PATCH v5 14/16] mm: memcontrol: fix cannot alloc the maximum memcg ID Muchun Song
2021-12-20  8:56 ` [PATCH v5 15/16] mm: list_lru: rename list_lru_per_memcg to list_lru_memcg Muchun Song
2021-12-20  8:56 ` [PATCH v5 16/16] mm: memcontrol: rename memcg_cache_id to memcg_kmem_id Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ydet1XmiY8SZPLUx@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexs@kernel.org \
    --cc=anna.schumaker@netapp.com \
    --cc=chao@kernel.org \
    --cc=cl@linux.com \
    --cc=david@fromorbit.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=fam.zheng@bytedance.com \
    --cc=hannes@cmpxchg.org \
    --cc=jaegeuk@kernel.org \
    --cc=kari.argillander@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=penberg@kernel.org \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=smuchun@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=trond.myklebust@hammerspace.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=willy@infradead.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.