linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
@ 2020-07-07  6:27 Muchun Song
  2020-07-15 11:32 ` Vlastimil Babka
  2020-07-15 15:20 ` Shakeel Butt
  0 siblings, 2 replies; 9+ messages in thread
From: Muchun Song @ 2020-07-07  6:27 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm
  Cc: linux-mm, linux-kernel, shakeelb, Muchun Song

If the kmem_cache refcount is greater than one, we should not
mark the root kmem_cache as dying. If we mark the root kmem_cache
dying incorrectly, the non-root kmem_cache can never be destroyed.
It resulted in memory leak when memcg was destroyed. We can use the
following steps to reproduce.

  1) Use kmem_cache_create() to create a new kmem_cache named A.
  2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
     so the refcount of B is just increased.
  3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
     decrease the B's refcount but mark the B as dying.
  4) Create a new memory cgroup and alloc memory from the kmem_cache
     A. It leads to create a non-root kmem_cache for allocating.
  5) When destroy the memory cgroup created in the step 4), the
     non-root kmem_cache can never be destroyed.

If we repeat steps 4) and 5), this will cause a lot of memory leak.
So only when refcount reach zero, we mark the root kmem_cache as dying.

Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8c1ffbf7de45..83ee6211aec7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
 		list_del(&s->memcg_params.kmem_caches_node);
 	}
 }
+
+static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
+{
+	return is_root_cache(s) && s->memcg_params.dying;
+}
 #else
 static inline int init_memcg_params(struct kmem_cache *s,
 				    struct kmem_cache *root_cache)
@@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
 static inline void memcg_unlink_cache(struct kmem_cache *s)
 {
 }
+
+static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
+{
+	return false;
+}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
@@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
 	if (s->refcount < 0)
 		return 1;
 
+	/*
+	 * If the kmem_cache is dying. We should also skip this
+	 * kmem_cache.
+	 */
+	if (memcg_kmem_cache_dying(s))
+		return 1;
+
 	return 0;
 }
 
@@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
 	if (unlikely(!s))
 		return;
 
-	flush_memcg_workqueue(s);
-
 	get_online_cpus();
 	get_online_mems();
 
@@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
 	if (s->refcount)
 		goto out_unlock;
 
+#ifdef CONFIG_MEMCG_KMEM
+	mutex_unlock(&slab_mutex);
+
+	put_online_mems();
+	put_online_cpus();
+
+	flush_memcg_workqueue(s);
+
+	get_online_cpus();
+	get_online_mems();
+
+	mutex_lock(&slab_mutex);
+
+	if (WARN(s->refcount,
+		 "kmem_cache_destroy %s: Slab cache is still referenced\n",
+		 s->name)) {
+		/*
+		 * Reset the dying flag setted by flush_memcg_workqueue().
+		 */
+		s->memcg_params.dying = false;
+		goto out_unlock;
+	}
+#endif
+
 	err = shutdown_memcg_caches(s);
 	if (!err)
 		err = shutdown_cache(s);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-07  6:27 [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy Muchun Song
@ 2020-07-15 11:32 ` Vlastimil Babka
  2020-07-15 15:13   ` [External] " Muchun Song
  2020-07-15 16:24   ` Roman Gushchin
  2020-07-15 15:20 ` Shakeel Butt
  1 sibling, 2 replies; 9+ messages in thread
From: Vlastimil Babka @ 2020-07-15 11:32 UTC (permalink / raw)
  To: Muchun Song, cl, penberg, rientjes, iamjoonsoo.kim, akpm
  Cc: linux-mm, linux-kernel, shakeelb, Roman Gushchin

On 7/7/20 8:27 AM, Muchun Song wrote:
> If the kmem_cache refcount is greater than one, we should not
> mark the root kmem_cache as dying. If we mark the root kmem_cache
> dying incorrectly, the non-root kmem_cache can never be destroyed.
> It resulted in memory leak when memcg was destroyed. We can use the
> following steps to reproduce.
> 
>   1) Use kmem_cache_create() to create a new kmem_cache named A.
>   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
>      so the refcount of B is just increased.
>   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
>      decrease the B's refcount but mark the B as dying.
>   4) Create a new memory cgroup and alloc memory from the kmem_cache
>      A. It leads to create a non-root kmem_cache for allocating.
>   5) When destroy the memory cgroup created in the step 4), the
>      non-root kmem_cache can never be destroyed.
> 
> If we repeat steps 4) and 5), this will cause a lot of memory leak.
> So only when refcount reach zero, we mark the root kmem_cache as dying.
> 
> Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

CC Roman, who worked in this area recently.

Also why is this marked "[PATCH v5.4.y, v4.19.y]"? Has it been fixed otherwise
in 5.5+ ?

> ---
>  mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 8c1ffbf7de45..83ee6211aec7 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
>  		list_del(&s->memcg_params.kmem_caches_node);
>  	}
>  }
> +
> +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> +{
> +	return is_root_cache(s) && s->memcg_params.dying;
> +}
>  #else
>  static inline int init_memcg_params(struct kmem_cache *s,
>  				    struct kmem_cache *root_cache)
> @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
>  static inline void memcg_unlink_cache(struct kmem_cache *s)
>  {
>  }
> +
> +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> +{
> +	return false;
> +}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
>  	if (s->refcount < 0)
>  		return 1;
>  
> +	/*
> +	 * If the kmem_cache is dying. We should also skip this
> +	 * kmem_cache.
> +	 */
> +	if (memcg_kmem_cache_dying(s))
> +		return 1;
> +
>  	return 0;
>  }
>  
> @@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
>  	if (unlikely(!s))
>  		return;
>  
> -	flush_memcg_workqueue(s);
> -
>  	get_online_cpus();
>  	get_online_mems();
>  
> @@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
>  	if (s->refcount)
>  		goto out_unlock;
>  
> +#ifdef CONFIG_MEMCG_KMEM
> +	mutex_unlock(&slab_mutex);
> +
> +	put_online_mems();
> +	put_online_cpus();
> +
> +	flush_memcg_workqueue(s);
> +
> +	get_online_cpus();
> +	get_online_mems();
> +
> +	mutex_lock(&slab_mutex);
> +
> +	if (WARN(s->refcount,
> +		 "kmem_cache_destroy %s: Slab cache is still referenced\n",
> +		 s->name)) {
> +		/*
> +		 * Reset the dying flag setted by flush_memcg_workqueue().
> +		 */
> +		s->memcg_params.dying = false;
> +		goto out_unlock;
> +	}
> +#endif
> +
>  	err = shutdown_memcg_caches(s);
>  	if (!err)
>  		err = shutdown_cache(s);
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-15 11:32 ` Vlastimil Babka
@ 2020-07-15 15:13   ` Muchun Song
  2020-07-15 15:43     ` Vlastimil Babka
  2020-07-15 16:24   ` Roman Gushchin
  1 sibling, 1 reply; 9+ messages in thread
From: Muchun Song @ 2020-07-15 15:13 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Linux Memory Management List, LKML, Shakeel Butt,
	Roman Gushchin

On Wed, Jul 15, 2020 at 7:32 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 7/7/20 8:27 AM, Muchun Song wrote:
> > If the kmem_cache refcount is greater than one, we should not
> > mark the root kmem_cache as dying. If we mark the root kmem_cache
> > dying incorrectly, the non-root kmem_cache can never be destroyed.
> > It resulted in memory leak when memcg was destroyed. We can use the
> > following steps to reproduce.
> >
> >   1) Use kmem_cache_create() to create a new kmem_cache named A.
> >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
> >      so the refcount of B is just increased.
> >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
> >      decrease the B's refcount but mark the B as dying.
> >   4) Create a new memory cgroup and alloc memory from the kmem_cache
> >      A. It leads to create a non-root kmem_cache for allocating.
> >   5) When destroy the memory cgroup created in the step 4), the
> >      non-root kmem_cache can never be destroyed.
> >
> > If we repeat steps 4) and 5), this will cause a lot of memory leak.
> > So only when refcount reach zero, we mark the root kmem_cache as dying.
> >
> > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>
> CC Roman, who worked in this area recently.
>
> Also why is this marked "[PATCH v5.4.y, v4.19.y]"? Has it been fixed otherwise
> in 5.5+ ?

Because the memcg slab/slub is reworked by Roman since v5.8.
Therefore, this problem exists in v5.7 and below.

>
> > ---
> >  mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 41 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 8c1ffbf7de45..83ee6211aec7 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
> >               list_del(&s->memcg_params.kmem_caches_node);
> >       }
> >  }
> > +
> > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > +{
> > +     return is_root_cache(s) && s->memcg_params.dying;
> > +}
> >  #else
> >  static inline int init_memcg_params(struct kmem_cache *s,
> >                                   struct kmem_cache *root_cache)
> > @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
> >  static inline void memcg_unlink_cache(struct kmem_cache *s)
> >  {
> >  }
> > +
> > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > +{
> > +     return false;
> > +}
> >  #endif /* CONFIG_MEMCG_KMEM */
> >
> >  /*
> > @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
> >       if (s->refcount < 0)
> >               return 1;
> >
> > +     /*
> > +      * If the kmem_cache is dying. We should also skip this
> > +      * kmem_cache.
> > +      */
> > +     if (memcg_kmem_cache_dying(s))
> > +             return 1;
> > +
> >       return 0;
> >  }
> >
> > @@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >       if (unlikely(!s))
> >               return;
> >
> > -     flush_memcg_workqueue(s);
> > -
> >       get_online_cpus();
> >       get_online_mems();
> >
> > @@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >       if (s->refcount)
> >               goto out_unlock;
> >
> > +#ifdef CONFIG_MEMCG_KMEM
> > +     mutex_unlock(&slab_mutex);
> > +
> > +     put_online_mems();
> > +     put_online_cpus();
> > +
> > +     flush_memcg_workqueue(s);
> > +
> > +     get_online_cpus();
> > +     get_online_mems();
> > +
> > +     mutex_lock(&slab_mutex);
> > +
> > +     if (WARN(s->refcount,
> > +              "kmem_cache_destroy %s: Slab cache is still referenced\n",
> > +              s->name)) {
> > +             /*
> > +              * Reset the dying flag setted by flush_memcg_workqueue().
> > +              */
> > +             s->memcg_params.dying = false;
> > +             goto out_unlock;
> > +     }
> > +#endif
> > +
> >       err = shutdown_memcg_caches(s);
> >       if (!err)
> >               err = shutdown_cache(s);
> >
>


--
Yours,
Muchun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-07  6:27 [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy Muchun Song
  2020-07-15 11:32 ` Vlastimil Babka
@ 2020-07-15 15:20 ` Shakeel Butt
  2020-07-15 15:28   ` [External] " Muchun Song
  1 sibling, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2020-07-15 15:20 UTC (permalink / raw)
  To: Muchun Song
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Linux MM, LKML

Sorry I missed this email.


On Mon, Jul 6, 2020 at 11:28 PM Muchun Song <songmuchun@bytedance.com> wrote:
>
> If the kmem_cache refcount is greater than one, we should not
> mark the root kmem_cache as dying. If we mark the root kmem_cache
> dying incorrectly, the non-root kmem_cache can never be destroyed.
> It resulted in memory leak when memcg was destroyed. We can use the
> following steps to reproduce.
>
>   1) Use kmem_cache_create() to create a new kmem_cache named A.
>   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
>      so the refcount of B is just increased.

I definitely missed the alias kmem cache case.

>   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
>      decrease the B's refcount but mark the B as dying.
>   4) Create a new memory cgroup and alloc memory from the kmem_cache
>      A. It leads to create a non-root kmem_cache for allocating.

I think in (4) you meant alloc memory from kmem_cache B instead of A.
There should not be any allocation from A after kmem_cache_destroy()
in (3).

>   5) When destroy the memory cgroup created in the step 4), the
>      non-root kmem_cache can never be destroyed.
>
> If we repeat steps 4) and 5), this will cause a lot of memory leak.
> So only when refcount reach zero, we mark the root kmem_cache as dying.
>
> Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

The patch looks fine.

Reviewed-by: Shakeel Butt <shakeelb@google.com>

> ---
>  mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 8c1ffbf7de45..83ee6211aec7 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
>                 list_del(&s->memcg_params.kmem_caches_node);
>         }
>  }
> +
> +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> +{
> +       return is_root_cache(s) && s->memcg_params.dying;
> +}
>  #else
>  static inline int init_memcg_params(struct kmem_cache *s,
>                                     struct kmem_cache *root_cache)
> @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
>  static inline void memcg_unlink_cache(struct kmem_cache *s)
>  {
>  }
> +
> +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> +{
> +       return false;
> +}
>  #endif /* CONFIG_MEMCG_KMEM */
>
>  /*
> @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
>         if (s->refcount < 0)
>                 return 1;
>
> +       /*
> +        * If the kmem_cache is dying. We should also skip this
> +        * kmem_cache.
> +        */
> +       if (memcg_kmem_cache_dying(s))
> +               return 1;
> +
>         return 0;
>  }
>
> @@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
>         if (unlikely(!s))
>                 return;
>
> -       flush_memcg_workqueue(s);
> -
>         get_online_cpus();
>         get_online_mems();
>
> @@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
>         if (s->refcount)
>                 goto out_unlock;
>
> +#ifdef CONFIG_MEMCG_KMEM
> +       mutex_unlock(&slab_mutex);
> +
> +       put_online_mems();
> +       put_online_cpus();
> +
> +       flush_memcg_workqueue(s);
> +
> +       get_online_cpus();
> +       get_online_mems();
> +
> +       mutex_lock(&slab_mutex);
> +
> +       if (WARN(s->refcount,
> +                "kmem_cache_destroy %s: Slab cache is still referenced\n",
> +                s->name)) {
> +               /*
> +                * Reset the dying flag setted by flush_memcg_workqueue().
> +                */
> +               s->memcg_params.dying = false;
> +               goto out_unlock;
> +       }
> +#endif
> +
>         err = shutdown_memcg_caches(s);
>         if (!err)
>                 err = shutdown_cache(s);
> --
> 2.11.0
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-15 15:20 ` Shakeel Butt
@ 2020-07-15 15:28   ` Muchun Song
  0 siblings, 0 replies; 9+ messages in thread
From: Muchun Song @ 2020-07-15 15:28 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Linux MM, LKML

On Wed, Jul 15, 2020 at 11:21 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> Sorry I missed this email.
>
>
> On Mon, Jul 6, 2020 at 11:28 PM Muchun Song <songmuchun@bytedance.com> wrote:
> >
> > If the kmem_cache refcount is greater than one, we should not
> > mark the root kmem_cache as dying. If we mark the root kmem_cache
> > dying incorrectly, the non-root kmem_cache can never be destroyed.
> > It resulted in memory leak when memcg was destroyed. We can use the
> > following steps to reproduce.
> >
> >   1) Use kmem_cache_create() to create a new kmem_cache named A.
> >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
> >      so the refcount of B is just increased.
>
> I definitely missed the alias kmem cache case.
>
> >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
> >      decrease the B's refcount but mark the B as dying.
> >   4) Create a new memory cgroup and alloc memory from the kmem_cache
> >      A. It leads to create a non-root kmem_cache for allocating.
>
> I think in (4) you meant alloc memory from kmem_cache B instead of A.
> There should not be any allocation from A after kmem_cache_destroy()
> in (3).

Yeah, here is a typo. I will fix it in the next version patch.

>
> >   5) When destroy the memory cgroup created in the step 4), the
> >      non-root kmem_cache can never be destroyed.
> >
> > If we repeat steps 4) and 5), this will cause a lot of memory leak.
> > So only when refcount reach zero, we mark the root kmem_cache as dying.
> >
> > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>
> The patch looks fine.
>
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
>
> > ---
> >  mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 41 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 8c1ffbf7de45..83ee6211aec7 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
> >                 list_del(&s->memcg_params.kmem_caches_node);
> >         }
> >  }
> > +
> > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > +{
> > +       return is_root_cache(s) && s->memcg_params.dying;
> > +}
> >  #else
> >  static inline int init_memcg_params(struct kmem_cache *s,
> >                                     struct kmem_cache *root_cache)
> > @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
> >  static inline void memcg_unlink_cache(struct kmem_cache *s)
> >  {
> >  }
> > +
> > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > +{
> > +       return false;
> > +}
> >  #endif /* CONFIG_MEMCG_KMEM */
> >
> >  /*
> > @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
> >         if (s->refcount < 0)
> >                 return 1;
> >
> > +       /*
> > +        * If the kmem_cache is dying. We should also skip this
> > +        * kmem_cache.
> > +        */
> > +       if (memcg_kmem_cache_dying(s))
> > +               return 1;
> > +
> >         return 0;
> >  }
> >
> > @@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >         if (unlikely(!s))
> >                 return;
> >
> > -       flush_memcg_workqueue(s);
> > -
> >         get_online_cpus();
> >         get_online_mems();
> >
> > @@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >         if (s->refcount)
> >                 goto out_unlock;
> >
> > +#ifdef CONFIG_MEMCG_KMEM
> > +       mutex_unlock(&slab_mutex);
> > +
> > +       put_online_mems();
> > +       put_online_cpus();
> > +
> > +       flush_memcg_workqueue(s);
> > +
> > +       get_online_cpus();
> > +       get_online_mems();
> > +
> > +       mutex_lock(&slab_mutex);
> > +
> > +       if (WARN(s->refcount,
> > +                "kmem_cache_destroy %s: Slab cache is still referenced\n",
> > +                s->name)) {
> > +               /*
> > +                * Reset the dying flag setted by flush_memcg_workqueue().
> > +                */
> > +               s->memcg_params.dying = false;
> > +               goto out_unlock;
> > +       }
> > +#endif
> > +
> >         err = shutdown_memcg_caches(s);
> >         if (!err)
> >                 err = shutdown_cache(s);
> > --
> > 2.11.0
> >



-- 
Yours,
Muchun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-15 15:13   ` [External] " Muchun Song
@ 2020-07-15 15:43     ` Vlastimil Babka
  2020-07-15 15:55       ` Muchun Song
  0 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2020-07-15 15:43 UTC (permalink / raw)
  To: Muchun Song
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Linux Memory Management List, LKML, Shakeel Butt,
	Roman Gushchin

On 7/15/20 5:13 PM, Muchun Song wrote:
> On Wed, Jul 15, 2020 at 7:32 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 7/7/20 8:27 AM, Muchun Song wrote:
>> > If the kmem_cache refcount is greater than one, we should not
>> > mark the root kmem_cache as dying. If we mark the root kmem_cache
>> > dying incorrectly, the non-root kmem_cache can never be destroyed.
>> > It resulted in memory leak when memcg was destroyed. We can use the
>> > following steps to reproduce.
>> >
>> >   1) Use kmem_cache_create() to create a new kmem_cache named A.
>> >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
>> >      so the refcount of B is just increased.
>> >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
>> >      decrease the B's refcount but mark the B as dying.
>> >   4) Create a new memory cgroup and alloc memory from the kmem_cache
>> >      A. It leads to create a non-root kmem_cache for allocating.
>> >   5) When destroy the memory cgroup created in the step 4), the
>> >      non-root kmem_cache can never be destroyed.
>> >
>> > If we repeat steps 4) and 5), this will cause a lot of memory leak.
>> > So only when refcount reach zero, we mark the root kmem_cache as dying.
>> >
>> > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
>> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>
>> CC Roman, who worked in this area recently.
>>
>> Also why is this marked "[PATCH v5.4.y, v4.19.y]"? Has it been fixed otherwise
>> in 5.5+ ?
> 
> Because the memcg slab/slub is reworked by Roman since v5.8.

That rework is in mmotm, so scheduled for 5.9, AFAIK. If you mean "The new
cgroup slab memory controller" series.

> Therefore, this problem exists in v5.7 and below.

Even 5.7 has a stable series, so no need to list only the LTS's.
To sum up, the patch (once reviewed) should be queued for mainline as usual,
perhaps sent before 5.8 is final, if deemed safe enough, and with added

Cc: <stable@vger.kernel.org>

and the Fixes: tag you provided, the applicable stable versions will pick it.

Vlastimil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-15 15:43     ` Vlastimil Babka
@ 2020-07-15 15:55       ` Muchun Song
  0 siblings, 0 replies; 9+ messages in thread
From: Muchun Song @ 2020-07-15 15:55 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Linux Memory Management List, LKML, Shakeel Butt,
	Roman Gushchin

On Wed, Jul 15, 2020 at 11:43 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 7/15/20 5:13 PM, Muchun Song wrote:
> > On Wed, Jul 15, 2020 at 7:32 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>
> >> On 7/7/20 8:27 AM, Muchun Song wrote:
> >> > If the kmem_cache refcount is greater than one, we should not
> >> > mark the root kmem_cache as dying. If we mark the root kmem_cache
> >> > dying incorrectly, the non-root kmem_cache can never be destroyed.
> >> > It resulted in memory leak when memcg was destroyed. We can use the
> >> > following steps to reproduce.
> >> >
> >> >   1) Use kmem_cache_create() to create a new kmem_cache named A.
> >> >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
> >> >      so the refcount of B is just increased.
> >> >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
> >> >      decrease the B's refcount but mark the B as dying.
> >> >   4) Create a new memory cgroup and alloc memory from the kmem_cache
> >> >      A. It leads to create a non-root kmem_cache for allocating.
> >> >   5) When destroy the memory cgroup created in the step 4), the
> >> >      non-root kmem_cache can never be destroyed.
> >> >
> >> > If we repeat steps 4) and 5), this will cause a lot of memory leak.
> >> > So only when refcount reach zero, we mark the root kmem_cache as dying.
> >> >
> >> > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> >> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>
> >> CC Roman, who worked in this area recently.
> >>
> >> Also why is this marked "[PATCH v5.4.y, v4.19.y]"? Has it been fixed otherwise
> >> in 5.5+ ?
> >
> > Because the memcg slab/slub is reworked by Roman since v5.8.
>
> That rework is in mmotm, so scheduled for 5.9, AFAIK. If you mean "The new
> cgroup slab memory controller" series.

Yeah, I mean "The new cgroup slab memory controller".

>
> > Therefore, this problem exists in v5.7 and below.
>
> Even 5.7 has a stable series, so no need to list only the LTS's.
> To sum up, the patch (once reviewed) should be queued for mainline as usual,
> perhaps sent before 5.8 is final, if deemed safe enough, and with added
>
> Cc: <stable@vger.kernel.org>
>
> and the Fixes: tag you provided, the applicable stable versions will pick it.

Got it. Thanks.

>
> Vlastimil



-- 
Yours,
Muchun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-15 11:32 ` Vlastimil Babka
  2020-07-15 15:13   ` [External] " Muchun Song
@ 2020-07-15 16:24   ` Roman Gushchin
  2020-07-15 16:31     ` [External] " Muchun Song
  1 sibling, 1 reply; 9+ messages in thread
From: Roman Gushchin @ 2020-07-15 16:24 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Muchun Song, cl, penberg, rientjes, iamjoonsoo.kim, akpm,
	linux-mm, linux-kernel, shakeelb

On Wed, Jul 15, 2020 at 01:32:00PM +0200, Vlastimil Babka wrote:
> On 7/7/20 8:27 AM, Muchun Song wrote:
> > If the kmem_cache refcount is greater than one, we should not
> > mark the root kmem_cache as dying. If we mark the root kmem_cache
> > dying incorrectly, the non-root kmem_cache can never be destroyed.
> > It resulted in memory leak when memcg was destroyed. We can use the
> > following steps to reproduce.
> > 
> >   1) Use kmem_cache_create() to create a new kmem_cache named A.
> >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
> >      so the refcount of B is just increased.
> >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
> >      decrease the B's refcount but mark the B as dying.
> >   4) Create a new memory cgroup and alloc memory from the kmem_cache
> >      A. It leads to create a non-root kmem_cache for allocating.
> >   5) When destroy the memory cgroup created in the step 4), the
> >      non-root kmem_cache can never be destroyed.

Hello, Muchun!

If the scenario above is accurate, it means that somebody is allocating
from the kmem_cache A (or it's memcg counterparts, doesn't matter) after
calling kmem_cache_destroy()? If so, it's an API violation, and the following
memory leak is a non-issue on the slab side. No one should allocate memory
after calling kmem_cache_destroy(). It has to be called after all outstanding
allocations are freed, and it should be literally the last operation
with the kmem_cache.

Kmem_cache aliasing/sharing, as well as memcg accounting implementation are
implementation details and should not affect the picture.

I wonder, did you see the problem in the wild? How does it look like?
Which kmem_cache is involved? Etc.

BTW, Vlastimil is absolutely right about stable backports and rework planned
for 5.9, but let's figure out the problem first.

Thank you!

> > 
> > If we repeat steps 4) and 5), this will cause a lot of memory leak.
> > So only when refcount reach zero, we mark the root kmem_cache as dying.
> > 
> > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> 
> CC Roman, who worked in this area recently.
> 
> Also why is this marked "[PATCH v5.4.y, v4.19.y]"? Has it been fixed otherwise
> in 5.5+ ?
> 
> > ---
> >  mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 41 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 8c1ffbf7de45..83ee6211aec7 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
> >  		list_del(&s->memcg_params.kmem_caches_node);
> >  	}
> >  }
> > +
> > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > +{
> > +	return is_root_cache(s) && s->memcg_params.dying;
> > +}
> >  #else
> >  static inline int init_memcg_params(struct kmem_cache *s,
> >  				    struct kmem_cache *root_cache)
> > @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
> >  static inline void memcg_unlink_cache(struct kmem_cache *s)
> >  {
> >  }
> > +
> > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > +{
> > +	return false;
> > +}
> >  #endif /* CONFIG_MEMCG_KMEM */
> >  
> >  /*
> > @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
> >  	if (s->refcount < 0)
> >  		return 1;
> >  
> > +	/*
> > +	 * If the kmem_cache is dying. We should also skip this
> > +	 * kmem_cache.
> > +	 */
> > +	if (memcg_kmem_cache_dying(s))
> > +		return 1;
> > +
> >  	return 0;
> >  }
> >  
> > @@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >  	if (unlikely(!s))
> >  		return;
> >  
> > -	flush_memcg_workqueue(s);
> > -
> >  	get_online_cpus();
> >  	get_online_mems();
> >  
> > @@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >  	if (s->refcount)
> >  		goto out_unlock;
> >  
> > +#ifdef CONFIG_MEMCG_KMEM
> > +	mutex_unlock(&slab_mutex);
> > +
> > +	put_online_mems();
> > +	put_online_cpus();
> > +
> > +	flush_memcg_workqueue(s);
> > +
> > +	get_online_cpus();
> > +	get_online_mems();
> > +
> > +	mutex_lock(&slab_mutex);
> > +
> > +	if (WARN(s->refcount,
> > +		 "kmem_cache_destroy %s: Slab cache is still referenced\n",
> > +		 s->name)) {
> > +		/*
> > +		 * Reset the dying flag setted by flush_memcg_workqueue().
> > +		 */
> > +		s->memcg_params.dying = false;
> > +		goto out_unlock;
> > +	}
> > +#endif
> > +
> >  	err = shutdown_memcg_caches(s);
> >  	if (!err)
> >  		err = shutdown_cache(s);
> > 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] Re: [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
  2020-07-15 16:24   ` Roman Gushchin
@ 2020-07-15 16:31     ` Muchun Song
  0 siblings, 0 replies; 9+ messages in thread
From: Muchun Song @ 2020-07-15 16:31 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Linux Memory Management List, LKML,
	Shakeel Butt

On Thu, Jul 16, 2020 at 12:24 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Wed, Jul 15, 2020 at 01:32:00PM +0200, Vlastimil Babka wrote:
> > On 7/7/20 8:27 AM, Muchun Song wrote:
> > > If the kmem_cache refcount is greater than one, we should not
> > > mark the root kmem_cache as dying. If we mark the root kmem_cache
> > > dying incorrectly, the non-root kmem_cache can never be destroyed.
> > > It resulted in memory leak when memcg was destroyed. We can use the
> > > following steps to reproduce.
> > >
> > >   1) Use kmem_cache_create() to create a new kmem_cache named A.
> > >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
> > >      so the refcount of B is just increased.
> > >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
> > >      decrease the B's refcount but mark the B as dying.
> > >   4) Create a new memory cgroup and alloc memory from the kmem_cache
> > >      A. It leads to create a non-root kmem_cache for allocating.

Hi Roman,

I am sorry, here is a typo. I mean the step 4) allocates memory from
the kmem_cache
B instead of A.

> > >   5) When destroy the memory cgroup created in the step 4), the
> > >      non-root kmem_cache can never be destroyed.
>
> Hello, Muchun!
>
> If the scenario above is accurate, it means that somebody is allocating
> from the kmem_cache A (or it's memcg counterparts, doesn't matter) after
> calling kmem_cache_destroy()? If so, it's an API violation, and the following
> memory leak is a non-issue on the slab side. No one should allocate memory
> after calling kmem_cache_destroy(). It has to be called after all outstanding
> allocations are freed, and it should be literally the last operation
> with the kmem_cache.
>
> Kmem_cache aliasing/sharing, as well as memcg accounting implementation are
> implementation details and should not affect the picture.
>
> I wonder, did you see the problem in the wild? How does it look like?
> Which kmem_cache is involved? Etc.
>
> BTW, Vlastimil is absolutely right about stable backports and rework planned
> for 5.9, but let's figure out the problem first.
>
> Thank you!
>
> > >
> > > If we repeat steps 4) and 5), this will cause a lot of memory leak.
> > > So only when refcount reach zero, we mark the root kmem_cache as dying.
> > >
> > > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >
> > CC Roman, who worked in this area recently.
> >
> > Also why is this marked "[PATCH v5.4.y, v4.19.y]"? Has it been fixed otherwise
> > in 5.5+ ?
> >
> > > ---
> > >  mm/slab_common.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> > >  1 file changed, 41 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > > index 8c1ffbf7de45..83ee6211aec7 100644
> > > --- a/mm/slab_common.c
> > > +++ b/mm/slab_common.c
> > > @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
> > >             list_del(&s->memcg_params.kmem_caches_node);
> > >     }
> > >  }
> > > +
> > > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > > +{
> > > +   return is_root_cache(s) && s->memcg_params.dying;
> > > +}
> > >  #else
> > >  static inline int init_memcg_params(struct kmem_cache *s,
> > >                                 struct kmem_cache *root_cache)
> > > @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
> > >  static inline void memcg_unlink_cache(struct kmem_cache *s)
> > >  {
> > >  }
> > > +
> > > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > > +{
> > > +   return false;
> > > +}
> > >  #endif /* CONFIG_MEMCG_KMEM */
> > >
> > >  /*
> > > @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
> > >     if (s->refcount < 0)
> > >             return 1;
> > >
> > > +   /*
> > > +    * If the kmem_cache is dying. We should also skip this
> > > +    * kmem_cache.
> > > +    */
> > > +   if (memcg_kmem_cache_dying(s))
> > > +           return 1;
> > > +
> > >     return 0;
> > >  }
> > >
> > > @@ -944,8 +961,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> > >     if (unlikely(!s))
> > >             return;
> > >
> > > -   flush_memcg_workqueue(s);
> > > -
> > >     get_online_cpus();
> > >     get_online_mems();
> > >
> > > @@ -955,6 +970,30 @@ void kmem_cache_destroy(struct kmem_cache *s)
> > >     if (s->refcount)
> > >             goto out_unlock;
> > >
> > > +#ifdef CONFIG_MEMCG_KMEM
> > > +   mutex_unlock(&slab_mutex);
> > > +
> > > +   put_online_mems();
> > > +   put_online_cpus();
> > > +
> > > +   flush_memcg_workqueue(s);
> > > +
> > > +   get_online_cpus();
> > > +   get_online_mems();
> > > +
> > > +   mutex_lock(&slab_mutex);
> > > +
> > > +   if (WARN(s->refcount,
> > > +            "kmem_cache_destroy %s: Slab cache is still referenced\n",
> > > +            s->name)) {
> > > +           /*
> > > +            * Reset the dying flag setted by flush_memcg_workqueue().
> > > +            */
> > > +           s->memcg_params.dying = false;
> > > +           goto out_unlock;
> > > +   }
> > > +#endif
> > > +
> > >     err = shutdown_memcg_caches(s);
> > >     if (!err)
> > >             err = shutdown_cache(s);
> > >
> >



-- 
Yours,
Muchun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-15 16:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07  6:27 [PATCH v5.4.y, v4.19.y] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy Muchun Song
2020-07-15 11:32 ` Vlastimil Babka
2020-07-15 15:13   ` [External] " Muchun Song
2020-07-15 15:43     ` Vlastimil Babka
2020-07-15 15:55       ` Muchun Song
2020-07-15 16:24   ` Roman Gushchin
2020-07-15 16:31     ` [External] " Muchun Song
2020-07-15 15:20 ` Shakeel Butt
2020-07-15 15:28   ` [External] " Muchun Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).