All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [External] Re: [PATCH v2] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
Date: Thu, 16 Jul 2020 09:16:18 -0700	[thread overview]
Message-ID: <20200716161507.GA4217@carbon.lan> (raw)
In-Reply-To: <CAMZfGtVMY3DqH4XxwkfY0AekD9EFvAN2xaavRUjxYK_s3yA89w@mail.gmail.com>

On Thu, Jul 16, 2020 at 11:54:37PM +0800, Muchun Song wrote:
> On Thu, Jul 16, 2020 at 11:46 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Thu, Jul 16, 2020 at 01:07:02PM +0800, Muchun Song wrote:
> > > On Thu, Jul 16, 2020 at 1:54 AM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > On Thu, Jul 16, 2020 at 12:50:22AM +0800, Muchun Song wrote:
> > > > > If the kmem_cache refcount is greater than one, we should not
> > > > > mark the root kmem_cache as dying. If we mark the root kmem_cache
> > > > > dying incorrectly, the non-root kmem_cache can never be destroyed.
> > > > > It resulted in memory leak when memcg was destroyed. We can use the
> > > > > following steps to reproduce.
> > > > >
> > > > >   1) Use kmem_cache_create() to create a new kmem_cache named A.
> > > > >   2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
> > > > >      so the refcount of B is just increased.
> > > > >   3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
> > > > >      decrease the B's refcount but mark the B as dying.
> > > > >   4) Create a new memory cgroup and alloc memory from the kmem_cache
> > > > >      B. It leads to create a non-root kmem_cache for allocating memory.
> > > > >   5) When destroy the memory cgroup created in the step 4), the
> > > > >      non-root kmem_cache can never be destroyed.
> > > > >
> > > > > If we repeat steps 4) and 5), this will cause a lot of memory leak.
> > > > > So only when refcount reach zero, we mark the root kmem_cache as dying.
> > > > >
> > > > > Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
> > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > > > Reviewed-by: Shakeel Butt <shakeelb@google.com>
> > > > > ---
> > > > >
> > > > > changelog in v2:
> > > > >  1) Fix a confusing typo in the commit log.
> > > >
> > > > Ok, now I see the problem. Thank you for fixing the commit log!
> > > >
> > > > >  2) Remove flush_memcg_workqueue() for !CONFIG_MEMCG_KMEM.
> > > > >  3) Introduce a new helper memcg_set_kmem_cache_dying() to fix a race
> > > > >     condition between flush_memcg_workqueue() and slab_unmergeable().
> > > > >
> > > > >  mm/slab_common.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++-------
> > > > >  1 file changed, 47 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > > > > index 8c1ffbf7de45..c4958116e3fd 100644
> > > > > --- a/mm/slab_common.c
> > > > > +++ b/mm/slab_common.c
> > > > > @@ -258,6 +258,11 @@ static void memcg_unlink_cache(struct kmem_cache *s)
> > > > >               list_del(&s->memcg_params.kmem_caches_node);
> > > > >       }
> > > > >  }
> > > > > +
> > > > > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > > > > +{
> > > > > +     return is_root_cache(s) && s->memcg_params.dying;
> > > > > +}
> > > > >  #else
> > > > >  static inline int init_memcg_params(struct kmem_cache *s,
> > > > >                                   struct kmem_cache *root_cache)
> > > > > @@ -272,6 +277,11 @@ static inline void destroy_memcg_params(struct kmem_cache *s)
> > > > >  static inline void memcg_unlink_cache(struct kmem_cache *s)
> > > > >  {
> > > > >  }
> > > > > +
> > > > > +static inline bool memcg_kmem_cache_dying(struct kmem_cache *s)
> > > > > +{
> > > > > +     return false;
> > > > > +}
> > > > >  #endif /* CONFIG_MEMCG_KMEM */
> > > > >
> > > > >  /*
> > > > > @@ -326,6 +336,13 @@ int slab_unmergeable(struct kmem_cache *s)
> > > > >       if (s->refcount < 0)
> > > > >               return 1;
> > > > >
> > > > > +     /*
> > > > > +      * If the kmem_cache is dying. We should also skip this
> > > > > +      * kmem_cache.
> > > > > +      */
> > > > > +     if (memcg_kmem_cache_dying(s))
> > > > > +             return 1;
> > > > > +
> > > > >       return 0;
> > > > >  }
> > > > >
> > > > > @@ -886,12 +903,15 @@ static int shutdown_memcg_caches(struct kmem_cache *s)
> > > > >       return 0;
> > > > >  }
> > > > >
> > > > > -static void flush_memcg_workqueue(struct kmem_cache *s)
> > > > > +static void memcg_set_kmem_cache_dying(struct kmem_cache *s)
> > > > >  {
> > > > >       spin_lock_irq(&memcg_kmem_wq_lock);
> > > > >       s->memcg_params.dying = true;
> > > > >       spin_unlock_irq(&memcg_kmem_wq_lock);
> > > > > +}
> > > > >
> > > > > +static void flush_memcg_workqueue(struct kmem_cache *s)
> > > > > +{
> > > > >       /*
> > > > >        * SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
> > > > >        * sure all registered rcu callbacks have been invoked.
> > > > > @@ -923,10 +943,6 @@ static inline int shutdown_memcg_caches(struct kmem_cache *s)
> > > > >  {
> > > > >       return 0;
> > > > >  }
> > > > > -
> > > > > -static inline void flush_memcg_workqueue(struct kmem_cache *s)
> > > > > -{
> > > > > -}
> > > > >  #endif /* CONFIG_MEMCG_KMEM */
> > > > >
> > > > >  void slab_kmem_cache_release(struct kmem_cache *s)
> > > > > @@ -944,8 +960,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
> > > > >       if (unlikely(!s))
> > > > >               return;
> > > > >
> > > > > -     flush_memcg_workqueue(s);
> > > > > -
> > > > >       get_online_cpus();
> > > > >       get_online_mems();
> > > > >
> > > > > @@ -955,6 +969,32 @@ void kmem_cache_destroy(struct kmem_cache *s)
> > > > >       if (s->refcount)
> > > > >               goto out_unlock;
> > > > >
> > > > > +#ifdef CONFIG_MEMCG_KMEM
> > > > > +     memcg_set_kmem_cache_dying(s);
> > > > > +
> > > > > +     mutex_unlock(&slab_mutex);
> > > >
> > > > Hm, but in theory s->refcount can be increased here?
> > >
> > > I have tried my best to read all the codes that operate on s->refcount.
> > > There is only one place which increases the s->refcount, it is the
> > > __kmem_cache_alias(). If the kmem cache is dying, the slab_unmergeable()
> > > can never return true for the dying kmem cache because it is the same slab_mutex
> > > protection, so I think that there is not a problem, right?
> >
> > So the problem is that you're checking s->refcount under slab_mutex,
> > then you're releasing the mutex and then set the dying bit. But nothing prevents
> 
> Maybe you miss something. The dying bit is set in the
> memcg_set_kmem_cache_dying()
> which is under the slab_mutex protection.  So I think there is no problem.

I'm sorry, I haven't noticed that you've fixed the race in v2.
But then we can probably drop the WARN() and resetting the dying flag part, right?

And because it's a backport-only patch I'd try to make it smaller
(e.g. avoid introducing new helpers, etc) to simplify the back-porting to old kernels.
But it's up to you (and Andrew).

Please, feel free to add
Acked-by: Roman Gushchin <guro@fb.com>
after dropping the dying flag reset part.

Thank you!

  reply	other threads:[~2020-07-16 16:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15 16:50 [PATCH v2] mm: memcg/slab: fix memory leak at non-root kmem_cache destroy Muchun Song
2020-07-15 17:54 ` Roman Gushchin
2020-07-16  5:07   ` [External] " Muchun Song
2020-07-16  5:07     ` Muchun Song
2020-07-16 15:46     ` Roman Gushchin
2020-07-16 15:54       ` Muchun Song
2020-07-16 15:54         ` Muchun Song
2020-07-16 16:16         ` Roman Gushchin [this message]
2020-07-16 16:29           ` Muchun Song
2020-07-16 16:29             ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200716161507.GA4217@carbon.lan \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.