linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Waiman Long <longman@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Rik van Riel <riel@surriel.com>,
	Christoph Lameter <cl@linux.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Cgroups <cgroups@vger.kernel.org>
Subject: Re: [PATCH v4 5/7] mm: rework non-root kmem_cache lifecycle management
Date: Tue, 21 May 2019 19:23:28 +0000	[thread overview]
Message-ID: <20190521192320.GA6658@tower.DHCP.thefacebook.com> (raw)
In-Reply-To: <7d06354d-4542-af42-d83d-2bc4639b56f2@redhat.com>

On Tue, May 21, 2019 at 02:39:50PM -0400, Waiman Long wrote:
> On 5/14/19 8:06 PM, Shakeel Butt wrote:
> >> @@ -2651,20 +2652,35 @@ struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep)
> >>         struct mem_cgroup *memcg;
> >>         struct kmem_cache *memcg_cachep;
> >>         int kmemcg_id;
> >> +       struct memcg_cache_array *arr;
> >>
> >>         VM_BUG_ON(!is_root_cache(cachep));
> >>
> >>         if (memcg_kmem_bypass())
> >>                 return cachep;
> >>
> >> -       memcg = get_mem_cgroup_from_current();
> >> +       rcu_read_lock();
> >> +
> >> +       if (unlikely(current->active_memcg))
> >> +               memcg = current->active_memcg;
> >> +       else
> >> +               memcg = mem_cgroup_from_task(current);
> >> +
> >> +       if (!memcg || memcg == root_mem_cgroup)
> >> +               goto out_unlock;
> >> +
> >>         kmemcg_id = READ_ONCE(memcg->kmemcg_id);
> >>         if (kmemcg_id < 0)
> >> -               goto out;
> >> +               goto out_unlock;
> >>
> >> -       memcg_cachep = cache_from_memcg_idx(cachep, kmemcg_id);
> >> -       if (likely(memcg_cachep))
> >> -               return memcg_cachep;
> >> +       arr = rcu_dereference(cachep->memcg_params.memcg_caches);
> >> +
> >> +       /*
> >> +        * Make sure we will access the up-to-date value. The code updating
> >> +        * memcg_caches issues a write barrier to match this (see
> >> +        * memcg_create_kmem_cache()).
> >> +        */
> >> +       memcg_cachep = READ_ONCE(arr->entries[kmemcg_id]);
> >>
> >>         /*
> >>          * If we are in a safe context (can wait, and not in interrupt
> >> @@ -2677,10 +2693,20 @@ struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep)
> >>          * memcg_create_kmem_cache, this means no further allocation
> >>          * could happen with the slab_mutex held. So it's better to
> >>          * defer everything.
> >> +        *
> >> +        * If the memcg is dying or memcg_cache is about to be released,
> >> +        * don't bother creating new kmem_caches. Because memcg_cachep
> >> +        * is ZEROed as the fist step of kmem offlining, we don't need
> >> +        * percpu_ref_tryget() here. css_tryget_online() check in
> > *percpu_ref_tryget_live()
> >
> >> +        * memcg_schedule_kmem_cache_create() will prevent us from
> >> +        * creation of a new kmem_cache.
> >>          */
> >> -       memcg_schedule_kmem_cache_create(memcg, cachep);
> >> -out:
> >> -       css_put(&memcg->css);
> >> +       if (unlikely(!memcg_cachep))
> >> +               memcg_schedule_kmem_cache_create(memcg, cachep);
> >> +       else if (percpu_ref_tryget(&memcg_cachep->memcg_params.refcnt))
> >> +               cachep = memcg_cachep;
> >> +out_unlock:
> >> +       rcu_read_lock();
> 
> There is one more bug that causes the kernel to panic on bootup when I
> turned on debugging options.
> 
> [   49.871437] =============================
> [   49.875452] WARNING: suspicious RCU usage
> [   49.879476] 5.2.0-rc1.bz1699202_memcg_test+ #2 Not tainted
> [   49.884967] -----------------------------
> [   49.888991] include/linux/rcupdate.h:268 Illegal context switch in
> RCU read-side critical section!
> [   49.897950]
> [   49.897950] other info that might help us debug this:
> [   49.897950]
> [   49.905958]
> [   49.905958] rcu_scheduler_active = 2, debug_locks = 1
> [   49.912492] 3 locks held by systemd/1:
> [   49.916252]  #0: 00000000633673c5 (&type->i_mutex_dir_key#5){.+.+},
> at: lookup_slow+0x42/0x70
> [   49.924788]  #1: 0000000029fa8c75 (rcu_read_lock){....}, at:
> memcg_kmem_get_cache+0x12b/0x910
> [   49.933316]  #2: 0000000029fa8c75 (rcu_read_lock){....}, at:
> memcg_kmem_get_cache+0x3da/0x910
> 
> It should be "rcu_read_unlock();" at the end.

Oops. Good catch, thanks Waiman!

I'm somewhat surprised it didn't get up in my tests, neither any of test
bots caught it. Anyway, I'll fix it and send v5.

Does the rest of the patchset looks sane to you?

Thank you!

Roman


  reply	other threads:[~2019-05-21 19:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14 21:39 [PATCH v4 0/7] mm: reparent slab memory on cgroup removal Roman Gushchin
2019-05-14 21:39 ` [PATCH v4 1/7] mm: postpone kmem_cache memcg pointer initialization to memcg_link_cache() Roman Gushchin
2019-05-14 21:39 ` [PATCH v4 2/7] mm: generalize postponed non-root kmem_cache deactivation Roman Gushchin
2019-05-14 21:39 ` [PATCH v4 3/7] mm: introduce __memcg_kmem_uncharge_memcg() Roman Gushchin
2019-05-14 21:39 ` [PATCH v4 4/7] mm: unify SLAB and SLUB page accounting Roman Gushchin
2019-05-14 21:39 ` [PATCH v4 5/7] mm: rework non-root kmem_cache lifecycle management Roman Gushchin
2019-05-15  0:06   ` Shakeel Butt
2019-05-20 14:54     ` Waiman Long
2019-05-20 17:56       ` Roman Gushchin
2019-05-21 18:39     ` Waiman Long
2019-05-21 19:23       ` Roman Gushchin [this message]
2019-05-21 19:35         ` Waiman Long
2019-05-15 14:00   ` Christopher Lameter
2019-05-15 14:11     ` Shakeel Butt
2019-05-23  0:58   ` [mm] e52271917f: BUG:sleeping_function_called_from_invalid_context_at_mm/slab.h kernel test robot
2019-05-23 21:00     ` Roman Gushchin
2019-05-14 21:39 ` [PATCH v4 6/7] mm: reparent slab memory on cgroup removal Roman Gushchin
2019-05-15  0:10   ` Shakeel Butt
2019-05-14 21:39 ` [PATCH v4 7/7] mm: fix /proc/kpagecgroup interface for slab pages Roman Gushchin
2019-05-15  0:16   ` Shakeel Butt
2019-06-05  7:39 ` [PATCH v4 0/7] mm: reparent slab memory on cgroup removal Greg Thelen
2019-06-05 17:33   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190521192320.GA6658@tower.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=riel@surriel.com \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).