linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
	Qian Cai <cai@lca.pw>, Waiman Long <longman@redhat.com>
Subject: [PATCH v2 3/4] mm/slub: Fix another circular locking dependency in slab_attr_store()
Date: Mon, 27 Apr 2020 19:56:20 -0400	[thread overview]
Message-ID: <20200427235621.7823-4-longman@redhat.com> (raw)
In-Reply-To: <20200427235621.7823-1-longman@redhat.com>

It turns out that switching from slab_mutex to memcg_cache_ids_sem in
slab_attr_store() does not completely eliminate circular locking dependency
as shown by the following lockdep splat when the system is shut down:

[ 2095.079697] Chain exists of:
[ 2095.079697]   kn->count#278 --> memcg_cache_ids_sem --> slab_mutex
[ 2095.079697]
[ 2095.090278]  Possible unsafe locking scenario:
[ 2095.090278]
[ 2095.096227]        CPU0                    CPU1
[ 2095.100779]        ----                    ----
[ 2095.105331]   lock(slab_mutex);
[ 2095.108486]                                lock(memcg_cache_ids_sem);
[ 2095.114961]                                lock(slab_mutex);
[ 2095.120649]   lock(kn->count#278);
[ 2095.124068]
[ 2095.124068]  *** DEADLOCK ***

To eliminate this possibility, we have to use trylock to acquire
memcg_cache_ids_sem. Unlikely slab_mutex which can be acquired in
many places, the memcg_cache_ids_sem write lock is only acquired
in memcg_alloc_cache_id() to double the size of memcg_nr_cache_ids.
So the chance of successive calls to memcg_alloc_cache_id() within
a short time is pretty low. As a result, we can retry the read lock
acquisition a few times if the first attempt fails.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/memcontrol.h |  1 +
 mm/memcontrol.c            |  5 +++++
 mm/slub.c                  | 25 +++++++++++++++++++++++--
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d275c72c4f8e..9285f14965b1 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1379,6 +1379,7 @@ extern struct workqueue_struct *memcg_kmem_cache_wq;
 extern int memcg_nr_cache_ids;
 void memcg_get_cache_ids(void);
 void memcg_put_cache_ids(void);
+int  memcg_tryget_cache_ids(void);
 
 /*
  * Helper macro to loop through all memcg-specific caches. Callers must still
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5beea03dd58a..9fa8535ff72a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -279,6 +279,11 @@ void memcg_get_cache_ids(void)
 	down_read(&memcg_cache_ids_sem);
 }
 
+int memcg_tryget_cache_ids(void)
+{
+	return down_read_trylock(&memcg_cache_ids_sem);
+}
+
 void memcg_put_cache_ids(void)
 {
 	up_read(&memcg_cache_ids_sem);
diff --git a/mm/slub.c b/mm/slub.c
index 44cb5215c17f..cf2114ca27f7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -34,6 +34,7 @@
 #include <linux/prefetch.h>
 #include <linux/memcontrol.h>
 #include <linux/random.h>
+#include <linux/delay.h>
 
 #include <trace/events/kmem.h>
 
@@ -5572,6 +5573,7 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 	    !list_empty(&s->memcg_params.children)) {
 		struct kmem_cache *c, **pcaches;
 		int idx, max, cnt = 0;
+		int retries = 3;
 		size_t size, old = s->max_attr_size;
 		struct memcg_cache_array *arr;
 
@@ -5585,9 +5587,28 @@ static ssize_t slab_attr_store(struct kobject *kobj,
 			old = cmpxchg(&s->max_attr_size, size, len);
 		} while (old != size);
 
-		memcg_get_cache_ids();
-		max = memcg_nr_cache_ids;
+		/*
+		 * To avoid the following circular lock chain
+		 *
+		 *   kn->count#278 --> memcg_cache_ids_sem --> slab_mutex
+		 *
+		 * We need to use trylock to acquire memcg_cache_ids_sem.
+		 *
+		 * Since the write lock is acquired only in
+		 * memcg_alloc_cache_id() to double the size of
+		 * memcg_nr_cache_ids. The chance of successive
+		 * memcg_alloc_cache_id() calls within a short time is
+		 * very low except at the beginning where the number of
+		 * memory cgroups is low. So we retry a few times to get
+		 * the memcg_cache_ids_sem read lock.
+		 */
+		while (!memcg_tryget_cache_ids()) {
+			if (retries-- <= 0)
+				return -EBUSY;
+			msleep(100);
+		}
 
+		max = memcg_nr_cache_ids;
 		pcaches = kmalloc_array(max, sizeof(void *), GFP_KERNEL);
 		if (!pcaches) {
 			memcg_put_cache_ids();
-- 
2.18.1



  parent reply	other threads:[~2020-04-27 23:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-27 23:56 [PATCH v2 0/4] mm/slub: Fix sysfs circular locking dependency Waiman Long
2020-04-27 23:56 ` [PATCH v2 1/4] mm, slab: Revert "extend slab/shrink to shrink all memcg caches" Waiman Long
2020-04-27 23:56 ` [PATCH v2 2/4] mm/slub: Fix slab_mutex circular locking problem in slab_attr_store() Waiman Long
2020-04-27 23:56 ` Waiman Long [this message]
     [not found]   ` <F1FA6654-C07C-42FD-B497-61EB635B264C@lca.pw>
2020-05-18 22:05     ` [PATCH v2 3/4] mm/slub: Fix another circular locking dependency " Waiman Long
2020-04-27 23:56 ` [PATCH v2 4/4] mm/slub: Fix sysfs shrink circular locking dependency Waiman Long
2020-04-28  0:13   ` Qian Cai
2020-04-28  1:39     ` Waiman Long
2020-04-28  2:11       ` Qian Cai
2020-04-28 14:06         ` Waiman Long
2020-04-29  2:52           ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200427235621.7823-4-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).