All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks
@ 2021-01-06 17:40 Vlastimil Babka
  2021-01-06 17:40 ` [RFC 1/3] mm, slab, slub: stop taking memory hotplug lock Vlastimil Babka
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Vlastimil Babka @ 2021-01-06 17:40 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko, Vlastimil Babka

Hi,

some related work caused me to look at how we use get/put_mems_online() and
get/put_online_cpus() during kmem cache creation/descruction/shrinking, and
realize that it should be actually safe to remove all of that with rather small
effort (as e.g. Michal Hocko suspected in some of the past discussions
already). This has the benefit to avoid rather heavy locks that have caused
locking order issues already in the past. So this is the result, Patches 1 and
2 remove memory hotplug and cpu hotplug locking, respectively. Patch 3 is due
to realization that in fact some races exist despite the locks (even if not
removed), but the most sane solution is not to introduce more of them, but
rather accept some wasted memory in scenarios that should be rare anyway (full
memory hot remove), as we do the same in other contexts already. It's all RFC
for now, as I might have missed some reason why it's not safe.

Vlastimil Babka (3):
  mm, slab, slub: stop taking memory hotplug lock
  mm, slab, slub: stop taking cpu hotplug lock
  mm, slub: stop freeing kmem_cache_node structures on node offline

 mm/slab_common.c | 20 ++++--------------
 mm/slub.c        | 54 ++++++++++++++++++++++++++++++++----------------
 2 files changed, 40 insertions(+), 34 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC 1/3] mm, slab, slub: stop taking memory hotplug lock
  2021-01-06 17:40 [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks Vlastimil Babka
@ 2021-01-06 17:40 ` Vlastimil Babka
  2021-01-06 17:40 ` [RFC 2/3] mm, slab, slub: stop taking cpu " Vlastimil Babka
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2021-01-06 17:40 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko, Vlastimil Babka

Since commit 03afc0e25f7f ("slab: get_online_mems for
kmem_cache_{create,destroy,shrink}") we are taking memory hotplug lock for SLAB
and SLUB when creating, destroying or shrinking a cache. It is quite a heavy
lock and it's best to avoid it if possible, as we had several issues with
lockdep complaining about ordering in the past, see e.g. e4f8e513c3d3
("mm/slub: fix a deadlock in show_slab_objects()").

The problem scenario in 03afc0e25f7f (solved by the memory hotplug lock) can be
summarized as follows: while there's slab_mutex synchronizing new kmem cache
creation and SLUB's MEM_GOING_ONLINE callback slab_mem_going_online_callback(),
we may miss creation of kmem_cache_node for the hotplugged node in the new kmem
cache, because the hotplug callback doesn't yet see the new cache, and cache
creation in init_kmem_cache_nodes() only inits kmem_cache_node for nodes in the
N_NORMAL_MEMORY nodemask, which however may not yet include the new node, as
that happens only later after the MEM_GOING_ONLINE callback.

Instead of using get/put_online_mems(), the problem can be solved by SLUB
maintaining its own nodemask of nodes for which it has allocated the per-node
kmem_cache_node structures. This nodemask would generally mirror the
N_NORMAL_MEMORY nodemask, but would be updated only in under SLUB's control in
its memory hotplug callbacks under the slab_mutex. This patch adds such
nodemask and its handling.

Commit 03afc0e25f7f mentiones "issues like [the one above]", but there don't
appear to be further issues. All the paths (shared for SLAB and SLUB) taking
the memory hotplug locks are also taking the slab_mutex, except
kmem_cache_shrink() where 03afc0e25f7f replaced slab_mutex with
get/put_online_mems(). So this patch restores slab_mutex in
kmem_cache_shrink(). slab_mutex should be otherwise sufficient, as all the
memory hotplug callbacks in SLUB take it as well.

SLAB also doesn't need the memory hotplug locking, which it only gained by
03afc0e25f7f through the shared paths in slab_common.c. Its memory hotplug
callbacks are also protected by slab_mutex against races with these paths. The
problem of SLUB relying on N_NORMAL_MEMORY doesn't apply to SLAB, as its
setup_kmem_cache_nodes relies on N_ONLINE, and the new node is already set
there during the MEM_GOING_ONLINE callback, so no special care is needed
for SLAB.

As such, this patch removes all get/put_online_mems() usage by the slab
subsystem.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slab_common.c | 10 ++++------
 mm/slub.c        | 28 +++++++++++++++++++++++++---
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 075b23ce94ec..e728265c8b7d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -311,7 +311,6 @@ kmem_cache_create_usercopy(const char *name,
 	int err;
 
 	get_online_cpus();
-	get_online_mems();
 
 	mutex_lock(&slab_mutex);
 
@@ -361,7 +360,6 @@ kmem_cache_create_usercopy(const char *name,
 out_unlock:
 	mutex_unlock(&slab_mutex);
 
-	put_online_mems();
 	put_online_cpus();
 
 	if (err) {
@@ -490,7 +488,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
 		return;
 
 	get_online_cpus();
-	get_online_mems();
 
 	mutex_lock(&slab_mutex);
 
@@ -507,7 +504,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
 out_unlock:
 	mutex_unlock(&slab_mutex);
 
-	put_online_mems();
 	put_online_cpus();
 }
 EXPORT_SYMBOL(kmem_cache_destroy);
@@ -526,10 +522,12 @@ int kmem_cache_shrink(struct kmem_cache *cachep)
 	int ret;
 
 	get_online_cpus();
-	get_online_mems();
+	mutex_lock(&slab_mutex);
+
 	kasan_cache_shrink(cachep);
 	ret = __kmem_cache_shrink(cachep);
-	put_online_mems();
+
+	mutex_unlock(&slab_mutex);
 	put_online_cpus();
 	return ret;
 }
diff --git a/mm/slub.c b/mm/slub.c
index 1f4584954f4c..2e2edd5c9cfc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -236,6 +236,14 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
 #endif
 }
 
+/*
+ * Tracks for which NUMA nodes we have kmem_cache_nodes allocated.
+ * Corresponds to node_state[N_NORMAL_MEMORY], but can temporarily
+ * differ during memory hotplug/hotremove operations.
+ * Protected by slab_mutex.
+ */
+static nodemask_t slab_nodes;
+
 /********************************************************************
  * 			Core slab cache functions
  *******************************************************************/
@@ -2678,7 +2686,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 		 * ignore the node constraint
 		 */
 		if (unlikely(node != NUMA_NO_NODE &&
-			     !node_state(node, N_NORMAL_MEMORY)))
+			     !node_isset(node, slab_nodes)))
 			node = NUMA_NO_NODE;
 		goto new_slab;
 	}
@@ -2689,7 +2697,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 		 * same as above but node_match() being false already
 		 * implies node != NUMA_NO_NODE
 		 */
-		if (!node_state(node, N_NORMAL_MEMORY)) {
+		if (!node_isset(node, slab_nodes)) {
 			node = NUMA_NO_NODE;
 			goto redo;
 		} else {
@@ -3599,7 +3607,7 @@ static int init_kmem_cache_nodes(struct kmem_cache *s)
 {
 	int node;
 
-	for_each_node_state(node, N_NORMAL_MEMORY) {
+	for_each_node_mask(node, slab_nodes) {
 		struct kmem_cache_node *n;
 
 		if (slab_state == DOWN) {
@@ -4259,6 +4267,7 @@ static void slab_mem_offline_callback(void *arg)
 		return;
 
 	mutex_lock(&slab_mutex);
+	node_clear(offline_node, slab_nodes);
 	list_for_each_entry(s, &slab_caches, list) {
 		n = get_node(s, offline_node);
 		if (n) {
@@ -4312,6 +4321,11 @@ static int slab_mem_going_online_callback(void *arg)
 		init_kmem_cache_node(n);
 		s->node[nid] = n;
 	}
+	/*
+	 * Any cache created after this point will also have kmem_cache_node
+	 * initialized for the new node.
+	 */
+	node_set(nid, slab_nodes);
 out:
 	mutex_unlock(&slab_mutex);
 	return ret;
@@ -4392,6 +4406,7 @@ void __init kmem_cache_init(void)
 {
 	static __initdata struct kmem_cache boot_kmem_cache,
 		boot_kmem_cache_node;
+	int node;
 
 	if (debug_guardpage_minorder())
 		slub_max_order = 0;
@@ -4399,6 +4414,13 @@ void __init kmem_cache_init(void)
 	kmem_cache_node = &boot_kmem_cache_node;
 	kmem_cache = &boot_kmem_cache;
 
+	/*
+	 * Initialize the nodemask for which we will allocate per node
+	 * structures. Here we don't need taking slab_mutex yet.
+	 */
+	for_each_node_state(node, N_NORMAL_MEMORY)
+		node_set(node, slab_nodes);
+
 	create_boot_cache(kmem_cache_node, "kmem_cache_node",
 		sizeof(struct kmem_cache_node), SLAB_HWCACHE_ALIGN, 0, 0);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC 2/3] mm, slab, slub: stop taking cpu hotplug lock
  2021-01-06 17:40 [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks Vlastimil Babka
  2021-01-06 17:40 ` [RFC 1/3] mm, slab, slub: stop taking memory hotplug lock Vlastimil Babka
@ 2021-01-06 17:40 ` Vlastimil Babka
  2021-01-06 17:40 ` [RFC 3/3] mm, slub: stop freeing kmem_cache_node structures on node offline Vlastimil Babka
  2021-01-06 19:09   ` Christoph Lameter
  3 siblings, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2021-01-06 17:40 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko, Vlastimil Babka

SLAB has been using get/put_online_cpus() around creating, destroying and
shrinking kmem caches since 95402b382901 ("cpu-hotplug: replace per-subsystem
mutexes with get_online_cpus()") in 2008, which is supposed to be replacing
a private mutex (cache_chain_mutex, called slab_mutex today) with system-wide
mechanism, but in case of SLAB it's in fact used in addition to the existing
mutex, without explanation why.

SLUB appears to have avoided the cpu hotplug lock initially, but gained it due
to common code unification, such as 20cea9683ecc ("mm, sl[aou]b: Move
kmem_cache_create mutex handling to common code").

Regardless of the history, checking if the hotplug lock is actually needed
today suggests that it's not, and therefore it's better to avoid this
system-wide lock and the ordering this imposes wrt other locks (such as
slab_mutex).

Specifically, in SLAB we have for_each_online_cpu() in do_tune_cpucache()
protected by slab_mutex, and cpu hotplug callbacks that also take the
slab_mutex, which is also taken by the common slab function that currently also
take the hotplug lock. Thus the slab_mutex protection should be sufficient.
Also per-cpu array caches are allocated for each possible cpu, so not affected
by their online/offline state.

In SLUB we have for_each_online_cpu() in functions that show statistics and are
already unprotected today, as racing with hotplug is not harmful. Otherwise
SLUB relies on percpu allocator. The slub_cpu_dead() hotplug callback takes the
slab_mutex.

To sum up, this patch removes get/put_online_cpus() calls from slab as it
should be safe without further adjustments.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slab_common.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index e728265c8b7d..0f29a2b59dac 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -310,8 +310,6 @@ kmem_cache_create_usercopy(const char *name,
 	const char *cache_name;
 	int err;
 
-	get_online_cpus();
-
 	mutex_lock(&slab_mutex);
 
 	err = kmem_cache_sanity_check(name, size);
@@ -360,8 +358,6 @@ kmem_cache_create_usercopy(const char *name,
 out_unlock:
 	mutex_unlock(&slab_mutex);
 
-	put_online_cpus();
-
 	if (err) {
 		if (flags & SLAB_PANIC)
 			panic("kmem_cache_create: Failed to create slab '%s'. Error %d\n",
@@ -487,8 +483,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
 	if (unlikely(!s))
 		return;
 
-	get_online_cpus();
-
 	mutex_lock(&slab_mutex);
 
 	s->refcount--;
@@ -503,8 +497,6 @@ void kmem_cache_destroy(struct kmem_cache *s)
 	}
 out_unlock:
 	mutex_unlock(&slab_mutex);
-
-	put_online_cpus();
 }
 EXPORT_SYMBOL(kmem_cache_destroy);
 
@@ -521,14 +513,12 @@ int kmem_cache_shrink(struct kmem_cache *cachep)
 {
 	int ret;
 
-	get_online_cpus();
 	mutex_lock(&slab_mutex);
 
 	kasan_cache_shrink(cachep);
 	ret = __kmem_cache_shrink(cachep);
 
 	mutex_unlock(&slab_mutex);
-	put_online_cpus();
 	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_shrink);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC 3/3] mm, slub: stop freeing kmem_cache_node structures on node offline
  2021-01-06 17:40 [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks Vlastimil Babka
  2021-01-06 17:40 ` [RFC 1/3] mm, slab, slub: stop taking memory hotplug lock Vlastimil Babka
  2021-01-06 17:40 ` [RFC 2/3] mm, slab, slub: stop taking cpu " Vlastimil Babka
@ 2021-01-06 17:40 ` Vlastimil Babka
  2021-01-07  0:49   ` kernel test robot
  2021-01-06 19:09   ` Christoph Lameter
  3 siblings, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2021-01-06 17:40 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko, Vlastimil Babka

Commit e4f8e513c3d3 ("mm/slub: fix a deadlock in show_slab_objects()") has
fixed a problematic locking order by removing the memory hotplug lock
get/put_online_mems() from show_slab_objects(). During the discussion, it was
argued [1] that this is OK, because existing slabs on the node would prevent
a hotremove to proceed.

That's true, but per-node kmem_cache_node structures are not necessarily
allocated on the same node and may exist even without actual slab pages
on the same node. Any path that uses get_node() directly or via
for_each_kmem_cache_node() (such as show_slab_objects()) can race with
freeing of kmem_cache_node even with the !NULL check, resulting in
use-after-free.

To that end, commit e4f8e513c3d3 argues in a comment that:

 * We don't really need mem_hotplug_lock (to hold off
 * slab_mem_going_offline_callback) here because slab's memory hot
 * unplug code doesn't destroy the kmem_cache->node[] data.

While it's true that slab_mem_going_offline_callback() doesn't free
the kmem_cache_node, the later callback slab_mem_offline_callback() actually
does, so the race and use-after-free exists. Not just for show_slab_objects()
after commit e4f8e513c3d3, but also many other places that are not under
slab_mutex. And adding slab_mutex locking or other synchronization to SLUB
paths such as get_any_partial() would be bad for performance and error-prone.

The easiest solution is therefore to make the abovementioned comment true and
stop freeing the kmem_cache_node structures, accepting some wasted memory in
the full memory node removal scenario. Analogically we also don't free
hotremoved pgdat as mentioned in [1], nor the similar per-node structures in
SLAB. Importantly this approach will not block the hotremove, as generally such
nodes should be movable in order to succeed hotremove in the first place, and
thus the GFP_KERNEL allocated kmem_cache_node will come from elsewhere.

[1] https://lore.kernel.org/linux-mm/20190924151147.GB23050@dhcp22.suse.cz/

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 2e2edd5c9cfc..d7c4f08dcf39 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4268,21 +4268,11 @@ static void slab_mem_offline_callback(void *arg)
 
 	mutex_lock(&slab_mutex);
 	node_clear(offline_node, slab_nodes);
-	list_for_each_entry(s, &slab_caches, list) {
-		n = get_node(s, offline_node);
-		if (n) {
-			/*
-			 * if n->nr_slabs > 0, slabs still exist on the node
-			 * that is going down. We were unable to free them,
-			 * and offline_pages() function shouldn't call this
-			 * callback. So, we must fail.
-			 */
-			BUG_ON(slabs_node(s, offline_node));
-
-			s->node[offline_node] = NULL;
-			kmem_cache_free(kmem_cache_node, n);
-		}
-	}
+	/*
+	 * We no longer free kmem_cache_node structures here, as it would be
+	 * racy with all get_node() users, and infeasible to protect them with
+	 * slab_mutex.
+	 */
 	mutex_unlock(&slab_mutex);
 }
 
@@ -4308,6 +4298,12 @@ static int slab_mem_going_online_callback(void *arg)
 	 */
 	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list) {
+		/*
+		 * The structure may already exist if the node was previously
+		 * onlined and offlined.
+		 */
+		if (get_node(s, nid))
+			continue;
 		/*
 		 * XXX: kmem_cache_alloc_node will fallback to other nodes
 		 *      since memory is not yet available from the node that
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks
  2021-01-06 17:40 [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks Vlastimil Babka
@ 2021-01-06 19:09   ` Christoph Lameter
  2021-01-06 17:40 ` [RFC 2/3] mm, slab, slub: stop taking cpu " Vlastimil Babka
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Christoph Lameter @ 2021-01-06 19:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, linux-kernel, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko

On Wed, 6 Jan 2021, Vlastimil Babka wrote:

> rather accept some wasted memory in scenarios that should be rare anyway (full
> memory hot remove), as we do the same in other contexts already. It's all RFC
> for now, as I might have missed some reason why it's not safe.

Looks good to me. My only concern is the kernel that has hotplug disabled.
Current code allows the online/offline checks to be optimized away.

Can this patch be enhanced to do the same?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks
@ 2021-01-06 19:09   ` Christoph Lameter
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Lameter @ 2021-01-06 19:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, linux-kernel, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko

On Wed, 6 Jan 2021, Vlastimil Babka wrote:

> rather accept some wasted memory in scenarios that should be rare anyway (full
> memory hot remove), as we do the same in other contexts already. It's all RFC
> for now, as I might have missed some reason why it's not safe.

Looks good to me. My only concern is the kernel that has hotplug disabled.
Current code allows the online/offline checks to be optimized away.

Can this patch be enhanced to do the same?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 3/3] mm, slub: stop freeing kmem_cache_node structures on node offline
  2021-01-06 17:40 ` [RFC 3/3] mm, slub: stop freeing kmem_cache_node structures on node offline Vlastimil Babka
@ 2021-01-07  0:49   ` kernel test robot
  0 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-01-07  0:49 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3790 bytes --]

Hi Vlastimil,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on linux/master]
[also build test WARNING on linus/master hnaz-linux-mm/master v5.11-rc2 next-20210104]
[cannot apply to mmotm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Vlastimil-Babka/mm-slab-slub-remove-cpu-and-memory-hotplug-locks/20210107-014224
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 09162bc32c880a791c6c0668ce0745cf7958f576
compiler: c6x-elf-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <rong.a.chen@intel.com>


cppcheck possible warnings: (new ones prefixed by >>, may not real problems)

>> mm/slub.c:4223:26: warning: Unused variable: n [unusedVariable]
    struct kmem_cache_node *n;
                            ^
>> mm/slub.c:4224:21: warning: Unused variable: s [unusedVariable]
    struct kmem_cache *s;
                       ^
   mm/slub.c:5581:4: warning: Either the condition '!name' is redundant or there is pointer arithmetic with NULL pointer. [nullPointerArithmeticRedundantCheck]
    *p++ = ':';
      ^
   mm/slub.c:5579:9: note: Assuming that condition '!name' is not redundant
    BUG_ON(!name);
           ^
   mm/slub.c:5577:12: note: Assignment 'p=name', assigned value is 0
    char *p = name;
              ^
   mm/slub.c:5581:4: note: Null pointer addition
    *p++ = ':';
      ^

vim +4223 mm/slub.c

b9049e234401e1 Yasunori Goto     2007-10-21  4220  
b9049e234401e1 Yasunori Goto     2007-10-21  4221  static void slab_mem_offline_callback(void *arg)
b9049e234401e1 Yasunori Goto     2007-10-21  4222  {
b9049e234401e1 Yasunori Goto     2007-10-21 @4223  	struct kmem_cache_node *n;
b9049e234401e1 Yasunori Goto     2007-10-21 @4224  	struct kmem_cache *s;
b9049e234401e1 Yasunori Goto     2007-10-21  4225  	struct memory_notify *marg = arg;
b9049e234401e1 Yasunori Goto     2007-10-21  4226  	int offline_node;
b9049e234401e1 Yasunori Goto     2007-10-21  4227  
b9d5ab2562ecee Lai Jiangshan     2012-12-11  4228  	offline_node = marg->status_change_nid_normal;
b9049e234401e1 Yasunori Goto     2007-10-21  4229  
b9049e234401e1 Yasunori Goto     2007-10-21  4230  	/*
b9049e234401e1 Yasunori Goto     2007-10-21  4231  	 * If the node still has available memory. we need kmem_cache_node
b9049e234401e1 Yasunori Goto     2007-10-21  4232  	 * for it yet.
b9049e234401e1 Yasunori Goto     2007-10-21  4233  	 */
b9049e234401e1 Yasunori Goto     2007-10-21  4234  	if (offline_node < 0)
b9049e234401e1 Yasunori Goto     2007-10-21  4235  		return;
b9049e234401e1 Yasunori Goto     2007-10-21  4236  
18004c5d4084d9 Christoph Lameter 2012-07-06  4237  	mutex_lock(&slab_mutex);
27153f04e4a5f9 Vlastimil Babka   2021-01-06  4238  	node_clear(offline_node, slab_nodes);
b9049e234401e1 Yasunori Goto     2007-10-21  4239  	/*
9d5e878e997461 Vlastimil Babka   2021-01-06  4240  	 * We no longer free kmem_cache_node structures here, as it would be
9d5e878e997461 Vlastimil Babka   2021-01-06  4241  	 * racy with all get_node() users, and infeasible to protect them with
9d5e878e997461 Vlastimil Babka   2021-01-06  4242  	 * slab_mutex.
b9049e234401e1 Yasunori Goto     2007-10-21  4243  	 */
18004c5d4084d9 Christoph Lameter 2012-07-06  4244  	mutex_unlock(&slab_mutex);
b9049e234401e1 Yasunori Goto     2007-10-21  4245  }
b9049e234401e1 Yasunori Goto     2007-10-21  4246  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks
  2021-01-06 19:09   ` Christoph Lameter
  (?)
@ 2021-01-11 17:55   ` Vlastimil Babka
  -1 siblings, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2021-01-11 17:55 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-mm, linux-kernel, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vladimir Davydov, Qian Cai, David Hildenbrand,
	Michal Hocko


On 1/6/21 8:09 PM, Christoph Lameter wrote:
> On Wed, 6 Jan 2021, Vlastimil Babka wrote:
> 
>> rather accept some wasted memory in scenarios that should be rare anyway (full
>> memory hot remove), as we do the same in other contexts already. It's all RFC
>> for now, as I might have missed some reason why it's not safe.
> 
> Looks good to me. My only concern is the kernel that has hotplug disabled.
> Current code allows the online/offline checks to be optimized away.
> 
> Can this patch be enhanced to do the same?

Thanks, indeed I didn't think about that.
But on closer inspection, there doesn't seem to be need for such enhancement:

- Patch 1 adds the new slab_nodes nodemask, which is basically a copy of
N_NORMAL_MEMORY. Without memory hotplug, the callbacks that would update it
don't occur (maybe are even eliminated as dead code?), other code that uses the
nodemask is unaffected wtf performance, it's just using a different nodemask for
the same operations. The extra memory usage of adding the nodemask is negligible
and not worth complicating the code to distinguish between the new nodemask and
N_NORMAL_MEMORY depending on hotplug being disabled or enabled.

- Patch 1 also restores slab_mutex lock in kmem_cache_shrink(). Commit
03afc0e25f7f removed it because the memory hotplug lock was deemed to be
sufficient replacement, but probably didn't consider the case where hotplug is
disabled and thus the hotplug lock is no-op. Maybe it's safe not to take
slab_mutex in kmem_cache_shrink() in that case, but kmem_cache_shrink() is only
called from a sysfs handler, thus a very cold path anyway.
But, I found out that lockdep complains about it, so I have to rethink this part
anyway... (when kmem_cache_shrink() is called from write to the 'shrink' file we
already have kernfs lock "kn->active#28" and try to lock slab_mutex, but there's
existing dependency in reverse order where in kmem_cache_create() we start with
slab_mutex and sysfs_slab_add takes the kernfs lock, I wonder how this wasn't a
problem before 03afc0e25f7f)

- Patch 2 purely just removes calls to cpu hotplug lock.

- Patch 3 only affects memory hotplug callbacks so nothing to enhance if hotplug
is disabled



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-01-11 17:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-06 17:40 [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks Vlastimil Babka
2021-01-06 17:40 ` [RFC 1/3] mm, slab, slub: stop taking memory hotplug lock Vlastimil Babka
2021-01-06 17:40 ` [RFC 2/3] mm, slab, slub: stop taking cpu " Vlastimil Babka
2021-01-06 17:40 ` [RFC 3/3] mm, slub: stop freeing kmem_cache_node structures on node offline Vlastimil Babka
2021-01-07  0:49   ` kernel test robot
2021-01-06 19:09 ` [RFC 0/3] mm, slab, slub: remove cpu and memory hotplug locks Christoph Lameter
2021-01-06 19:09   ` Christoph Lameter
2021-01-11 17:55   ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.