* [PATCH v2] mm/slub: fix a deadlock in show_slab_objects()
@ 2019-10-04 12:31 Qian Cai
2019-10-04 12:57 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Qian Cai @ 2019-10-04 12:31 UTC (permalink / raw)
To: akpm
Cc: tj, vdavydov.dev, hannes, guro, mhocko, cl, penberg, rientjes,
linux-mm, linux-kernel, Qian Cai, stable
Long time ago, there fixed a similar deadlock in show_slab_objects()
[1]. However, it is apparently due to the commits like 01fb58bcba63
("slab: remove synchronous synchronize_sched() from memcg cache
deactivation path") and 03afc0e25f7f ("slab: get_online_mems for
kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
just reading files in /sys/kernel/slab which will generate a lockdep
splat below.
Since the "mem_hotplug_lock" here is only to obtain a stable online node
mask while racing with NUMA node hotplug, in the worst case, the results
may me miscalculated while doing NUMA node hotplug, but they shall be
corrected by later reads of the same files.
WARNING: possible circular locking dependency detected
------------------------------------------------------
cat/5224 is trying to acquire lock:
ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at:
show_slab_objects+0x94/0x3a8
but task is already holding lock:
b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (kn->count#45){++++}:
lock_acquire+0x31c/0x360
__kernfs_remove+0x290/0x490
kernfs_remove+0x30/0x44
sysfs_remove_dir+0x70/0x88
kobject_del+0x50/0xb0
sysfs_slab_unlink+0x2c/0x38
shutdown_cache+0xa0/0xf0
kmemcg_cache_shutdown_fn+0x1c/0x34
kmemcg_workfn+0x44/0x64
process_one_work+0x4f4/0x950
worker_thread+0x390/0x4bc
kthread+0x1cc/0x1e8
ret_from_fork+0x10/0x18
-> #1 (slab_mutex){+.+.}:
lock_acquire+0x31c/0x360
__mutex_lock_common+0x16c/0xf78
mutex_lock_nested+0x40/0x50
memcg_create_kmem_cache+0x38/0x16c
memcg_kmem_cache_create_func+0x3c/0x70
process_one_work+0x4f4/0x950
worker_thread+0x390/0x4bc
kthread+0x1cc/0x1e8
ret_from_fork+0x10/0x18
-> #0 (mem_hotplug_lock.rw_sem){++++}:
validate_chain+0xd10/0x2bcc
__lock_acquire+0x7f4/0xb8c
lock_acquire+0x31c/0x360
get_online_mems+0x54/0x150
show_slab_objects+0x94/0x3a8
total_objects_show+0x28/0x34
slab_attr_show+0x38/0x54
sysfs_kf_seq_show+0x198/0x2d4
kernfs_seq_show+0xa4/0xcc
seq_read+0x30c/0x8a8
kernfs_fop_read+0xa8/0x314
__vfs_read+0x88/0x20c
vfs_read+0xd8/0x10c
ksys_read+0xb0/0x120
__arm64_sys_read+0x54/0x88
el0_svc_handler+0x170/0x240
el0_svc+0x8/0xc
other info that might help us debug this:
Chain exists of:
mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(kn->count#45);
lock(slab_mutex);
lock(kn->count#45);
lock(mem_hotplug_lock.rw_sem);
*** DEADLOCK ***
3 locks held by cat/5224:
#0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8
#1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0
#2: b8ff009693eee398 (kn->count#45){++++}, at:
kernfs_seq_start+0x44/0xf0
stack backtrace:
Call trace:
dump_backtrace+0x0/0x248
show_stack+0x20/0x2c
dump_stack+0xd0/0x140
print_circular_bug+0x368/0x380
check_noncircular+0x248/0x250
validate_chain+0xd10/0x2bcc
__lock_acquire+0x7f4/0xb8c
lock_acquire+0x31c/0x360
get_online_mems+0x54/0x150
show_slab_objects+0x94/0x3a8
total_objects_show+0x28/0x34
slab_attr_show+0x38/0x54
sysfs_kf_seq_show+0x198/0x2d4
kernfs_seq_show+0xa4/0xcc
seq_read+0x30c/0x8a8
kernfs_fop_read+0xa8/0x314
__vfs_read+0x88/0x20c
vfs_read+0xd8/0x10c
ksys_read+0xb0/0x120
__arm64_sys_read+0x54/0x88
el0_svc_handler+0x170/0x240
el0_svc+0x8/0xc
[1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html
Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path")
Fixes: 03afc0e25f7f ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}")
Cc: stable@vger.kernel.org
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Qian Cai <cai@lca.pw>
---
v2: fix the comment alignment and improve the changelog.
mm/slub.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 42c1b3af3c98..86bfd9d98af5 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4838,7 +4838,13 @@ static ssize_t show_slab_objects(struct kmem_cache *s,
}
}
- get_online_mems();
+ /*
+ * It is impossible to take "mem_hotplug_lock" here with "kernfs_mutex"
+ * already held which will conflict with an existing lock order:
+ *
+ * mem_hotplug_lock->slab_mutex->kernfs_mutex
+ */
+
#ifdef CONFIG_SLUB_DEBUG
if (flags & SO_ALL) {
struct kmem_cache_node *n;
@@ -4879,7 +4885,6 @@ static ssize_t show_slab_objects(struct kmem_cache *s,
x += sprintf(buf + x, " N%d=%lu",
node, nodes[node]);
#endif
- put_online_mems();
kfree(nodes);
return x + sprintf(buf + x, "\n");
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mm/slub: fix a deadlock in show_slab_objects()
2019-10-04 12:31 [PATCH v2] mm/slub: fix a deadlock in show_slab_objects() Qian Cai
@ 2019-10-04 12:57 ` Michal Hocko
2019-10-07 8:16 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2019-10-04 12:57 UTC (permalink / raw)
To: Qian Cai
Cc: akpm, tj, vdavydov.dev, hannes, guro, cl, penberg, rientjes,
linux-mm, linux-kernel, stable
On Fri 04-10-19 08:31:49, Qian Cai wrote:
> Long time ago, there fixed a similar deadlock in show_slab_objects()
> [1]. However, it is apparently due to the commits like 01fb58bcba63
> ("slab: remove synchronous synchronize_sched() from memcg cache
> deactivation path") and 03afc0e25f7f ("slab: get_online_mems for
> kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
> just reading files in /sys/kernel/slab which will generate a lockdep
> splat below.
>
> Since the "mem_hotplug_lock" here is only to obtain a stable online node
> mask while racing with NUMA node hotplug, in the worst case, the results
> may me miscalculated while doing NUMA node hotplug, but they shall be
> corrected by later reads of the same files.
I think it is important to mention that this doesn't expose the
show_slab_objects to use-after-free. There is only a single path that
might really race here and that is the slab hotplug notifier callback
__kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
doesn't really destroy kmem_cache_node data structures.
Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mm/slub: fix a deadlock in show_slab_objects()
2019-10-04 12:57 ` Michal Hocko
@ 2019-10-07 8:16 ` Michal Hocko
2019-10-07 21:59 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2019-10-07 8:16 UTC (permalink / raw)
To: Qian Cai
Cc: akpm, tj, vdavydov.dev, hannes, guro, cl, penberg, rientjes,
linux-mm, linux-kernel, stable
On Fri 04-10-19 14:57:01, Michal Hocko wrote:
> On Fri 04-10-19 08:31:49, Qian Cai wrote:
> > Long time ago, there fixed a similar deadlock in show_slab_objects()
> > [1]. However, it is apparently due to the commits like 01fb58bcba63
> > ("slab: remove synchronous synchronize_sched() from memcg cache
> > deactivation path") and 03afc0e25f7f ("slab: get_online_mems for
> > kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
> > just reading files in /sys/kernel/slab which will generate a lockdep
> > splat below.
> >
> > Since the "mem_hotplug_lock" here is only to obtain a stable online node
> > mask while racing with NUMA node hotplug, in the worst case, the results
> > may me miscalculated while doing NUMA node hotplug, but they shall be
> > corrected by later reads of the same files.
>
> I think it is important to mention that this doesn't expose the
> show_slab_objects to use-after-free. There is only a single path that
> might really race here and that is the slab hotplug notifier callback
> __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
> doesn't really destroy kmem_cache_node data structures.
Andrew, please add this to the changelog so that we do not have to
scratch heads again when looking into that code.
Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mm/slub: fix a deadlock in show_slab_objects()
2019-10-07 8:16 ` Michal Hocko
@ 2019-10-07 21:59 ` Andrew Morton
2019-10-08 8:47 ` Michal Hocko
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2019-10-07 21:59 UTC (permalink / raw)
To: Michal Hocko
Cc: Qian Cai, tj, vdavydov.dev, hannes, guro, cl, penberg, rientjes,
linux-mm, linux-kernel, stable
On Mon, 7 Oct 2019 10:16:21 +0200 Michal Hocko <mhocko@kernel.org> wrote:
> On Fri 04-10-19 14:57:01, Michal Hocko wrote:
> > On Fri 04-10-19 08:31:49, Qian Cai wrote:
> > > Long time ago, there fixed a similar deadlock in show_slab_objects()
> > > [1]. However, it is apparently due to the commits like 01fb58bcba63
> > > ("slab: remove synchronous synchronize_sched() from memcg cache
> > > deactivation path") and 03afc0e25f7f ("slab: get_online_mems for
> > > kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
> > > just reading files in /sys/kernel/slab which will generate a lockdep
> > > splat below.
> > >
> > > Since the "mem_hotplug_lock" here is only to obtain a stable online node
> > > mask while racing with NUMA node hotplug, in the worst case, the results
> > > may me miscalculated while doing NUMA node hotplug, but they shall be
> > > corrected by later reads of the same files.
> >
> > I think it is important to mention that this doesn't expose the
> > show_slab_objects to use-after-free. There is only a single path that
> > might really race here and that is the slab hotplug notifier callback
> > __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
> > doesn't really destroy kmem_cache_node data structures.
Yes, I noted this during review. It's a bit subtle and is worthy of
more than a changelog note, I think. How about this?
--- a/mm/slub.c~mm-slub-fix-a-deadlock-in-show_slab_objects-fix
+++ a/mm/slub.c
@@ -4851,6 +4851,10 @@ static ssize_t show_slab_objects(struct
* already held which will conflict with an existing lock order:
*
* mem_hotplug_lock->slab_mutex->kernfs_mutex
+ *
+ * We don't really need mem_hotplug_lock (to hold off
+ * slab_mem_going_offline_callback()) here because slab's memory hot
+ * unplug code doesn't destroy the kmem_cache->node[] data.
*/
#ifdef CONFIG_SLUB_DEBUG
_
> Andrew, please add this to the changelog so that we do not have to
> scratch heads again when looking into that code.
I did that as well.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mm/slub: fix a deadlock in show_slab_objects()
2019-10-07 21:59 ` Andrew Morton
@ 2019-10-08 8:47 ` Michal Hocko
0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2019-10-08 8:47 UTC (permalink / raw)
To: Andrew Morton
Cc: Qian Cai, tj, vdavydov.dev, hannes, guro, cl, penberg, rientjes,
linux-mm, linux-kernel, stable
On Mon 07-10-19 14:59:02, Andrew Morton wrote:
> On Mon, 7 Oct 2019 10:16:21 +0200 Michal Hocko <mhocko@kernel.org> wrote:
>
> > On Fri 04-10-19 14:57:01, Michal Hocko wrote:
> > > On Fri 04-10-19 08:31:49, Qian Cai wrote:
> > > > Long time ago, there fixed a similar deadlock in show_slab_objects()
> > > > [1]. However, it is apparently due to the commits like 01fb58bcba63
> > > > ("slab: remove synchronous synchronize_sched() from memcg cache
> > > > deactivation path") and 03afc0e25f7f ("slab: get_online_mems for
> > > > kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
> > > > just reading files in /sys/kernel/slab which will generate a lockdep
> > > > splat below.
> > > >
> > > > Since the "mem_hotplug_lock" here is only to obtain a stable online node
> > > > mask while racing with NUMA node hotplug, in the worst case, the results
> > > > may me miscalculated while doing NUMA node hotplug, but they shall be
> > > > corrected by later reads of the same files.
> > >
> > > I think it is important to mention that this doesn't expose the
> > > show_slab_objects to use-after-free. There is only a single path that
> > > might really race here and that is the slab hotplug notifier callback
> > > __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
> > > doesn't really destroy kmem_cache_node data structures.
>
> Yes, I noted this during review. It's a bit subtle and is worthy of
> more than a changelog note, I think. How about this?
>
> --- a/mm/slub.c~mm-slub-fix-a-deadlock-in-show_slab_objects-fix
> +++ a/mm/slub.c
> @@ -4851,6 +4851,10 @@ static ssize_t show_slab_objects(struct
> * already held which will conflict with an existing lock order:
> *
> * mem_hotplug_lock->slab_mutex->kernfs_mutex
> + *
> + * We don't really need mem_hotplug_lock (to hold off
> + * slab_mem_going_offline_callback()) here because slab's memory hot
> + * unplug code doesn't destroy the kmem_cache->node[] data.
> */
Yes please!
> #ifdef CONFIG_SLUB_DEBUG
> _
>
> > Andrew, please add this to the changelog so that we do not have to
> > scratch heads again when looking into that code.
>
> I did that as well.
Thanks!
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-10-08 8:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04 12:31 [PATCH v2] mm/slub: fix a deadlock in show_slab_objects() Qian Cai
2019-10-04 12:57 ` Michal Hocko
2019-10-07 8:16 ` Michal Hocko
2019-10-07 21:59 ` Andrew Morton
2019-10-08 8:47 ` Michal Hocko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).