linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [BUG] lockdep splat with kernfs lockdep annotations and slab mutex from drm patch??
       [not found]     ` <20190710225720.58246f8e@oasis.local.home>
@ 2019-07-11  6:17       ` Chris Wilson
  2019-07-11 21:22         ` David Rientjes
  0 siblings, 1 reply; 2+ messages in thread
From: Chris Wilson @ 2019-07-11  6:17 UTC (permalink / raw)
  To: Steven Rostedt, Tejun Heo
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, linux-kernel

Quoting Steven Rostedt (2019-07-11 03:57:20)
> On Fri, 14 Jun 2019 08:38:37 -0700
> Tejun Heo <tj@kernel.org> wrote:
> 
> > Hello,
> > 
> > On Fri, Jun 14, 2019 at 04:08:33PM +0100, Chris Wilson wrote:
> > > #ifdef CONFIG_MEMCG
> > >         if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
> > >                 struct kmem_cache *c;
> > > 
> > >                 mutex_lock(&slab_mutex);
> > > 
> > > so it happens to hit the error + FULL case with the additional slabcaches?
> > > 
> > > Anyway, according to lockdep, it is dangerous to use the slab_mutex inside
> > > slab_attr_store().  
> > 
> > Didn't really look into the code but it looks like slab_mutex is held
> > while trying to remove sysfs files.  sysfs file removal flushes
> > on-going accesses, so if a file operation then tries to grab a mutex
> > which is held during removal, it leads to a deadlock.
> > 
> 
> Looks like this never got fixed and now this bug is in 5.2.

git blame gives

commit 107dab5c92d5f9c3afe962036e47c207363255c7
Author: Glauber Costa <glommer@parallels.com>
Date:   Tue Dec 18 14:23:05 2012 -0800

    slub: slub-specific propagation changes

for adding the mutex underneath sysfs read, and I think

commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0
Author: Mikulas Patocka <mpatocka@redhat.com>
Date:   Wed Jun 27 23:26:09 2018 -0700

    slub: fix failure when we delete and create a slab cache

added the sysfs removal underneath the slab_mutex.

> Just got this:
> 
>  ======================================================
>  WARNING: possible circular locking dependency detected
>  5.2.0-test #15 Not tainted
>  ------------------------------------------------------
>  slub_cpu_partia/899 is trying to acquire lock:
>  000000000f6f2dd7 (slab_mutex){+.+.}, at: slab_attr_store+0x6d/0xe0
>  
>  but task is already holding lock:
>  00000000b23ffe3d (kn->count#160){++++}, at: kernfs_fop_write+0x125/0x230
>  
>  which lock already depends on the new lock.
>  
>  
>  the existing dependency chain (in reverse order) is:
>  
>  -> #1 (kn->count#160){++++}:
>         __kernfs_remove+0x413/0x4a0
>         kernfs_remove_by_name_ns+0x40/0x80
>         sysfs_slab_add+0x1b5/0x2f0
>         __kmem_cache_create+0x511/0x560
>         create_cache+0xcd/0x1f0
>         kmem_cache_create_usercopy+0x18a/0x240
>         kmem_cache_create+0x12/0x20
>         is_active_nid+0xdb/0x230 [snd_hda_codec_generic]
>         snd_hda_get_path_idx+0x55/0x80 [snd_hda_codec_generic]
>         get_nid_path+0xc/0x170 [snd_hda_codec_generic]
>         do_one_initcall+0xa2/0x394
>         do_init_module+0xfd/0x370
>         load_module+0x38c6/0x3bd0
>         __do_sys_finit_module+0x11a/0x1b0
>         do_syscall_64+0x68/0x250
>         entry_SYSCALL_64_after_hwframe+0x49/0xbe
>  
>  -> #0 (slab_mutex){+.+.}:
>         lock_acquire+0xbd/0x1d0
>         __mutex_lock+0xfc/0xb70
>         slab_attr_store+0x6d/0xe0
>         kernfs_fop_write+0x170/0x230
>         vfs_write+0xe1/0x240
>         ksys_write+0xba/0x150
>         do_syscall_64+0x68/0x250
>         entry_SYSCALL_64_after_hwframe+0x49/0xbe
>  
>  other info that might help us debug this:
>  
>   Possible unsafe locking scenario:
>  
>         CPU0                    CPU1
>         ----                    ----
>    lock(kn->count#160);
>                                 lock(slab_mutex);
>                                 lock(kn->count#160);
>    lock(slab_mutex);
>  
>   *** DEADLOCK ***
>  
> 
> 
> Attached is a config and the full dmesg.
> 
> -- Steve
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] lockdep splat with kernfs lockdep annotations and slab mutex from drm patch??
  2019-07-11  6:17       ` [BUG] lockdep splat with kernfs lockdep annotations and slab mutex from drm patch?? Chris Wilson
@ 2019-07-11 21:22         ` David Rientjes
  0 siblings, 0 replies; 2+ messages in thread
From: David Rientjes @ 2019-07-11 21:22 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Steven Rostedt, Tejun Heo, Christoph Lameter, Pekka Enberg,
	Joonsoo Kim, Andrew Morton, linux-mm, linux-kernel

On Thu, 11 Jul 2019, Chris Wilson wrote:

> Quoting Steven Rostedt (2019-07-11 03:57:20)
> > On Fri, 14 Jun 2019 08:38:37 -0700
> > Tejun Heo <tj@kernel.org> wrote:
> > 
> > > Hello,
> > > 
> > > On Fri, Jun 14, 2019 at 04:08:33PM +0100, Chris Wilson wrote:
> > > > #ifdef CONFIG_MEMCG
> > > >         if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
> > > >                 struct kmem_cache *c;
> > > > 
> > > >                 mutex_lock(&slab_mutex);
> > > > 
> > > > so it happens to hit the error + FULL case with the additional slabcaches?
> > > > 
> > > > Anyway, according to lockdep, it is dangerous to use the slab_mutex inside
> > > > slab_attr_store().  
> > > 
> > > Didn't really look into the code but it looks like slab_mutex is held
> > > while trying to remove sysfs files.  sysfs file removal flushes
> > > on-going accesses, so if a file operation then tries to grab a mutex
> > > which is held during removal, it leads to a deadlock.
> > > 
> > 
> > Looks like this never got fixed and now this bug is in 5.2.
> 
> git blame gives
> 
> commit 107dab5c92d5f9c3afe962036e47c207363255c7
> Author: Glauber Costa <glommer@parallels.com>
> Date:   Tue Dec 18 14:23:05 2012 -0800
> 
>     slub: slub-specific propagation changes
> 
> for adding the mutex underneath sysfs read, and I think
> 
> commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0
> Author: Mikulas Patocka <mpatocka@redhat.com>
> Date:   Wed Jun 27 23:26:09 2018 -0700
> 
>     slub: fix failure when we delete and create a slab cache
> 
> added the sysfs removal underneath the slab_mutex.
> 
> > Just got this:
> > 
> >  ======================================================
> >  WARNING: possible circular locking dependency detected
> >  5.2.0-test #15 Not tainted
> >  ------------------------------------------------------
> >  slub_cpu_partia/899 is trying to acquire lock:
> >  000000000f6f2dd7 (slab_mutex){+.+.}, at: slab_attr_store+0x6d/0xe0
> >  
> >  but task is already holding lock:
> >  00000000b23ffe3d (kn->count#160){++++}, at: kernfs_fop_write+0x125/0x230
> >  
> >  which lock already depends on the new lock.
> >  
> >  
> >  the existing dependency chain (in reverse order) is:
> >  
> >  -> #1 (kn->count#160){++++}:
> >         __kernfs_remove+0x413/0x4a0
> >         kernfs_remove_by_name_ns+0x40/0x80
> >         sysfs_slab_add+0x1b5/0x2f0
> >         __kmem_cache_create+0x511/0x560
> >         create_cache+0xcd/0x1f0
> >         kmem_cache_create_usercopy+0x18a/0x240
> >         kmem_cache_create+0x12/0x20
> >         is_active_nid+0xdb/0x230 [snd_hda_codec_generic]
> >         snd_hda_get_path_idx+0x55/0x80 [snd_hda_codec_generic]
> >         get_nid_path+0xc/0x170 [snd_hda_codec_generic]
> >         do_one_initcall+0xa2/0x394
> >         do_init_module+0xfd/0x370
> >         load_module+0x38c6/0x3bd0
> >         __do_sys_finit_module+0x11a/0x1b0
> >         do_syscall_64+0x68/0x250
> >         entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >  

Which slab cache is getting created here?  I assume that sysfs_slab_add() 
is only trying to do kernfs_remove_by_name_ns() becasue its unmergeable 
with other slab caches.

> >  -> #0 (slab_mutex){+.+.}:
> >         lock_acquire+0xbd/0x1d0
> >         __mutex_lock+0xfc/0xb70
> >         slab_attr_store+0x6d/0xe0
> >         kernfs_fop_write+0x170/0x230
> >         vfs_write+0xe1/0x240
> >         ksys_write+0xba/0x150
> >         do_syscall_64+0x68/0x250
> >         entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >  
> >  other info that might help us debug this:
> >  
> >   Possible unsafe locking scenario:
> >  
> >         CPU0                    CPU1
> >         ----                    ----
> >    lock(kn->count#160);
> >                                 lock(slab_mutex);
> >                                 lock(kn->count#160);
> >    lock(slab_mutex);
> >  
> >   *** DEADLOCK ***
> >  


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-07-11 21:22 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190614093914.58f41d8f@gandalf.local.home>
     [not found] ` <156052491337.7796.17642747687124632554@skylake-alporthouse-com>
     [not found]   ` <20190614153837.GE538958@devbig004.ftw2.facebook.com>
     [not found]     ` <20190710225720.58246f8e@oasis.local.home>
2019-07-11  6:17       ` [BUG] lockdep splat with kernfs lockdep annotations and slab mutex from drm patch?? Chris Wilson
2019-07-11 21:22         ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).