All of lore.kernel.org
 help / color / mirror / Atom feed
* lockdep warning on 4.11.0-rc5 kernel
@ 2017-04-05 20:52 Vivek Goyal
  2017-04-07 15:10 ` Brian Foster
  0 siblings, 1 reply; 5+ messages in thread
From: Vivek Goyal @ 2017-04-05 20:52 UTC (permalink / raw)
  To: linux-xfs

Hi,

I am running 4.11.0-rc5 kernel and did a kernel build and noticed
following lockdep warning on console. Have not analyzed it. Lots of
xfs in backtrace, so sending it to xfs mailing list.

Thanks
Vivek

login: [ 4931.174758] 
[ 4931.175065] =================================
[ 4931.175731] [ INFO: inconsistent lock state ]
[ 4931.176365] 4.11.0-rc5+ #87 Not tainted
[ 4931.176920] ---------------------------------
[ 4931.177537] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 4931.178463] kswapd0/128 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 4931.179198]  (&sb->s_type->i_mutex_key#12){++++?+}, at: [<ffffffffa01fcb0a>] xfs_ilock+0x13a/0x210 [xfs]
[ 4931.180584] {RECLAIM_FS-ON-W} state was registered at:
[ 4931.181320]   mark_held_locks+0x6f/0xa0
[ 4931.181878]   lockdep_trace_alloc+0x7d/0xe0
[ 4931.182474]   kmem_cache_alloc+0x2f/0x2a0
[ 4931.183083]   kmem_zone_alloc+0x81/0x120 [xfs]
[ 4931.183739]   xfs_trans_alloc+0x6c/0x130 [xfs]
[ 4931.184407]   xfs_setattr_nonsize+0x239/0x560 [xfs]
[ 4931.185135]   xfs_vn_setattr_nonsize+0x59/0x150 [xfs]
[ 4931.185890]   xfs_vn_setattr+0x22/0x70 [xfs]
[ 4931.186503]   notify_change+0x2ee/0x440
[ 4931.187058]   chmod_common+0xc5/0x150
[ 4931.187582]   SyS_fchmod+0x53/0x90
[ 4931.188077]   do_syscall_64+0x6c/0x1f0
[ 4931.188616]   return_from_SYSCALL_64+0x0/0x7a
[ 4931.189238] irq event stamp: 397343
[ 4931.189739] hardirqs last  enabled at (397343): [<ffffffff81135cca>] __call_rcu+0x1fa/0x340
[ 4931.190909] hardirqs last disabled at (397342): [<ffffffff81135b21>] __call_rcu+0x51/0x340
[ 4931.192070] softirqs last  enabled at (397192): [<ffffffff818fb86d>] __do_softirq+0x38d/0x4c3
[ 4931.193263] softirqs last disabled at (397185): [<ffffffff810bae27>] irq_exit+0xf7/0x100
[ 4931.194397] 
[ 4931.194397] other info that might help us debug this:
[ 4931.195318]  Possible unsafe locking scenario:
[ 4931.195318] 
[ 4931.196155]        CPU0
[ 4931.196511]        ----
[ 4931.196874]   lock(&sb->s_type->i_mutex_key#12);
[ 4931.197526]   <Interrupt>
[ 4931.197912]     lock(&sb->s_type->i_mutex_key#12);
[ 4931.198591] 
[ 4931.198591]  *** DEADLOCK ***
[ 4931.198591] 
[ 4931.199429] 2 locks held by kswapd0/128:
[ 4931.199990]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff8121a16e>] shrink_slab.part.46+0x5e/0x600
[ 4931.201261]  #1:  (&type->s_umount_key#48){++++++}, at: [<ffffffff812aa54b>] trylock_super+0x1b/0x50
[ 4931.202832] 
[ 4931.202832] stack backtrace:
[ 4931.203878] CPU: 2 PID: 128 Comm: kswapd0 Not tainted 4.11.0-rc5+ #87
[ 4931.204998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[ 4931.206533] Call Trace:
[ 4931.207114]  dump_stack+0x86/0xc3
[ 4931.207807]  print_usage_bug+0x1d0/0x1e0
[ 4931.208573]  mark_lock+0x559/0x5c0
[ 4931.209274]  ? print_shortest_lock_dependencies+0x1a0/0x1a0
[ 4931.210274]  __lock_acquire+0x6ce/0x13c0
[ 4931.211049]  lock_acquire+0xe3/0x1d0
[ 4931.211776]  ? lock_acquire+0xe3/0x1d0
[ 4931.212556]  ? xfs_ilock+0x13a/0x210 [xfs]
[ 4931.213373]  ? xfs_inactive+0xec/0x130 [xfs]
[ 4931.214231]  down_write_nested+0x46/0x80
[ 4931.215038]  ? xfs_ilock+0x13a/0x210 [xfs]
[ 4931.215851]  xfs_ilock+0x13a/0x210 [xfs]
[ 4931.216634]  xfs_inactive+0xec/0x130 [xfs]
[ 4931.217699]  xfs_fs_destroy_inode+0xbb/0x2d0 [xfs]
[ 4931.218594]  destroy_inode+0x3b/0x60
[ 4931.219314]  evict+0x139/0x1c0
[ 4931.220061]  dispose_list+0x56/0x80
[ 4931.220765]  prune_icache_sb+0x5a/0x80
[ 4931.221498]  super_cache_scan+0x14e/0x1a0
[ 4931.222269]  shrink_slab.part.46+0x216/0x600
[ 4931.223075]  shrink_slab+0x29/0x30
[ 4931.223883]  shrink_node+0x108/0x320
[ 4931.224588]  kswapd+0x391/0x990
[ 4931.225246]  kthread+0x10c/0x140
[ 4931.225902]  ? mem_cgroup_shrink_node+0x300/0x300
[ 4931.226760]  ? kthread_create_on_node+0x70/0x70
[ 4931.227579]  ret_from_fork+0x31/0x40


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockdep warning on 4.11.0-rc5 kernel
  2017-04-05 20:52 lockdep warning on 4.11.0-rc5 kernel Vivek Goyal
@ 2017-04-07 15:10 ` Brian Foster
  2017-04-07 16:58   ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Brian Foster @ 2017-04-07 15:10 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-xfs, darrick.wong, hch

On Wed, Apr 05, 2017 at 04:52:11PM -0400, Vivek Goyal wrote:
> Hi,
> 
> I am running 4.11.0-rc5 kernel and did a kernel build and noticed
> following lockdep warning on console. Have not analyzed it. Lots of
> xfs in backtrace, so sending it to xfs mailing list.
> 

Darrick pointed out on irc yesterday that this is likely due to the
lock_inode() call in chmod_common(). I was confused as to where the
iolock came into play here, but apparently we now reuse the core
inode->i_rwsem for that.

In any event, I was playing around with this and reproduce pretty easily
by populating an fs with a bunch files with speculative preallocation
and then generating some memory pressure. I reproduce with the following
stack, however:

[  434.220605] =================================
[  434.222286] [ INFO: inconsistent lock state ]
[  434.224092] 4.11.0-rc4+ #36 Tainted: G           OE  
[  434.225839] ---------------------------------
[  434.227587] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}
usage.
[  434.229995] kswapd0/59 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  434.231851]  (&sb->s_type->i_mutex_key#17){+.+.?.}, at:
[<ffffffffc078e0aa>] xfs_ilock+0x20a/0x300 [xfs]
[  434.235473] {RECLAIM_FS-ON-W} state was registered at:
[  434.237427]   mark_held_locks+0x76/0xa0
[  434.238840]   lockdep_trace_alloc+0x7d/0xe0
[  434.240362]   kmem_cache_alloc+0x2f/0x2d0
[  434.241871]   kmem_zone_alloc+0x81/0x120 [xfs]
[  434.243559]   xfs_trans_alloc+0x6c/0x130 [xfs]
[  434.245233]   xfs_vn_update_time+0x75/0x230 [xfs]
[  434.247031]   file_update_time+0xbc/0x110
[  434.248593]   xfs_file_aio_write_checks+0x19b/0x1c0 [xfs]
[  434.250762]   xfs_file_buffered_aio_write+0x75/0x350 [xfs]
[  434.252978]   xfs_file_write_iter+0x103/0x150 [xfs]
[  434.254935]   __vfs_write+0xe8/0x160
[  434.256325]   vfs_write+0xcb/0x1f0
[  434.257625]   SyS_pwrite64+0x98/0xc0
[  434.258963]   entry_SYSCALL_64_fastpath+0x1f/0xc2

... so this isn't just a chmod thing. OTOH, I think we agree that this
is not a real deadlock vector because the iolock is taken in the
destroy_inode() path and so there should be no other reference to the
inode.

That aside, the IOLOCK_EXCL was added to xfs_inactive() in commit
a36b926180 ("xfs: pull up iolock from xfs_free_eofblocks()") purely to
honor the cleaner call semantics that patch defined for
xfs_free_eofblocks(). We could probably either drop the iolock from here
(though we would then have to kill the assert in xfs_free_eofblocks()),
or use something like the diff below that quiets the lockdep splat for
me. Thoughts?

Brian

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7605d83..eb80d31 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1908,7 +1908,11 @@ xfs_inactive(
 		 * broken free space accounting.
 		 */
 		if (xfs_can_free_eofblocks(ip, true)) {
-			xfs_ilock(ip, XFS_IOLOCK_EXCL);
+			/* trylock to quiet lockdep, iolock should be free */
+			if (!xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
+				ASSERT(0);
+				xfs_ilock(ip, XFS_IOLOCK_EXCL);
+			}
 			xfs_free_eofblocks(ip);
 			xfs_iunlock(ip, XFS_IOLOCK_EXCL);
 		}

> Thanks
> Vivek
> 
> login: [ 4931.174758] 
> [ 4931.175065] =================================
> [ 4931.175731] [ INFO: inconsistent lock state ]
> [ 4931.176365] 4.11.0-rc5+ #87 Not tainted
> [ 4931.176920] ---------------------------------
> [ 4931.177537] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> [ 4931.178463] kswapd0/128 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 4931.179198]  (&sb->s_type->i_mutex_key#12){++++?+}, at: [<ffffffffa01fcb0a>] xfs_ilock+0x13a/0x210 [xfs]
> [ 4931.180584] {RECLAIM_FS-ON-W} state was registered at:
> [ 4931.181320]   mark_held_locks+0x6f/0xa0
> [ 4931.181878]   lockdep_trace_alloc+0x7d/0xe0
> [ 4931.182474]   kmem_cache_alloc+0x2f/0x2a0
> [ 4931.183083]   kmem_zone_alloc+0x81/0x120 [xfs]
> [ 4931.183739]   xfs_trans_alloc+0x6c/0x130 [xfs]
> [ 4931.184407]   xfs_setattr_nonsize+0x239/0x560 [xfs]
> [ 4931.185135]   xfs_vn_setattr_nonsize+0x59/0x150 [xfs]
> [ 4931.185890]   xfs_vn_setattr+0x22/0x70 [xfs]
> [ 4931.186503]   notify_change+0x2ee/0x440
> [ 4931.187058]   chmod_common+0xc5/0x150
> [ 4931.187582]   SyS_fchmod+0x53/0x90
> [ 4931.188077]   do_syscall_64+0x6c/0x1f0
> [ 4931.188616]   return_from_SYSCALL_64+0x0/0x7a
> [ 4931.189238] irq event stamp: 397343
> [ 4931.189739] hardirqs last  enabled at (397343): [<ffffffff81135cca>] __call_rcu+0x1fa/0x340
> [ 4931.190909] hardirqs last disabled at (397342): [<ffffffff81135b21>] __call_rcu+0x51/0x340
> [ 4931.192070] softirqs last  enabled at (397192): [<ffffffff818fb86d>] __do_softirq+0x38d/0x4c3
> [ 4931.193263] softirqs last disabled at (397185): [<ffffffff810bae27>] irq_exit+0xf7/0x100
> [ 4931.194397] 
> [ 4931.194397] other info that might help us debug this:
> [ 4931.195318]  Possible unsafe locking scenario:
> [ 4931.195318] 
> [ 4931.196155]        CPU0
> [ 4931.196511]        ----
> [ 4931.196874]   lock(&sb->s_type->i_mutex_key#12);
> [ 4931.197526]   <Interrupt>
> [ 4931.197912]     lock(&sb->s_type->i_mutex_key#12);
> [ 4931.198591] 
> [ 4931.198591]  *** DEADLOCK ***
> [ 4931.198591] 
> [ 4931.199429] 2 locks held by kswapd0/128:
> [ 4931.199990]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff8121a16e>] shrink_slab.part.46+0x5e/0x600
> [ 4931.201261]  #1:  (&type->s_umount_key#48){++++++}, at: [<ffffffff812aa54b>] trylock_super+0x1b/0x50
> [ 4931.202832] 
> [ 4931.202832] stack backtrace:
> [ 4931.203878] CPU: 2 PID: 128 Comm: kswapd0 Not tainted 4.11.0-rc5+ #87
> [ 4931.204998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
> [ 4931.206533] Call Trace:
> [ 4931.207114]  dump_stack+0x86/0xc3
> [ 4931.207807]  print_usage_bug+0x1d0/0x1e0
> [ 4931.208573]  mark_lock+0x559/0x5c0
> [ 4931.209274]  ? print_shortest_lock_dependencies+0x1a0/0x1a0
> [ 4931.210274]  __lock_acquire+0x6ce/0x13c0
> [ 4931.211049]  lock_acquire+0xe3/0x1d0
> [ 4931.211776]  ? lock_acquire+0xe3/0x1d0
> [ 4931.212556]  ? xfs_ilock+0x13a/0x210 [xfs]
> [ 4931.213373]  ? xfs_inactive+0xec/0x130 [xfs]
> [ 4931.214231]  down_write_nested+0x46/0x80
> [ 4931.215038]  ? xfs_ilock+0x13a/0x210 [xfs]
> [ 4931.215851]  xfs_ilock+0x13a/0x210 [xfs]
> [ 4931.216634]  xfs_inactive+0xec/0x130 [xfs]
> [ 4931.217699]  xfs_fs_destroy_inode+0xbb/0x2d0 [xfs]
> [ 4931.218594]  destroy_inode+0x3b/0x60
> [ 4931.219314]  evict+0x139/0x1c0
> [ 4931.220061]  dispose_list+0x56/0x80
> [ 4931.220765]  prune_icache_sb+0x5a/0x80
> [ 4931.221498]  super_cache_scan+0x14e/0x1a0
> [ 4931.222269]  shrink_slab.part.46+0x216/0x600
> [ 4931.223075]  shrink_slab+0x29/0x30
> [ 4931.223883]  shrink_node+0x108/0x320
> [ 4931.224588]  kswapd+0x391/0x990
> [ 4931.225246]  kthread+0x10c/0x140
> [ 4931.225902]  ? mem_cgroup_shrink_node+0x300/0x300
> [ 4931.226760]  ? kthread_create_on_node+0x70/0x70
> [ 4931.227579]  ret_from_fork+0x31/0x40
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: lockdep warning on 4.11.0-rc5 kernel
  2017-04-07 15:10 ` Brian Foster
@ 2017-04-07 16:58   ` Darrick J. Wong
  2017-04-07 17:28     ` Brian Foster
  0 siblings, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2017-04-07 16:58 UTC (permalink / raw)
  To: Brian Foster; +Cc: Vivek Goyal, linux-xfs, hch

On Fri, Apr 07, 2017 at 11:10:09AM -0400, Brian Foster wrote:
> On Wed, Apr 05, 2017 at 04:52:11PM -0400, Vivek Goyal wrote:
> > Hi,
> > 
> > I am running 4.11.0-rc5 kernel and did a kernel build and noticed
> > following lockdep warning on console. Have not analyzed it. Lots of
> > xfs in backtrace, so sending it to xfs mailing list.
> > 
> 
> Darrick pointed out on irc yesterday that this is likely due to the
> lock_inode() call in chmod_common(). I was confused as to where the
> iolock came into play here, but apparently we now reuse the core
> inode->i_rwsem for that.
> 
> In any event, I was playing around with this and reproduce pretty easily
> by populating an fs with a bunch files with speculative preallocation
> and then generating some memory pressure. I reproduce with the following
> stack, however:
> 
> [  434.220605] =================================
> [  434.222286] [ INFO: inconsistent lock state ]
> [  434.224092] 4.11.0-rc4+ #36 Tainted: G           OE  
> [  434.225839] ---------------------------------
> [  434.227587] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}
> usage.
> [  434.229995] kswapd0/59 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [  434.231851]  (&sb->s_type->i_mutex_key#17){+.+.?.}, at:
> [<ffffffffc078e0aa>] xfs_ilock+0x20a/0x300 [xfs]
> [  434.235473] {RECLAIM_FS-ON-W} state was registered at:
> [  434.237427]   mark_held_locks+0x76/0xa0
> [  434.238840]   lockdep_trace_alloc+0x7d/0xe0
> [  434.240362]   kmem_cache_alloc+0x2f/0x2d0
> [  434.241871]   kmem_zone_alloc+0x81/0x120 [xfs]
> [  434.243559]   xfs_trans_alloc+0x6c/0x130 [xfs]
> [  434.245233]   xfs_vn_update_time+0x75/0x230 [xfs]
> [  434.247031]   file_update_time+0xbc/0x110
> [  434.248593]   xfs_file_aio_write_checks+0x19b/0x1c0 [xfs]
> [  434.250762]   xfs_file_buffered_aio_write+0x75/0x350 [xfs]
> [  434.252978]   xfs_file_write_iter+0x103/0x150 [xfs]
> [  434.254935]   __vfs_write+0xe8/0x160
> [  434.256325]   vfs_write+0xcb/0x1f0
> [  434.257625]   SyS_pwrite64+0x98/0xc0
> [  434.258963]   entry_SYSCALL_64_fastpath+0x1f/0xc2
> 
> ... so this isn't just a chmod thing. OTOH, I think we agree that this
> is not a real deadlock vector because the iolock is taken in the
> destroy_inode() path and so there should be no other reference to the
> inode.
> 
> That aside, the IOLOCK_EXCL was added to xfs_inactive() in commit
> a36b926180 ("xfs: pull up iolock from xfs_free_eofblocks()") purely to
> honor the cleaner call semantics that patch defined for
> xfs_free_eofblocks(). We could probably either drop the iolock from here
> (though we would then have to kill the assert in xfs_free_eofblocks()),
> or use something like the diff below that quiets the lockdep splat for
> me. Thoughts?

Well... the inode is inactive, which means there won't be any io to
protect ourselves against, so we don't need to take the iolock, right?
Why not remove the iolock grabbing and either change the iolock assert
in _free_eofblocks to detect that we're coming from _inactive and not
blow the assert, or just make a __xfs_free_eofblocks that skips the
assert.

(Or I guess teach lockdep better?  I'd rather just get rid of a lock
grab if we can feasibly do it...)

--D

> 
> Brian
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 7605d83..eb80d31 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1908,7 +1908,11 @@ xfs_inactive(
>  		 * broken free space accounting.
>  		 */
>  		if (xfs_can_free_eofblocks(ip, true)) {
> -			xfs_ilock(ip, XFS_IOLOCK_EXCL);
> +			/* trylock to quiet lockdep, iolock should be free */
> +			if (!xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
> +				ASSERT(0);
> +				xfs_ilock(ip, XFS_IOLOCK_EXCL);
> +			}
>  			xfs_free_eofblocks(ip);
>  			xfs_iunlock(ip, XFS_IOLOCK_EXCL);
>  		}
> 
> > Thanks
> > Vivek
> > 
> > login: [ 4931.174758] 
> > [ 4931.175065] =================================
> > [ 4931.175731] [ INFO: inconsistent lock state ]
> > [ 4931.176365] 4.11.0-rc5+ #87 Not tainted
> > [ 4931.176920] ---------------------------------
> > [ 4931.177537] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> > [ 4931.178463] kswapd0/128 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [ 4931.179198]  (&sb->s_type->i_mutex_key#12){++++?+}, at: [<ffffffffa01fcb0a>] xfs_ilock+0x13a/0x210 [xfs]
> > [ 4931.180584] {RECLAIM_FS-ON-W} state was registered at:
> > [ 4931.181320]   mark_held_locks+0x6f/0xa0
> > [ 4931.181878]   lockdep_trace_alloc+0x7d/0xe0
> > [ 4931.182474]   kmem_cache_alloc+0x2f/0x2a0
> > [ 4931.183083]   kmem_zone_alloc+0x81/0x120 [xfs]
> > [ 4931.183739]   xfs_trans_alloc+0x6c/0x130 [xfs]
> > [ 4931.184407]   xfs_setattr_nonsize+0x239/0x560 [xfs]
> > [ 4931.185135]   xfs_vn_setattr_nonsize+0x59/0x150 [xfs]
> > [ 4931.185890]   xfs_vn_setattr+0x22/0x70 [xfs]
> > [ 4931.186503]   notify_change+0x2ee/0x440
> > [ 4931.187058]   chmod_common+0xc5/0x150
> > [ 4931.187582]   SyS_fchmod+0x53/0x90
> > [ 4931.188077]   do_syscall_64+0x6c/0x1f0
> > [ 4931.188616]   return_from_SYSCALL_64+0x0/0x7a
> > [ 4931.189238] irq event stamp: 397343
> > [ 4931.189739] hardirqs last  enabled at (397343): [<ffffffff81135cca>] __call_rcu+0x1fa/0x340
> > [ 4931.190909] hardirqs last disabled at (397342): [<ffffffff81135b21>] __call_rcu+0x51/0x340
> > [ 4931.192070] softirqs last  enabled at (397192): [<ffffffff818fb86d>] __do_softirq+0x38d/0x4c3
> > [ 4931.193263] softirqs last disabled at (397185): [<ffffffff810bae27>] irq_exit+0xf7/0x100
> > [ 4931.194397] 
> > [ 4931.194397] other info that might help us debug this:
> > [ 4931.195318]  Possible unsafe locking scenario:
> > [ 4931.195318] 
> > [ 4931.196155]        CPU0
> > [ 4931.196511]        ----
> > [ 4931.196874]   lock(&sb->s_type->i_mutex_key#12);
> > [ 4931.197526]   <Interrupt>
> > [ 4931.197912]     lock(&sb->s_type->i_mutex_key#12);
> > [ 4931.198591] 
> > [ 4931.198591]  *** DEADLOCK ***
> > [ 4931.198591] 
> > [ 4931.199429] 2 locks held by kswapd0/128:
> > [ 4931.199990]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff8121a16e>] shrink_slab.part.46+0x5e/0x600
> > [ 4931.201261]  #1:  (&type->s_umount_key#48){++++++}, at: [<ffffffff812aa54b>] trylock_super+0x1b/0x50
> > [ 4931.202832] 
> > [ 4931.202832] stack backtrace:
> > [ 4931.203878] CPU: 2 PID: 128 Comm: kswapd0 Not tainted 4.11.0-rc5+ #87
> > [ 4931.204998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
> > [ 4931.206533] Call Trace:
> > [ 4931.207114]  dump_stack+0x86/0xc3
> > [ 4931.207807]  print_usage_bug+0x1d0/0x1e0
> > [ 4931.208573]  mark_lock+0x559/0x5c0
> > [ 4931.209274]  ? print_shortest_lock_dependencies+0x1a0/0x1a0
> > [ 4931.210274]  __lock_acquire+0x6ce/0x13c0
> > [ 4931.211049]  lock_acquire+0xe3/0x1d0
> > [ 4931.211776]  ? lock_acquire+0xe3/0x1d0
> > [ 4931.212556]  ? xfs_ilock+0x13a/0x210 [xfs]
> > [ 4931.213373]  ? xfs_inactive+0xec/0x130 [xfs]
> > [ 4931.214231]  down_write_nested+0x46/0x80
> > [ 4931.215038]  ? xfs_ilock+0x13a/0x210 [xfs]
> > [ 4931.215851]  xfs_ilock+0x13a/0x210 [xfs]
> > [ 4931.216634]  xfs_inactive+0xec/0x130 [xfs]
> > [ 4931.217699]  xfs_fs_destroy_inode+0xbb/0x2d0 [xfs]
> > [ 4931.218594]  destroy_inode+0x3b/0x60
> > [ 4931.219314]  evict+0x139/0x1c0
> > [ 4931.220061]  dispose_list+0x56/0x80
> > [ 4931.220765]  prune_icache_sb+0x5a/0x80
> > [ 4931.221498]  super_cache_scan+0x14e/0x1a0
> > [ 4931.222269]  shrink_slab.part.46+0x216/0x600
> > [ 4931.223075]  shrink_slab+0x29/0x30
> > [ 4931.223883]  shrink_node+0x108/0x320
> > [ 4931.224588]  kswapd+0x391/0x990
> > [ 4931.225246]  kthread+0x10c/0x140
> > [ 4931.225902]  ? mem_cgroup_shrink_node+0x300/0x300
> > [ 4931.226760]  ? kthread_create_on_node+0x70/0x70
> > [ 4931.227579]  ret_from_fork+0x31/0x40
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockdep warning on 4.11.0-rc5 kernel
  2017-04-07 16:58   ` Darrick J. Wong
@ 2017-04-07 17:28     ` Brian Foster
  2017-04-07 17:35       ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Brian Foster @ 2017-04-07 17:28 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Vivek Goyal, linux-xfs, hch

On Fri, Apr 07, 2017 at 09:58:43AM -0700, Darrick J. Wong wrote:
> On Fri, Apr 07, 2017 at 11:10:09AM -0400, Brian Foster wrote:
> > On Wed, Apr 05, 2017 at 04:52:11PM -0400, Vivek Goyal wrote:
> > > Hi,
> > > 
> > > I am running 4.11.0-rc5 kernel and did a kernel build and noticed
> > > following lockdep warning on console. Have not analyzed it. Lots of
> > > xfs in backtrace, so sending it to xfs mailing list.
> > > 
> > 
> > Darrick pointed out on irc yesterday that this is likely due to the
> > lock_inode() call in chmod_common(). I was confused as to where the
> > iolock came into play here, but apparently we now reuse the core
> > inode->i_rwsem for that.
> > 
> > In any event, I was playing around with this and reproduce pretty easily
> > by populating an fs with a bunch files with speculative preallocation
> > and then generating some memory pressure. I reproduce with the following
> > stack, however:
> > 
> > [  434.220605] =================================
> > [  434.222286] [ INFO: inconsistent lock state ]
> > [  434.224092] 4.11.0-rc4+ #36 Tainted: G           OE  
> > [  434.225839] ---------------------------------
> > [  434.227587] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}
> > usage.
> > [  434.229995] kswapd0/59 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [  434.231851]  (&sb->s_type->i_mutex_key#17){+.+.?.}, at:
> > [<ffffffffc078e0aa>] xfs_ilock+0x20a/0x300 [xfs]
> > [  434.235473] {RECLAIM_FS-ON-W} state was registered at:
> > [  434.237427]   mark_held_locks+0x76/0xa0
> > [  434.238840]   lockdep_trace_alloc+0x7d/0xe0
> > [  434.240362]   kmem_cache_alloc+0x2f/0x2d0
> > [  434.241871]   kmem_zone_alloc+0x81/0x120 [xfs]
> > [  434.243559]   xfs_trans_alloc+0x6c/0x130 [xfs]
> > [  434.245233]   xfs_vn_update_time+0x75/0x230 [xfs]
> > [  434.247031]   file_update_time+0xbc/0x110
> > [  434.248593]   xfs_file_aio_write_checks+0x19b/0x1c0 [xfs]
> > [  434.250762]   xfs_file_buffered_aio_write+0x75/0x350 [xfs]
> > [  434.252978]   xfs_file_write_iter+0x103/0x150 [xfs]
> > [  434.254935]   __vfs_write+0xe8/0x160
> > [  434.256325]   vfs_write+0xcb/0x1f0
> > [  434.257625]   SyS_pwrite64+0x98/0xc0
> > [  434.258963]   entry_SYSCALL_64_fastpath+0x1f/0xc2
> > 
> > ... so this isn't just a chmod thing. OTOH, I think we agree that this
> > is not a real deadlock vector because the iolock is taken in the
> > destroy_inode() path and so there should be no other reference to the
> > inode.
> > 
> > That aside, the IOLOCK_EXCL was added to xfs_inactive() in commit
> > a36b926180 ("xfs: pull up iolock from xfs_free_eofblocks()") purely to
> > honor the cleaner call semantics that patch defined for
> > xfs_free_eofblocks(). We could probably either drop the iolock from here
> > (though we would then have to kill the assert in xfs_free_eofblocks()),
> > or use something like the diff below that quiets the lockdep splat for
> > me. Thoughts?
> 
> Well... the inode is inactive, which means there won't be any io to
> protect ourselves against, so we don't need to take the iolock, right?
> Why not remove the iolock grabbing and either change the iolock assert
> in _free_eofblocks to detect that we're coming from _inactive and not
> blow the assert, or just make a __xfs_free_eofblocks that skips the
> assert.
> 

Correct, the iolock is not necessary here and historically was not
taken. It was added in the commit above to clean up the function, more
specifically to support the assert. I'd rather not go and uglify the
whole thing again (might as well just revert the patch in that case).

I guess we could just kill the assert then. How about the appended?

Brian

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 4d1920e..de94798 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -903,9 +903,9 @@ xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
 }
 
 /*
- * This is called by xfs_inactive to free any blocks beyond eof
- * when the link count isn't zero and by xfs_dm_punch_hole() when
- * punching a hole to EOF.
+ * This is called to free any blocks beyond eof. The caller must hold
+ * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
+ * reference to the inode.
  */
 int
 xfs_free_eofblocks(
@@ -920,8 +920,6 @@ xfs_free_eofblocks(
 	struct xfs_bmbt_irec	imap;
 	struct xfs_mount	*mp = ip->i_mount;
 
-	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
-
 	/*
 	 * Figure out if there are any blocks beyond the end
 	 * of the file.  If not, then there is nothing to do.
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7605d83..e36c2c3 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1906,12 +1906,13 @@ xfs_inactive(
 		 * force is true because we are evicting an inode from the
 		 * cache. Post-eof blocks must be freed, lest we end up with
 		 * broken free space accounting.
+		 *
+		 * Note: don't bother with iolock here since lockdep complains
+		 * about acquiring it in reclaim context. We have the only
+		 * reference to the inode at this point anyways.
 		 */
-		if (xfs_can_free_eofblocks(ip, true)) {
-			xfs_ilock(ip, XFS_IOLOCK_EXCL);
+		if (xfs_can_free_eofblocks(ip, true))
 			xfs_free_eofblocks(ip);
-			xfs_iunlock(ip, XFS_IOLOCK_EXCL);
-		}
 
 		return;
 	}

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: lockdep warning on 4.11.0-rc5 kernel
  2017-04-07 17:28     ` Brian Foster
@ 2017-04-07 17:35       ` Darrick J. Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Darrick J. Wong @ 2017-04-07 17:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: Vivek Goyal, linux-xfs, hch

On Fri, Apr 07, 2017 at 01:28:31PM -0400, Brian Foster wrote:
> On Fri, Apr 07, 2017 at 09:58:43AM -0700, Darrick J. Wong wrote:
> > On Fri, Apr 07, 2017 at 11:10:09AM -0400, Brian Foster wrote:
> > > On Wed, Apr 05, 2017 at 04:52:11PM -0400, Vivek Goyal wrote:
> > > > Hi,
> > > > 
> > > > I am running 4.11.0-rc5 kernel and did a kernel build and noticed
> > > > following lockdep warning on console. Have not analyzed it. Lots of
> > > > xfs in backtrace, so sending it to xfs mailing list.
> > > > 
> > > 
> > > Darrick pointed out on irc yesterday that this is likely due to the
> > > lock_inode() call in chmod_common(). I was confused as to where the
> > > iolock came into play here, but apparently we now reuse the core
> > > inode->i_rwsem for that.
> > > 
> > > In any event, I was playing around with this and reproduce pretty easily
> > > by populating an fs with a bunch files with speculative preallocation
> > > and then generating some memory pressure. I reproduce with the following
> > > stack, however:
> > > 
> > > [  434.220605] =================================
> > > [  434.222286] [ INFO: inconsistent lock state ]
> > > [  434.224092] 4.11.0-rc4+ #36 Tainted: G           OE  
> > > [  434.225839] ---------------------------------
> > > [  434.227587] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}
> > > usage.
> > > [  434.229995] kswapd0/59 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > > [  434.231851]  (&sb->s_type->i_mutex_key#17){+.+.?.}, at:
> > > [<ffffffffc078e0aa>] xfs_ilock+0x20a/0x300 [xfs]
> > > [  434.235473] {RECLAIM_FS-ON-W} state was registered at:
> > > [  434.237427]   mark_held_locks+0x76/0xa0
> > > [  434.238840]   lockdep_trace_alloc+0x7d/0xe0
> > > [  434.240362]   kmem_cache_alloc+0x2f/0x2d0
> > > [  434.241871]   kmem_zone_alloc+0x81/0x120 [xfs]
> > > [  434.243559]   xfs_trans_alloc+0x6c/0x130 [xfs]
> > > [  434.245233]   xfs_vn_update_time+0x75/0x230 [xfs]
> > > [  434.247031]   file_update_time+0xbc/0x110
> > > [  434.248593]   xfs_file_aio_write_checks+0x19b/0x1c0 [xfs]
> > > [  434.250762]   xfs_file_buffered_aio_write+0x75/0x350 [xfs]
> > > [  434.252978]   xfs_file_write_iter+0x103/0x150 [xfs]
> > > [  434.254935]   __vfs_write+0xe8/0x160
> > > [  434.256325]   vfs_write+0xcb/0x1f0
> > > [  434.257625]   SyS_pwrite64+0x98/0xc0
> > > [  434.258963]   entry_SYSCALL_64_fastpath+0x1f/0xc2
> > > 
> > > ... so this isn't just a chmod thing. OTOH, I think we agree that this
> > > is not a real deadlock vector because the iolock is taken in the
> > > destroy_inode() path and so there should be no other reference to the
> > > inode.
> > > 
> > > That aside, the IOLOCK_EXCL was added to xfs_inactive() in commit
> > > a36b926180 ("xfs: pull up iolock from xfs_free_eofblocks()") purely to
> > > honor the cleaner call semantics that patch defined for
> > > xfs_free_eofblocks(). We could probably either drop the iolock from here
> > > (though we would then have to kill the assert in xfs_free_eofblocks()),
> > > or use something like the diff below that quiets the lockdep splat for
> > > me. Thoughts?
> > 
> > Well... the inode is inactive, which means there won't be any io to
> > protect ourselves against, so we don't need to take the iolock, right?
> > Why not remove the iolock grabbing and either change the iolock assert
> > in _free_eofblocks to detect that we're coming from _inactive and not
> > blow the assert, or just make a __xfs_free_eofblocks that skips the
> > assert.
> > 
> 
> Correct, the iolock is not necessary here and historically was not
> taken. It was added in the commit above to clean up the function, more
> specifically to support the assert. I'd rather not go and uglify the
> whole thing again (might as well just revert the patch in that case).
> 
> I guess we could just kill the assert then. How about the appended?

Looks reasonable.  Test, patchify, and send to list?

--D

> 
> Brian
> 
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 4d1920e..de94798 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -903,9 +903,9 @@ xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
>  }
>  
>  /*
> - * This is called by xfs_inactive to free any blocks beyond eof
> - * when the link count isn't zero and by xfs_dm_punch_hole() when
> - * punching a hole to EOF.
> + * This is called to free any blocks beyond eof. The caller must hold
> + * IOLOCK_EXCL unless we are in the inode reclaim path and have the only
> + * reference to the inode.
>   */
>  int
>  xfs_free_eofblocks(
> @@ -920,8 +920,6 @@ xfs_free_eofblocks(
>  	struct xfs_bmbt_irec	imap;
>  	struct xfs_mount	*mp = ip->i_mount;
>  
> -	ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
> -
>  	/*
>  	 * Figure out if there are any blocks beyond the end
>  	 * of the file.  If not, then there is nothing to do.
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 7605d83..e36c2c3 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1906,12 +1906,13 @@ xfs_inactive(
>  		 * force is true because we are evicting an inode from the
>  		 * cache. Post-eof blocks must be freed, lest we end up with
>  		 * broken free space accounting.
> +		 *
> +		 * Note: don't bother with iolock here since lockdep complains
> +		 * about acquiring it in reclaim context. We have the only
> +		 * reference to the inode at this point anyways.
>  		 */
> -		if (xfs_can_free_eofblocks(ip, true)) {
> -			xfs_ilock(ip, XFS_IOLOCK_EXCL);
> +		if (xfs_can_free_eofblocks(ip, true))
>  			xfs_free_eofblocks(ip);
> -			xfs_iunlock(ip, XFS_IOLOCK_EXCL);
> -		}
>  
>  		return;
>  	}
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-07 17:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-05 20:52 lockdep warning on 4.11.0-rc5 kernel Vivek Goyal
2017-04-07 15:10 ` Brian Foster
2017-04-07 16:58   ` Darrick J. Wong
2017-04-07 17:28     ` Brian Foster
2017-04-07 17:35       ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.