All of lore.kernel.org
 help / color / mirror / Atom feed
* Lockdep splat with xfs
@ 2023-01-19  1:28 Damien Le Moal
  2023-01-19  4:52 ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Damien Le Moal @ 2023-01-19  1:28 UTC (permalink / raw)
  To: linux-xfs, Dave Chinner, Darrick J. Wong

I got the below kasan splat running on 6.2-rc3.

The machine is currently running some SMR & CMR drives benchmarks and xfs is
used only for the rootfs (on an m.2 ssd) to log test results. So nothing special
really exercising xfs.

My tests are still running (they take several days so I do not want to interrupt
them) so I have not tried the latest Linus tree. Have you got reports of
something similar ? Is that fixed already ? I did not dig into the issue :)


======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc3+ #1637 Not tainted
------------------------------------------------------
kswapd0/177 is trying to acquire lock:
ffff8881fe452118 (&xfs_dir_ilock_class){++++}-{3:3}, at:
xfs_icwalk_ag+0x9d8/0x11f0 [xfs]

but task is already holding lock:
ffffffff83b5d280 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x760/0xf90

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
       fs_reclaim_acquire+0x122/0x170
       __alloc_pages+0x1b3/0x690
       __stack_depot_save+0x3b4/0x4b0
       kasan_save_stack+0x32/0x40
       kasan_set_track+0x25/0x30
       __kasan_kmalloc+0x88/0x90
       __kmalloc_node+0x5a/0xc0
       xfs_attr_copy_value+0xf2/0x170 [xfs]
       xfs_attr_get+0x36a/0x4b0 [xfs]
       xfs_get_acl+0x1a5/0x3f0 [xfs]
       __get_acl.part.0+0x1d5/0x2e0
       vfs_get_acl+0x11b/0x1a0
       do_get_acl+0x39/0x520
       do_getxattr+0xcb/0x330
       getxattr+0xde/0x140
       path_getxattr+0xc1/0x140
       do_syscall_64+0x38/0x80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0

-> #0 (&xfs_dir_ilock_class){++++}-{3:3}:
       __lock_acquire+0x2b91/0x69e0
       lock_acquire+0x1a3/0x520
       down_write_nested+0x9c/0x240
       xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
       xfs_icwalk+0x4c/0xd0 [xfs]
       xfs_reclaim_inodes_nr+0x148/0x1f0 [xfs]
       super_cache_scan+0x3a5/0x500
       do_shrink_slab+0x324/0x900
       shrink_slab+0x376/0x4f0
       shrink_node+0x80f/0x1ae0
       balance_pgdat+0x6e2/0xf90
       kswapd+0x312/0x9b0
       kthread+0x29f/0x340
       ret_from_fork+0x1f/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&xfs_dir_ilock_class);
                               lock(fs_reclaim);
  lock(&xfs_dir_ilock_class);

 *** DEADLOCK ***

3 locks held by kswapd0/177:
 #0: ffffffff83b5d280 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x760/0xf90
 #1: ffffffff83b2b8b0 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x237/0x4f0
 #2: ffff8881a73cc0e0 (&type->s_umount_key#36){++++}-{3:3}, at:
super_cache_scan+0x58/0x500

stack backtrace:
CPU: 16 PID: 177 Comm: kswapd0 Not tainted 6.2.0-rc3+ #1637
Hardware name: Supermicro AS -2014CS-TR/H12SSW-AN6, BIOS 2.4 02/23/2022
Call Trace:
 <TASK>
 dump_stack_lvl+0x50/0x63
 check_noncircular+0x268/0x310
 ? print_circular_bug+0x440/0x440
 ? check_path.constprop.0+0x24/0x50
 ? save_trace+0x46/0xd00
 ? add_lock_to_list+0x188/0x5a0
 __lock_acquire+0x2b91/0x69e0
 ? lockdep_hardirqs_on_prepare+0x410/0x410
 lock_acquire+0x1a3/0x520
 ? xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
 ? lock_downgrade+0x6d0/0x6d0
 ? lock_is_held_type+0xdc/0x130
 down_write_nested+0x9c/0x240
 ? xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
 ? up_read+0x30/0x30
 ? xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
 ? rcu_read_lock_sched_held+0x3f/0x70
 ? xfs_ilock+0x252/0x2f0 [xfs]
 xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
 ? xfs_inode_free_cowblocks+0x1f0/0x1f0 [xfs]
 ? lock_is_held_type+0xdc/0x130
 ? find_held_lock+0x2d/0x110
 ? xfs_perag_get+0x2c0/0x2c0 [xfs]
 ? rwlock_bug.part.0+0x90/0x90
 xfs_icwalk+0x4c/0xd0 [xfs]
 xfs_reclaim_inodes_nr+0x148/0x1f0 [xfs]
 ? xfs_reclaim_inodes+0x1f0/0x1f0 [xfs]
 super_cache_scan+0x3a5/0x500
 do_shrink_slab+0x324/0x900
 shrink_slab+0x376/0x4f0
 ? set_shrinker_bit+0x230/0x230
 ? mem_cgroup_calculate_protection+0x4a/0x4e0
 shrink_node+0x80f/0x1ae0
 balance_pgdat+0x6e2/0xf90
 ? finish_task_switch.isra.0+0x218/0x920
 ? shrink_node+0x1ae0/0x1ae0
 ? lock_is_held_type+0xdc/0x130
 kswapd+0x312/0x9b0
 ? balance_pgdat+0xf90/0xf90
 ? prepare_to_swait_exclusive+0x250/0x250
 ? __kthread_parkme+0xc1/0x1f0
 ? schedule+0x151/0x230
 ? balance_pgdat+0xf90/0xf90
 kthread+0x29f/0x340
 ? kthread_complete_and_exit+0x30/0x30
 ret_from_fork+0x1f/0x30
 </TASK>


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Lockdep splat with xfs
  2023-01-19  1:28 Lockdep splat with xfs Damien Le Moal
@ 2023-01-19  4:52 ` Dave Chinner
  2023-01-19  5:13   ` Damien Le Moal
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2023-01-19  4:52 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-xfs, Dave Chinner, Darrick J. Wong, kasan-dev, Andrey Ryabinin

[cc kasan list as this is a kasan bug]

On Thu, Jan 19, 2023 at 10:28:38AM +0900, Damien Le Moal wrote:
> I got the below kasan splat running on 6.2-rc3.
> 
> The machine is currently running some SMR & CMR drives benchmarks and xfs is
> used only for the rootfs (on an m.2 ssd) to log test results. So nothing special
> really exercising xfs.
> 
> My tests are still running (they take several days so I do not want to interrupt
> them) so I have not tried the latest Linus tree. Have you got reports of
> something similar ? Is that fixed already ? I did not dig into the issue :)
> 
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.2.0-rc3+ #1637 Not tainted
> ------------------------------------------------------
> kswapd0/177 is trying to acquire lock:
> ffff8881fe452118 (&xfs_dir_ilock_class){++++}-{3:3}, at:
> xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
> 
> but task is already holding lock:
> ffffffff83b5d280 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x760/0xf90
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (fs_reclaim){+.+.}-{0:0}:
>        fs_reclaim_acquire+0x122/0x170
>        __alloc_pages+0x1b3/0x690
>        __stack_depot_save+0x3b4/0x4b0
>        kasan_save_stack+0x32/0x40
>        kasan_set_track+0x25/0x30
>        __kasan_kmalloc+0x88/0x90
>        __kmalloc_node+0x5a/0xc0
>        xfs_attr_copy_value+0xf2/0x170 [xfs]

It's a false positive, and the allocation context it comes from
in XFS is documented as needing to avoid lockdep tracking because
this path is know to trigger false positive memory reclaim recursion
reports:

        if (!args->value) {
                args->value = kvmalloc(valuelen, GFP_KERNEL | __GFP_NOLOCKDEP);
                if (!args->value)
                        return -ENOMEM;
        }
        args->valuelen = valuelen;


XFS is telling the allocator not to track this allocation with
lockdep, and that is getting passed down through the allocator which
has not passed it to lockdep (correct behaviour!), but then KASAN is
trying to track the allocation and that needs to do a memory
allocation.  __stack_depot_save() is passed the gfp mask from the
allocation context so it has __GFP_NOLOCKDEP right there, but it
does:

        if (unlikely(can_alloc && !smp_load_acquire(&next_slab_inited))) {
                /*
                 * Zero out zone modifiers, as we don't have specific zone
                 * requirements. Keep the flags related to allocation in atomic
                 * contexts and I/O.
                 */
                alloc_flags &= ~GFP_ZONEMASK;
>>>>>>>         alloc_flags &= (GFP_ATOMIC | GFP_KERNEL);
                alloc_flags |= __GFP_NOWARN;
                page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);

It masks masks out anything other than GFP_ATOMIC and GFP_KERNEL
related flags. This drops __GFP_NOLOCKDEP on the floor, hence
lockdep tracks an allocation in a context we've explicitly said not
to track. Hence lockdep (correctly!) explodes later when the
false positive "lock inode in reclaim context" situation triggers.

This is a KASAN bug. It should not be dropping __GFP_NOLOCKDEP from
the allocation context flags.

-Dave.


>        xfs_attr_get+0x36a/0x4b0 [xfs]
>        xfs_get_acl+0x1a5/0x3f0 [xfs]
>        __get_acl.part.0+0x1d5/0x2e0
>        vfs_get_acl+0x11b/0x1a0
>        do_get_acl+0x39/0x520
>        do_getxattr+0xcb/0x330
>        getxattr+0xde/0x140
>        path_getxattr+0xc1/0x140
>        do_syscall_64+0x38/0x80
>        entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 
> -> #0 (&xfs_dir_ilock_class){++++}-{3:3}:
>        __lock_acquire+0x2b91/0x69e0
>        lock_acquire+0x1a3/0x520
>        down_write_nested+0x9c/0x240
>        xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
>        xfs_icwalk+0x4c/0xd0 [xfs]
>        xfs_reclaim_inodes_nr+0x148/0x1f0 [xfs]
>        super_cache_scan+0x3a5/0x500
>        do_shrink_slab+0x324/0x900
>        shrink_slab+0x376/0x4f0
>        shrink_node+0x80f/0x1ae0
>        balance_pgdat+0x6e2/0xf90
>        kswapd+0x312/0x9b0
>        kthread+0x29f/0x340
>        ret_from_fork+0x1f/0x30
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(fs_reclaim);
>                                lock(&xfs_dir_ilock_class);
>                                lock(fs_reclaim);
>   lock(&xfs_dir_ilock_class);
> 
>  *** DEADLOCK ***
> 
> 3 locks held by kswapd0/177:
>  #0: ffffffff83b5d280 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x760/0xf90
>  #1: ffffffff83b2b8b0 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x237/0x4f0
>  #2: ffff8881a73cc0e0 (&type->s_umount_key#36){++++}-{3:3}, at:
> super_cache_scan+0x58/0x500
> 
> stack backtrace:
> CPU: 16 PID: 177 Comm: kswapd0 Not tainted 6.2.0-rc3+ #1637
> Hardware name: Supermicro AS -2014CS-TR/H12SSW-AN6, BIOS 2.4 02/23/2022
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x50/0x63
>  check_noncircular+0x268/0x310
>  ? print_circular_bug+0x440/0x440
>  ? check_path.constprop.0+0x24/0x50
>  ? save_trace+0x46/0xd00
>  ? add_lock_to_list+0x188/0x5a0
>  __lock_acquire+0x2b91/0x69e0
>  ? lockdep_hardirqs_on_prepare+0x410/0x410
>  lock_acquire+0x1a3/0x520
>  ? xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
>  ? lock_downgrade+0x6d0/0x6d0
>  ? lock_is_held_type+0xdc/0x130
>  down_write_nested+0x9c/0x240
>  ? xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
>  ? up_read+0x30/0x30
>  ? xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
>  ? rcu_read_lock_sched_held+0x3f/0x70
>  ? xfs_ilock+0x252/0x2f0 [xfs]
>  xfs_icwalk_ag+0x9d8/0x11f0 [xfs]
>  ? xfs_inode_free_cowblocks+0x1f0/0x1f0 [xfs]
>  ? lock_is_held_type+0xdc/0x130
>  ? find_held_lock+0x2d/0x110
>  ? xfs_perag_get+0x2c0/0x2c0 [xfs]
>  ? rwlock_bug.part.0+0x90/0x90
>  xfs_icwalk+0x4c/0xd0 [xfs]
>  xfs_reclaim_inodes_nr+0x148/0x1f0 [xfs]
>  ? xfs_reclaim_inodes+0x1f0/0x1f0 [xfs]
>  super_cache_scan+0x3a5/0x500
>  do_shrink_slab+0x324/0x900
>  shrink_slab+0x376/0x4f0
>  ? set_shrinker_bit+0x230/0x230
>  ? mem_cgroup_calculate_protection+0x4a/0x4e0
>  shrink_node+0x80f/0x1ae0
>  balance_pgdat+0x6e2/0xf90
>  ? finish_task_switch.isra.0+0x218/0x920
>  ? shrink_node+0x1ae0/0x1ae0
>  ? lock_is_held_type+0xdc/0x130
>  kswapd+0x312/0x9b0
>  ? balance_pgdat+0xf90/0xf90
>  ? prepare_to_swait_exclusive+0x250/0x250
>  ? __kthread_parkme+0xc1/0x1f0
>  ? schedule+0x151/0x230
>  ? balance_pgdat+0xf90/0xf90
>  kthread+0x29f/0x340
>  ? kthread_complete_and_exit+0x30/0x30
>  ret_from_fork+0x1f/0x30
>  </TASK>
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research
> 

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Lockdep splat with xfs
  2023-01-19  4:52 ` Dave Chinner
@ 2023-01-19  5:13   ` Damien Le Moal
  0 siblings, 0 replies; 3+ messages in thread
From: Damien Le Moal @ 2023-01-19  5:13 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-xfs, Dave Chinner, Darrick J. Wong, kasan-dev, Andrey Ryabinin

On 2023/01/19 13:52, Dave Chinner wrote:
> It's a false positive, and the allocation context it comes from
> in XFS is documented as needing to avoid lockdep tracking because
> this path is know to trigger false positive memory reclaim recursion
> reports:
> 
>         if (!args->value) {
>                 args->value = kvmalloc(valuelen, GFP_KERNEL | __GFP_NOLOCKDEP);
>                 if (!args->value)
>                         return -ENOMEM;
>         }
>         args->valuelen = valuelen;
> 
> 
> XFS is telling the allocator not to track this allocation with
> lockdep, and that is getting passed down through the allocator which
> has not passed it to lockdep (correct behaviour!), but then KASAN is
> trying to track the allocation and that needs to do a memory
> allocation.  __stack_depot_save() is passed the gfp mask from the
> allocation context so it has __GFP_NOLOCKDEP right there, but it
> does:
> 
>         if (unlikely(can_alloc && !smp_load_acquire(&next_slab_inited))) {
>                 /*
>                  * Zero out zone modifiers, as we don't have specific zone
>                  * requirements. Keep the flags related to allocation in atomic
>                  * contexts and I/O.
>                  */
>                 alloc_flags &= ~GFP_ZONEMASK;
>>>>>>>>         alloc_flags &= (GFP_ATOMIC | GFP_KERNEL);
>                 alloc_flags |= __GFP_NOWARN;
>                 page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
> 
> It masks masks out anything other than GFP_ATOMIC and GFP_KERNEL
> related flags. This drops __GFP_NOLOCKDEP on the floor, hence
> lockdep tracks an allocation in a context we've explicitly said not
> to track. Hence lockdep (correctly!) explodes later when the
> false positive "lock inode in reclaim context" situation triggers.
> 
> This is a KASAN bug. It should not be dropping __GFP_NOLOCKDEP from
> the allocation context flags.

OK. Thanks for the explanation !

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-01-19  5:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19  1:28 Lockdep splat with xfs Damien Le Moal
2023-01-19  4:52 ` Dave Chinner
2023-01-19  5:13   ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.