linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bug in btrfs during low memory testing.
@ 2022-07-05 19:37 dai.ngo
  2022-07-05 20:26 ` Filipe Manana
  0 siblings, 1 reply; 7+ messages in thread
From: dai.ngo @ 2022-07-05 19:37 UTC (permalink / raw)
  To: linux-btrfs; +Cc: gniebler, linux-nfs

Hi,

During my low memory testing with 5.19.0-rc4, I ran into this
problem in btrfs where the thread tries to sleep on invalid
context:


Jul  3 04:08:44 nfsvmf24 kernel: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
Jul  3 04:08:44 nfsvmf24 kernel: in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 4308, name: nfsd
Jul  3 04:08:44 nfsvmf24 kernel: preempt_count: 1, expected: 0
Jul  3 04:08:44 nfsvmf24 kernel: RCU nest depth: 0, expected: 0
Jul  3 04:08:44 nfsvmf24 kernel: 4 locks held by nfsd/4308:
Jul  3 04:08:44 nfsvmf24 kernel: #0: ffff8881052d4438 (sb_writers#14){.+.+}-{0:0}, at: nfsd4_setattr+0x17f/0x3c0 [nfsd]
Jul  3 04:08:44 nfsvmf24 kernel: #1: ffff888141e5aa50 (&sb->s_type->i_mutex_key#16){+.+.}-{3:3}, at:  nfsd_setattr+0x56a/0xe00 [nfsd]
Jul  3 04:08:44 nfsvmf24 kernel: #2: ffff8881052d4628 (sb_internal#2){.+.+}-{0:0}, at: btrfs_dirty_inode+0xda/0x1d0
Jul  3 04:08:44 nfsvmf24 kernel: #3: ffff888105ed6628 (&root->inode_lock){+.+.}-{2:2}, at: btrfs_get_or_create_delayed_node+0x27a/0x430

Jul  3 04:08:44 nfsvmf24 kernel: Preemption disabled at:
Jul  3 04:08:44 nfsvmf24 kernel: [<0000000000000000>] 0x0
Jul  3 04:08:44 nfsvmf24 kernel: CPU: 0 PID: 4308 Comm: nfsd Kdump: loaded Not tainted 5.19.0-rc4+ #1
Jul  3 04:08:44 nfsvmf24 kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Jul  3 04:08:45 nfsvmf24 kernel: Call Trace:
Jul  3 04:08:45 nfsvmf24 kernel: <TASK>
Jul  3 04:08:45 nfsvmf24 kernel: dump_stack_lvl+0x57/0x7d
Jul  3 04:08:45 nfsvmf24 kernel: __might_resched.cold+0x222/0x26b
Jul  3 04:08:45 nfsvmf24 kernel: kmem_cache_alloc_lru+0x159/0x2c0
Jul  3 04:08:45 nfsvmf24 kernel: __xas_nomem+0x1a5/0x5d0
Jul  3 04:08:45 nfsvmf24 kernel: ? lock_acquire+0x1bb/0x500
Jul  3 04:08:45 nfsvmf24 kernel: __xa_insert+0xff/0x1d0
Jul  3 04:08:45 nfsvmf24 kernel: ? __xa_cmpxchg+0x1f0/0x1f0
Jul  3 04:08:45 nfsvmf24 kernel: ? lockdep_init_map_type+0x2c3/0x7b0
Jul  3 04:08:45 nfsvmf24 kernel: ? rwlock_bug.part.0+0x90/0x90
Jul  3 04:08:45 nfsvmf24 kernel: btrfs_get_or_create_delayed_node+0x295/0x430
Jul  3 04:08:45 nfsvmf24 kernel: btrfs_delayed_update_inode+0x24/0x670
Jul  3 04:08:45 nfsvmf24 kernel: ? btrfs_check_and_init_root_item+0x1f0/0x1f0
Jul  3 04:08:45 nfsvmf24 kernel: ? start_transaction+0x219/0x12d0
Jul  3 04:08:45 nfsvmf24 kernel: btrfs_update_inode+0x1aa/0x430
Jul  3 04:08:45 nfsvmf24 kernel: btrfs_dirty_inode+0xf7/0x1d0
Jul  3 04:08:45 nfsvmf24 kernel: btrfs_setattr+0x928/0x1400
Jul  3 04:08:45 nfsvmf24 kernel: ? mark_held_locks+0x9e/0xe0
Jul  3 04:08:45 nfsvmf24 kernel: ? _raw_spin_unlock+0x29/0x40
Jul  3 04:08:45 nfsvmf24 kernel: ? btrfs_cont_expand+0x9a0/0x9a0
Jul  3 04:08:45 nfsvmf24 kernel: ? fh_update+0x1e0/0x1e0 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? current_time+0x88/0xd0
Jul  3 04:08:45 nfsvmf24 kernel: ? timestamp_truncate+0x230/0x230
Jul  3 04:08:45 nfsvmf24 kernel: notify_change+0x4b3/0xe40
Jul  3 04:08:45 nfsvmf24 kernel: ? down_write_nested+0xd4/0x130
Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_setattr+0x984/0xe00 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: nfsd_setattr+0x984/0xe00 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? lock_downgrade+0x680/0x680
Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_permission+0x310/0x310 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? __mnt_want_write+0x192/0x270
Jul  3 04:08:45 nfsvmf24 kernel: nfsd4_setattr+0x1df/0x3c0 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? percpu_counter_add_batch+0x79/0x130
Jul  3 04:08:45 nfsvmf24 kernel: nfsd4_proc_compound+0xce8/0x23e0 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: nfsd_dispatch+0x4ed/0xc10 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? svc_reserve+0xb5/0x130 [sunrpc]
Jul  3 04:08:45 nfsvmf24 kernel: svc_process_common+0xd3f/0x1b00 [sunrpc]
Jul  3 04:08:45 nfsvmf24 kernel: ? svc_generic_rpcbind_set+0x4d0/0x4d0 [sunrpc]
Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_svc+0xc50/0xc50 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? svc_sock_secure_port+0x37/0x50 [sunrpc]
Jul  3 04:08:45 nfsvmf24 kernel: ? svc_recv+0x1096/0x2350 [sunrpc]
Jul  3 04:08:45 nfsvmf24 kernel: svc_process+0x361/0x4f0 [sunrpc]
Jul  3 04:08:45 nfsvmf24 kernel: nfsd+0x2d6/0x570 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_shutdown_threads+0x2a0/0x2a0 [nfsd]
Jul  3 04:08:45 nfsvmf24 kernel: kthread+0x2a1/0x340
Jul  3 04:08:45 nfsvmf24 kernel: ? kthread_complete_and_exit+0x20/0x20
Jul  3 04:08:45 nfsvmf24 kernel: ret_from_fork+0x22/0x30
Jul  3 04:08:45 nfsvmf24 kernel: </TASK>
Jul  3 04:08:45 nfsvmf24 kernel:


I think the problem was introduced by commit 253bf57555e
btrfs: turn delayed_nodes_tree into an XArray.

The spin_lock(&root->inode_lock) was held while xa_insert tries
to allocate memory and block.

-Dai


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bug in btrfs during low memory testing.
  2022-07-05 19:37 bug in btrfs during low memory testing dai.ngo
@ 2022-07-05 20:26 ` Filipe Manana
  2022-07-05 20:29   ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Filipe Manana @ 2022-07-05 20:26 UTC (permalink / raw)
  To: dai.ngo; +Cc: linux-btrfs, gniebler, linux-nfs, Matthew Wilcox

On Tue, Jul 5, 2022 at 9:18 PM <dai.ngo@oracle.com> wrote:
>
> Hi,
>
> During my low memory testing with 5.19.0-rc4, I ran into this
> problem in btrfs where the thread tries to sleep on invalid
> context:
>
>
> Jul  3 04:08:44 nfsvmf24 kernel: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
> Jul  3 04:08:44 nfsvmf24 kernel: in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 4308, name: nfsd
> Jul  3 04:08:44 nfsvmf24 kernel: preempt_count: 1, expected: 0
> Jul  3 04:08:44 nfsvmf24 kernel: RCU nest depth: 0, expected: 0
> Jul  3 04:08:44 nfsvmf24 kernel: 4 locks held by nfsd/4308:
> Jul  3 04:08:44 nfsvmf24 kernel: #0: ffff8881052d4438 (sb_writers#14){.+.+}-{0:0}, at: nfsd4_setattr+0x17f/0x3c0 [nfsd]
> Jul  3 04:08:44 nfsvmf24 kernel: #1: ffff888141e5aa50 (&sb->s_type->i_mutex_key#16){+.+.}-{3:3}, at:  nfsd_setattr+0x56a/0xe00 [nfsd]
> Jul  3 04:08:44 nfsvmf24 kernel: #2: ffff8881052d4628 (sb_internal#2){.+.+}-{0:0}, at: btrfs_dirty_inode+0xda/0x1d0
> Jul  3 04:08:44 nfsvmf24 kernel: #3: ffff888105ed6628 (&root->inode_lock){+.+.}-{2:2}, at: btrfs_get_or_create_delayed_node+0x27a/0x430
>
> Jul  3 04:08:44 nfsvmf24 kernel: Preemption disabled at:
> Jul  3 04:08:44 nfsvmf24 kernel: [<0000000000000000>] 0x0
> Jul  3 04:08:44 nfsvmf24 kernel: CPU: 0 PID: 4308 Comm: nfsd Kdump: loaded Not tainted 5.19.0-rc4+ #1
> Jul  3 04:08:44 nfsvmf24 kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> Jul  3 04:08:45 nfsvmf24 kernel: Call Trace:
> Jul  3 04:08:45 nfsvmf24 kernel: <TASK>
> Jul  3 04:08:45 nfsvmf24 kernel: dump_stack_lvl+0x57/0x7d
> Jul  3 04:08:45 nfsvmf24 kernel: __might_resched.cold+0x222/0x26b
> Jul  3 04:08:45 nfsvmf24 kernel: kmem_cache_alloc_lru+0x159/0x2c0
> Jul  3 04:08:45 nfsvmf24 kernel: __xas_nomem+0x1a5/0x5d0
> Jul  3 04:08:45 nfsvmf24 kernel: ? lock_acquire+0x1bb/0x500
> Jul  3 04:08:45 nfsvmf24 kernel: __xa_insert+0xff/0x1d0
> Jul  3 04:08:45 nfsvmf24 kernel: ? __xa_cmpxchg+0x1f0/0x1f0
> Jul  3 04:08:45 nfsvmf24 kernel: ? lockdep_init_map_type+0x2c3/0x7b0
> Jul  3 04:08:45 nfsvmf24 kernel: ? rwlock_bug.part.0+0x90/0x90
> Jul  3 04:08:45 nfsvmf24 kernel: btrfs_get_or_create_delayed_node+0x295/0x430
> Jul  3 04:08:45 nfsvmf24 kernel: btrfs_delayed_update_inode+0x24/0x670
> Jul  3 04:08:45 nfsvmf24 kernel: ? btrfs_check_and_init_root_item+0x1f0/0x1f0
> Jul  3 04:08:45 nfsvmf24 kernel: ? start_transaction+0x219/0x12d0
> Jul  3 04:08:45 nfsvmf24 kernel: btrfs_update_inode+0x1aa/0x430
> Jul  3 04:08:45 nfsvmf24 kernel: btrfs_dirty_inode+0xf7/0x1d0
> Jul  3 04:08:45 nfsvmf24 kernel: btrfs_setattr+0x928/0x1400
> Jul  3 04:08:45 nfsvmf24 kernel: ? mark_held_locks+0x9e/0xe0
> Jul  3 04:08:45 nfsvmf24 kernel: ? _raw_spin_unlock+0x29/0x40
> Jul  3 04:08:45 nfsvmf24 kernel: ? btrfs_cont_expand+0x9a0/0x9a0
> Jul  3 04:08:45 nfsvmf24 kernel: ? fh_update+0x1e0/0x1e0 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? current_time+0x88/0xd0
> Jul  3 04:08:45 nfsvmf24 kernel: ? timestamp_truncate+0x230/0x230
> Jul  3 04:08:45 nfsvmf24 kernel: notify_change+0x4b3/0xe40
> Jul  3 04:08:45 nfsvmf24 kernel: ? down_write_nested+0xd4/0x130
> Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_setattr+0x984/0xe00 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: nfsd_setattr+0x984/0xe00 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? lock_downgrade+0x680/0x680
> Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_permission+0x310/0x310 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? __mnt_want_write+0x192/0x270
> Jul  3 04:08:45 nfsvmf24 kernel: nfsd4_setattr+0x1df/0x3c0 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? percpu_counter_add_batch+0x79/0x130
> Jul  3 04:08:45 nfsvmf24 kernel: nfsd4_proc_compound+0xce8/0x23e0 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: nfsd_dispatch+0x4ed/0xc10 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? svc_reserve+0xb5/0x130 [sunrpc]
> Jul  3 04:08:45 nfsvmf24 kernel: svc_process_common+0xd3f/0x1b00 [sunrpc]
> Jul  3 04:08:45 nfsvmf24 kernel: ? svc_generic_rpcbind_set+0x4d0/0x4d0 [sunrpc]
> Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_svc+0xc50/0xc50 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? svc_sock_secure_port+0x37/0x50 [sunrpc]
> Jul  3 04:08:45 nfsvmf24 kernel: ? svc_recv+0x1096/0x2350 [sunrpc]
> Jul  3 04:08:45 nfsvmf24 kernel: svc_process+0x361/0x4f0 [sunrpc]
> Jul  3 04:08:45 nfsvmf24 kernel: nfsd+0x2d6/0x570 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: ? nfsd_shutdown_threads+0x2a0/0x2a0 [nfsd]
> Jul  3 04:08:45 nfsvmf24 kernel: kthread+0x2a1/0x340
> Jul  3 04:08:45 nfsvmf24 kernel: ? kthread_complete_and_exit+0x20/0x20
> Jul  3 04:08:45 nfsvmf24 kernel: ret_from_fork+0x22/0x30
> Jul  3 04:08:45 nfsvmf24 kernel: </TASK>
> Jul  3 04:08:45 nfsvmf24 kernel:
>
>
> I think the problem was introduced by commit 253bf57555e
> btrfs: turn delayed_nodes_tree into an XArray.
>
> The spin_lock(&root->inode_lock) was held while xa_insert tries
> to allocate memory and block.

Yep.

In this case we can actually call xa_insert() without holding that
spinlock, it's safe against other concurrent calls to
btrfs_get_or_create_delayed_node(), btrfs_get_delayed_node(),
btrfs_kill_delayed_inode_items(), etc.

However, looking at xa_insert() we have:

        xa_lock(xa);
        err = __xa_insert(xa, index, entry, gfp);
        xa_unlock(xa);

And xa_lock() is defined as:

        #define xa_lock(xa)             spin_lock(&(xa)->xa_lock)

So we'll always be under a spinlock even if we change btrfs to not
take the root->inode_lock spinlock.

This seems more like a general problem outside btrfs' control.
So CC'ing Willy to double check.

We have another place in btrfs that calls xa_insert() while holding some
btrfs spinlock too.

Thanks.

>
> -Dai
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bug in btrfs during low memory testing.
  2022-07-05 20:26 ` Filipe Manana
@ 2022-07-05 20:29   ` Matthew Wilcox
  2022-07-05 20:33     ` Filipe Manana
  2022-07-06  6:36     ` Nikolay Borisov
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2022-07-05 20:29 UTC (permalink / raw)
  To: Filipe Manana; +Cc: dai.ngo, linux-btrfs, gniebler, linux-nfs

On Tue, Jul 05, 2022 at 09:26:47PM +0100, Filipe Manana wrote:
> In this case we can actually call xa_insert() without holding that
> spinlock, it's safe against other concurrent calls to
> btrfs_get_or_create_delayed_node(), btrfs_get_delayed_node(),
> btrfs_kill_delayed_inode_items(), etc.
> 
> However, looking at xa_insert() we have:
> 
>         xa_lock(xa);
>         err = __xa_insert(xa, index, entry, gfp);
>         xa_unlock(xa);
> 
> And xa_lock() is defined as:
> 
>         #define xa_lock(xa)             spin_lock(&(xa)->xa_lock)
> 
> So we'll always be under a spinlock even if we change btrfs to not
> take the root->inode_lock spinlock.
> 
> This seems more like a general problem outside btrfs' control.
> So CC'ing Willy to double check.

No, the XArray knows about its own spinlock.  It'll drop it if it needs
to allocate memory and the GFP flags indicate that the caller can sleep.
It doesn't know about your spinlock, so it can't do the same thing for
you ;-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bug in btrfs during low memory testing.
  2022-07-05 20:29   ` Matthew Wilcox
@ 2022-07-05 20:33     ` Filipe Manana
  2022-07-06  6:36     ` Nikolay Borisov
  1 sibling, 0 replies; 7+ messages in thread
From: Filipe Manana @ 2022-07-05 20:33 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: dai.ngo, linux-btrfs, gniebler, linux-nfs

On Tue, Jul 5, 2022 at 9:30 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Jul 05, 2022 at 09:26:47PM +0100, Filipe Manana wrote:
> > In this case we can actually call xa_insert() without holding that
> > spinlock, it's safe against other concurrent calls to
> > btrfs_get_or_create_delayed_node(), btrfs_get_delayed_node(),
> > btrfs_kill_delayed_inode_items(), etc.
> >
> > However, looking at xa_insert() we have:
> >
> >         xa_lock(xa);
> >         err = __xa_insert(xa, index, entry, gfp);
> >         xa_unlock(xa);
> >
> > And xa_lock() is defined as:
> >
> >         #define xa_lock(xa)             spin_lock(&(xa)->xa_lock)
> >
> > So we'll always be under a spinlock even if we change btrfs to not
> > take the root->inode_lock spinlock.
> >
> > This seems more like a general problem outside btrfs' control.
> > So CC'ing Willy to double check.
>
> No, the XArray knows about its own spinlock.  It'll drop it if it needs
> to allocate memory and the GFP flags indicate that the caller can sleep.
> It doesn't know about your spinlock, so it can't do the same thing for
> you ;-)

Ah, that's good to know.
Thanks Willy.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bug in btrfs during low memory testing.
  2022-07-05 20:29   ` Matthew Wilcox
  2022-07-05 20:33     ` Filipe Manana
@ 2022-07-06  6:36     ` Nikolay Borisov
  2022-07-06 12:09       ` Matthew Wilcox
  1 sibling, 1 reply; 7+ messages in thread
From: Nikolay Borisov @ 2022-07-06  6:36 UTC (permalink / raw)
  To: Matthew Wilcox, Filipe Manana; +Cc: dai.ngo, linux-btrfs, gniebler, linux-nfs



On 5.07.22 г. 23:29 ч., Matthew Wilcox wrote:
> On Tue, Jul 05, 2022 at 09:26:47PM +0100, Filipe Manana wrote:
>> In this case we can actually call xa_insert() without holding that
>> spinlock, it's safe against other concurrent calls to
>> btrfs_get_or_create_delayed_node(), btrfs_get_delayed_node(),
>> btrfs_kill_delayed_inode_items(), etc.
>>
>> However, looking at xa_insert() we have:
>>
>>          xa_lock(xa);
>>          err = __xa_insert(xa, index, entry, gfp);
>>          xa_unlock(xa);
>>
>> And xa_lock() is defined as:
>>
>>          #define xa_lock(xa)             spin_lock(&(xa)->xa_lock)
>>
>> So we'll always be under a spinlock even if we change btrfs to not
>> take the root->inode_lock spinlock.
>>
>> This seems more like a general problem outside btrfs' control.
>> So CC'ing Willy to double check.
> 
> No, the XArray knows about its own spinlock.  It'll drop it if it needs
> to allocate memory and the GFP flags indicate that the caller can sleep.
> It doesn't know about your spinlock, so it can't do the same thing for
> you ;-)


In order to catch (and prevent) further offensive can we perhaps have 
something like that in xa_insert:


diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index c29e11b2c073..63c00b2945a2 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -770,6 +770,9 @@ static inline int __must_check xa_insert(struct 
xarray *xa,
  {
         int err;

+       if (gfpflags_allow_blocking(gfp))
+               might_sleep()
+
         xa_lock(xa);
         err = __xa_insert(xa, index, entry, gfp);
         xa_unlock(xa);


Because an alternative route to fix is it to have GFP_ATOMIC in the gfp.

Basically we'd want to bark on xa_insert execution when we are 
in_atomic(), cause by something else apart from xa_lock ?

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: bug in btrfs during low memory testing.
  2022-07-06  6:36     ` Nikolay Borisov
@ 2022-07-06 12:09       ` Matthew Wilcox
  2022-07-06 12:13         ` Nikolay Borisov
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2022-07-06 12:09 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Filipe Manana, dai.ngo, linux-btrfs, gniebler, linux-nfs

On Wed, Jul 06, 2022 at 09:36:42AM +0300, Nikolay Borisov wrote:
> On 5.07.22 г. 23:29 ч., Matthew Wilcox wrote:
> > On Tue, Jul 05, 2022 at 09:26:47PM +0100, Filipe Manana wrote:
> > > In this case we can actually call xa_insert() without holding that
> > > spinlock, it's safe against other concurrent calls to
> > > btrfs_get_or_create_delayed_node(), btrfs_get_delayed_node(),
> > > btrfs_kill_delayed_inode_items(), etc.
> > > 
> > > However, looking at xa_insert() we have:
> > > 
> > >          xa_lock(xa);
> > >          err = __xa_insert(xa, index, entry, gfp);
> > >          xa_unlock(xa);
> > > 
> > > And xa_lock() is defined as:
> > > 
> > >          #define xa_lock(xa)             spin_lock(&(xa)->xa_lock)
> > > 
> > > So we'll always be under a spinlock even if we change btrfs to not
> > > take the root->inode_lock spinlock.
> > > 
> > > This seems more like a general problem outside btrfs' control.
> > > So CC'ing Willy to double check.
> > 
> > No, the XArray knows about its own spinlock.  It'll drop it if it needs
> > to allocate memory and the GFP flags indicate that the caller can sleep.
> > It doesn't know about your spinlock, so it can't do the same thing for
> > you ;-)
> 
> 
> In order to catch (and prevent) further offensive can we perhaps have
> something like that in xa_insert:
> 
> 
> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
> index c29e11b2c073..63c00b2945a2 100644
> --- a/include/linux/xarray.h
> +++ b/include/linux/xarray.h
> @@ -770,6 +770,9 @@ static inline int __must_check xa_insert(struct xarray
> *xa,
>  {
>         int err;
> 
> +       if (gfpflags_allow_blocking(gfp))
> +               might_sleep()
> +
>         xa_lock(xa);
>         err = __xa_insert(xa, index, entry, gfp);
>         xa_unlock(xa);

I think you mean:

	might_alloc(gfp);

And yes, I think that makes a lot of sense.  Quite a few similar places
to do ... I'll take care of it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: bug in btrfs during low memory testing.
  2022-07-06 12:09       ` Matthew Wilcox
@ 2022-07-06 12:13         ` Nikolay Borisov
  0 siblings, 0 replies; 7+ messages in thread
From: Nikolay Borisov @ 2022-07-06 12:13 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Filipe Manana, dai.ngo, linux-btrfs, gniebler, linux-nfs



On 6.07.22 г. 15:09 ч., Matthew Wilcox wrote:
> On Wed, Jul 06, 2022 at 09:36:42AM +0300, Nikolay Borisov wrote:
>> On 5.07.22 г. 23:29 ч., Matthew Wilcox wrote:
>>> On Tue, Jul 05, 2022 at 09:26:47PM +0100, Filipe Manana wrote:
>>>> In this case we can actually call xa_insert() without holding that
>>>> spinlock, it's safe against other concurrent calls to
>>>> btrfs_get_or_create_delayed_node(), btrfs_get_delayed_node(),
>>>> btrfs_kill_delayed_inode_items(), etc.
>>>>
>>>> However, looking at xa_insert() we have:
>>>>
>>>>           xa_lock(xa);
>>>>           err = __xa_insert(xa, index, entry, gfp);
>>>>           xa_unlock(xa);
>>>>
>>>> And xa_lock() is defined as:
>>>>
>>>>           #define xa_lock(xa)             spin_lock(&(xa)->xa_lock)
>>>>
>>>> So we'll always be under a spinlock even if we change btrfs to not
>>>> take the root->inode_lock spinlock.
>>>>
>>>> This seems more like a general problem outside btrfs' control.
>>>> So CC'ing Willy to double check.
>>>
>>> No, the XArray knows about its own spinlock.  It'll drop it if it needs
>>> to allocate memory and the GFP flags indicate that the caller can sleep.
>>> It doesn't know about your spinlock, so it can't do the same thing for
>>> you ;-)
>>
>>
>> In order to catch (and prevent) further offensive can we perhaps have
>> something like that in xa_insert:
>>
>>
>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>> index c29e11b2c073..63c00b2945a2 100644
>> --- a/include/linux/xarray.h
>> +++ b/include/linux/xarray.h
>> @@ -770,6 +770,9 @@ static inline int __must_check xa_insert(struct xarray
>> *xa,
>>   {
>>          int err;
>>
>> +       if (gfpflags_allow_blocking(gfp))
>> +               might_sleep()
>> +
>>          xa_lock(xa);
>>          err = __xa_insert(xa, index, entry, gfp);
>>          xa_unlock(xa);
> 
> I think you mean:
> 
> 	might_alloc(gfp);
> 
> And yes, I think that makes a lot of sense.  Quite a few similar places
> to do ... I'll take care of it.

Actually I had in mind more along the lines of:

might_sleep_if(gfpflags_allow_blocking(gfp_mask));

but this is essentially what might_alloc does + lockdep annotations, so 
yeah, that would work. What I'd like to achieve with such a modification 
is for new users of xa array to instantly get a splat when xa_insert is 
executed irrespective of whether the allocation happens or not so that 
they have a chance to fix their code sooner rather than later.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-06 12:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-05 19:37 bug in btrfs during low memory testing dai.ngo
2022-07-05 20:26 ` Filipe Manana
2022-07-05 20:29   ` Matthew Wilcox
2022-07-05 20:33     ` Filipe Manana
2022-07-06  6:36     ` Nikolay Borisov
2022-07-06 12:09       ` Matthew Wilcox
2022-07-06 12:13         ` Nikolay Borisov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).