linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* inconsistent lock state in xa_destroy
@ 2020-10-08 15:00 syzbot
  2020-10-08 15:01 ` Jens Axboe
  2020-10-08 21:14 ` syzbot
  0 siblings, 2 replies; 9+ messages in thread
From: syzbot @ 2020-10-08 15:00 UTC (permalink / raw)
  To: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

Hello,

syzbot found the following issue on:

HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
compiler:       gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com

================================
WARNING: inconsistent lock state
5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
syz-executor.2/6913 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffff888023003c18 (&xa->xa_lock#9){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
{SOFTIRQ-ON-W} state was registered at:
  lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:354 [inline]
  io_uring_add_task_file fs/io_uring.c:8607 [inline]
  io_uring_add_task_file+0x207/0x430 fs/io_uring.c:8590
  io_uring_get_fd fs/io_uring.c:9116 [inline]
  io_uring_create fs/io_uring.c:9280 [inline]
  io_uring_setup+0x2727/0x3660 fs/io_uring.c:9314
  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
irq event stamp: 362445
hardirqs last  enabled at (362444): [<ffffffff8847f0df>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
hardirqs last  enabled at (362444): [<ffffffff8847f0df>] _raw_spin_unlock_irqrestore+0x6f/0x90 kernel/locking/spinlock.c:191
hardirqs last disabled at (362445): [<ffffffff8847f6c9>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (362445): [<ffffffff8847f6c9>] _raw_spin_lock_irqsave+0xa9/0xd0 kernel/locking/spinlock.c:159
softirqs last  enabled at (361998): [<ffffffff86db0172>] tcp_close+0x8d2/0x1220 net/ipv4/tcp.c:2576
softirqs last disabled at (362079): [<ffffffff88600f2f>] asm_call_irq_on_stack+0xf/0x20

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&xa->xa_lock#9);
  <Interrupt>
    lock(&xa->xa_lock#9);

 *** DEADLOCK ***

1 lock held by syz-executor.2/6913:
 #0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2474 [inline]
 #0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_core+0x5d8/0x1240 kernel/rcu/tree.c:2718

stack backtrace:
CPU: 0 PID: 6913 Comm: syz-executor.2 Not tainted 5.9.0-rc8-next-20201008-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fb lib/dump_stack.c:118
 print_usage_bug kernel/locking/lockdep.c:3715 [inline]
 valid_state kernel/locking/lockdep.c:3726 [inline]
 mark_lock_irq kernel/locking/lockdep.c:3929 [inline]
 mark_lock.cold+0x32/0x74 kernel/locking/lockdep.c:4396
 mark_usage kernel/locking/lockdep.c:4281 [inline]
 __lock_acquire+0x118a/0x56d0 kernel/locking/lockdep.c:4771
 lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
 xa_destroy+0xaa/0x350 lib/xarray.c:2205
 __io_uring_free+0x60/0xc0 fs/io_uring.c:7693
 io_uring_free include/linux/io_uring.h:40 [inline]
 __put_task_struct+0xff/0x3f0 kernel/fork.c:732
 put_task_struct include/linux/sched/task.h:111 [inline]
 delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
 rcu_do_batch kernel/rcu/tree.c:2484 [inline]
 rcu_core+0x645/0x1240 kernel/rcu/tree.c:2718
 __do_softirq+0x203/0xab6 kernel/softirq.c:298
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x9b/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:393 [inline]
 __irq_exit_rcu kernel/softirq.c:423 [inline]
 irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
 sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:66
Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
RSP: 0018:ffffc900053c7b78 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000002040
RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffffc900161a5fc0
RBP: ffffc900053c7d08 R08: 0000000000000001 R09: ffffc900161a0000
R10: fffff52002c34fff R11: 0000000000000000 R12: ffff88805b9f0380
R13: ffff888010ccae08 R14: 0000000001200000 R15: 0000000000000000
 memset include/linux/string.h:384 [inline]
 alloc_thread_stack_node kernel/fork.c:232 [inline]
 dup_task_struct kernel/fork.c:864 [inline]
 copy_process+0x68a/0x6e90 kernel/fork.c:1938
 kernel_clone+0xe5/0xae0 kernel/fork.c:2456
 __do_sys_clone+0xc8/0x110 kernel/fork.c:2573
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c3fa
Code: f7 d8 64 89 04 25 d4 02 00 00 64 4c 8b 0c 25 10 00 00 00 31 d2 4d 8d 91 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 f5 00 00 00 85 c0 41 89 c5 0f 85 fc 00 00
RSP: 002b:00007ffe5dc445b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007ffe5dc445b0 RCX: 000000000045c3fa
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007ffe5dc445f0 R08: 0000000000000001 R09: 0000000002f46940
R10: 0000000002f46c10 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000001 R15: 00007ffe5dc44640


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 15:00 inconsistent lock state in xa_destroy syzbot
@ 2020-10-08 15:01 ` Jens Axboe
  2020-10-08 15:05   ` Matthew Wilcox
  2020-10-08 21:14 ` syzbot
  1 sibling, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-10-08 15:01 UTC (permalink / raw)
  To: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

On 10/8/20 9:00 AM, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
> compiler:       gcc (GCC) 10.1.0-syz 20200507
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com

Already pushed out a fix for this, it's really an xarray issue where it just
assumes that destroy can irq grab the lock.

#syz fix: io_uring: no need to call xa_destroy() on empty xarray

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 15:01 ` Jens Axboe
@ 2020-10-08 15:05   ` Matthew Wilcox
  2020-10-08 15:06     ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-10-08 15:05 UTC (permalink / raw)
  To: Jens Axboe
  Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
> On 10/8/20 9:00 AM, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
> > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > 
> > Unfortunately, I don't have any reproducer for this issue yet.
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com
> 
> Already pushed out a fix for this, it's really an xarray issue where it just
> assumes that destroy can irq grab the lock.

... nice of you to report the issue to the XArray maintainer.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 15:05   ` Matthew Wilcox
@ 2020-10-08 15:06     ` Jens Axboe
  2020-10-08 15:28       ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2020-10-08 15:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

On 10/8/20 9:05 AM, Matthew Wilcox wrote:
> On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
>> On 10/8/20 9:00 AM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
>>> git tree:       linux-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
>>> compiler:       gcc (GCC) 10.1.0-syz 20200507
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com
>>
>> Already pushed out a fix for this, it's really an xarray issue where it just
>> assumes that destroy can irq grab the lock.
> 
> ... nice of you to report the issue to the XArray maintainer.

This is from not even 12h ago, 10h of which I was offline. It wasn't on
the top of my list of priority items to tackle this morning, but it
is/was on the list.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 15:06     ` Jens Axboe
@ 2020-10-08 15:28       ` Matthew Wilcox
  2020-10-08 15:32         ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-10-08 15:28 UTC (permalink / raw)
  To: Jens Axboe
  Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

On Thu, Oct 08, 2020 at 09:06:56AM -0600, Jens Axboe wrote:
> On 10/8/20 9:05 AM, Matthew Wilcox wrote:
> > On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
> >> On 10/8/20 9:00 AM, syzbot wrote:
> >>> Hello,
> >>>
> >>> syzbot found the following issue on:
> >>>
> >>> HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
> >>> git tree:       linux-next
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
> >>> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
> >>> compiler:       gcc (GCC) 10.1.0-syz 20200507
> >>>
> >>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>
> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>> Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com
> >>
> >> Already pushed out a fix for this, it's really an xarray issue where it just
> >> assumes that destroy can irq grab the lock.
> > 
> > ... nice of you to report the issue to the XArray maintainer.
> 
> This is from not even 12h ago, 10h of which I was offline. It wasn't on
> the top of my list of priority items to tackle this morning, but it
> is/was on the list.

How's this?

diff --git a/lib/xarray.c b/lib/xarray.c
index 1e4ed5bce5dc..d84cb98d5485 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1999,21 +1999,32 @@ EXPORT_SYMBOL_GPL(xa_delete_node);	/* For the benefit of the test suite */
  * xa_destroy() - Free all internal data structures.
  * @xa: XArray.
  *
- * After calling this function, the XArray is empty and has freed all memory
- * allocated for its internal data structures.  You are responsible for
- * freeing the objects referenced by the XArray.
- *
- * Context: Any context.  Takes and releases the xa_lock, interrupt-safe.
+ * After calling this function, the XArray is empty and has freed all
+ * memory allocated for its internal data structures.  You are responsible
+ * for freeing the objects referenced by the XArray.
+ *
+ * You do not need to call xa_destroy() if you know the XArray is
+ * already empty.  The IDR used to require this, so you may see some
+ * old code calling idr_destroy() or xa_destroy() on arrays which we
+ * know to be empty, but new code should not do this.
+ *
+ * Context: If the XArray is protected by an IRQ-safe lock, this function
+ * must not be called from interrupt context or with interrupts disabled.
+ * Otherwise it may be called from any context.  It will take and release
+ * the xa_lock with the appropriate disabling & enabling of softirqs
+ * or interrupts.
  */
 void xa_destroy(struct xarray *xa)
 {
 	XA_STATE(xas, xa, 0);
-	unsigned long flags;
+	unsigned int lock_type = xa_lock_type(xa);
 	void *entry;
 
 	xas.xa_node = NULL;
-	xas_lock_irqsave(&xas, flags);
+	xas_lock_type(&xas, lock_type);
 	entry = xa_head_locked(xa);
+	if (!entry)
+		goto out;
 	RCU_INIT_POINTER(xa->xa_head, NULL);
 	xas_init_marks(&xas);
 	if (xa_zero_busy(xa))
@@ -2021,7 +2032,8 @@ void xa_destroy(struct xarray *xa)
 	/* lockdep checks we're still holding the lock in xas_free_nodes() */
 	if (xa_is_node(entry))
 		xas_free_nodes(&xas, xa_to_node(entry));
-	xas_unlock_irqrestore(&xas, flags);
+out:
+	xas_unlock_type(&xas, lock_type);
 }
 EXPORT_SYMBOL(xa_destroy);
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 15:28       ` Matthew Wilcox
@ 2020-10-08 15:32         ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-08 15:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: syzbot, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

On 10/8/20 9:28 AM, Matthew Wilcox wrote:
> On Thu, Oct 08, 2020 at 09:06:56AM -0600, Jens Axboe wrote:
>> On 10/8/20 9:05 AM, Matthew Wilcox wrote:
>>> On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
>>>> On 10/8/20 9:00 AM, syzbot wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following issue on:
>>>>>
>>>>> HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
>>>>> git tree:       linux-next
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
>>>>> compiler:       gcc (GCC) 10.1.0-syz 20200507
>>>>>
>>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>>
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>> Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com
>>>>
>>>> Already pushed out a fix for this, it's really an xarray issue where it just
>>>> assumes that destroy can irq grab the lock.
>>>
>>> ... nice of you to report the issue to the XArray maintainer.
>>
>> This is from not even 12h ago, 10h of which I was offline. It wasn't on
>> the top of my list of priority items to tackle this morning, but it
>> is/was on the list.
> 
> How's this?

Looks like that'll do the trick in avoiding similar future lockdep
splats for xa_destroy().

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 15:00 inconsistent lock state in xa_destroy syzbot
  2020-10-08 15:01 ` Jens Axboe
@ 2020-10-08 21:14 ` syzbot
  2020-10-08 22:27   ` Matthew Wilcox
  1 sibling, 1 reply; 9+ messages in thread
From: syzbot @ 2020-10-08 21:14 UTC (permalink / raw)
  To: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs,
	viro, willy

syzbot has found a reproducer for the following issue on:

HEAD commit:    e4fb79c7 Add linux-next specific files for 20201008
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17dda29f900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14860568500000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16367de7900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+cdcbdc0bd42e559b52b9@syzkaller.appspotmail.com

================================
WARNING: inconsistent lock state
5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffff888025f65018 (&xa->xa_lock#7){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
{SOFTIRQ-ON-W} state was registered at:
  lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:354 [inline]
  io_uring_add_task_file fs/io_uring.c:8607 [inline]
  io_uring_add_task_file+0x207/0x430 fs/io_uring.c:8590
  io_uring_get_fd fs/io_uring.c:9116 [inline]
  io_uring_create fs/io_uring.c:9280 [inline]
  io_uring_setup+0x2727/0x3660 fs/io_uring.c:9314
  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
irq event stamp: 120141
hardirqs last  enabled at (120140): [<ffffffff8847f0df>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
hardirqs last  enabled at (120140): [<ffffffff8847f0df>] _raw_spin_unlock_irqrestore+0x6f/0x90 kernel/locking/spinlock.c:191
hardirqs last disabled at (120141): [<ffffffff8847f6c9>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (120141): [<ffffffff8847f6c9>] _raw_spin_lock_irqsave+0xa9/0xd0 kernel/locking/spinlock.c:159
softirqs last  enabled at (119956): [<ffffffff814731af>] irq_enter_rcu+0xcf/0xf0 kernel/softirq.c:360
softirqs last disabled at (119957): [<ffffffff88600f2f>] asm_call_irq_on_stack+0xf/0x20

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&xa->xa_lock#7);
  <Interrupt>
    lock(&xa->xa_lock#7);

 *** DEADLOCK ***

1 lock held by swapper/0/0:
 #0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2474 [inline]
 #0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_core+0x5d8/0x1240 kernel/rcu/tree.c:2718

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc8-next-20201008-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fb lib/dump_stack.c:118
 print_usage_bug kernel/locking/lockdep.c:3715 [inline]
 valid_state kernel/locking/lockdep.c:3726 [inline]
 mark_lock_irq kernel/locking/lockdep.c:3929 [inline]
 mark_lock.cold+0x32/0x74 kernel/locking/lockdep.c:4396
 mark_usage kernel/locking/lockdep.c:4281 [inline]
 __lock_acquire+0x118a/0x56d0 kernel/locking/lockdep.c:4771
 lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
 xa_destroy+0xaa/0x350 lib/xarray.c:2205
 __io_uring_free+0x60/0xc0 fs/io_uring.c:7693
 io_uring_free include/linux/io_uring.h:40 [inline]
 __put_task_struct+0xff/0x3f0 kernel/fork.c:732
 put_task_struct include/linux/sched/task.h:111 [inline]
 delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
 rcu_do_batch kernel/rcu/tree.c:2484 [inline]
 rcu_core+0x645/0x1240 kernel/rcu/tree.c:2718
 __do_softirq+0x203/0xab6 kernel/softirq.c:298
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0x9b/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:393 [inline]
 __irq_exit_rcu kernel/softirq.c:423 [inline]
 irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
 sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: 89 ef e8 b5 62 6f f9 e9 86 fe ff ff 48 89 df e8 a8 62 6f f9 e9 7b ff ff ff cc cc cc e9 07 00 00 00 0f 00 2d 54 08 61 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 44 08 61 00 f4 c3 cc cc 55 53 e8 09
RSP: 0018:ffffffff8a207d48 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffffffff176a7c1
RDX: ffffffff8a29ce40 RSI: ffffffff8847e5c3 RDI: 0000000000000000
RBP: ffff888012d2e064 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffff888012d2e000 R14: ffff888012d2e064 R15: ffff8881339b2004
 arch_safe_halt arch/x86/include/asm/paravirt.h:150 [inline]
 acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
 acpi_idle_do_entry+0x1e8/0x330 drivers/acpi/processor_idle.c:517
 acpi_idle_enter+0x35a/0x550 drivers/acpi/processor_idle.c:648
 cpuidle_enter_state+0x1ab/0xdb0 drivers/cpuidle/cpuidle.c:237
 cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
 call_cpuidle kernel/sched/idle.c:132 [inline]
 cpuidle_idle_call kernel/sched/idle.c:213 [inline]
 do_idle+0x48e/0x730 kernel/sched/idle.c:273
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:369
 start_kernel+0x490/0x4b1 init/main.c:1049
 secondary_startup_64_no_verify+0xa6/0xab


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 21:14 ` syzbot
@ 2020-10-08 22:27   ` Matthew Wilcox
  2020-10-09  0:55     ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-10-08 22:27 UTC (permalink / raw)
  To: syzbot; +Cc: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro


If I understand the lockdep report here, this actually isn't an XArray
issue, although I do think there is one.

On Thu, Oct 08, 2020 at 02:14:20PM -0700, syzbot wrote:
> ================================
> WARNING: inconsistent lock state
> 5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
> --------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> ffff888025f65018 (&xa->xa_lock#7){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
> {SOFTIRQ-ON-W} state was registered at:
>   lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
>   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>   _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>   spin_lock include/linux/spinlock.h:354 [inline]
>   io_uring_add_task_file fs/io_uring.c:8607 [inline]

You're using the XArray in a non-interrupt-disabling mode.

>  _raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
>  xa_destroy+0xaa/0x350 lib/xarray.c:2205
>  __io_uring_free+0x60/0xc0 fs/io_uring.c:7693
>  io_uring_free include/linux/io_uring.h:40 [inline]
>  __put_task_struct+0xff/0x3f0 kernel/fork.c:732
>  put_task_struct include/linux/sched/task.h:111 [inline]
>  delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
>  rcu_do_batch kernel/rcu/tree.c:2484 [inline]

But you're calling xa_destroy() from in-interrupt context.
So (as far as lockdep is concerned), no matter what I do in
xa_destroy(), this potential deadlock is there.  You'd need to be
using xa_init_flags(XA_FLAGS_LOCK_IRQ) if you actually needed to call
xa_destroy() here.

Fortunately, it seems you don't need to call xa_destroy() at all, so
that problem is solved, but the patch I have here wouldn't help.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: inconsistent lock state in xa_destroy
  2020-10-08 22:27   ` Matthew Wilcox
@ 2020-10-09  0:55     ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2020-10-09  0:55 UTC (permalink / raw)
  To: Matthew Wilcox, syzbot
  Cc: io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

On 10/8/20 4:27 PM, Matthew Wilcox wrote:
> 
> If I understand the lockdep report here, this actually isn't an XArray
> issue, although I do think there is one.
> 
> On Thu, Oct 08, 2020 at 02:14:20PM -0700, syzbot wrote:
>> ================================
>> WARNING: inconsistent lock state
>> 5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
>> --------------------------------
>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
>> swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
>> ffff888025f65018 (&xa->xa_lock#7){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
>> {SOFTIRQ-ON-W} state was registered at:
>>   lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
>>   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>   _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>>   spin_lock include/linux/spinlock.h:354 [inline]
>>   io_uring_add_task_file fs/io_uring.c:8607 [inline]
> 
> You're using the XArray in a non-interrupt-disabling mode.
> 
>>  _raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
>>  xa_destroy+0xaa/0x350 lib/xarray.c:2205
>>  __io_uring_free+0x60/0xc0 fs/io_uring.c:7693
>>  io_uring_free include/linux/io_uring.h:40 [inline]
>>  __put_task_struct+0xff/0x3f0 kernel/fork.c:732
>>  put_task_struct include/linux/sched/task.h:111 [inline]
>>  delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
>>  rcu_do_batch kernel/rcu/tree.c:2484 [inline]
> 
> But you're calling xa_destroy() from in-interrupt context.
> So (as far as lockdep is concerned), no matter what I do in
> xa_destroy(), this potential deadlock is there.  You'd need to be
> using xa_init_flags(XA_FLAGS_LOCK_IRQ) if you actually needed to call
> xa_destroy() here.

Yeah good point, I guess that last free is in softirq from RCU.

> Fortunately, it seems you don't need to call xa_destroy() at all, so
> that problem is solved, but the patch I have here wouldn't help.

Right, it wouldn't have helped this case.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-09  0:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08 15:00 inconsistent lock state in xa_destroy syzbot
2020-10-08 15:01 ` Jens Axboe
2020-10-08 15:05   ` Matthew Wilcox
2020-10-08 15:06     ` Jens Axboe
2020-10-08 15:28       ` Matthew Wilcox
2020-10-08 15:32         ` Jens Axboe
2020-10-08 21:14 ` syzbot
2020-10-08 22:27   ` Matthew Wilcox
2020-10-09  0:55     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).