linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: fix possible bd_size_lock deadlock
@ 2021-03-11 12:11 yanfei.xu
  2021-03-12 19:37 ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: yanfei.xu @ 2021-03-11 12:11 UTC (permalink / raw)
  To: axboe, damien.lemoal; +Cc: linux-block, linux-kernel

From: Yanfei Xu <yanfei.xu@windriver.com>

bd_size_lock spinlock could be taken in block softirq, thus we should
disable the softirq before taking the lock.

WARNING: inconsistent lock state
5.12.0-rc2-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage.
kworker/u4:0/7 [HC0[0]:SC1[1]:HE0:SE0] takes:
8f87826c (&inode->i_size_seqcount){+.+-}-{0:0}, at:
end_bio_bh_io_sync+0x38/0x54 fs/buffer.c:3006
{SOFTIRQ-ON-W} state was registered at:
  lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510
  lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483
  do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]
  do_write_seqcount_begin include/linux/seqlock.h:545 [inline]
  i_size_write include/linux/fs.h:863 [inline]
  set_capacity+0x13c/0x1f8 block/genhd.c:50
  brd_alloc+0x130/0x180 drivers/block/brd.c:401
  brd_init+0xcc/0x1e0 drivers/block/brd.c:500
  do_one_initcall+0x8c/0x59c init/main.c:1226
  do_initcall_level init/main.c:1299 [inline]
  do_initcalls init/main.c:1315 [inline]
  do_basic_setup init/main.c:1335 [inline]
  kernel_init_freeable+0x2cc/0x330 init/main.c:1537
  kernel_init+0x10/0x120 init/main.c:1424
  ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158
  0x0
irq event stamp: 2783413
hardirqs last  enabled at (2783412): [<802011ec>]
__do_softirq+0xf4/0x7ac kernel/softirq.c:329
hardirqs last disabled at (2783413): [<8277d260>]
__raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline]
hardirqs last disabled at (2783413): [<8277d260>]
_raw_read_lock_irqsave+0x84/0x88 kernel/locking/spinlock.c:231
softirqs last  enabled at (2783410): [<826b5050>] spin_unlock_bh
include/linux/spinlock.h:399 [inline]
softirqs last  enabled at (2783410): [<826b5050>]
batadv_nc_purge_paths+0x10c/0x148 net/batman-adv/network-coding.c:467
softirqs last disabled at (2783411): [<8024ddfc>] do_softirq_own_stack
include/asm-generic/softirq_stack.h:10 [inline]
softirqs last disabled at (2783411): [<8024ddfc>] do_softirq
kernel/softirq.c:248 [inline]
softirqs last disabled at (2783411): [<8024ddfc>] do_softirq+0xd8/0xe4
kernel/softirq.c:235

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&inode->i_size_seqcount);
  <Interrupt>
    lock(&inode->i_size_seqcount);

 *** DEADLOCK ***

3 locks held by kworker/u4:0/7:
 #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data
kernel/workqueue.c:615 [inline]
 #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at:
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at:
process_one_work+0x214/0x998 kernel/workqueue.c:2246
 #1: 85147ef8
((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
set_work_data kernel/workqueue.c:615 [inline]
 #1: 85147ef8
((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #1: 85147ef8
((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
process_one_work+0x214/0x998 kernel/workqueue.c:2246
 #2: 8f878010 (&ni->size_lock){...-}-{2:2}, at:
ntfs_end_buffer_async_read+0x6c/0x558 fs/ntfs/aops.c:66

Fixes: 0f47227705d8 (block: revert "block: fix bd_size_lock use")
Reported-by: syzbot+a464ba0296692a4d2692@syzkaller.appspotmail.com
Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
---
 block/genhd.c           | 4 ++--
 block/partitions/core.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index c55e8f0fced1..a246fcbd6fc5 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -46,9 +46,9 @@ void set_capacity(struct gendisk *disk, sector_t sectors)
 {
 	struct block_device *bdev = disk->part0;
 
-	spin_lock(&bdev->bd_size_lock);
+	spin_lock_bh(&bdev->bd_size_lock);
 	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
-	spin_unlock(&bdev->bd_size_lock);
+	spin_unlock_bh(&bdev->bd_size_lock);
 }
 EXPORT_SYMBOL(set_capacity);
 
diff --git a/block/partitions/core.c b/block/partitions/core.c
index 1a7558917c47..777db55debce 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -88,9 +88,9 @@ static int (*check_part[])(struct parsed_partitions *) = {
 
 static void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors)
 {
-	spin_lock(&bdev->bd_size_lock);
+	spin_lock_bh(&bdev->bd_size_lock);
 	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
-	spin_unlock(&bdev->bd_size_lock);
+	spin_unlock_bh(&bdev->bd_size_lock);
 }
 
 static struct parsed_partitions *allocate_partitions(struct gendisk *hd)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] block: fix possible bd_size_lock deadlock
  2021-03-11 12:11 [PATCH] block: fix possible bd_size_lock deadlock yanfei.xu
@ 2021-03-12 19:37 ` Jens Axboe
  2021-03-12 22:32   ` Damien Le Moal
  0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2021-03-12 19:37 UTC (permalink / raw)
  To: yanfei.xu, damien.lemoal; +Cc: linux-block, linux-kernel

On 3/11/21 5:11 AM, yanfei.xu@windriver.com wrote:
> From: Yanfei Xu <yanfei.xu@windriver.com>
> 
> bd_size_lock spinlock could be taken in block softirq, thus we should
> disable the softirq before taking the lock.
> 
> WARNING: inconsistent lock state
> 5.12.0-rc2-syzkaller #0 Not tainted
> --------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage.
> kworker/u4:0/7 [HC0[0]:SC1[1]:HE0:SE0] takes:
> 8f87826c (&inode->i_size_seqcount){+.+-}-{0:0}, at:
> end_bio_bh_io_sync+0x38/0x54 fs/buffer.c:3006
> {SOFTIRQ-ON-W} state was registered at:
>   lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510
>   lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483
>   do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]
>   do_write_seqcount_begin include/linux/seqlock.h:545 [inline]
>   i_size_write include/linux/fs.h:863 [inline]
>   set_capacity+0x13c/0x1f8 block/genhd.c:50
>   brd_alloc+0x130/0x180 drivers/block/brd.c:401
>   brd_init+0xcc/0x1e0 drivers/block/brd.c:500
>   do_one_initcall+0x8c/0x59c init/main.c:1226
>   do_initcall_level init/main.c:1299 [inline]
>   do_initcalls init/main.c:1315 [inline]
>   do_basic_setup init/main.c:1335 [inline]
>   kernel_init_freeable+0x2cc/0x330 init/main.c:1537
>   kernel_init+0x10/0x120 init/main.c:1424
>   ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158
>   0x0
> irq event stamp: 2783413
> hardirqs last  enabled at (2783412): [<802011ec>]
> __do_softirq+0xf4/0x7ac kernel/softirq.c:329
> hardirqs last disabled at (2783413): [<8277d260>]
> __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline]
> hardirqs last disabled at (2783413): [<8277d260>]
> _raw_read_lock_irqsave+0x84/0x88 kernel/locking/spinlock.c:231
> softirqs last  enabled at (2783410): [<826b5050>] spin_unlock_bh
> include/linux/spinlock.h:399 [inline]
> softirqs last  enabled at (2783410): [<826b5050>]
> batadv_nc_purge_paths+0x10c/0x148 net/batman-adv/network-coding.c:467
> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq_own_stack
> include/asm-generic/softirq_stack.h:10 [inline]
> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq
> kernel/softirq.c:248 [inline]
> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq+0xd8/0xe4
> kernel/softirq.c:235
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&inode->i_size_seqcount);
>   <Interrupt>
>     lock(&inode->i_size_seqcount);
> 
>  *** DEADLOCK ***
> 
> 3 locks held by kworker/u4:0/7:
>  #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data
> kernel/workqueue.c:615 [inline]
>  #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at:
> process_one_work+0x214/0x998 kernel/workqueue.c:2246
>  #1: 85147ef8
> ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
> set_work_data kernel/workqueue.c:615 [inline]
>  #1: 85147ef8
> ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #1: 85147ef8
> ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
> process_one_work+0x214/0x998 kernel/workqueue.c:2246
>  #2: 8f878010 (&ni->size_lock){...-}-{2:2}, at:
> ntfs_end_buffer_async_read+0x6c/0x558 fs/ntfs/aops.c:66

Damien? We have that revert queued up for this for 5.12, but looking
at that, the state before that was kind of messy too.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] block: fix possible bd_size_lock deadlock
  2021-03-12 19:37 ` Jens Axboe
@ 2021-03-12 22:32   ` Damien Le Moal
  0 siblings, 0 replies; 3+ messages in thread
From: Damien Le Moal @ 2021-03-12 22:32 UTC (permalink / raw)
  To: Jens Axboe, yanfei.xu; +Cc: linux-block, linux-kernel

On 2021/03/13 4:37, Jens Axboe wrote:
> On 3/11/21 5:11 AM, yanfei.xu@windriver.com wrote:
>> From: Yanfei Xu <yanfei.xu@windriver.com>
>>
>> bd_size_lock spinlock could be taken in block softirq, thus we should
>> disable the softirq before taking the lock.
>>
>> WARNING: inconsistent lock state
>> 5.12.0-rc2-syzkaller #0 Not tainted
>> --------------------------------
>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage.
>> kworker/u4:0/7 [HC0[0]:SC1[1]:HE0:SE0] takes:
>> 8f87826c (&inode->i_size_seqcount){+.+-}-{0:0}, at:
>> end_bio_bh_io_sync+0x38/0x54 fs/buffer.c:3006
>> {SOFTIRQ-ON-W} state was registered at:
>>   lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510
>>   lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483
>>   do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]
>>   do_write_seqcount_begin include/linux/seqlock.h:545 [inline]
>>   i_size_write include/linux/fs.h:863 [inline]
>>   set_capacity+0x13c/0x1f8 block/genhd.c:50
>>   brd_alloc+0x130/0x180 drivers/block/brd.c:401
>>   brd_init+0xcc/0x1e0 drivers/block/brd.c:500
>>   do_one_initcall+0x8c/0x59c init/main.c:1226
>>   do_initcall_level init/main.c:1299 [inline]
>>   do_initcalls init/main.c:1315 [inline]
>>   do_basic_setup init/main.c:1335 [inline]
>>   kernel_init_freeable+0x2cc/0x330 init/main.c:1537
>>   kernel_init+0x10/0x120 init/main.c:1424
>>   ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158
>>   0x0
>> irq event stamp: 2783413
>> hardirqs last  enabled at (2783412): [<802011ec>]
>> __do_softirq+0xf4/0x7ac kernel/softirq.c:329
>> hardirqs last disabled at (2783413): [<8277d260>]
>> __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline]
>> hardirqs last disabled at (2783413): [<8277d260>]
>> _raw_read_lock_irqsave+0x84/0x88 kernel/locking/spinlock.c:231
>> softirqs last  enabled at (2783410): [<826b5050>] spin_unlock_bh
>> include/linux/spinlock.h:399 [inline]
>> softirqs last  enabled at (2783410): [<826b5050>]
>> batadv_nc_purge_paths+0x10c/0x148 net/batman-adv/network-coding.c:467
>> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq_own_stack
>> include/asm-generic/softirq_stack.h:10 [inline]
>> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq
>> kernel/softirq.c:248 [inline]
>> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq+0xd8/0xe4
>> kernel/softirq.c:235
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>        CPU0
>>        ----
>>   lock(&inode->i_size_seqcount);
>>   <Interrupt>
>>     lock(&inode->i_size_seqcount);
>>
>>  *** DEADLOCK ***
>>
>> 3 locks held by kworker/u4:0/7:
>>  #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data
>> kernel/workqueue.c:615 [inline]
>>  #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at:
>> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>>  #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at:
>> process_one_work+0x214/0x998 kernel/workqueue.c:2246
>>  #1: 85147ef8
>> ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
>> set_work_data kernel/workqueue.c:615 [inline]
>>  #1: 85147ef8
>> ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
>> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>>  #1: 85147ef8
>> ((work_completion)(&(&bat_priv->nc.work)->work)){+.+.}-{0:0}, at:
>> process_one_work+0x214/0x998 kernel/workqueue.c:2246
>>  #2: 8f878010 (&ni->size_lock){...-}-{2:2}, at:
>> ntfs_end_buffer_async_read+0x6c/0x558 fs/ntfs/aops.c:66
> 
> Damien? We have that revert queued up for this for 5.12, but looking
> at that, the state before that was kind of messy too.

Indeed... I was thinking about this and I think I am with Christoph on this:
drivers should not call set_capacity() from command completion context. I think
the best thing to do would be to fix drivers that do that but that may not be RC
material ?

Looking into more details of this case, it is slightly different though.
set_capacity() is here not called from soft IRQ context. It looks like a regular
initialization, but one that seems way too early in the boot process when a
secondary core is being initialized with IRQ not yet enabled... I think. And the
warnings come from i_size_write() calling preempt_disable() rather than
set_capacity() use of spin_lock(&bdev->bd_size_lock).

I wonder how it is possible to have brd being initialized so early.
I am not sure how to fix that. It looks like arm arch code territory.

For now, we could revert the revert as I do not think that Yanfei patch is
enough since completions may be from hard IRQ context too, which is not covered
with the spin_lock_bh() variants (c.f. a similar problem we are facing with that
in scsi completion [1])
I do not have any good idea how to proceed though.

[1]
https://lore.kernel.org/linux-scsi/PH0PR04MB7416C8330459E92D8AA21A889B6F9@PH0PR04MB7416.namprd04.prod.outlook.com/T/#t

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-12 22:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-11 12:11 [PATCH] block: fix possible bd_size_lock deadlock yanfei.xu
2021-03-12 19:37 ` Jens Axboe
2021-03-12 22:32   ` Damien Le Moal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).