linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* block: kernel panic in __bio_associate_blkg+0x1e
@ 2018-12-11  2:36 Ming Lei
  2018-12-11  3:09 ` Dennis Zhou
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2018-12-11  2:36 UTC (permalink / raw)
  To: linux-block, Jens Axboe, Dennis Zhou; +Cc: Ming Lei

Hi Jens and Dennis,

Just found the following issue when testing for-4.21/block when
running stress io & device
remove on scsi_debug, and it should be caused by recent blkcg changes.

[   37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
[   37.665644] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000048
[   37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
[   37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
[   37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
4.20.0-rc6_f0ea84586b7c_for-next+ #1
[   37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.10.2-2.fc27 04/01/2014
[   37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
[   37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
01 eb
[   37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
[   37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
[   37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
[   37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
[   37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
[   37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
[   37.687435] FS:  00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
knlGS:0000000000000000
[   37.688548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
[   37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   37.692320] PKRU: 55555554
[   37.692704] Call Trace:
[   37.693070]  bio_associate_blkg_from_css+0x4e/0x57
[   37.693734]  bio_associate_blkg+0x4d/0x53
[   37.694300]  blkdev_direct_IO+0x1d4/0x3c9
[   37.694861]  ? __switch_to_asm+0x34/0x70
[   37.695417]  ? aio_complete+0x2cc/0x2cc
[   37.695962]  ? __switch_to_asm+0x34/0x70
[   37.696511]  ? __switch_to_asm+0x40/0x70
[   37.697066]  ? __switch_to_asm+0x34/0x70
[   37.697614]  ? __switch_to_asm+0x40/0x70
[   37.698166]  ? __switch_to_asm+0x34/0x70
[   37.698717]  ? generic_file_read_iter+0x96/0x110
[   37.699366]  generic_file_read_iter+0x96/0x110
[   37.699991]  aio_read+0xe9/0x178
[   37.700448]  ? __switch_to_asm+0x34/0x70
[   37.701004]  ? __switch_to_asm+0x34/0x70
[   37.701552]  ? __switch_to_asm+0x40/0x70
[   37.702109]  ? __switch_to_asm+0x34/0x70
[   37.702659]  ? __switch_to_asm+0x40/0x70
[   37.703214]  ? __switch_to_asm+0x34/0x70
[   37.703762]  ? __switch_to_asm+0x40/0x70
[   37.704317]  ? __switch_to_asm+0x34/0x70
[   37.704867]  ? __switch_to_asm+0x40/0x70
[   37.705423]  ? __switch_to_asm+0x34/0x70
[   37.705973]  ? __switch_to_asm+0x40/0x70
[   37.706523]  ? __switch_to_asm+0x34/0x70
[   37.707088]  ? io_submit_one+0x2e1/0x67b
[   37.707638]  io_submit_one+0x2e1/0x67b
[   37.708171]  ? __se_sys_io_submit+0xc5/0x15e
[   37.708770]  __se_sys_io_submit+0xc5/0x15e
[   37.709348]  ? 0xffffffff81000000
[   37.709819]  ? do_syscall_64+0x84/0x13f
[   37.710362]  ? __se_sys_io_submit+0x15e/0x15e
[   37.710987]  do_syscall_64+0x84/0x13f
[   37.711505]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   37.712216] RIP: 0033:0x7f7d471c6687
[   37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
00 00
[   37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
00000000000000d1
[   37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
[   37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
[   37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
[   37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
[   37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
[   37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
[   37.724240] Dumping ftrace buffer:
[   37.724717]    (ftrace buffer empty)
[   37.725223] CR2: 0000000000000048
[   37.725692] ---[ end trace 4758725073447b42 ]---


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: block: kernel panic in __bio_associate_blkg+0x1e
  2018-12-11  2:36 block: kernel panic in __bio_associate_blkg+0x1e Ming Lei
@ 2018-12-11  3:09 ` Dennis Zhou
  2018-12-11  3:22   ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Dennis Zhou @ 2018-12-11  3:09 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Jens Axboe, Dennis Zhou, Ming Lei

Hi Ming,

On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> Hi Jens and Dennis,
> 
> Just found the following issue when testing for-4.21/block when
> running stress io & device
> remove on scsi_debug, and it should be caused by recent blkcg changes.
> 
> [   37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> [   37.665644] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000048
> [   37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> [   37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> [   37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> [   37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.10.2-2.fc27 04/01/2014
> [   37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> [   37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> 01 eb
> [   37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> [   37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> [   37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> [   37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> [   37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> [   37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> [   37.687435] FS:  00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> knlGS:0000000000000000
> [   37.688548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> [   37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   37.692320] PKRU: 55555554
> [   37.692704] Call Trace:
> [   37.693070]  bio_associate_blkg_from_css+0x4e/0x57
> [   37.693734]  bio_associate_blkg+0x4d/0x53
> [   37.694300]  blkdev_direct_IO+0x1d4/0x3c9
> [   37.694861]  ? __switch_to_asm+0x34/0x70
> [   37.695417]  ? aio_complete+0x2cc/0x2cc
> [   37.695962]  ? __switch_to_asm+0x34/0x70
> [   37.696511]  ? __switch_to_asm+0x40/0x70
> [   37.697066]  ? __switch_to_asm+0x34/0x70
> [   37.697614]  ? __switch_to_asm+0x40/0x70
> [   37.698166]  ? __switch_to_asm+0x34/0x70
> [   37.698717]  ? generic_file_read_iter+0x96/0x110
> [   37.699366]  generic_file_read_iter+0x96/0x110
> [   37.699991]  aio_read+0xe9/0x178
> [   37.700448]  ? __switch_to_asm+0x34/0x70
> [   37.701004]  ? __switch_to_asm+0x34/0x70
> [   37.701552]  ? __switch_to_asm+0x40/0x70
> [   37.702109]  ? __switch_to_asm+0x34/0x70
> [   37.702659]  ? __switch_to_asm+0x40/0x70
> [   37.703214]  ? __switch_to_asm+0x34/0x70
> [   37.703762]  ? __switch_to_asm+0x40/0x70
> [   37.704317]  ? __switch_to_asm+0x34/0x70
> [   37.704867]  ? __switch_to_asm+0x40/0x70
> [   37.705423]  ? __switch_to_asm+0x34/0x70
> [   37.705973]  ? __switch_to_asm+0x40/0x70
> [   37.706523]  ? __switch_to_asm+0x34/0x70
> [   37.707088]  ? io_submit_one+0x2e1/0x67b
> [   37.707638]  io_submit_one+0x2e1/0x67b
> [   37.708171]  ? __se_sys_io_submit+0xc5/0x15e
> [   37.708770]  __se_sys_io_submit+0xc5/0x15e
> [   37.709348]  ? 0xffffffff81000000
> [   37.709819]  ? do_syscall_64+0x84/0x13f
> [   37.710362]  ? __se_sys_io_submit+0x15e/0x15e
> [   37.710987]  do_syscall_64+0x84/0x13f
> [   37.711505]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   37.712216] RIP: 0033:0x7f7d471c6687
> [   37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> 00 00
> [   37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> 00000000000000d1
> [   37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> [   37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> [   37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> [   37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> [   37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> [   37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> [   37.724240] Dumping ftrace buffer:
> [   37.724717]    (ftrace buffer empty)
> [   37.725223] CR2: 0000000000000048
> [   37.725692] ---[ end trace 4758725073447b42 ]---
> 

Thanks for reporting this to me. I'm not familiar with scsi_debug would
you please explain to me how to reproduce this?

Thanks,
Dennis

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: block: kernel panic in __bio_associate_blkg+0x1e
  2018-12-11  3:09 ` Dennis Zhou
@ 2018-12-11  3:22   ` Ming Lei
  2018-12-11  6:20     ` Dennis Zhou
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2018-12-11  3:22 UTC (permalink / raw)
  To: Dennis Zhou; +Cc: linux-block, Jens Axboe, Ming Lei

On Tue, Dec 11, 2018 at 11:09 AM Dennis Zhou <dennis@kernel.org> wrote:
>
> Hi Ming,
>
> On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> > Hi Jens and Dennis,
> >
> > Just found the following issue when testing for-4.21/block when
> > running stress io & device
> > remove on scsi_debug, and it should be caused by recent blkcg changes.
> >
> > [   37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> > [   37.665644] BUG: unable to handle kernel NULL pointer dereference
> > at 0000000000000048
> > [   37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> > [   37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> > 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> > [   37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS 1.10.2-2.fc27 04/01/2014
> > [   37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> > [   37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> > 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> > 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> > 01 eb
> > [   37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> > [   37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> > [   37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> > [   37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> > [   37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> > [   37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> > [   37.687435] FS:  00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> > knlGS:0000000000000000
> > [   37.688548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> > [   37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   37.692320] PKRU: 55555554
> > [   37.692704] Call Trace:
> > [   37.693070]  bio_associate_blkg_from_css+0x4e/0x57
> > [   37.693734]  bio_associate_blkg+0x4d/0x53
> > [   37.694300]  blkdev_direct_IO+0x1d4/0x3c9
> > [   37.694861]  ? __switch_to_asm+0x34/0x70
> > [   37.695417]  ? aio_complete+0x2cc/0x2cc
> > [   37.695962]  ? __switch_to_asm+0x34/0x70
> > [   37.696511]  ? __switch_to_asm+0x40/0x70
> > [   37.697066]  ? __switch_to_asm+0x34/0x70
> > [   37.697614]  ? __switch_to_asm+0x40/0x70
> > [   37.698166]  ? __switch_to_asm+0x34/0x70
> > [   37.698717]  ? generic_file_read_iter+0x96/0x110
> > [   37.699366]  generic_file_read_iter+0x96/0x110
> > [   37.699991]  aio_read+0xe9/0x178
> > [   37.700448]  ? __switch_to_asm+0x34/0x70
> > [   37.701004]  ? __switch_to_asm+0x34/0x70
> > [   37.701552]  ? __switch_to_asm+0x40/0x70
> > [   37.702109]  ? __switch_to_asm+0x34/0x70
> > [   37.702659]  ? __switch_to_asm+0x40/0x70
> > [   37.703214]  ? __switch_to_asm+0x34/0x70
> > [   37.703762]  ? __switch_to_asm+0x40/0x70
> > [   37.704317]  ? __switch_to_asm+0x34/0x70
> > [   37.704867]  ? __switch_to_asm+0x40/0x70
> > [   37.705423]  ? __switch_to_asm+0x34/0x70
> > [   37.705973]  ? __switch_to_asm+0x40/0x70
> > [   37.706523]  ? __switch_to_asm+0x34/0x70
> > [   37.707088]  ? io_submit_one+0x2e1/0x67b
> > [   37.707638]  io_submit_one+0x2e1/0x67b
> > [   37.708171]  ? __se_sys_io_submit+0xc5/0x15e
> > [   37.708770]  __se_sys_io_submit+0xc5/0x15e
> > [   37.709348]  ? 0xffffffff81000000
> > [   37.709819]  ? do_syscall_64+0x84/0x13f
> > [   37.710362]  ? __se_sys_io_submit+0x15e/0x15e
> > [   37.710987]  do_syscall_64+0x84/0x13f
> > [   37.711505]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [   37.712216] RIP: 0033:0x7f7d471c6687
> > [   37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> > 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> > 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> > 00 00
> > [   37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> > 00000000000000d1
> > [   37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> > [   37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> > [   37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> > [   37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> > [   37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> > [   37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> > i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> > usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> > [   37.724240] Dumping ftrace buffer:
> > [   37.724717]    (ftrace buffer empty)
> > [   37.725223] CR2: 0000000000000048
> > [   37.725692] ---[ end trace 4758725073447b42 ]---
> >
>
> Thanks for reporting this to me. I'm not familiar with scsi_debug would
> you please explain to me how to reproduce this?

Hi,

The issue can be reproduced reliably by passing '21' to the following
script, and
run it for a couple of times.

http://people.redhat.com/minlei/tests/tools/scsi-stress-remove

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: block: kernel panic in __bio_associate_blkg+0x1e
  2018-12-11  3:22   ` Ming Lei
@ 2018-12-11  6:20     ` Dennis Zhou
  2018-12-11  7:55       ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Dennis Zhou @ 2018-12-11  6:20 UTC (permalink / raw)
  To: Ming Lei; +Cc: Dennis Zhou, linux-block, Jens Axboe, Ming Lei

On Tue, Dec 11, 2018 at 11:22:18AM +0800, Ming Lei wrote:
> On Tue, Dec 11, 2018 at 11:09 AM Dennis Zhou <dennis@kernel.org> wrote:
> >
> > Hi Ming,
> >
> > On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> > > Hi Jens and Dennis,
> > >
> > > Just found the following issue when testing for-4.21/block when
> > > running stress io & device
> > > remove on scsi_debug, and it should be caused by recent blkcg changes.
> > >
> > > [   37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> > > [   37.665644] BUG: unable to handle kernel NULL pointer dereference
> > > at 0000000000000048
> > > [   37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> > > [   37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> > > [   37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> > > 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> > > [   37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > BIOS 1.10.2-2.fc27 04/01/2014
> > > [   37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> > > [   37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> > > 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> > > 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> > > 01 eb
> > > [   37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> > > [   37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> > > [   37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> > > [   37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> > > [   37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> > > [   37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> > > [   37.687435] FS:  00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> > > knlGS:0000000000000000
> > > [   37.688548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [   37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> > > [   37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [   37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [   37.692320] PKRU: 55555554
> > > [   37.692704] Call Trace:
> > > [   37.693070]  bio_associate_blkg_from_css+0x4e/0x57
> > > [   37.693734]  bio_associate_blkg+0x4d/0x53
> > > [   37.694300]  blkdev_direct_IO+0x1d4/0x3c9
> > > [   37.694861]  ? __switch_to_asm+0x34/0x70
> > > [   37.695417]  ? aio_complete+0x2cc/0x2cc
> > > [   37.695962]  ? __switch_to_asm+0x34/0x70
> > > [   37.696511]  ? __switch_to_asm+0x40/0x70
> > > [   37.697066]  ? __switch_to_asm+0x34/0x70
> > > [   37.697614]  ? __switch_to_asm+0x40/0x70
> > > [   37.698166]  ? __switch_to_asm+0x34/0x70
> > > [   37.698717]  ? generic_file_read_iter+0x96/0x110
> > > [   37.699366]  generic_file_read_iter+0x96/0x110
> > > [   37.699991]  aio_read+0xe9/0x178
> > > [   37.700448]  ? __switch_to_asm+0x34/0x70
> > > [   37.701004]  ? __switch_to_asm+0x34/0x70
> > > [   37.701552]  ? __switch_to_asm+0x40/0x70
> > > [   37.702109]  ? __switch_to_asm+0x34/0x70
> > > [   37.702659]  ? __switch_to_asm+0x40/0x70
> > > [   37.703214]  ? __switch_to_asm+0x34/0x70
> > > [   37.703762]  ? __switch_to_asm+0x40/0x70
> > > [   37.704317]  ? __switch_to_asm+0x34/0x70
> > > [   37.704867]  ? __switch_to_asm+0x40/0x70
> > > [   37.705423]  ? __switch_to_asm+0x34/0x70
> > > [   37.705973]  ? __switch_to_asm+0x40/0x70
> > > [   37.706523]  ? __switch_to_asm+0x34/0x70
> > > [   37.707088]  ? io_submit_one+0x2e1/0x67b
> > > [   37.707638]  io_submit_one+0x2e1/0x67b
> > > [   37.708171]  ? __se_sys_io_submit+0xc5/0x15e
> > > [   37.708770]  __se_sys_io_submit+0xc5/0x15e
> > > [   37.709348]  ? 0xffffffff81000000
> > > [   37.709819]  ? do_syscall_64+0x84/0x13f
> > > [   37.710362]  ? __se_sys_io_submit+0x15e/0x15e
> > > [   37.710987]  do_syscall_64+0x84/0x13f
> > > [   37.711505]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [   37.712216] RIP: 0033:0x7f7d471c6687
> > > [   37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> > > 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> > > 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> > > 00 00
> > > [   37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> > > 00000000000000d1
> > > [   37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> > > [   37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> > > [   37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> > > [   37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> > > [   37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> > > [   37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> > > i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> > > usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> > > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> > > [   37.724240] Dumping ftrace buffer:
> > > [   37.724717]    (ftrace buffer empty)
> > > [   37.725223] CR2: 0000000000000048
> > > [   37.725692] ---[ end trace 4758725073447b42 ]---
> > >
> >
> > Thanks for reporting this to me. I'm not familiar with scsi_debug would
> > you please explain to me how to reproduce this?
> 
> Hi,
> 
> The issue can be reproduced reliably by passing '21' to the following
> script, and
> run it for a couple of times.
> 
> http://people.redhat.com/minlei/tests/tools/scsi-stress-remove
> 

Thanks for the quick response. I'm having a little bit of trouble with
my qemu setup and will try and set it up with scsi_debug properly in the
morning.

However, it seems to me that the issue is with the request_queue going
away and me not handling that scenario properly when doing association.
I think the following should fix the issue, if you don't mind testing
it.

Thanks,
Dennis

---
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index bf13ecb0fe4f..f025fd1e22e6 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -511,7 +511,7 @@ static inline bool blkg_tryget(struct blkcg_gq *blkg)
  */
 static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
 {
-   while (!percpu_ref_tryget(&blkg->refcnt))
+   while (blkg && !percpu_ref_tryget(&blkg->refcnt))
        blkg = blkg->parent;
 
    return blkg;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: block: kernel panic in __bio_associate_blkg+0x1e
  2018-12-11  6:20     ` Dennis Zhou
@ 2018-12-11  7:55       ` Ming Lei
  0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2018-12-11  7:55 UTC (permalink / raw)
  To: Dennis Zhou; +Cc: Ming Lei, linux-block, Jens Axboe

On Tue, Dec 11, 2018 at 01:20:30AM -0500, Dennis Zhou wrote:
> On Tue, Dec 11, 2018 at 11:22:18AM +0800, Ming Lei wrote:
> > On Tue, Dec 11, 2018 at 11:09 AM Dennis Zhou <dennis@kernel.org> wrote:
> > >
> > > Hi Ming,
> > >
> > > On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> > > > Hi Jens and Dennis,
> > > >
> > > > Just found the following issue when testing for-4.21/block when
> > > > running stress io & device
> > > > remove on scsi_debug, and it should be caused by recent blkcg changes.
> > > >
> > > > [   37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> > > > [   37.665644] BUG: unable to handle kernel NULL pointer dereference
> > > > at 0000000000000048
> > > > [   37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> > > > [   37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> > > > [   37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> > > > 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> > > > [   37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.10.2-2.fc27 04/01/2014
> > > > [   37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> > > > [   37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> > > > 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> > > > 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> > > > 01 eb
> > > > [   37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> > > > [   37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> > > > [   37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> > > > [   37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> > > > [   37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> > > > [   37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> > > > [   37.687435] FS:  00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> > > > knlGS:0000000000000000
> > > > [   37.688548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [   37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> > > > [   37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [   37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > [   37.692320] PKRU: 55555554
> > > > [   37.692704] Call Trace:
> > > > [   37.693070]  bio_associate_blkg_from_css+0x4e/0x57
> > > > [   37.693734]  bio_associate_blkg+0x4d/0x53
> > > > [   37.694300]  blkdev_direct_IO+0x1d4/0x3c9
> > > > [   37.694861]  ? __switch_to_asm+0x34/0x70
> > > > [   37.695417]  ? aio_complete+0x2cc/0x2cc
> > > > [   37.695962]  ? __switch_to_asm+0x34/0x70
> > > > [   37.696511]  ? __switch_to_asm+0x40/0x70
> > > > [   37.697066]  ? __switch_to_asm+0x34/0x70
> > > > [   37.697614]  ? __switch_to_asm+0x40/0x70
> > > > [   37.698166]  ? __switch_to_asm+0x34/0x70
> > > > [   37.698717]  ? generic_file_read_iter+0x96/0x110
> > > > [   37.699366]  generic_file_read_iter+0x96/0x110
> > > > [   37.699991]  aio_read+0xe9/0x178
> > > > [   37.700448]  ? __switch_to_asm+0x34/0x70
> > > > [   37.701004]  ? __switch_to_asm+0x34/0x70
> > > > [   37.701552]  ? __switch_to_asm+0x40/0x70
> > > > [   37.702109]  ? __switch_to_asm+0x34/0x70
> > > > [   37.702659]  ? __switch_to_asm+0x40/0x70
> > > > [   37.703214]  ? __switch_to_asm+0x34/0x70
> > > > [   37.703762]  ? __switch_to_asm+0x40/0x70
> > > > [   37.704317]  ? __switch_to_asm+0x34/0x70
> > > > [   37.704867]  ? __switch_to_asm+0x40/0x70
> > > > [   37.705423]  ? __switch_to_asm+0x34/0x70
> > > > [   37.705973]  ? __switch_to_asm+0x40/0x70
> > > > [   37.706523]  ? __switch_to_asm+0x34/0x70
> > > > [   37.707088]  ? io_submit_one+0x2e1/0x67b
> > > > [   37.707638]  io_submit_one+0x2e1/0x67b
> > > > [   37.708171]  ? __se_sys_io_submit+0xc5/0x15e
> > > > [   37.708770]  __se_sys_io_submit+0xc5/0x15e
> > > > [   37.709348]  ? 0xffffffff81000000
> > > > [   37.709819]  ? do_syscall_64+0x84/0x13f
> > > > [   37.710362]  ? __se_sys_io_submit+0x15e/0x15e
> > > > [   37.710987]  do_syscall_64+0x84/0x13f
> > > > [   37.711505]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > [   37.712216] RIP: 0033:0x7f7d471c6687
> > > > [   37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> > > > 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> > > > 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> > > > 00 00
> > > > [   37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> > > > 00000000000000d1
> > > > [   37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> > > > [   37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> > > > [   37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> > > > [   37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> > > > [   37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> > > > [   37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> > > > i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> > > > usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> > > > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> > > > [   37.724240] Dumping ftrace buffer:
> > > > [   37.724717]    (ftrace buffer empty)
> > > > [   37.725223] CR2: 0000000000000048
> > > > [   37.725692] ---[ end trace 4758725073447b42 ]---
> > > >
> > >
> > > Thanks for reporting this to me. I'm not familiar with scsi_debug would
> > > you please explain to me how to reproduce this?
> > 
> > Hi,
> > 
> > The issue can be reproduced reliably by passing '21' to the following
> > script, and
> > run it for a couple of times.
> > 
> > http://people.redhat.com/minlei/tests/tools/scsi-stress-remove
> > 
> 
> Thanks for the quick response. I'm having a little bit of trouble with
> my qemu setup and will try and set it up with scsi_debug properly in the
> morning.

You may run test over scsi_debug in may machine, not limited to qemu.

> 
> However, it seems to me that the issue is with the request_queue going
> away and me not handling that scenario properly when doing association.
> I think the following should fix the issue, if you don't mind testing
> it.
> 
> Thanks,
> Dennis
> 
> ---
> diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
> index bf13ecb0fe4f..f025fd1e22e6 100644
> --- a/include/linux/blk-cgroup.h
> +++ b/include/linux/blk-cgroup.h
> @@ -511,7 +511,7 @@ static inline bool blkg_tryget(struct blkcg_gq *blkg)
>   */
>  static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
>  {
> -   while (!percpu_ref_tryget(&blkg->refcnt))
> +   while (blkg && !percpu_ref_tryget(&blkg->refcnt))
>         blkg = blkg->parent;
>  
>     return blkg;

After applying the above patch, the 'scsi-stress-remove' test mentioned before
can survive, without panic any more.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-12-11  7:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11  2:36 block: kernel panic in __bio_associate_blkg+0x1e Ming Lei
2018-12-11  3:09 ` Dennis Zhou
2018-12-11  3:22   ` Ming Lei
2018-12-11  6:20     ` Dennis Zhou
2018-12-11  7:55       ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).