* block: kernel panic in __bio_associate_blkg+0x1e
@ 2018-12-11 2:36 Ming Lei
2018-12-11 3:09 ` Dennis Zhou
0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2018-12-11 2:36 UTC (permalink / raw)
To: linux-block, Jens Axboe, Dennis Zhou; +Cc: Ming Lei
Hi Jens and Dennis,
Just found the following issue when testing for-4.21/block when
running stress io & device
remove on scsi_debug, and it should be caused by recent blkcg changes.
[ 37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
[ 37.665644] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000048
[ 37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
[ 37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
[ 37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
4.20.0-rc6_f0ea84586b7c_for-next+ #1
[ 37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.10.2-2.fc27 04/01/2014
[ 37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
[ 37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
01 eb
[ 37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
[ 37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
[ 37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
[ 37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
[ 37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
[ 37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
[ 37.687435] FS: 00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
knlGS:0000000000000000
[ 37.688548] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
[ 37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 37.692320] PKRU: 55555554
[ 37.692704] Call Trace:
[ 37.693070] bio_associate_blkg_from_css+0x4e/0x57
[ 37.693734] bio_associate_blkg+0x4d/0x53
[ 37.694300] blkdev_direct_IO+0x1d4/0x3c9
[ 37.694861] ? __switch_to_asm+0x34/0x70
[ 37.695417] ? aio_complete+0x2cc/0x2cc
[ 37.695962] ? __switch_to_asm+0x34/0x70
[ 37.696511] ? __switch_to_asm+0x40/0x70
[ 37.697066] ? __switch_to_asm+0x34/0x70
[ 37.697614] ? __switch_to_asm+0x40/0x70
[ 37.698166] ? __switch_to_asm+0x34/0x70
[ 37.698717] ? generic_file_read_iter+0x96/0x110
[ 37.699366] generic_file_read_iter+0x96/0x110
[ 37.699991] aio_read+0xe9/0x178
[ 37.700448] ? __switch_to_asm+0x34/0x70
[ 37.701004] ? __switch_to_asm+0x34/0x70
[ 37.701552] ? __switch_to_asm+0x40/0x70
[ 37.702109] ? __switch_to_asm+0x34/0x70
[ 37.702659] ? __switch_to_asm+0x40/0x70
[ 37.703214] ? __switch_to_asm+0x34/0x70
[ 37.703762] ? __switch_to_asm+0x40/0x70
[ 37.704317] ? __switch_to_asm+0x34/0x70
[ 37.704867] ? __switch_to_asm+0x40/0x70
[ 37.705423] ? __switch_to_asm+0x34/0x70
[ 37.705973] ? __switch_to_asm+0x40/0x70
[ 37.706523] ? __switch_to_asm+0x34/0x70
[ 37.707088] ? io_submit_one+0x2e1/0x67b
[ 37.707638] io_submit_one+0x2e1/0x67b
[ 37.708171] ? __se_sys_io_submit+0xc5/0x15e
[ 37.708770] __se_sys_io_submit+0xc5/0x15e
[ 37.709348] ? 0xffffffff81000000
[ 37.709819] ? do_syscall_64+0x84/0x13f
[ 37.710362] ? __se_sys_io_submit+0x15e/0x15e
[ 37.710987] do_syscall_64+0x84/0x13f
[ 37.711505] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 37.712216] RIP: 0033:0x7f7d471c6687
[ 37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
00 00
[ 37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
00000000000000d1
[ 37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
[ 37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
[ 37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
[ 37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
[ 37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
[ 37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
[ 37.724240] Dumping ftrace buffer:
[ 37.724717] (ftrace buffer empty)
[ 37.725223] CR2: 0000000000000048
[ 37.725692] ---[ end trace 4758725073447b42 ]---
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: block: kernel panic in __bio_associate_blkg+0x1e
2018-12-11 2:36 block: kernel panic in __bio_associate_blkg+0x1e Ming Lei
@ 2018-12-11 3:09 ` Dennis Zhou
2018-12-11 3:22 ` Ming Lei
0 siblings, 1 reply; 5+ messages in thread
From: Dennis Zhou @ 2018-12-11 3:09 UTC (permalink / raw)
To: Ming Lei; +Cc: linux-block, Jens Axboe, Dennis Zhou, Ming Lei
Hi Ming,
On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> Hi Jens and Dennis,
>
> Just found the following issue when testing for-4.21/block when
> running stress io & device
> remove on scsi_debug, and it should be caused by recent blkcg changes.
>
> [ 37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> [ 37.665644] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000048
> [ 37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> [ 37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> [ 37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> [ 37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.10.2-2.fc27 04/01/2014
> [ 37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> [ 37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> 01 eb
> [ 37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> [ 37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> [ 37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> [ 37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> [ 37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> [ 37.687435] FS: 00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> knlGS:0000000000000000
> [ 37.688548] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> [ 37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 37.692320] PKRU: 55555554
> [ 37.692704] Call Trace:
> [ 37.693070] bio_associate_blkg_from_css+0x4e/0x57
> [ 37.693734] bio_associate_blkg+0x4d/0x53
> [ 37.694300] blkdev_direct_IO+0x1d4/0x3c9
> [ 37.694861] ? __switch_to_asm+0x34/0x70
> [ 37.695417] ? aio_complete+0x2cc/0x2cc
> [ 37.695962] ? __switch_to_asm+0x34/0x70
> [ 37.696511] ? __switch_to_asm+0x40/0x70
> [ 37.697066] ? __switch_to_asm+0x34/0x70
> [ 37.697614] ? __switch_to_asm+0x40/0x70
> [ 37.698166] ? __switch_to_asm+0x34/0x70
> [ 37.698717] ? generic_file_read_iter+0x96/0x110
> [ 37.699366] generic_file_read_iter+0x96/0x110
> [ 37.699991] aio_read+0xe9/0x178
> [ 37.700448] ? __switch_to_asm+0x34/0x70
> [ 37.701004] ? __switch_to_asm+0x34/0x70
> [ 37.701552] ? __switch_to_asm+0x40/0x70
> [ 37.702109] ? __switch_to_asm+0x34/0x70
> [ 37.702659] ? __switch_to_asm+0x40/0x70
> [ 37.703214] ? __switch_to_asm+0x34/0x70
> [ 37.703762] ? __switch_to_asm+0x40/0x70
> [ 37.704317] ? __switch_to_asm+0x34/0x70
> [ 37.704867] ? __switch_to_asm+0x40/0x70
> [ 37.705423] ? __switch_to_asm+0x34/0x70
> [ 37.705973] ? __switch_to_asm+0x40/0x70
> [ 37.706523] ? __switch_to_asm+0x34/0x70
> [ 37.707088] ? io_submit_one+0x2e1/0x67b
> [ 37.707638] io_submit_one+0x2e1/0x67b
> [ 37.708171] ? __se_sys_io_submit+0xc5/0x15e
> [ 37.708770] __se_sys_io_submit+0xc5/0x15e
> [ 37.709348] ? 0xffffffff81000000
> [ 37.709819] ? do_syscall_64+0x84/0x13f
> [ 37.710362] ? __se_sys_io_submit+0x15e/0x15e
> [ 37.710987] do_syscall_64+0x84/0x13f
> [ 37.711505] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 37.712216] RIP: 0033:0x7f7d471c6687
> [ 37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> 00 00
> [ 37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> 00000000000000d1
> [ 37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> [ 37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> [ 37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> [ 37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> [ 37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> [ 37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> [ 37.724240] Dumping ftrace buffer:
> [ 37.724717] (ftrace buffer empty)
> [ 37.725223] CR2: 0000000000000048
> [ 37.725692] ---[ end trace 4758725073447b42 ]---
>
Thanks for reporting this to me. I'm not familiar with scsi_debug would
you please explain to me how to reproduce this?
Thanks,
Dennis
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: block: kernel panic in __bio_associate_blkg+0x1e
2018-12-11 3:09 ` Dennis Zhou
@ 2018-12-11 3:22 ` Ming Lei
2018-12-11 6:20 ` Dennis Zhou
0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2018-12-11 3:22 UTC (permalink / raw)
To: Dennis Zhou; +Cc: linux-block, Jens Axboe, Ming Lei
On Tue, Dec 11, 2018 at 11:09 AM Dennis Zhou <dennis@kernel.org> wrote:
>
> Hi Ming,
>
> On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> > Hi Jens and Dennis,
> >
> > Just found the following issue when testing for-4.21/block when
> > running stress io & device
> > remove on scsi_debug, and it should be caused by recent blkcg changes.
> >
> > [ 37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> > [ 37.665644] BUG: unable to handle kernel NULL pointer dereference
> > at 0000000000000048
> > [ 37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> > [ 37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> > [ 37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> > 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> > [ 37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS 1.10.2-2.fc27 04/01/2014
> > [ 37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> > [ 37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> > 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> > 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> > 01 eb
> > [ 37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> > [ 37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> > [ 37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> > [ 37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> > [ 37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> > [ 37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> > [ 37.687435] FS: 00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> > knlGS:0000000000000000
> > [ 37.688548] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> > [ 37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 37.692320] PKRU: 55555554
> > [ 37.692704] Call Trace:
> > [ 37.693070] bio_associate_blkg_from_css+0x4e/0x57
> > [ 37.693734] bio_associate_blkg+0x4d/0x53
> > [ 37.694300] blkdev_direct_IO+0x1d4/0x3c9
> > [ 37.694861] ? __switch_to_asm+0x34/0x70
> > [ 37.695417] ? aio_complete+0x2cc/0x2cc
> > [ 37.695962] ? __switch_to_asm+0x34/0x70
> > [ 37.696511] ? __switch_to_asm+0x40/0x70
> > [ 37.697066] ? __switch_to_asm+0x34/0x70
> > [ 37.697614] ? __switch_to_asm+0x40/0x70
> > [ 37.698166] ? __switch_to_asm+0x34/0x70
> > [ 37.698717] ? generic_file_read_iter+0x96/0x110
> > [ 37.699366] generic_file_read_iter+0x96/0x110
> > [ 37.699991] aio_read+0xe9/0x178
> > [ 37.700448] ? __switch_to_asm+0x34/0x70
> > [ 37.701004] ? __switch_to_asm+0x34/0x70
> > [ 37.701552] ? __switch_to_asm+0x40/0x70
> > [ 37.702109] ? __switch_to_asm+0x34/0x70
> > [ 37.702659] ? __switch_to_asm+0x40/0x70
> > [ 37.703214] ? __switch_to_asm+0x34/0x70
> > [ 37.703762] ? __switch_to_asm+0x40/0x70
> > [ 37.704317] ? __switch_to_asm+0x34/0x70
> > [ 37.704867] ? __switch_to_asm+0x40/0x70
> > [ 37.705423] ? __switch_to_asm+0x34/0x70
> > [ 37.705973] ? __switch_to_asm+0x40/0x70
> > [ 37.706523] ? __switch_to_asm+0x34/0x70
> > [ 37.707088] ? io_submit_one+0x2e1/0x67b
> > [ 37.707638] io_submit_one+0x2e1/0x67b
> > [ 37.708171] ? __se_sys_io_submit+0xc5/0x15e
> > [ 37.708770] __se_sys_io_submit+0xc5/0x15e
> > [ 37.709348] ? 0xffffffff81000000
> > [ 37.709819] ? do_syscall_64+0x84/0x13f
> > [ 37.710362] ? __se_sys_io_submit+0x15e/0x15e
> > [ 37.710987] do_syscall_64+0x84/0x13f
> > [ 37.711505] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 37.712216] RIP: 0033:0x7f7d471c6687
> > [ 37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> > 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> > 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> > 00 00
> > [ 37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> > 00000000000000d1
> > [ 37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> > [ 37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> > [ 37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> > [ 37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> > [ 37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> > [ 37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> > i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> > usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> > [ 37.724240] Dumping ftrace buffer:
> > [ 37.724717] (ftrace buffer empty)
> > [ 37.725223] CR2: 0000000000000048
> > [ 37.725692] ---[ end trace 4758725073447b42 ]---
> >
>
> Thanks for reporting this to me. I'm not familiar with scsi_debug would
> you please explain to me how to reproduce this?
Hi,
The issue can be reproduced reliably by passing '21' to the following
script, and
run it for a couple of times.
http://people.redhat.com/minlei/tests/tools/scsi-stress-remove
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: block: kernel panic in __bio_associate_blkg+0x1e
2018-12-11 3:22 ` Ming Lei
@ 2018-12-11 6:20 ` Dennis Zhou
2018-12-11 7:55 ` Ming Lei
0 siblings, 1 reply; 5+ messages in thread
From: Dennis Zhou @ 2018-12-11 6:20 UTC (permalink / raw)
To: Ming Lei; +Cc: Dennis Zhou, linux-block, Jens Axboe, Ming Lei
On Tue, Dec 11, 2018 at 11:22:18AM +0800, Ming Lei wrote:
> On Tue, Dec 11, 2018 at 11:09 AM Dennis Zhou <dennis@kernel.org> wrote:
> >
> > Hi Ming,
> >
> > On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> > > Hi Jens and Dennis,
> > >
> > > Just found the following issue when testing for-4.21/block when
> > > running stress io & device
> > > remove on scsi_debug, and it should be caused by recent blkcg changes.
> > >
> > > [ 37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> > > [ 37.665644] BUG: unable to handle kernel NULL pointer dereference
> > > at 0000000000000048
> > > [ 37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> > > [ 37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> > > [ 37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> > > 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> > > [ 37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > BIOS 1.10.2-2.fc27 04/01/2014
> > > [ 37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> > > [ 37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> > > 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> > > 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> > > 01 eb
> > > [ 37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> > > [ 37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> > > [ 37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> > > [ 37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> > > [ 37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> > > [ 37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> > > [ 37.687435] FS: 00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> > > knlGS:0000000000000000
> > > [ 37.688548] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> > > [ 37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [ 37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [ 37.692320] PKRU: 55555554
> > > [ 37.692704] Call Trace:
> > > [ 37.693070] bio_associate_blkg_from_css+0x4e/0x57
> > > [ 37.693734] bio_associate_blkg+0x4d/0x53
> > > [ 37.694300] blkdev_direct_IO+0x1d4/0x3c9
> > > [ 37.694861] ? __switch_to_asm+0x34/0x70
> > > [ 37.695417] ? aio_complete+0x2cc/0x2cc
> > > [ 37.695962] ? __switch_to_asm+0x34/0x70
> > > [ 37.696511] ? __switch_to_asm+0x40/0x70
> > > [ 37.697066] ? __switch_to_asm+0x34/0x70
> > > [ 37.697614] ? __switch_to_asm+0x40/0x70
> > > [ 37.698166] ? __switch_to_asm+0x34/0x70
> > > [ 37.698717] ? generic_file_read_iter+0x96/0x110
> > > [ 37.699366] generic_file_read_iter+0x96/0x110
> > > [ 37.699991] aio_read+0xe9/0x178
> > > [ 37.700448] ? __switch_to_asm+0x34/0x70
> > > [ 37.701004] ? __switch_to_asm+0x34/0x70
> > > [ 37.701552] ? __switch_to_asm+0x40/0x70
> > > [ 37.702109] ? __switch_to_asm+0x34/0x70
> > > [ 37.702659] ? __switch_to_asm+0x40/0x70
> > > [ 37.703214] ? __switch_to_asm+0x34/0x70
> > > [ 37.703762] ? __switch_to_asm+0x40/0x70
> > > [ 37.704317] ? __switch_to_asm+0x34/0x70
> > > [ 37.704867] ? __switch_to_asm+0x40/0x70
> > > [ 37.705423] ? __switch_to_asm+0x34/0x70
> > > [ 37.705973] ? __switch_to_asm+0x40/0x70
> > > [ 37.706523] ? __switch_to_asm+0x34/0x70
> > > [ 37.707088] ? io_submit_one+0x2e1/0x67b
> > > [ 37.707638] io_submit_one+0x2e1/0x67b
> > > [ 37.708171] ? __se_sys_io_submit+0xc5/0x15e
> > > [ 37.708770] __se_sys_io_submit+0xc5/0x15e
> > > [ 37.709348] ? 0xffffffff81000000
> > > [ 37.709819] ? do_syscall_64+0x84/0x13f
> > > [ 37.710362] ? __se_sys_io_submit+0x15e/0x15e
> > > [ 37.710987] do_syscall_64+0x84/0x13f
> > > [ 37.711505] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > [ 37.712216] RIP: 0033:0x7f7d471c6687
> > > [ 37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> > > 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> > > 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> > > 00 00
> > > [ 37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> > > 00000000000000d1
> > > [ 37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> > > [ 37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> > > [ 37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> > > [ 37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> > > [ 37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> > > [ 37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> > > i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> > > usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> > > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> > > [ 37.724240] Dumping ftrace buffer:
> > > [ 37.724717] (ftrace buffer empty)
> > > [ 37.725223] CR2: 0000000000000048
> > > [ 37.725692] ---[ end trace 4758725073447b42 ]---
> > >
> >
> > Thanks for reporting this to me. I'm not familiar with scsi_debug would
> > you please explain to me how to reproduce this?
>
> Hi,
>
> The issue can be reproduced reliably by passing '21' to the following
> script, and
> run it for a couple of times.
>
> http://people.redhat.com/minlei/tests/tools/scsi-stress-remove
>
Thanks for the quick response. I'm having a little bit of trouble with
my qemu setup and will try and set it up with scsi_debug properly in the
morning.
However, it seems to me that the issue is with the request_queue going
away and me not handling that scenario properly when doing association.
I think the following should fix the issue, if you don't mind testing
it.
Thanks,
Dennis
---
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index bf13ecb0fe4f..f025fd1e22e6 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -511,7 +511,7 @@ static inline bool blkg_tryget(struct blkcg_gq *blkg)
*/
static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
{
- while (!percpu_ref_tryget(&blkg->refcnt))
+ while (blkg && !percpu_ref_tryget(&blkg->refcnt))
blkg = blkg->parent;
return blkg;
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: block: kernel panic in __bio_associate_blkg+0x1e
2018-12-11 6:20 ` Dennis Zhou
@ 2018-12-11 7:55 ` Ming Lei
0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2018-12-11 7:55 UTC (permalink / raw)
To: Dennis Zhou; +Cc: Ming Lei, linux-block, Jens Axboe
On Tue, Dec 11, 2018 at 01:20:30AM -0500, Dennis Zhou wrote:
> On Tue, Dec 11, 2018 at 11:22:18AM +0800, Ming Lei wrote:
> > On Tue, Dec 11, 2018 at 11:09 AM Dennis Zhou <dennis@kernel.org> wrote:
> > >
> > > Hi Ming,
> > >
> > > On Tue, Dec 11, 2018 at 10:36:07AM +0800, Ming Lei wrote:
> > > > Hi Jens and Dennis,
> > > >
> > > > Just found the following issue when testing for-4.21/block when
> > > > running stress io & device
> > > > remove on scsi_debug, and it should be caused by recent blkcg changes.
> > > >
> > > > [ 37.144330] sd 8:0:0:15: [sds] Synchronizing SCSI cache
> > > > [ 37.665644] BUG: unable to handle kernel NULL pointer dereference
> > > > at 0000000000000048
> > > > [ 37.674748] PGD 8000000269c3b067 P4D 8000000269c3b067 PUD 269c3c067 PMD 0
> > > > [ 37.675703] Oops: 0000 [#1] PREEMPT SMP PTI
> > > > [ 37.676294] CPU: 2 PID: 1270 Comm: fio Not tainted
> > > > 4.20.0-rc6_f0ea84586b7c_for-next+ #1
> > > > [ 37.677392] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.10.2-2.fc27 04/01/2014
> > > > [ 37.678563] RIP: 0010:__bio_associate_blkg+0x1e/0x81
> > > > [ 37.679255] Code: 00 00 5b 5d c3 0f 1f 44 00 00 eb 94 0f 1f 44 00
> > > > 00 41 54 55 48 89 fd 53 48 89 f3 e8 80 ff ff ff bf 01 00 00 00 e8 94
> > > > 37 d7 ff <48> 8b 43 48 a8 03 74 06 48 8b 53 40 eb 1b 65 48 ff 00 41 b4
> > > > 01 eb
> > > > [ 37.681801] RSP: 0018:ffffc9000169bb80 EFLAGS: 00010246
> > > > [ 37.682525] RAX: ffff888269c40000 RBX: 0000000000000000 RCX: 0000000000000000
> > > > [ 37.683506] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8132c44f
> > > > [ 37.684486] RBP: ffff888270fecb18 R08: 000000000000001e R09: ffffffffffffffff
> > > > [ 37.685473] R10: 00000000ffffffca R11: 0000000000000000 R12: ffff88826630b758
> > > > [ 37.686450] R13: ffff888270fecb18 R14: ffff888107811118 R15: ffff888270fecb18
> > > > [ 37.687435] FS: 00007f7d486c6ec0(0000) GS:ffff888277b00000(0000)
> > > > knlGS:0000000000000000
> > > > [ 37.688548] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 37.689345] CR2: 0000000000000048 CR3: 0000000269c4a001 CR4: 0000000000760ee0
> > > > [ 37.690333] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [ 37.691330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > [ 37.692320] PKRU: 55555554
> > > > [ 37.692704] Call Trace:
> > > > [ 37.693070] bio_associate_blkg_from_css+0x4e/0x57
> > > > [ 37.693734] bio_associate_blkg+0x4d/0x53
> > > > [ 37.694300] blkdev_direct_IO+0x1d4/0x3c9
> > > > [ 37.694861] ? __switch_to_asm+0x34/0x70
> > > > [ 37.695417] ? aio_complete+0x2cc/0x2cc
> > > > [ 37.695962] ? __switch_to_asm+0x34/0x70
> > > > [ 37.696511] ? __switch_to_asm+0x40/0x70
> > > > [ 37.697066] ? __switch_to_asm+0x34/0x70
> > > > [ 37.697614] ? __switch_to_asm+0x40/0x70
> > > > [ 37.698166] ? __switch_to_asm+0x34/0x70
> > > > [ 37.698717] ? generic_file_read_iter+0x96/0x110
> > > > [ 37.699366] generic_file_read_iter+0x96/0x110
> > > > [ 37.699991] aio_read+0xe9/0x178
> > > > [ 37.700448] ? __switch_to_asm+0x34/0x70
> > > > [ 37.701004] ? __switch_to_asm+0x34/0x70
> > > > [ 37.701552] ? __switch_to_asm+0x40/0x70
> > > > [ 37.702109] ? __switch_to_asm+0x34/0x70
> > > > [ 37.702659] ? __switch_to_asm+0x40/0x70
> > > > [ 37.703214] ? __switch_to_asm+0x34/0x70
> > > > [ 37.703762] ? __switch_to_asm+0x40/0x70
> > > > [ 37.704317] ? __switch_to_asm+0x34/0x70
> > > > [ 37.704867] ? __switch_to_asm+0x40/0x70
> > > > [ 37.705423] ? __switch_to_asm+0x34/0x70
> > > > [ 37.705973] ? __switch_to_asm+0x40/0x70
> > > > [ 37.706523] ? __switch_to_asm+0x34/0x70
> > > > [ 37.707088] ? io_submit_one+0x2e1/0x67b
> > > > [ 37.707638] io_submit_one+0x2e1/0x67b
> > > > [ 37.708171] ? __se_sys_io_submit+0xc5/0x15e
> > > > [ 37.708770] __se_sys_io_submit+0xc5/0x15e
> > > > [ 37.709348] ? 0xffffffff81000000
> > > > [ 37.709819] ? do_syscall_64+0x84/0x13f
> > > > [ 37.710362] ? __se_sys_io_submit+0x15e/0x15e
> > > > [ 37.710987] do_syscall_64+0x84/0x13f
> > > > [ 37.711505] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > [ 37.712216] RIP: 0033:0x7f7d471c6687
> > > > [ 37.712720] Code: 00 00 00 49 83 38 00 75 ed 49 83 78 08 00 75 e6
> > > > 8b 47 0c 39 47 08 75 de 31 c0 c3 0f 1f 84 00 00 00 00 00 b8 d1 00 00
> > > > 00 0f 05 <c3> 0f 1f 84 00 00 00 00 00 b8 d2 00 00 00 0f 05 c3 0f 1f 84
> > > > 00 00
> > > > [ 37.715280] RSP: 002b:00007ffcb7d448c8 EFLAGS: 00000202 ORIG_RAX:
> > > > 00000000000000d1
> > > > [ 37.716314] RAX: ffffffffffffffda RBX: 00007f7d22e90298 RCX: 00007f7d471c6687
> > > > [ 37.717297] RDX: 00000000009f4b20 RSI: 0000000000000001 RDI: 00007f7d484ea000
> > > > [ 37.718280] RBP: 0000000000003e70 R08: 0000000000000001 R09: 00000000008799e0
> > > > [ 37.719264] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f7d22e90298
> > > > [ 37.720249] R13: 0000000000000000 R14: 00000000009f4cc0 R15: 00000000009e0c60
> > > > [ 37.721232] Modules linked in: scsi_debug isofs iTCO_wdt i2c_i801
> > > > i2c_core iTCO_vendor_support lpc_ich mfd_core ip_tables sr_mod cdrom
> > > > usb_storage sd_mod ahci libahci libata crc32c_intel qemu_fw_cfg
> > > > virtio_scsi dm_mirror dm_region_hash dm_log dm_mod
> > > > [ 37.724240] Dumping ftrace buffer:
> > > > [ 37.724717] (ftrace buffer empty)
> > > > [ 37.725223] CR2: 0000000000000048
> > > > [ 37.725692] ---[ end trace 4758725073447b42 ]---
> > > >
> > >
> > > Thanks for reporting this to me. I'm not familiar with scsi_debug would
> > > you please explain to me how to reproduce this?
> >
> > Hi,
> >
> > The issue can be reproduced reliably by passing '21' to the following
> > script, and
> > run it for a couple of times.
> >
> > http://people.redhat.com/minlei/tests/tools/scsi-stress-remove
> >
>
> Thanks for the quick response. I'm having a little bit of trouble with
> my qemu setup and will try and set it up with scsi_debug properly in the
> morning.
You may run test over scsi_debug in may machine, not limited to qemu.
>
> However, it seems to me that the issue is with the request_queue going
> away and me not handling that scenario properly when doing association.
> I think the following should fix the issue, if you don't mind testing
> it.
>
> Thanks,
> Dennis
>
> ---
> diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
> index bf13ecb0fe4f..f025fd1e22e6 100644
> --- a/include/linux/blk-cgroup.h
> +++ b/include/linux/blk-cgroup.h
> @@ -511,7 +511,7 @@ static inline bool blkg_tryget(struct blkcg_gq *blkg)
> */
> static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
> {
> - while (!percpu_ref_tryget(&blkg->refcnt))
> + while (blkg && !percpu_ref_tryget(&blkg->refcnt))
> blkg = blkg->parent;
>
> return blkg;
After applying the above patch, the 'scsi-stress-remove' test mentioned before
can survive, without panic any more.
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-12-11 7:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-11 2:36 block: kernel panic in __bio_associate_blkg+0x1e Ming Lei
2018-12-11 3:09 ` Dennis Zhou
2018-12-11 3:22 ` Ming Lei
2018-12-11 6:20 ` Dennis Zhou
2018-12-11 7:55 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).