* [block regression] kernel oops triggered by removing scsi device dring IO
@ 2018-04-08 4:21 Ming Lei
2018-04-08 8:11 ` Joseph Qi
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Ming Lei @ 2018-04-08 4:21 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Joseph Qi
Hi,
The following kernel oops is triggered by 'removing scsi device' during
heavy IO.
'git bisect' shows that commit a063057d7c731cffa7d10740(block: Fix a race
between request queue removal and the block cgroup controller)
introduced this regression:
[ 42.268257] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 42.269339] PGD 26bd9f067 P4D 26bd9f067 PUD 26bfec067 PMD 0
[ 42.270077] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 42.270681] Dumping ftrace buffer:
[ 42.271141] (ftrace buffer empty)
[ 42.271641] Modules linked in: scsi_debug iTCO_wdt iTCO_vendor_support crc32c_intel i2c_i801 i2c_core lpc_ich mfd_core usb_storage nvme shpchp nvme_core virtio_scsi qemu_fw_cfg ip_tables
[ 42.273770] CPU: 5 PID: 1076 Comm: fio Not tainted 4.16.0+ #49
[ 42.274530] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
[ 42.275634] RIP: 0010:blk_throtl_bio+0x41/0x904
[ 42.276225] RSP: 0018:ffffc900033cfaa0 EFLAGS: 00010246
[ 42.276907] RAX: 0000000080000000 RBX: ffff8801bdcc5118 RCX: 0000000000000001
[ 42.277818] RDX: ffff8801bdcc5118 RSI: 0000000000000000 RDI: ffff8802641f8870
[ 42.278733] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffc900033cfb94
[ 42.279651] R10: ffffc900033cfc00 R11: 0000000006ea0000 R12: ffff8802641f8870
[ 42.280567] R13: ffff88026f34f000 R14: 0000000000000000 R15: ffff8801bdcc5118
[ 42.281489] FS: 00007fc123922d40(0000) GS:ffff880272f40000(0000) knlGS:0000000000000000
[ 42.282525] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 42.283270] CR2: 0000000000000028 CR3: 000000026d7ac004 CR4: 00000000007606e0
[ 42.284194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 42.285116] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 42.286036] PKRU: 55555554
[ 42.286393] Call Trace:
[ 42.286725] ? try_to_wake_up+0x3a3/0x3c9
[ 42.287255] ? blk_mq_hctx_notify_dead+0x135/0x135
[ 42.287880] ? gup_pud_range+0xb5/0x7e1
[ 42.288381] generic_make_request_checks+0x3cf/0x539
[ 42.289027] ? gup_pgd_range+0x8e/0xaa
[ 42.289515] generic_make_request+0x38/0x25b
[ 42.290078] ? submit_bio+0x103/0x11f
[ 42.290555] submit_bio+0x103/0x11f
[ 42.291018] ? bio_iov_iter_get_pages+0xe4/0x104
[ 42.291620] blkdev_direct_IO+0x2a3/0x3af
[ 42.292151] ? kiocb_free+0x34/0x34
[ 42.292607] ? ___preempt_schedule+0x16/0x18
[ 42.293168] ? preempt_schedule_common+0x4c/0x65
[ 42.293771] ? generic_file_read_iter+0x96/0x110
[ 42.294377] generic_file_read_iter+0x96/0x110
[ 42.294962] aio_read+0xca/0x13b
[ 42.295388] ? preempt_count_add+0x6d/0x8c
[ 42.295926] ? aio_read_events+0x287/0x2d6
[ 42.296460] ? do_io_submit+0x4d2/0x62c
[ 42.296964] do_io_submit+0x4d2/0x62c
[ 42.297446] ? do_syscall_64+0x9d/0x15e
[ 42.297950] do_syscall_64+0x9d/0x15e
[ 42.298431] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 42.299090] RIP: 0033:0x7fc12244e687
[ 42.299556] RSP: 002b:00007ffe18388a68 EFLAGS: 00000202 ORIG_RAX: 00000000000000d1
[ 42.300528] RAX: ffffffffffffffda RBX: 00007fc0fde08670 RCX: 00007fc12244e687
[ 42.301442] RDX: 0000000001d1b388 RSI: 0000000000000001 RDI: 00007fc123782000
[ 42.302359] RBP: 00000000000022d8 R08: 0000000000000001 R09: 0000000001c461e0
[ 42.303275] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fc0fde08670
[ 42.304195] R13: 0000000000000000 R14: 0000000001d1d0c0 R15: 0000000001b872f0
[ 42.305117] Code: 48 85 f6 48 89 7c 24 10 75 0e 48 8b b7 b8 05 00 00 31 ed 48 85 f6 74 0f 48 63 05 75 a4 e4 00 48 8b ac c6 28 02 00 00 f6 43 15 02 <48> 8b 45 28 48 89 04 24 0f 85 28 08 00 00 8b 43 10 45 31 e4 83
[ 42.307553] RIP: blk_throtl_bio+0x41/0x904 RSP: ffffc900033cfaa0
[ 42.308328] CR2: 0000000000000028
[ 42.308920] ---[ end trace f53a144979f63b29 ]---
[ 42.309520] Kernel panic - not syncing: Fatal exception
[ 42.310635] Dumping ftrace buffer:
[ 42.311087] (ftrace buffer empty)
[ 42.311583] Kernel Offset: disabled
[ 42.312163] ---[ end Kernel panic - not syncing: Fatal exception ]---
--
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
@ 2018-04-08 8:11 ` Joseph Qi
2018-04-08 9:25 ` Ming Lei
2018-04-08 14:58 ` Bart Van Assche
2018-04-08 14:50 ` Bart Van Assche
2018-04-09 4:47 ` Bart Van Assche
2 siblings, 2 replies; 15+ messages in thread
From: Joseph Qi @ 2018-04-08 8:11 UTC (permalink / raw)
To: Ming Lei, Jens Axboe, Bart Van Assche; +Cc: linux-block
This is because scsi_remove_device() will call blk_cleanup_queue(), and
then all blkgs have been destroyed and root_blkg is NULL.
Thus tg is NULL and trigger NULL pointer dereference when get td from
tg (tg->td).
It seems that we cannot simply move blkcg_exit_queue() up to
blk_cleanup_queue().
Thanks,
Joseph
On 18/4/8 12:21, Ming Lei wrote:
> Hi,
>
> The following kernel oops is triggered by 'removing scsi device' during
> heavy IO.
>
> 'git bisect' shows that commit a063057d7c731cffa7d10740(block: Fix a race
> between request queue removal and the block cgroup controller)
> introduced this regression:
>
> [ 42.268257] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> [ 42.269339] PGD 26bd9f067 P4D 26bd9f067 PUD 26bfec067 PMD 0
> [ 42.270077] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 42.270681] Dumping ftrace buffer:
> [ 42.271141] (ftrace buffer empty)
> [ 42.271641] Modules linked in: scsi_debug iTCO_wdt iTCO_vendor_support crc32c_intel i2c_i801 i2c_core lpc_ich mfd_core usb_storage nvme shpchp nvme_core virtio_scsi qemu_fw_cfg ip_tables
> [ 42.273770] CPU: 5 PID: 1076 Comm: fio Not tainted 4.16.0+ #49
> [ 42.274530] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
> [ 42.275634] RIP: 0010:blk_throtl_bio+0x41/0x904
> [ 42.276225] RSP: 0018:ffffc900033cfaa0 EFLAGS: 00010246
> [ 42.276907] RAX: 0000000080000000 RBX: ffff8801bdcc5118 RCX: 0000000000000001
> [ 42.277818] RDX: ffff8801bdcc5118 RSI: 0000000000000000 RDI: ffff8802641f8870
> [ 42.278733] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffc900033cfb94
> [ 42.279651] R10: ffffc900033cfc00 R11: 0000000006ea0000 R12: ffff8802641f8870
> [ 42.280567] R13: ffff88026f34f000 R14: 0000000000000000 R15: ffff8801bdcc5118
> [ 42.281489] FS: 00007fc123922d40(0000) GS:ffff880272f40000(0000) knlGS:0000000000000000
> [ 42.282525] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 42.283270] CR2: 0000000000000028 CR3: 000000026d7ac004 CR4: 00000000007606e0
> [ 42.284194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 42.285116] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 42.286036] PKRU: 55555554
> [ 42.286393] Call Trace:
> [ 42.286725] ? try_to_wake_up+0x3a3/0x3c9
> [ 42.287255] ? blk_mq_hctx_notify_dead+0x135/0x135
> [ 42.287880] ? gup_pud_range+0xb5/0x7e1
> [ 42.288381] generic_make_request_checks+0x3cf/0x539
> [ 42.289027] ? gup_pgd_range+0x8e/0xaa
> [ 42.289515] generic_make_request+0x38/0x25b
> [ 42.290078] ? submit_bio+0x103/0x11f
> [ 42.290555] submit_bio+0x103/0x11f
> [ 42.291018] ? bio_iov_iter_get_pages+0xe4/0x104
> [ 42.291620] blkdev_direct_IO+0x2a3/0x3af
> [ 42.292151] ? kiocb_free+0x34/0x34
> [ 42.292607] ? ___preempt_schedule+0x16/0x18
> [ 42.293168] ? preempt_schedule_common+0x4c/0x65
> [ 42.293771] ? generic_file_read_iter+0x96/0x110
> [ 42.294377] generic_file_read_iter+0x96/0x110
> [ 42.294962] aio_read+0xca/0x13b
> [ 42.295388] ? preempt_count_add+0x6d/0x8c
> [ 42.295926] ? aio_read_events+0x287/0x2d6
> [ 42.296460] ? do_io_submit+0x4d2/0x62c
> [ 42.296964] do_io_submit+0x4d2/0x62c
> [ 42.297446] ? do_syscall_64+0x9d/0x15e
> [ 42.297950] do_syscall_64+0x9d/0x15e
> [ 42.298431] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [ 42.299090] RIP: 0033:0x7fc12244e687
> [ 42.299556] RSP: 002b:00007ffe18388a68 EFLAGS: 00000202 ORIG_RAX: 00000000000000d1
> [ 42.300528] RAX: ffffffffffffffda RBX: 00007fc0fde08670 RCX: 00007fc12244e687
> [ 42.301442] RDX: 0000000001d1b388 RSI: 0000000000000001 RDI: 00007fc123782000
> [ 42.302359] RBP: 00000000000022d8 R08: 0000000000000001 R09: 0000000001c461e0
> [ 42.303275] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fc0fde08670
> [ 42.304195] R13: 0000000000000000 R14: 0000000001d1d0c0 R15: 0000000001b872f0
> [ 42.305117] Code: 48 85 f6 48 89 7c 24 10 75 0e 48 8b b7 b8 05 00 00 31 ed 48 85 f6 74 0f 48 63 05 75 a4 e4 00 48 8b ac c6 28 02 00 00 f6 43 15 02 <48> 8b 45 28 48 89 04 24 0f 85 28 08 00 00 8b 43 10 45 31 e4 83
> [ 42.307553] RIP: blk_throtl_bio+0x41/0x904 RSP: ffffc900033cfaa0
> [ 42.308328] CR2: 0000000000000028
> [ 42.308920] ---[ end trace f53a144979f63b29 ]---
> [ 42.309520] Kernel panic - not syncing: Fatal exception
> [ 42.310635] Dumping ftrace buffer:
> [ 42.311087] (ftrace buffer empty)
> [ 42.311583] Kernel Offset: disabled
> [ 42.312163] ---[ end Kernel panic - not syncing: Fatal exception ]---
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 8:11 ` Joseph Qi
@ 2018-04-08 9:25 ` Ming Lei
2018-04-08 10:31 ` Ming Lei
2018-04-08 14:58 ` Bart Van Assche
1 sibling, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-04-08 9:25 UTC (permalink / raw)
To: Joseph Qi; +Cc: Jens Axboe, Bart Van Assche, linux-block
On Sun, Apr 08, 2018 at 04:11:51PM +0800, Joseph Qi wrote:
> This is because scsi_remove_device() will call blk_cleanup_queue(), and
> then all blkgs have been destroyed and root_blkg is NULL.
> Thus tg is NULL and trigger NULL pointer dereference when get td from
> tg (tg->td).
> It seems that we cannot simply move blkcg_exit_queue() up to
> blk_cleanup_queue().
Maybe one per-queue blkcg should be introduced, which seems reasonable
too.
Thanks,
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 9:25 ` Ming Lei
@ 2018-04-08 10:31 ` Ming Lei
0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2018-04-08 10:31 UTC (permalink / raw)
To: Joseph Qi; +Cc: Jens Axboe, Bart Van Assche, linux-block
On Sun, Apr 08, 2018 at 05:25:42PM +0800, Ming Lei wrote:
> On Sun, Apr 08, 2018 at 04:11:51PM +0800, Joseph Qi wrote:
> > This is because scsi_remove_device() will call blk_cleanup_queue(), and
> > then all blkgs have been destroyed and root_blkg is NULL.
> > Thus tg is NULL and trigger NULL pointer dereference when get td from
> > tg (tg->td).
> > It seems that we cannot simply move blkcg_exit_queue() up to
> > blk_cleanup_queue().
>
> Maybe one per-queue blkcg should be introduced, which seems reasonable
> too.
Sorry, I mean one per-queue blkcg lock.
--
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
2018-04-08 8:11 ` Joseph Qi
@ 2018-04-08 14:50 ` Bart Van Assche
2018-04-09 1:33 ` Joseph Qi
2018-04-09 4:47 ` Bart Van Assche
2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2018-04-08 14:50 UTC (permalink / raw)
To: ming.lei, axboe; +Cc: linux-block, joseph.qi
T24gU3VuLCAyMDE4LTA0LTA4IGF0IDEyOjIxICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gVGhl
IGZvbGxvd2luZyBrZXJuZWwgb29wcyBpcyB0cmlnZ2VyZWQgYnkgJ3JlbW92aW5nIHNjc2kgZGV2
aWNlJyBkdXJpbmcNCj4gaGVhdnkgSU8uDQoNCkhvdyBkaWQgeW91IHRyaWdnZXIgdGhpcyBvb3Bz
Pw0KDQpCYXJ0Lg0KDQoNCg0KDQo=
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 8:11 ` Joseph Qi
2018-04-08 9:25 ` Ming Lei
@ 2018-04-08 14:58 ` Bart Van Assche
1 sibling, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-08 14:58 UTC (permalink / raw)
To: axboe, joseph.qi, ming.lei; +Cc: linux-block
T24gU3VuLCAyMDE4LTA0LTA4IGF0IDE2OjExICswODAwLCBKb3NlcGggUWkgd3JvdGU6DQo+IFRo
aXMgaXMgYmVjYXVzZSBzY3NpX3JlbW92ZV9kZXZpY2UoKSB3aWxsIGNhbGwgYmxrX2NsZWFudXBf
cXVldWUoKSwgYW5kDQo+IHRoZW4gYWxsIGJsa2dzIGhhdmUgYmVlbiBkZXN0cm95ZWQgYW5kIHJv
b3RfYmxrZyBpcyBOVUxMLg0KPiBUaHVzIHRnIGlzIE5VTEwgYW5kIHRyaWdnZXIgTlVMTCBwb2lu
dGVyIGRlcmVmZXJlbmNlIHdoZW4gZ2V0IHRkIGZyb20NCj4gdGcgKHRnLT50ZCkuDQo+IEl0IHNl
ZW1zIHRoYXQgd2UgY2Fubm90IHNpbXBseSBtb3ZlIGJsa2NnX2V4aXRfcXVldWUoKSB1cCB0bw0K
PiBibGtfY2xlYW51cF9xdWV1ZSgpLg0KDQpIYWQgeW91IGNvbnNpZGVyZWQgdG8gYWRkIGEgYmxr
X3F1ZXVlX2VudGVyKCkgLyBibGtfcXVldWVfZXhpdCgpIHBhaXIgaW4NCmdlbmVyaWNfbWFrZV9y
ZXF1ZXN0KCk/IGJsa19xdWV1ZV9lbnRlcigpIG5hbWVseSBjaGVja3MgdGhlIERZSU5HIGZsYWcu
DQoNClRoYW5rcywNCg0KQmFydC4NCg0KDQo=
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 14:50 ` Bart Van Assche
@ 2018-04-09 1:33 ` Joseph Qi
2018-04-09 2:48 ` Ming Lei
0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi @ 2018-04-09 1:33 UTC (permalink / raw)
To: Bart Van Assche, ming.lei, axboe; +Cc: linux-block
Hi Bart,
On 18/4/8 22:50, Bart Van Assche wrote:
> On Sun, 2018-04-08 at 12:21 +0800, Ming Lei wrote:
>> The following kernel oops is triggered by 'removing scsi device' during
>> heavy IO.
>
> How did you trigger this oops?
>
I can reproduce this oops by the following steps:
1) start a fio job with buffered write;
2) remove the scsi device fio write to:
echo "scsi remove-single-device ${dev}" > /proc/scsi/scsi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-09 1:33 ` Joseph Qi
@ 2018-04-09 2:48 ` Ming Lei
0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2018-04-09 2:48 UTC (permalink / raw)
To: Joseph Qi; +Cc: Bart Van Assche, axboe, linux-block
On Mon, Apr 09, 2018 at 09:33:08AM +0800, Joseph Qi wrote:
> Hi Bart,
>
> On 18/4/8 22:50, Bart Van Assche wrote:
> > On Sun, 2018-04-08 at 12:21 +0800, Ming Lei wrote:
> >> The following kernel oops is triggered by 'removing scsi device' during
> >> heavy IO.
> >
> > How did you trigger this oops?
> >
>
> I can reproduce this oops by the following steps:
> 1) start a fio job with buffered write;
> 2) remove the scsi device fio write to:
> echo "scsi remove-single-device ${dev}" > /proc/scsi/scsi
Yeah, it can be reproduced easily, and I usually remove scsi
device via 'echo 1 > /sys/block/sda/device/delete'
Thanks,
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-08 4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
2018-04-08 8:11 ` Joseph Qi
2018-04-08 14:50 ` Bart Van Assche
@ 2018-04-09 4:47 ` Bart Van Assche
2018-04-09 6:54 ` Joseph Qi
2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2018-04-09 4:47 UTC (permalink / raw)
To: ming.lei, axboe; +Cc: linux-block, joseph.qi
T24gU3VuLCAyMDE4LTA0LTA4IGF0IDEyOjIxICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gVGhl
IGZvbGxvd2luZyBrZXJuZWwgb29wcyBpcyB0cmlnZ2VyZWQgYnkgJ3JlbW92aW5nIHNjc2kgZGV2
aWNlJyBkdXJpbmcNCj4gaGVhdnkgSU8uDQoNCklzIHRoZSBiZWxvdyBwYXRjaCBzdWZmaWNpZW50
IHRvIGZpeCB0aGlzPw0KDQpUaGFua3MsDQoNCkJhcnQuDQoNCg0KU3ViamVjdDogYmxrLW1xOiBB
dm9pZCB0aGF0IHN1Ym1pdHRpbmcgYSBiaW8gY29uY3VycmVudGx5IHdpdGggZGV2aWNlIHJlbW92
YWwgdHJpZ2dlcnMgYSBjcmFzaA0KDQpCZWNhdXNlIGJsa2NnX2V4aXRfcXVldWUoKSBpcyBub3cg
Y2FsbGVkIGZyb20gaW5zaWRlIGJsa19jbGVhbnVwX3F1ZXVlKCkNCml0IGlzIG5vIGxvbmdlciBz
YWZlIHRvIGFjY2VzcyBjZ3JvdXAgaW5mb3JtYXRpb24gZHVyaW5nIG9yIGFmdGVyIHRoZQ0KYmxr
X2NsZWFudXBfcXVldWUoKSBjYWxsLiBIZW5jZSBjaGVjayBlYXJsaWVyIGluIGdlbmVyaWNfbWFr
ZV9yZXF1ZXN0KCkNCndoZXRoZXIgdGhlIHF1ZXVlIGhhcyBiZWVuIG1hcmtlZCBhcyAiZHlpbmci
Lg0KLS0tDQogYmxvY2svYmxrLWNvcmUuYyB8IDcyICsrKysrKysrKysrKysrKysrKysrKysrKysr
KysrLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQogMSBmaWxlIGNoYW5nZWQsIDM3IGluc2Vy
dGlvbnMoKyksIDM1IGRlbGV0aW9ucygtKQ0KDQpkaWZmIC0tZ2l0IGEvYmxvY2svYmxrLWNvcmUu
YyBiL2Jsb2NrL2Jsay1jb3JlLmMNCmluZGV4IGFhOGM5OWZhZTUyNy4uM2FjOWRkMjVlMDRlIDEw
MDY0NA0KLS0tIGEvYmxvY2svYmxrLWNvcmUuYw0KKysrIGIvYmxvY2svYmxrLWNvcmUuYw0KQEAg
LTIzODUsMTAgKzIzODUsMjEgQEAgYmxrX3FjX3QgZ2VuZXJpY19tYWtlX3JlcXVlc3Qoc3RydWN0
IGJpbyAqYmlvKQ0KIAkgKiB5ZXQuDQogCSAqLw0KIAlzdHJ1Y3QgYmlvX2xpc3QgYmlvX2xpc3Rf
b25fc3RhY2tbMl07DQorCWJsa19tcV9yZXFfZmxhZ3NfdCBmbGFncyA9IGJpby0+Ymlfb3BmICYg
UkVRX05PV0FJVCA/DQorCQlCTEtfTVFfUkVRX05PV0FJVCA6IDA7DQorCXN0cnVjdCByZXF1ZXN0
X3F1ZXVlICpxID0gYmlvLT5iaV9kaXNrLT5xdWV1ZTsNCiAJYmxrX3FjX3QgcmV0ID0gQkxLX1FD
X1RfTk9ORTsNCiANCiAJaWYgKCFnZW5lcmljX21ha2VfcmVxdWVzdF9jaGVja3MoYmlvKSkNCi0J
CWdvdG8gb3V0Ow0KKwkJcmV0dXJuIHJldDsNCisNCisJaWYgKGJsa19xdWV1ZV9lbnRlcihxLCBm
bGFncykgPCAwKSB7DQorCQlpZiAodW5saWtlbHkoIWJsa19xdWV1ZV9keWluZyhxKSAmJiAoYmlv
LT5iaV9vcGYgJiBSRVFfTk9XQUlUKSkpDQorCQkJYmlvX3dvdWxkYmxvY2tfZXJyb3IoYmlvKTsN
CisJCWVsc2UNCisJCQliaW9faW9fZXJyb3IoYmlvKTsNCisJCXJldHVybiByZXQ7DQorCX0NCiAN
CiAJLyoNCiAJICogV2Ugb25seSB3YW50IG9uZSAtPm1ha2VfcmVxdWVzdF9mbiB0byBiZSBhY3Rp
dmUgYXQgYSB0aW1lLCBlbHNlDQpAQCAtMjQyMyw0NiArMjQzNCwzNyBAQCBibGtfcWNfdCBnZW5l
cmljX21ha2VfcmVxdWVzdChzdHJ1Y3QgYmlvICpiaW8pDQogCWJpb19saXN0X2luaXQoJmJpb19s
aXN0X29uX3N0YWNrWzBdKTsNCiAJY3VycmVudC0+YmlvX2xpc3QgPSBiaW9fbGlzdF9vbl9zdGFj
azsNCiAJZG8gew0KLQkJc3RydWN0IHJlcXVlc3RfcXVldWUgKnEgPSBiaW8tPmJpX2Rpc2stPnF1
ZXVlOw0KLQkJYmxrX21xX3JlcV9mbGFnc190IGZsYWdzID0gYmlvLT5iaV9vcGYgJiBSRVFfTk9X
QUlUID8NCi0JCQlCTEtfTVFfUkVRX05PV0FJVCA6IDA7DQotDQotCQlpZiAobGlrZWx5KGJsa19x
dWV1ZV9lbnRlcihxLCBmbGFncykgPT0gMCkpIHsNCi0JCQlzdHJ1Y3QgYmlvX2xpc3QgbG93ZXIs
IHNhbWU7DQotDQotCQkJLyogQ3JlYXRlIGEgZnJlc2ggYmlvX2xpc3QgZm9yIGFsbCBzdWJvcmRp
bmF0ZSByZXF1ZXN0cyAqLw0KLQkJCWJpb19saXN0X29uX3N0YWNrWzFdID0gYmlvX2xpc3Rfb25f
c3RhY2tbMF07DQotCQkJYmlvX2xpc3RfaW5pdCgmYmlvX2xpc3Rfb25fc3RhY2tbMF0pOw0KLQkJ
CXJldCA9IHEtPm1ha2VfcmVxdWVzdF9mbihxLCBiaW8pOw0KLQ0KLQkJCWJsa19xdWV1ZV9leGl0
KHEpOw0KLQ0KLQkJCS8qIHNvcnQgbmV3IGJpb3MgaW50byB0aG9zZSBmb3IgYSBsb3dlciBsZXZl
bA0KLQkJCSAqIGFuZCB0aG9zZSBmb3IgdGhlIHNhbWUgbGV2ZWwNCi0JCQkgKi8NCi0JCQliaW9f
bGlzdF9pbml0KCZsb3dlcik7DQotCQkJYmlvX2xpc3RfaW5pdCgmc2FtZSk7DQotCQkJd2hpbGUg
KChiaW8gPSBiaW9fbGlzdF9wb3AoJmJpb19saXN0X29uX3N0YWNrWzBdKSkgIT0gTlVMTCkNCi0J
CQkJaWYgKHEgPT0gYmlvLT5iaV9kaXNrLT5xdWV1ZSkNCi0JCQkJCWJpb19saXN0X2FkZCgmc2Ft
ZSwgYmlvKTsNCi0JCQkJZWxzZQ0KLQkJCQkJYmlvX2xpc3RfYWRkKCZsb3dlciwgYmlvKTsNCi0J
CQkvKiBub3cgYXNzZW1ibGUgc28gd2UgaGFuZGxlIHRoZSBsb3dlc3QgbGV2ZWwgZmlyc3QgKi8N
Ci0JCQliaW9fbGlzdF9tZXJnZSgmYmlvX2xpc3Rfb25fc3RhY2tbMF0sICZsb3dlcik7DQotCQkJ
YmlvX2xpc3RfbWVyZ2UoJmJpb19saXN0X29uX3N0YWNrWzBdLCAmc2FtZSk7DQotCQkJYmlvX2xp
c3RfbWVyZ2UoJmJpb19saXN0X29uX3N0YWNrWzBdLCAmYmlvX2xpc3Rfb25fc3RhY2tbMV0pOw0K
LQkJfSBlbHNlIHsNCi0JCQlpZiAodW5saWtlbHkoIWJsa19xdWV1ZV9keWluZyhxKSAmJg0KLQkJ
CQkJKGJpby0+Ymlfb3BmICYgUkVRX05PV0FJVCkpKQ0KLQkJCQliaW9fd291bGRibG9ja19lcnJv
cihiaW8pOw0KKwkJc3RydWN0IGJpb19saXN0IGxvd2VyLCBzYW1lOw0KKw0KKwkJV0FSTl9PTl9P
TkNFKCEoZmxhZ3MgJiBCTEtfTVFfUkVRX05PV0FJVCkgJiYNCisJCQkgICAgIChiaW8tPmJpX29w
ZiAmIFJFUV9OT1dBSVQpKTsNCisJCVdBUk5fT05fT05DRShxICE9IGJpby0+YmlfZGlzay0+cXVl
dWUpOw0KKwkJcSA9IGJpby0+YmlfZGlzay0+cXVldWU7DQorCQkvKiBDcmVhdGUgYSBmcmVzaCBi
aW9fbGlzdCBmb3IgYWxsIHN1Ym9yZGluYXRlIHJlcXVlc3RzICovDQorCQliaW9fbGlzdF9vbl9z
dGFja1sxXSA9IGJpb19saXN0X29uX3N0YWNrWzBdOw0KKwkJYmlvX2xpc3RfaW5pdCgmYmlvX2xp
c3Rfb25fc3RhY2tbMF0pOw0KKwkJcmV0ID0gcS0+bWFrZV9yZXF1ZXN0X2ZuKHEsIGJpbyk7DQor
DQorCQkvKiBzb3J0IG5ldyBiaW9zIGludG8gdGhvc2UgZm9yIGEgbG93ZXIgbGV2ZWwNCisJCSAq
IGFuZCB0aG9zZSBmb3IgdGhlIHNhbWUgbGV2ZWwNCisJCSAqLw0KKwkJYmlvX2xpc3RfaW5pdCgm
bG93ZXIpOw0KKwkJYmlvX2xpc3RfaW5pdCgmc2FtZSk7DQorCQl3aGlsZSAoKGJpbyA9IGJpb19s
aXN0X3BvcCgmYmlvX2xpc3Rfb25fc3RhY2tbMF0pKSAhPSBOVUxMKQ0KKwkJCWlmIChxID09IGJp
by0+YmlfZGlzay0+cXVldWUpDQorCQkJCWJpb19saXN0X2FkZCgmc2FtZSwgYmlvKTsNCiAJCQll
bHNlDQotCQkJCWJpb19pb19lcnJvcihiaW8pOw0KLQkJfQ0KKwkJCQliaW9fbGlzdF9hZGQoJmxv
d2VyLCBiaW8pOw0KKwkJLyogbm93IGFzc2VtYmxlIHNvIHdlIGhhbmRsZSB0aGUgbG93ZXN0IGxl
dmVsIGZpcnN0ICovDQorCQliaW9fbGlzdF9tZXJnZSgmYmlvX2xpc3Rfb25fc3RhY2tbMF0sICZs
b3dlcik7DQorCQliaW9fbGlzdF9tZXJnZSgmYmlvX2xpc3Rfb25fc3RhY2tbMF0sICZzYW1lKTsN
CisJCWJpb19saXN0X21lcmdlKCZiaW9fbGlzdF9vbl9zdGFja1swXSwgJmJpb19saXN0X29uX3N0
YWNrWzFdKTsNCiAJCWJpbyA9IGJpb19saXN0X3BvcCgmYmlvX2xpc3Rfb25fc3RhY2tbMF0pOw0K
IAl9IHdoaWxlIChiaW8pOw0KIAljdXJyZW50LT5iaW9fbGlzdCA9IE5VTEw7IC8qIGRlYWN0aXZh
dGUgKi8NCiANCiBvdXQ6DQorCWJsa19xdWV1ZV9leGl0KHEpOw0KIAlyZXR1cm4gcmV0Ow0KIH0N
CiBFWFBPUlRfU1lNQk9MKGdlbmVyaWNfbWFrZV9yZXF1ZXN0KTsNCi0tIA0KMi4xNi4yDQo=
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-09 4:47 ` Bart Van Assche
@ 2018-04-09 6:54 ` Joseph Qi
2018-04-09 22:54 ` Bart Van Assche
0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi @ 2018-04-09 6:54 UTC (permalink / raw)
To: Bart Van Assche, ming.lei, axboe; +Cc: linux-block
Hi Bart,
On 18/4/9 12:47, Bart Van Assche wrote:
> On Sun, 2018-04-08 at 12:21 +0800, Ming Lei wrote:
>> The following kernel oops is triggered by 'removing scsi device' during
>> heavy IO.
>
> Is the below patch sufficient to fix this?
>
> Thanks,
>
> Bart.
>
>
> Subject: blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash
>
> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
> it is no longer safe to access cgroup information during or after the
> blk_cleanup_queue() call. Hence check earlier in generic_make_request()
> whether the queue has been marked as "dying".
The oops happens during generic_make_request_checks(), in
blk_throtl_bio() exactly.
So if we want to bypass dying queue, we have to check this before
generic_make_request_checks(), I think.
Thanks,
Joseph
> ---
> block/blk-core.c | 72 +++++++++++++++++++++++++++++---------------------------
> 1 file changed, 37 insertions(+), 35 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index aa8c99fae527..3ac9dd25e04e 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2385,10 +2385,21 @@ blk_qc_t generic_make_request(struct bio *bio)
> * yet.
> */
> struct bio_list bio_list_on_stack[2];
> + blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> + BLK_MQ_REQ_NOWAIT : 0;
> + struct request_queue *q = bio->bi_disk->queue;
> blk_qc_t ret = BLK_QC_T_NONE;
>
> if (!generic_make_request_checks(bio))
> - goto out;
> + return ret;
> +
> + if (blk_queue_enter(q, flags) < 0) {
> + if (unlikely(!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT)))
> + bio_wouldblock_error(bio);
> + else
> + bio_io_error(bio);
> + return ret;
> + }
>
> /*
> * We only want one ->make_request_fn to be active at a time, else
> @@ -2423,46 +2434,37 @@ blk_qc_t generic_make_request(struct bio *bio)
> bio_list_init(&bio_list_on_stack[0]);
> current->bio_list = bio_list_on_stack;
> do {
> - struct request_queue *q = bio->bi_disk->queue;
> - blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> - BLK_MQ_REQ_NOWAIT : 0;
> -
> - if (likely(blk_queue_enter(q, flags) == 0)) {
> - struct bio_list lower, same;
> -
> - /* Create a fresh bio_list for all subordinate requests */
> - bio_list_on_stack[1] = bio_list_on_stack[0];
> - bio_list_init(&bio_list_on_stack[0]);
> - ret = q->make_request_fn(q, bio);
> -
> - blk_queue_exit(q);
> -
> - /* sort new bios into those for a lower level
> - * and those for the same level
> - */
> - bio_list_init(&lower);
> - bio_list_init(&same);
> - while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
> - if (q == bio->bi_disk->queue)
> - bio_list_add(&same, bio);
> - else
> - bio_list_add(&lower, bio);
> - /* now assemble so we handle the lowest level first */
> - bio_list_merge(&bio_list_on_stack[0], &lower);
> - bio_list_merge(&bio_list_on_stack[0], &same);
> - bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
> - } else {
> - if (unlikely(!blk_queue_dying(q) &&
> - (bio->bi_opf & REQ_NOWAIT)))
> - bio_wouldblock_error(bio);
> + struct bio_list lower, same;
> +
> + WARN_ON_ONCE(!(flags & BLK_MQ_REQ_NOWAIT) &&
> + (bio->bi_opf & REQ_NOWAIT));
> + WARN_ON_ONCE(q != bio->bi_disk->queue);
> + q = bio->bi_disk->queue;
> + /* Create a fresh bio_list for all subordinate requests */
> + bio_list_on_stack[1] = bio_list_on_stack[0];
> + bio_list_init(&bio_list_on_stack[0]);
> + ret = q->make_request_fn(q, bio);
> +
> + /* sort new bios into those for a lower level
> + * and those for the same level
> + */
> + bio_list_init(&lower);
> + bio_list_init(&same);
> + while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
> + if (q == bio->bi_disk->queue)
> + bio_list_add(&same, bio);
> else
> - bio_io_error(bio);
> - }
> + bio_list_add(&lower, bio);
> + /* now assemble so we handle the lowest level first */
> + bio_list_merge(&bio_list_on_stack[0], &lower);
> + bio_list_merge(&bio_list_on_stack[0], &same);
> + bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
> bio = bio_list_pop(&bio_list_on_stack[0]);
> } while (bio);
> current->bio_list = NULL; /* deactivate */
>
> out:
> + blk_queue_exit(q);
> return ret;
> }
> EXPORT_SYMBOL(generic_make_request);
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-09 6:54 ` Joseph Qi
@ 2018-04-09 22:54 ` Bart Van Assche
2018-04-09 22:58 ` Jens Axboe
2018-04-10 1:30 ` Ming Lei
0 siblings, 2 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-09 22:54 UTC (permalink / raw)
To: axboe, joseph.qi, ming.lei; +Cc: linux-block
T24gTW9uLCAyMDE4LTA0LTA5IGF0IDE0OjU0ICswODAwLCBKb3NlcGggUWkgd3JvdGU6DQo+IFRo
ZSBvb3BzIGhhcHBlbnMgZHVyaW5nIGdlbmVyaWNfbWFrZV9yZXF1ZXN0X2NoZWNrcygpLCBpbg0K
PiBibGtfdGhyb3RsX2JpbygpIGV4YWN0bHkuDQo+IFNvIGlmIHdlIHdhbnQgdG8gYnlwYXNzIGR5
aW5nIHF1ZXVlLCB3ZSBoYXZlIHRvIGNoZWNrIHRoaXMgYmVmb3JlDQo+IGdlbmVyaWNfbWFrZV9y
ZXF1ZXN0X2NoZWNrcygpLCBJIHRoaW5rLg0KDQpIb3cgYWJvdXQgc29tZXRoaW5nIGxpa2UgdGhl
IHBhdGNoIGJlbG93Pw0KDQpUaGFua3MsDQoNCkJhcnQuDQoNClN1YmplY3Q6IFtQQVRDSF0gYmxr
LW1xOiBBdm9pZCB0aGF0IHN1Ym1pdHRpbmcgYSBiaW8gY29uY3VycmVudGx5IHdpdGggZGV2aWNl
DQogcmVtb3ZhbCB0cmlnZ2VycyBhIGNyYXNoDQoNCkJlY2F1c2UgYmxrY2dfZXhpdF9xdWV1ZSgp
IGlzIG5vdyBjYWxsZWQgZnJvbSBpbnNpZGUgYmxrX2NsZWFudXBfcXVldWUoKQ0KaXQgaXMgbm8g
bG9uZ2VyIHNhZmUgdG8gYWNjZXNzIGNncm91cCBpbmZvcm1hdGlvbiBkdXJpbmcgb3IgYWZ0ZXIg
dGhlDQpibGtfY2xlYW51cF9xdWV1ZSgpIGNhbGwuIEhlbmNlIHByb3RlY3QgdGhlIGdlbmVyaWNf
bWFrZV9yZXF1ZXN0X2NoZWNrcygpDQpjYWxsIHdpdGggYSBibGtfcXVldWVfZW50ZXIoKSAvIGJs
a19xdWV1ZV9leGl0KCkgcGFpci4NCg0KLS0tDQogYmxvY2svYmxrLWNvcmUuYyB8IDE3ICsrKysr
KysrKysrKysrKystDQogMSBmaWxlIGNoYW5nZWQsIDE2IGluc2VydGlvbnMoKyksIDEgZGVsZXRp
b24oLSkNCg0KZGlmZiAtLWdpdCBhL2Jsb2NrL2Jsay1jb3JlLmMgYi9ibG9jay9ibGstY29yZS5j
DQppbmRleCBkNjk4ODhmZjUyZjAuLjBjNDhiZWY4NDkwZiAxMDA2NDQNCi0tLSBhL2Jsb2NrL2Js
ay1jb3JlLmMNCisrKyBiL2Jsb2NrL2Jsay1jb3JlLmMNCkBAIC0yMzg4LDkgKzIzODgsMjQgQEAg
YmxrX3FjX3QgZ2VuZXJpY19tYWtlX3JlcXVlc3Qoc3RydWN0IGJpbyAqYmlvKQ0KIAkgKiB5ZXQu
DQogCSAqLw0KIAlzdHJ1Y3QgYmlvX2xpc3QgYmlvX2xpc3Rfb25fc3RhY2tbMl07DQorCWJsa19t
cV9yZXFfZmxhZ3NfdCBmbGFncyA9IGJpby0+Ymlfb3BmICYgUkVRX05PV0FJVCA/DQorCQlCTEtf
TVFfUkVRX05PV0FJVCA6IDA7DQorCXN0cnVjdCByZXF1ZXN0X3F1ZXVlICpxID0gYmlvLT5iaV9k
aXNrLT5xdWV1ZTsNCisJYm9vbCBjaGVja19yZXN1bHQ7DQogCWJsa19xY190IHJldCA9IEJMS19R
Q19UX05PTkU7DQogDQotCWlmICghZ2VuZXJpY19tYWtlX3JlcXVlc3RfY2hlY2tzKGJpbykpDQor
CWlmIChibGtfcXVldWVfZW50ZXIocSwgZmxhZ3MpIDwgMCkgew0KKwkJaWYgKCFibGtfcXVldWVf
ZHlpbmcocSkgJiYgKGJpby0+Ymlfb3BmICYgUkVRX05PV0FJVCkpDQorCQkJYmlvX3dvdWxkYmxv
Y2tfZXJyb3IoYmlvKTsNCisJCWVsc2UNCisJCQliaW9faW9fZXJyb3IoYmlvKTsNCisJCXJldHVy
biByZXQ7DQorCX0NCisNCisJY2hlY2tfcmVzdWx0ID0gZ2VuZXJpY19tYWtlX3JlcXVlc3RfY2hl
Y2tzKGJpbyk7DQorCWJsa19xdWV1ZV9leGl0KHEpOw0KKw0KKwlpZiAoIWNoZWNrX3Jlc3VsdCkN
CiAJCWdvdG8gb3V0Ow0KIA0KIAkvKg0KLS0gDQoyLjE2LjINCg==
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-09 22:54 ` Bart Van Assche
@ 2018-04-09 22:58 ` Jens Axboe
2018-04-09 23:07 ` Bart Van Assche
2018-04-10 1:30 ` Ming Lei
1 sibling, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2018-04-09 22:58 UTC (permalink / raw)
To: Bart Van Assche, joseph.qi, ming.lei; +Cc: linux-block
On 4/9/18 4:54 PM, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
>> The oops happens during generic_make_request_checks(), in
>> blk_throtl_bio() exactly.
>> So if we want to bypass dying queue, we have to check this before
>> generic_make_request_checks(), I think.
>
> How about something like the patch below?
>
> Thanks,
>
> Bart.
>
> Subject: [PATCH] blk-mq: Avoid that submitting a bio concurrently with device
> removal triggers a crash
>
> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
> it is no longer safe to access cgroup information during or after the
> blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
> call with a blk_queue_enter() / blk_queue_exit() pair.
>
> ---
> block/blk-core.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index d69888ff52f0..0c48bef8490f 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2388,9 +2388,24 @@ blk_qc_t generic_make_request(struct bio *bio)
> * yet.
> */
> struct bio_list bio_list_on_stack[2];
> + blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> + BLK_MQ_REQ_NOWAIT : 0;
> + struct request_queue *q = bio->bi_disk->queue;
> + bool check_result;
> blk_qc_t ret = BLK_QC_T_NONE;
>
> - if (!generic_make_request_checks(bio))
> + if (blk_queue_enter(q, flags) < 0) {
> + if (!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT))
> + bio_wouldblock_error(bio);
> + else
> + bio_io_error(bio);
> + return ret;
> + }
> +
> + check_result = generic_make_request_checks(bio);
> + blk_queue_exit(q);
This ends up being nutty in the generic_make_request() case, where we
do the exact same enter/exit logic right after. That needs to get unified.
Maybe move the queue enter into generic_make_request_checks(), and exit
in the caller?
--
Jens Axboe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-09 22:58 ` Jens Axboe
@ 2018-04-09 23:07 ` Bart Van Assche
0 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-09 23:07 UTC (permalink / raw)
To: ming.lei, axboe, joseph.qi; +Cc: linux-block
T24gTW9uLCAyMDE4LTA0LTA5IGF0IDE2OjU4IC0wNjAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBU
aGlzIGVuZHMgdXAgYmVpbmcgbnV0dHkgaW4gdGhlIGdlbmVyaWNfbWFrZV9yZXF1ZXN0KCkgY2Fz
ZSwgd2hlcmUgd2UNCj4gZG8gdGhlIGV4YWN0IHNhbWUgZW50ZXIvZXhpdCBsb2dpYyByaWdodCBh
ZnRlci4gVGhhdCBuZWVkcyB0byBnZXQgdW5pZmllZC4NCj4gTWF5YmUgbW92ZSB0aGUgcXVldWUg
ZW50ZXIgaW50byBnZW5lcmljX21ha2VfcmVxdWVzdF9jaGVja3MoKSwgYW5kIGV4aXQNCj4gaW4g
dGhlIGNhbGxlcj8NCg0KSGVsbG8gSmVucywNCg0KVGhlcmUgaXMgYSBjaGFsbGVuZ2U6IGdlbmVy
aWNfbWFrZV9yZXF1ZXN0KCkgc3VwcG9ydHMgYmlvIGNoYWlucyBpbiB3aGljaA0KZGlmZmVyZW50
IGJpbydzIGFwcGx5IHRvIGRpZmZlcmVudCByZXF1ZXN0IHF1ZXVlcyBhbmQgaXQgYWxzbyBzdXBw
b3J0IGJpbw0KY2hhaW5zIGluIHdoaWNoIHNvbWUgYmlvJ3MgaGF2ZSB0aGUgZmxhZyBSRVFfV0FJ
VCBzZXQgYW5kIG90aGVycyBub3QuIElzDQppdCBzYWZlIHRvIGRyb3AgdGhhdCBzdXBwb3J0Pw0K
DQpUaGFua3MsDQoNCkJhcnQuDQoNCg0K
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-09 22:54 ` Bart Van Assche
2018-04-09 22:58 ` Jens Axboe
@ 2018-04-10 1:30 ` Ming Lei
2018-04-10 1:34 ` Bart Van Assche
1 sibling, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-04-10 1:30 UTC (permalink / raw)
To: Bart Van Assche; +Cc: axboe, joseph.qi, linux-block
On Mon, Apr 09, 2018 at 10:54:57PM +0000, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
> > The oops happens during generic_make_request_checks(), in
> > blk_throtl_bio() exactly.
> > So if we want to bypass dying queue, we have to check this before
> > generic_make_request_checks(), I think.
>
> How about something like the patch below?
>
> Thanks,
>
> Bart.
>
> Subject: [PATCH] blk-mq: Avoid that submitting a bio concurrently with device
> removal triggers a crash
>
> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
> it is no longer safe to access cgroup information during or after the
> blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
> call with a blk_queue_enter() / blk_queue_exit() pair.
>
> ---
> block/blk-core.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index d69888ff52f0..0c48bef8490f 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2388,9 +2388,24 @@ blk_qc_t generic_make_request(struct bio *bio)
> * yet.
> */
> struct bio_list bio_list_on_stack[2];
> + blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> + BLK_MQ_REQ_NOWAIT : 0;
> + struct request_queue *q = bio->bi_disk->queue;
> + bool check_result;
> blk_qc_t ret = BLK_QC_T_NONE;
>
> - if (!generic_make_request_checks(bio))
> + if (blk_queue_enter(q, flags) < 0) {
The queue pointer need to be checked before calling blk_queue_enter
since the check is done in generic_make_request_checks().
Also is it possible to see queue freed here?
--
Ming
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [block regression] kernel oops triggered by removing scsi device dring IO
2018-04-10 1:30 ` Ming Lei
@ 2018-04-10 1:34 ` Bart Van Assche
0 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-10 1:34 UTC (permalink / raw)
To: ming.lei; +Cc: linux-block, axboe, joseph.qi
T24gVHVlLCAyMDE4LTA0LTEwIGF0IDA5OjMwICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gQWxz
byBpcyBpdCBwb3NzaWJsZSB0byBzZWUgcXVldWUgZnJlZWQgaGVyZT8NCg0KSSB0aGluayB0aGUg
Y2FsbGVyIHNob3VsZCBrZWVwIGEgcmVmZXJlbmNlIG9uIHRoZSByZXF1ZXN0IHF1ZXVlLiBPdGhl
cndpc2UNCndlIGhhdmUgYSBtdWNoIGJpZ2dlciBwcm9ibGVtIHRoYW4gYSByYWNlIGJldHdlZW4g
c3VibWl0dGluZyBhIGJpbyBhbmQNCnJlbW92aW5nIGEgcmVxdWVzdCBxdWV1ZSBmcm9tIHRoZSBj
Z3JvdXAgY29udHJvbGxlciBpbiBibGtfY2xlYW51cF9xdWV1ZSgpLg0KDQpCYXJ0Lg0KDQoNCg0K
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-04-10 1:34 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-08 4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
2018-04-08 8:11 ` Joseph Qi
2018-04-08 9:25 ` Ming Lei
2018-04-08 10:31 ` Ming Lei
2018-04-08 14:58 ` Bart Van Assche
2018-04-08 14:50 ` Bart Van Assche
2018-04-09 1:33 ` Joseph Qi
2018-04-09 2:48 ` Ming Lei
2018-04-09 4:47 ` Bart Van Assche
2018-04-09 6:54 ` Joseph Qi
2018-04-09 22:54 ` Bart Van Assche
2018-04-09 22:58 ` Jens Axboe
2018-04-09 23:07 ` Bart Van Assche
2018-04-10 1:30 ` Ming Lei
2018-04-10 1:34 ` Bart Van Assche
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.