All of lore.kernel.org
 help / color / mirror / Atom feed
* [block regression] kernel oops triggered by removing scsi device dring IO
@ 2018-04-08  4:21 Ming Lei
  2018-04-08  8:11 ` Joseph Qi
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Ming Lei @ 2018-04-08  4:21 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Joseph Qi

Hi,

The following kernel oops is triggered by 'removing scsi device' during
heavy IO.

'git bisect' shows that commit a063057d7c731cffa7d10740(block: Fix a race
between request queue removal and the block cgroup controller)
introduced this regression:

[   42.268257] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[   42.269339] PGD 26bd9f067 P4D 26bd9f067 PUD 26bfec067 PMD 0 
[   42.270077] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   42.270681] Dumping ftrace buffer:
[   42.271141]    (ftrace buffer empty)
[   42.271641] Modules linked in: scsi_debug iTCO_wdt iTCO_vendor_support crc32c_intel i2c_i801 i2c_core lpc_ich mfd_core usb_storage nvme shpchp nvme_core virtio_scsi qemu_fw_cfg ip_tables
[   42.273770] CPU: 5 PID: 1076 Comm: fio Not tainted 4.16.0+ #49
[   42.274530] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
[   42.275634] RIP: 0010:blk_throtl_bio+0x41/0x904
[   42.276225] RSP: 0018:ffffc900033cfaa0 EFLAGS: 00010246
[   42.276907] RAX: 0000000080000000 RBX: ffff8801bdcc5118 RCX: 0000000000000001
[   42.277818] RDX: ffff8801bdcc5118 RSI: 0000000000000000 RDI: ffff8802641f8870
[   42.278733] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffc900033cfb94
[   42.279651] R10: ffffc900033cfc00 R11: 0000000006ea0000 R12: ffff8802641f8870
[   42.280567] R13: ffff88026f34f000 R14: 0000000000000000 R15: ffff8801bdcc5118
[   42.281489] FS:  00007fc123922d40(0000) GS:ffff880272f40000(0000) knlGS:0000000000000000
[   42.282525] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   42.283270] CR2: 0000000000000028 CR3: 000000026d7ac004 CR4: 00000000007606e0
[   42.284194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   42.285116] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   42.286036] PKRU: 55555554
[   42.286393] Call Trace:
[   42.286725]  ? try_to_wake_up+0x3a3/0x3c9
[   42.287255]  ? blk_mq_hctx_notify_dead+0x135/0x135
[   42.287880]  ? gup_pud_range+0xb5/0x7e1
[   42.288381]  generic_make_request_checks+0x3cf/0x539
[   42.289027]  ? gup_pgd_range+0x8e/0xaa
[   42.289515]  generic_make_request+0x38/0x25b
[   42.290078]  ? submit_bio+0x103/0x11f
[   42.290555]  submit_bio+0x103/0x11f
[   42.291018]  ? bio_iov_iter_get_pages+0xe4/0x104
[   42.291620]  blkdev_direct_IO+0x2a3/0x3af
[   42.292151]  ? kiocb_free+0x34/0x34
[   42.292607]  ? ___preempt_schedule+0x16/0x18
[   42.293168]  ? preempt_schedule_common+0x4c/0x65
[   42.293771]  ? generic_file_read_iter+0x96/0x110
[   42.294377]  generic_file_read_iter+0x96/0x110
[   42.294962]  aio_read+0xca/0x13b
[   42.295388]  ? preempt_count_add+0x6d/0x8c
[   42.295926]  ? aio_read_events+0x287/0x2d6
[   42.296460]  ? do_io_submit+0x4d2/0x62c
[   42.296964]  do_io_submit+0x4d2/0x62c
[   42.297446]  ? do_syscall_64+0x9d/0x15e
[   42.297950]  do_syscall_64+0x9d/0x15e
[   42.298431]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   42.299090] RIP: 0033:0x7fc12244e687
[   42.299556] RSP: 002b:00007ffe18388a68 EFLAGS: 00000202 ORIG_RAX: 00000000000000d1
[   42.300528] RAX: ffffffffffffffda RBX: 00007fc0fde08670 RCX: 00007fc12244e687
[   42.301442] RDX: 0000000001d1b388 RSI: 0000000000000001 RDI: 00007fc123782000
[   42.302359] RBP: 00000000000022d8 R08: 0000000000000001 R09: 0000000001c461e0
[   42.303275] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fc0fde08670
[   42.304195] R13: 0000000000000000 R14: 0000000001d1d0c0 R15: 0000000001b872f0
[   42.305117] Code: 48 85 f6 48 89 7c 24 10 75 0e 48 8b b7 b8 05 00 00 31 ed 48 85 f6 74 0f 48 63 05 75 a4 e4 00 48 8b ac c6 28 02 00 00 f6 43 15 02 <48> 8b 45 28 48 89 04 24 0f 85 28 08 00 00 8b 43 10 45 31 e4 83 
[   42.307553] RIP: blk_throtl_bio+0x41/0x904 RSP: ffffc900033cfaa0
[   42.308328] CR2: 0000000000000028
[   42.308920] ---[ end trace f53a144979f63b29 ]---
[   42.309520] Kernel panic - not syncing: Fatal exception
[   42.310635] Dumping ftrace buffer:
[   42.311087]    (ftrace buffer empty)
[   42.311583] Kernel Offset: disabled
[   42.312163] ---[ end Kernel panic - not syncing: Fatal exception ]---

-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08  4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
@ 2018-04-08  8:11 ` Joseph Qi
  2018-04-08  9:25   ` Ming Lei
  2018-04-08 14:58   ` Bart Van Assche
  2018-04-08 14:50 ` Bart Van Assche
  2018-04-09  4:47 ` Bart Van Assche
  2 siblings, 2 replies; 15+ messages in thread
From: Joseph Qi @ 2018-04-08  8:11 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, Bart Van Assche; +Cc: linux-block

This is because scsi_remove_device() will call blk_cleanup_queue(), and
then all blkgs have been destroyed and root_blkg is NULL.
Thus tg is NULL and trigger NULL pointer dereference when get td from
tg (tg->td).
It seems that we cannot simply move blkcg_exit_queue() up to
blk_cleanup_queue().

Thanks,
Joseph

On 18/4/8 12:21, Ming Lei wrote:
> Hi,
> 
> The following kernel oops is triggered by 'removing scsi device' during
> heavy IO.
> 
> 'git bisect' shows that commit a063057d7c731cffa7d10740(block: Fix a race
> between request queue removal and the block cgroup controller)
> introduced this regression:
> 
> [   42.268257] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> [   42.269339] PGD 26bd9f067 P4D 26bd9f067 PUD 26bfec067 PMD 0 
> [   42.270077] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [   42.270681] Dumping ftrace buffer:
> [   42.271141]    (ftrace buffer empty)
> [   42.271641] Modules linked in: scsi_debug iTCO_wdt iTCO_vendor_support crc32c_intel i2c_i801 i2c_core lpc_ich mfd_core usb_storage nvme shpchp nvme_core virtio_scsi qemu_fw_cfg ip_tables
> [   42.273770] CPU: 5 PID: 1076 Comm: fio Not tainted 4.16.0+ #49
> [   42.274530] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
> [   42.275634] RIP: 0010:blk_throtl_bio+0x41/0x904
> [   42.276225] RSP: 0018:ffffc900033cfaa0 EFLAGS: 00010246
> [   42.276907] RAX: 0000000080000000 RBX: ffff8801bdcc5118 RCX: 0000000000000001
> [   42.277818] RDX: ffff8801bdcc5118 RSI: 0000000000000000 RDI: ffff8802641f8870
> [   42.278733] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffc900033cfb94
> [   42.279651] R10: ffffc900033cfc00 R11: 0000000006ea0000 R12: ffff8802641f8870
> [   42.280567] R13: ffff88026f34f000 R14: 0000000000000000 R15: ffff8801bdcc5118
> [   42.281489] FS:  00007fc123922d40(0000) GS:ffff880272f40000(0000) knlGS:0000000000000000
> [   42.282525] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   42.283270] CR2: 0000000000000028 CR3: 000000026d7ac004 CR4: 00000000007606e0
> [   42.284194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   42.285116] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   42.286036] PKRU: 55555554
> [   42.286393] Call Trace:
> [   42.286725]  ? try_to_wake_up+0x3a3/0x3c9
> [   42.287255]  ? blk_mq_hctx_notify_dead+0x135/0x135
> [   42.287880]  ? gup_pud_range+0xb5/0x7e1
> [   42.288381]  generic_make_request_checks+0x3cf/0x539
> [   42.289027]  ? gup_pgd_range+0x8e/0xaa
> [   42.289515]  generic_make_request+0x38/0x25b
> [   42.290078]  ? submit_bio+0x103/0x11f
> [   42.290555]  submit_bio+0x103/0x11f
> [   42.291018]  ? bio_iov_iter_get_pages+0xe4/0x104
> [   42.291620]  blkdev_direct_IO+0x2a3/0x3af
> [   42.292151]  ? kiocb_free+0x34/0x34
> [   42.292607]  ? ___preempt_schedule+0x16/0x18
> [   42.293168]  ? preempt_schedule_common+0x4c/0x65
> [   42.293771]  ? generic_file_read_iter+0x96/0x110
> [   42.294377]  generic_file_read_iter+0x96/0x110
> [   42.294962]  aio_read+0xca/0x13b
> [   42.295388]  ? preempt_count_add+0x6d/0x8c
> [   42.295926]  ? aio_read_events+0x287/0x2d6
> [   42.296460]  ? do_io_submit+0x4d2/0x62c
> [   42.296964]  do_io_submit+0x4d2/0x62c
> [   42.297446]  ? do_syscall_64+0x9d/0x15e
> [   42.297950]  do_syscall_64+0x9d/0x15e
> [   42.298431]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [   42.299090] RIP: 0033:0x7fc12244e687
> [   42.299556] RSP: 002b:00007ffe18388a68 EFLAGS: 00000202 ORIG_RAX: 00000000000000d1
> [   42.300528] RAX: ffffffffffffffda RBX: 00007fc0fde08670 RCX: 00007fc12244e687
> [   42.301442] RDX: 0000000001d1b388 RSI: 0000000000000001 RDI: 00007fc123782000
> [   42.302359] RBP: 00000000000022d8 R08: 0000000000000001 R09: 0000000001c461e0
> [   42.303275] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fc0fde08670
> [   42.304195] R13: 0000000000000000 R14: 0000000001d1d0c0 R15: 0000000001b872f0
> [   42.305117] Code: 48 85 f6 48 89 7c 24 10 75 0e 48 8b b7 b8 05 00 00 31 ed 48 85 f6 74 0f 48 63 05 75 a4 e4 00 48 8b ac c6 28 02 00 00 f6 43 15 02 <48> 8b 45 28 48 89 04 24 0f 85 28 08 00 00 8b 43 10 45 31 e4 83 
> [   42.307553] RIP: blk_throtl_bio+0x41/0x904 RSP: ffffc900033cfaa0
> [   42.308328] CR2: 0000000000000028
> [   42.308920] ---[ end trace f53a144979f63b29 ]---
> [   42.309520] Kernel panic - not syncing: Fatal exception
> [   42.310635] Dumping ftrace buffer:
> [   42.311087]    (ftrace buffer empty)
> [   42.311583] Kernel Offset: disabled
> [   42.312163] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08  8:11 ` Joseph Qi
@ 2018-04-08  9:25   ` Ming Lei
  2018-04-08 10:31     ` Ming Lei
  2018-04-08 14:58   ` Bart Van Assche
  1 sibling, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-04-08  9:25 UTC (permalink / raw)
  To: Joseph Qi; +Cc: Jens Axboe, Bart Van Assche, linux-block

On Sun, Apr 08, 2018 at 04:11:51PM +0800, Joseph Qi wrote:
> This is because scsi_remove_device() will call blk_cleanup_queue(), and
> then all blkgs have been destroyed and root_blkg is NULL.
> Thus tg is NULL and trigger NULL pointer dereference when get td from
> tg (tg->td).
> It seems that we cannot simply move blkcg_exit_queue() up to
> blk_cleanup_queue().

Maybe one per-queue blkcg should be introduced, which seems reasonable
too.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08  9:25   ` Ming Lei
@ 2018-04-08 10:31     ` Ming Lei
  0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2018-04-08 10:31 UTC (permalink / raw)
  To: Joseph Qi; +Cc: Jens Axboe, Bart Van Assche, linux-block

On Sun, Apr 08, 2018 at 05:25:42PM +0800, Ming Lei wrote:
> On Sun, Apr 08, 2018 at 04:11:51PM +0800, Joseph Qi wrote:
> > This is because scsi_remove_device() will call blk_cleanup_queue(), and
> > then all blkgs have been destroyed and root_blkg is NULL.
> > Thus tg is NULL and trigger NULL pointer dereference when get td from
> > tg (tg->td).
> > It seems that we cannot simply move blkcg_exit_queue() up to
> > blk_cleanup_queue().
> 
> Maybe one per-queue blkcg should be introduced, which seems reasonable
> too.

Sorry, I mean one per-queue blkcg lock.

-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08  4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
  2018-04-08  8:11 ` Joseph Qi
@ 2018-04-08 14:50 ` Bart Van Assche
  2018-04-09  1:33   ` Joseph Qi
  2018-04-09  4:47 ` Bart Van Assche
  2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2018-04-08 14:50 UTC (permalink / raw)
  To: ming.lei, axboe; +Cc: linux-block, joseph.qi

T24gU3VuLCAyMDE4LTA0LTA4IGF0IDEyOjIxICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gVGhl
IGZvbGxvd2luZyBrZXJuZWwgb29wcyBpcyB0cmlnZ2VyZWQgYnkgJ3JlbW92aW5nIHNjc2kgZGV2
aWNlJyBkdXJpbmcNCj4gaGVhdnkgSU8uDQoNCkhvdyBkaWQgeW91IHRyaWdnZXIgdGhpcyBvb3Bz
Pw0KDQpCYXJ0Lg0KDQoNCg0KDQo=

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08  8:11 ` Joseph Qi
  2018-04-08  9:25   ` Ming Lei
@ 2018-04-08 14:58   ` Bart Van Assche
  1 sibling, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-08 14:58 UTC (permalink / raw)
  To: axboe, joseph.qi, ming.lei; +Cc: linux-block

T24gU3VuLCAyMDE4LTA0LTA4IGF0IDE2OjExICswODAwLCBKb3NlcGggUWkgd3JvdGU6DQo+IFRo
aXMgaXMgYmVjYXVzZSBzY3NpX3JlbW92ZV9kZXZpY2UoKSB3aWxsIGNhbGwgYmxrX2NsZWFudXBf
cXVldWUoKSwgYW5kDQo+IHRoZW4gYWxsIGJsa2dzIGhhdmUgYmVlbiBkZXN0cm95ZWQgYW5kIHJv
b3RfYmxrZyBpcyBOVUxMLg0KPiBUaHVzIHRnIGlzIE5VTEwgYW5kIHRyaWdnZXIgTlVMTCBwb2lu
dGVyIGRlcmVmZXJlbmNlIHdoZW4gZ2V0IHRkIGZyb20NCj4gdGcgKHRnLT50ZCkuDQo+IEl0IHNl
ZW1zIHRoYXQgd2UgY2Fubm90IHNpbXBseSBtb3ZlIGJsa2NnX2V4aXRfcXVldWUoKSB1cCB0bw0K
PiBibGtfY2xlYW51cF9xdWV1ZSgpLg0KDQpIYWQgeW91IGNvbnNpZGVyZWQgdG8gYWRkIGEgYmxr
X3F1ZXVlX2VudGVyKCkgLyBibGtfcXVldWVfZXhpdCgpIHBhaXIgaW4NCmdlbmVyaWNfbWFrZV9y
ZXF1ZXN0KCk/IGJsa19xdWV1ZV9lbnRlcigpIG5hbWVseSBjaGVja3MgdGhlIERZSU5HIGZsYWcu
DQoNClRoYW5rcywNCg0KQmFydC4NCg0KDQo=

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08 14:50 ` Bart Van Assche
@ 2018-04-09  1:33   ` Joseph Qi
  2018-04-09  2:48     ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi @ 2018-04-09  1:33 UTC (permalink / raw)
  To: Bart Van Assche, ming.lei, axboe; +Cc: linux-block

Hi Bart,

On 18/4/8 22:50, Bart Van Assche wrote:
> On Sun, 2018-04-08 at 12:21 +0800, Ming Lei wrote:
>> The following kernel oops is triggered by 'removing scsi device' during
>> heavy IO.
> 
> How did you trigger this oops?
> 

I can reproduce this oops by the following steps:
1) start a fio job with buffered write;
2) remove the scsi device fio write to:
echo "scsi remove-single-device ${dev}" > /proc/scsi/scsi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-09  1:33   ` Joseph Qi
@ 2018-04-09  2:48     ` Ming Lei
  0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2018-04-09  2:48 UTC (permalink / raw)
  To: Joseph Qi; +Cc: Bart Van Assche, axboe, linux-block

On Mon, Apr 09, 2018 at 09:33:08AM +0800, Joseph Qi wrote:
> Hi Bart,
> 
> On 18/4/8 22:50, Bart Van Assche wrote:
> > On Sun, 2018-04-08 at 12:21 +0800, Ming Lei wrote:
> >> The following kernel oops is triggered by 'removing scsi device' during
> >> heavy IO.
> > 
> > How did you trigger this oops?
> > 
> 
> I can reproduce this oops by the following steps:
> 1) start a fio job with buffered write;
> 2) remove the scsi device fio write to:
> echo "scsi remove-single-device ${dev}" > /proc/scsi/scsi

Yeah, it can be reproduced easily, and I usually remove scsi
device via 'echo 1 > /sys/block/sda/device/delete'

Thanks,
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-08  4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
  2018-04-08  8:11 ` Joseph Qi
  2018-04-08 14:50 ` Bart Van Assche
@ 2018-04-09  4:47 ` Bart Van Assche
  2018-04-09  6:54   ` Joseph Qi
  2 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2018-04-09  4:47 UTC (permalink / raw)
  To: ming.lei, axboe; +Cc: linux-block, joseph.qi

T24gU3VuLCAyMDE4LTA0LTA4IGF0IDEyOjIxICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gVGhl
IGZvbGxvd2luZyBrZXJuZWwgb29wcyBpcyB0cmlnZ2VyZWQgYnkgJ3JlbW92aW5nIHNjc2kgZGV2
aWNlJyBkdXJpbmcNCj4gaGVhdnkgSU8uDQoNCklzIHRoZSBiZWxvdyBwYXRjaCBzdWZmaWNpZW50
IHRvIGZpeCB0aGlzPw0KDQpUaGFua3MsDQoNCkJhcnQuDQoNCg0KU3ViamVjdDogYmxrLW1xOiBB
dm9pZCB0aGF0IHN1Ym1pdHRpbmcgYSBiaW8gY29uY3VycmVudGx5IHdpdGggZGV2aWNlIHJlbW92
YWwgdHJpZ2dlcnMgYSBjcmFzaA0KDQpCZWNhdXNlIGJsa2NnX2V4aXRfcXVldWUoKSBpcyBub3cg
Y2FsbGVkIGZyb20gaW5zaWRlIGJsa19jbGVhbnVwX3F1ZXVlKCkNCml0IGlzIG5vIGxvbmdlciBz
YWZlIHRvIGFjY2VzcyBjZ3JvdXAgaW5mb3JtYXRpb24gZHVyaW5nIG9yIGFmdGVyIHRoZQ0KYmxr
X2NsZWFudXBfcXVldWUoKSBjYWxsLiBIZW5jZSBjaGVjayBlYXJsaWVyIGluIGdlbmVyaWNfbWFr
ZV9yZXF1ZXN0KCkNCndoZXRoZXIgdGhlIHF1ZXVlIGhhcyBiZWVuIG1hcmtlZCBhcyAiZHlpbmci
Lg0KLS0tDQogYmxvY2svYmxrLWNvcmUuYyB8IDcyICsrKysrKysrKysrKysrKysrKysrKysrKysr
KysrLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQogMSBmaWxlIGNoYW5nZWQsIDM3IGluc2Vy
dGlvbnMoKyksIDM1IGRlbGV0aW9ucygtKQ0KDQpkaWZmIC0tZ2l0IGEvYmxvY2svYmxrLWNvcmUu
YyBiL2Jsb2NrL2Jsay1jb3JlLmMNCmluZGV4IGFhOGM5OWZhZTUyNy4uM2FjOWRkMjVlMDRlIDEw
MDY0NA0KLS0tIGEvYmxvY2svYmxrLWNvcmUuYw0KKysrIGIvYmxvY2svYmxrLWNvcmUuYw0KQEAg
LTIzODUsMTAgKzIzODUsMjEgQEAgYmxrX3FjX3QgZ2VuZXJpY19tYWtlX3JlcXVlc3Qoc3RydWN0
IGJpbyAqYmlvKQ0KIAkgKiB5ZXQuDQogCSAqLw0KIAlzdHJ1Y3QgYmlvX2xpc3QgYmlvX2xpc3Rf
b25fc3RhY2tbMl07DQorCWJsa19tcV9yZXFfZmxhZ3NfdCBmbGFncyA9IGJpby0+Ymlfb3BmICYg
UkVRX05PV0FJVCA/DQorCQlCTEtfTVFfUkVRX05PV0FJVCA6IDA7DQorCXN0cnVjdCByZXF1ZXN0
X3F1ZXVlICpxID0gYmlvLT5iaV9kaXNrLT5xdWV1ZTsNCiAJYmxrX3FjX3QgcmV0ID0gQkxLX1FD
X1RfTk9ORTsNCiANCiAJaWYgKCFnZW5lcmljX21ha2VfcmVxdWVzdF9jaGVja3MoYmlvKSkNCi0J
CWdvdG8gb3V0Ow0KKwkJcmV0dXJuIHJldDsNCisNCisJaWYgKGJsa19xdWV1ZV9lbnRlcihxLCBm
bGFncykgPCAwKSB7DQorCQlpZiAodW5saWtlbHkoIWJsa19xdWV1ZV9keWluZyhxKSAmJiAoYmlv
LT5iaV9vcGYgJiBSRVFfTk9XQUlUKSkpDQorCQkJYmlvX3dvdWxkYmxvY2tfZXJyb3IoYmlvKTsN
CisJCWVsc2UNCisJCQliaW9faW9fZXJyb3IoYmlvKTsNCisJCXJldHVybiByZXQ7DQorCX0NCiAN
CiAJLyoNCiAJICogV2Ugb25seSB3YW50IG9uZSAtPm1ha2VfcmVxdWVzdF9mbiB0byBiZSBhY3Rp
dmUgYXQgYSB0aW1lLCBlbHNlDQpAQCAtMjQyMyw0NiArMjQzNCwzNyBAQCBibGtfcWNfdCBnZW5l
cmljX21ha2VfcmVxdWVzdChzdHJ1Y3QgYmlvICpiaW8pDQogCWJpb19saXN0X2luaXQoJmJpb19s
aXN0X29uX3N0YWNrWzBdKTsNCiAJY3VycmVudC0+YmlvX2xpc3QgPSBiaW9fbGlzdF9vbl9zdGFj
azsNCiAJZG8gew0KLQkJc3RydWN0IHJlcXVlc3RfcXVldWUgKnEgPSBiaW8tPmJpX2Rpc2stPnF1
ZXVlOw0KLQkJYmxrX21xX3JlcV9mbGFnc190IGZsYWdzID0gYmlvLT5iaV9vcGYgJiBSRVFfTk9X
QUlUID8NCi0JCQlCTEtfTVFfUkVRX05PV0FJVCA6IDA7DQotDQotCQlpZiAobGlrZWx5KGJsa19x
dWV1ZV9lbnRlcihxLCBmbGFncykgPT0gMCkpIHsNCi0JCQlzdHJ1Y3QgYmlvX2xpc3QgbG93ZXIs
IHNhbWU7DQotDQotCQkJLyogQ3JlYXRlIGEgZnJlc2ggYmlvX2xpc3QgZm9yIGFsbCBzdWJvcmRp
bmF0ZSByZXF1ZXN0cyAqLw0KLQkJCWJpb19saXN0X29uX3N0YWNrWzFdID0gYmlvX2xpc3Rfb25f
c3RhY2tbMF07DQotCQkJYmlvX2xpc3RfaW5pdCgmYmlvX2xpc3Rfb25fc3RhY2tbMF0pOw0KLQkJ
CXJldCA9IHEtPm1ha2VfcmVxdWVzdF9mbihxLCBiaW8pOw0KLQ0KLQkJCWJsa19xdWV1ZV9leGl0
KHEpOw0KLQ0KLQkJCS8qIHNvcnQgbmV3IGJpb3MgaW50byB0aG9zZSBmb3IgYSBsb3dlciBsZXZl
bA0KLQkJCSAqIGFuZCB0aG9zZSBmb3IgdGhlIHNhbWUgbGV2ZWwNCi0JCQkgKi8NCi0JCQliaW9f
bGlzdF9pbml0KCZsb3dlcik7DQotCQkJYmlvX2xpc3RfaW5pdCgmc2FtZSk7DQotCQkJd2hpbGUg
KChiaW8gPSBiaW9fbGlzdF9wb3AoJmJpb19saXN0X29uX3N0YWNrWzBdKSkgIT0gTlVMTCkNCi0J
CQkJaWYgKHEgPT0gYmlvLT5iaV9kaXNrLT5xdWV1ZSkNCi0JCQkJCWJpb19saXN0X2FkZCgmc2Ft
ZSwgYmlvKTsNCi0JCQkJZWxzZQ0KLQkJCQkJYmlvX2xpc3RfYWRkKCZsb3dlciwgYmlvKTsNCi0J
CQkvKiBub3cgYXNzZW1ibGUgc28gd2UgaGFuZGxlIHRoZSBsb3dlc3QgbGV2ZWwgZmlyc3QgKi8N
Ci0JCQliaW9fbGlzdF9tZXJnZSgmYmlvX2xpc3Rfb25fc3RhY2tbMF0sICZsb3dlcik7DQotCQkJ
YmlvX2xpc3RfbWVyZ2UoJmJpb19saXN0X29uX3N0YWNrWzBdLCAmc2FtZSk7DQotCQkJYmlvX2xp
c3RfbWVyZ2UoJmJpb19saXN0X29uX3N0YWNrWzBdLCAmYmlvX2xpc3Rfb25fc3RhY2tbMV0pOw0K
LQkJfSBlbHNlIHsNCi0JCQlpZiAodW5saWtlbHkoIWJsa19xdWV1ZV9keWluZyhxKSAmJg0KLQkJ
CQkJKGJpby0+Ymlfb3BmICYgUkVRX05PV0FJVCkpKQ0KLQkJCQliaW9fd291bGRibG9ja19lcnJv
cihiaW8pOw0KKwkJc3RydWN0IGJpb19saXN0IGxvd2VyLCBzYW1lOw0KKw0KKwkJV0FSTl9PTl9P
TkNFKCEoZmxhZ3MgJiBCTEtfTVFfUkVRX05PV0FJVCkgJiYNCisJCQkgICAgIChiaW8tPmJpX29w
ZiAmIFJFUV9OT1dBSVQpKTsNCisJCVdBUk5fT05fT05DRShxICE9IGJpby0+YmlfZGlzay0+cXVl
dWUpOw0KKwkJcSA9IGJpby0+YmlfZGlzay0+cXVldWU7DQorCQkvKiBDcmVhdGUgYSBmcmVzaCBi
aW9fbGlzdCBmb3IgYWxsIHN1Ym9yZGluYXRlIHJlcXVlc3RzICovDQorCQliaW9fbGlzdF9vbl9z
dGFja1sxXSA9IGJpb19saXN0X29uX3N0YWNrWzBdOw0KKwkJYmlvX2xpc3RfaW5pdCgmYmlvX2xp
c3Rfb25fc3RhY2tbMF0pOw0KKwkJcmV0ID0gcS0+bWFrZV9yZXF1ZXN0X2ZuKHEsIGJpbyk7DQor
DQorCQkvKiBzb3J0IG5ldyBiaW9zIGludG8gdGhvc2UgZm9yIGEgbG93ZXIgbGV2ZWwNCisJCSAq
IGFuZCB0aG9zZSBmb3IgdGhlIHNhbWUgbGV2ZWwNCisJCSAqLw0KKwkJYmlvX2xpc3RfaW5pdCgm
bG93ZXIpOw0KKwkJYmlvX2xpc3RfaW5pdCgmc2FtZSk7DQorCQl3aGlsZSAoKGJpbyA9IGJpb19s
aXN0X3BvcCgmYmlvX2xpc3Rfb25fc3RhY2tbMF0pKSAhPSBOVUxMKQ0KKwkJCWlmIChxID09IGJp
by0+YmlfZGlzay0+cXVldWUpDQorCQkJCWJpb19saXN0X2FkZCgmc2FtZSwgYmlvKTsNCiAJCQll
bHNlDQotCQkJCWJpb19pb19lcnJvcihiaW8pOw0KLQkJfQ0KKwkJCQliaW9fbGlzdF9hZGQoJmxv
d2VyLCBiaW8pOw0KKwkJLyogbm93IGFzc2VtYmxlIHNvIHdlIGhhbmRsZSB0aGUgbG93ZXN0IGxl
dmVsIGZpcnN0ICovDQorCQliaW9fbGlzdF9tZXJnZSgmYmlvX2xpc3Rfb25fc3RhY2tbMF0sICZs
b3dlcik7DQorCQliaW9fbGlzdF9tZXJnZSgmYmlvX2xpc3Rfb25fc3RhY2tbMF0sICZzYW1lKTsN
CisJCWJpb19saXN0X21lcmdlKCZiaW9fbGlzdF9vbl9zdGFja1swXSwgJmJpb19saXN0X29uX3N0
YWNrWzFdKTsNCiAJCWJpbyA9IGJpb19saXN0X3BvcCgmYmlvX2xpc3Rfb25fc3RhY2tbMF0pOw0K
IAl9IHdoaWxlIChiaW8pOw0KIAljdXJyZW50LT5iaW9fbGlzdCA9IE5VTEw7IC8qIGRlYWN0aXZh
dGUgKi8NCiANCiBvdXQ6DQorCWJsa19xdWV1ZV9leGl0KHEpOw0KIAlyZXR1cm4gcmV0Ow0KIH0N
CiBFWFBPUlRfU1lNQk9MKGdlbmVyaWNfbWFrZV9yZXF1ZXN0KTsNCi0tIA0KMi4xNi4yDQo=

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-09  4:47 ` Bart Van Assche
@ 2018-04-09  6:54   ` Joseph Qi
  2018-04-09 22:54     ` Bart Van Assche
  0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi @ 2018-04-09  6:54 UTC (permalink / raw)
  To: Bart Van Assche, ming.lei, axboe; +Cc: linux-block

Hi Bart,

On 18/4/9 12:47, Bart Van Assche wrote:
> On Sun, 2018-04-08 at 12:21 +0800, Ming Lei wrote:
>> The following kernel oops is triggered by 'removing scsi device' during
>> heavy IO.
> 
> Is the below patch sufficient to fix this?
> 
> Thanks,
> 
> Bart.
> 
> 
> Subject: blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash
> 
> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
> it is no longer safe to access cgroup information during or after the
> blk_cleanup_queue() call. Hence check earlier in generic_make_request()
> whether the queue has been marked as "dying".

The oops happens during generic_make_request_checks(), in
blk_throtl_bio() exactly.
So if we want to bypass dying queue, we have to check this before
generic_make_request_checks(), I think.

Thanks,
Joseph

> ---
>  block/blk-core.c | 72 +++++++++++++++++++++++++++++---------------------------
>  1 file changed, 37 insertions(+), 35 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index aa8c99fae527..3ac9dd25e04e 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2385,10 +2385,21 @@ blk_qc_t generic_make_request(struct bio *bio)
>  	 * yet.
>  	 */
>  	struct bio_list bio_list_on_stack[2];
> +	blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> +		BLK_MQ_REQ_NOWAIT : 0;
> +	struct request_queue *q = bio->bi_disk->queue;
>  	blk_qc_t ret = BLK_QC_T_NONE;
>  
>  	if (!generic_make_request_checks(bio))
> -		goto out;
> +		return ret;
> +
> +	if (blk_queue_enter(q, flags) < 0) {
> +		if (unlikely(!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT)))
> +			bio_wouldblock_error(bio);
> +		else
> +			bio_io_error(bio);
> +		return ret;
> +	}
>  
>  	/*
>  	 * We only want one ->make_request_fn to be active at a time, else
> @@ -2423,46 +2434,37 @@ blk_qc_t generic_make_request(struct bio *bio)
>  	bio_list_init(&bio_list_on_stack[0]);
>  	current->bio_list = bio_list_on_stack;
>  	do {
> -		struct request_queue *q = bio->bi_disk->queue;
> -		blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> -			BLK_MQ_REQ_NOWAIT : 0;
> -
> -		if (likely(blk_queue_enter(q, flags) == 0)) {
> -			struct bio_list lower, same;
> -
> -			/* Create a fresh bio_list for all subordinate requests */
> -			bio_list_on_stack[1] = bio_list_on_stack[0];
> -			bio_list_init(&bio_list_on_stack[0]);
> -			ret = q->make_request_fn(q, bio);
> -
> -			blk_queue_exit(q);
> -
> -			/* sort new bios into those for a lower level
> -			 * and those for the same level
> -			 */
> -			bio_list_init(&lower);
> -			bio_list_init(&same);
> -			while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
> -				if (q == bio->bi_disk->queue)
> -					bio_list_add(&same, bio);
> -				else
> -					bio_list_add(&lower, bio);
> -			/* now assemble so we handle the lowest level first */
> -			bio_list_merge(&bio_list_on_stack[0], &lower);
> -			bio_list_merge(&bio_list_on_stack[0], &same);
> -			bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
> -		} else {
> -			if (unlikely(!blk_queue_dying(q) &&
> -					(bio->bi_opf & REQ_NOWAIT)))
> -				bio_wouldblock_error(bio);
> +		struct bio_list lower, same;
> +
> +		WARN_ON_ONCE(!(flags & BLK_MQ_REQ_NOWAIT) &&
> +			     (bio->bi_opf & REQ_NOWAIT));
> +		WARN_ON_ONCE(q != bio->bi_disk->queue);
> +		q = bio->bi_disk->queue;
> +		/* Create a fresh bio_list for all subordinate requests */
> +		bio_list_on_stack[1] = bio_list_on_stack[0];
> +		bio_list_init(&bio_list_on_stack[0]);
> +		ret = q->make_request_fn(q, bio);
> +
> +		/* sort new bios into those for a lower level
> +		 * and those for the same level
> +		 */
> +		bio_list_init(&lower);
> +		bio_list_init(&same);
> +		while ((bio = bio_list_pop(&bio_list_on_stack[0])) != NULL)
> +			if (q == bio->bi_disk->queue)
> +				bio_list_add(&same, bio);
>  			else
> -				bio_io_error(bio);
> -		}
> +				bio_list_add(&lower, bio);
> +		/* now assemble so we handle the lowest level first */
> +		bio_list_merge(&bio_list_on_stack[0], &lower);
> +		bio_list_merge(&bio_list_on_stack[0], &same);
> +		bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
>  		bio = bio_list_pop(&bio_list_on_stack[0]);
>  	} while (bio);
>  	current->bio_list = NULL; /* deactivate */
>  
>  out:
> +	blk_queue_exit(q);
>  	return ret;
>  }
>  EXPORT_SYMBOL(generic_make_request);
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-09  6:54   ` Joseph Qi
@ 2018-04-09 22:54     ` Bart Van Assche
  2018-04-09 22:58       ` Jens Axboe
  2018-04-10  1:30       ` Ming Lei
  0 siblings, 2 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-09 22:54 UTC (permalink / raw)
  To: axboe, joseph.qi, ming.lei; +Cc: linux-block

T24gTW9uLCAyMDE4LTA0LTA5IGF0IDE0OjU0ICswODAwLCBKb3NlcGggUWkgd3JvdGU6DQo+IFRo
ZSBvb3BzIGhhcHBlbnMgZHVyaW5nIGdlbmVyaWNfbWFrZV9yZXF1ZXN0X2NoZWNrcygpLCBpbg0K
PiBibGtfdGhyb3RsX2JpbygpIGV4YWN0bHkuDQo+IFNvIGlmIHdlIHdhbnQgdG8gYnlwYXNzIGR5
aW5nIHF1ZXVlLCB3ZSBoYXZlIHRvIGNoZWNrIHRoaXMgYmVmb3JlDQo+IGdlbmVyaWNfbWFrZV9y
ZXF1ZXN0X2NoZWNrcygpLCBJIHRoaW5rLg0KDQpIb3cgYWJvdXQgc29tZXRoaW5nIGxpa2UgdGhl
IHBhdGNoIGJlbG93Pw0KDQpUaGFua3MsDQoNCkJhcnQuDQoNClN1YmplY3Q6IFtQQVRDSF0gYmxr
LW1xOiBBdm9pZCB0aGF0IHN1Ym1pdHRpbmcgYSBiaW8gY29uY3VycmVudGx5IHdpdGggZGV2aWNl
DQogcmVtb3ZhbCB0cmlnZ2VycyBhIGNyYXNoDQoNCkJlY2F1c2UgYmxrY2dfZXhpdF9xdWV1ZSgp
IGlzIG5vdyBjYWxsZWQgZnJvbSBpbnNpZGUgYmxrX2NsZWFudXBfcXVldWUoKQ0KaXQgaXMgbm8g
bG9uZ2VyIHNhZmUgdG8gYWNjZXNzIGNncm91cCBpbmZvcm1hdGlvbiBkdXJpbmcgb3IgYWZ0ZXIg
dGhlDQpibGtfY2xlYW51cF9xdWV1ZSgpIGNhbGwuIEhlbmNlIHByb3RlY3QgdGhlIGdlbmVyaWNf
bWFrZV9yZXF1ZXN0X2NoZWNrcygpDQpjYWxsIHdpdGggYSBibGtfcXVldWVfZW50ZXIoKSAvIGJs
a19xdWV1ZV9leGl0KCkgcGFpci4NCg0KLS0tDQogYmxvY2svYmxrLWNvcmUuYyB8IDE3ICsrKysr
KysrKysrKysrKystDQogMSBmaWxlIGNoYW5nZWQsIDE2IGluc2VydGlvbnMoKyksIDEgZGVsZXRp
b24oLSkNCg0KZGlmZiAtLWdpdCBhL2Jsb2NrL2Jsay1jb3JlLmMgYi9ibG9jay9ibGstY29yZS5j
DQppbmRleCBkNjk4ODhmZjUyZjAuLjBjNDhiZWY4NDkwZiAxMDA2NDQNCi0tLSBhL2Jsb2NrL2Js
ay1jb3JlLmMNCisrKyBiL2Jsb2NrL2Jsay1jb3JlLmMNCkBAIC0yMzg4LDkgKzIzODgsMjQgQEAg
YmxrX3FjX3QgZ2VuZXJpY19tYWtlX3JlcXVlc3Qoc3RydWN0IGJpbyAqYmlvKQ0KIAkgKiB5ZXQu
DQogCSAqLw0KIAlzdHJ1Y3QgYmlvX2xpc3QgYmlvX2xpc3Rfb25fc3RhY2tbMl07DQorCWJsa19t
cV9yZXFfZmxhZ3NfdCBmbGFncyA9IGJpby0+Ymlfb3BmICYgUkVRX05PV0FJVCA/DQorCQlCTEtf
TVFfUkVRX05PV0FJVCA6IDA7DQorCXN0cnVjdCByZXF1ZXN0X3F1ZXVlICpxID0gYmlvLT5iaV9k
aXNrLT5xdWV1ZTsNCisJYm9vbCBjaGVja19yZXN1bHQ7DQogCWJsa19xY190IHJldCA9IEJMS19R
Q19UX05PTkU7DQogDQotCWlmICghZ2VuZXJpY19tYWtlX3JlcXVlc3RfY2hlY2tzKGJpbykpDQor
CWlmIChibGtfcXVldWVfZW50ZXIocSwgZmxhZ3MpIDwgMCkgew0KKwkJaWYgKCFibGtfcXVldWVf
ZHlpbmcocSkgJiYgKGJpby0+Ymlfb3BmICYgUkVRX05PV0FJVCkpDQorCQkJYmlvX3dvdWxkYmxv
Y2tfZXJyb3IoYmlvKTsNCisJCWVsc2UNCisJCQliaW9faW9fZXJyb3IoYmlvKTsNCisJCXJldHVy
biByZXQ7DQorCX0NCisNCisJY2hlY2tfcmVzdWx0ID0gZ2VuZXJpY19tYWtlX3JlcXVlc3RfY2hl
Y2tzKGJpbyk7DQorCWJsa19xdWV1ZV9leGl0KHEpOw0KKw0KKwlpZiAoIWNoZWNrX3Jlc3VsdCkN
CiAJCWdvdG8gb3V0Ow0KIA0KIAkvKg0KLS0gDQoyLjE2LjINCg==

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-09 22:54     ` Bart Van Assche
@ 2018-04-09 22:58       ` Jens Axboe
  2018-04-09 23:07         ` Bart Van Assche
  2018-04-10  1:30       ` Ming Lei
  1 sibling, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2018-04-09 22:58 UTC (permalink / raw)
  To: Bart Van Assche, joseph.qi, ming.lei; +Cc: linux-block

On 4/9/18 4:54 PM, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
>> The oops happens during generic_make_request_checks(), in
>> blk_throtl_bio() exactly.
>> So if we want to bypass dying queue, we have to check this before
>> generic_make_request_checks(), I think.
> 
> How about something like the patch below?
> 
> Thanks,
> 
> Bart.
> 
> Subject: [PATCH] blk-mq: Avoid that submitting a bio concurrently with device
>  removal triggers a crash
> 
> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
> it is no longer safe to access cgroup information during or after the
> blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
> call with a blk_queue_enter() / blk_queue_exit() pair.
> 
> ---
>  block/blk-core.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index d69888ff52f0..0c48bef8490f 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2388,9 +2388,24 @@ blk_qc_t generic_make_request(struct bio *bio)
>  	 * yet.
>  	 */
>  	struct bio_list bio_list_on_stack[2];
> +	blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> +		BLK_MQ_REQ_NOWAIT : 0;
> +	struct request_queue *q = bio->bi_disk->queue;
> +	bool check_result;
>  	blk_qc_t ret = BLK_QC_T_NONE;
>  
> -	if (!generic_make_request_checks(bio))
> +	if (blk_queue_enter(q, flags) < 0) {
> +		if (!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT))
> +			bio_wouldblock_error(bio);
> +		else
> +			bio_io_error(bio);
> +		return ret;
> +	}
> +
> +	check_result = generic_make_request_checks(bio);
> +	blk_queue_exit(q);

This ends up being nutty in the generic_make_request() case, where we
do the exact same enter/exit logic right after. That needs to get unified.
Maybe move the queue enter into generic_make_request_checks(), and exit
in the caller?

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-09 22:58       ` Jens Axboe
@ 2018-04-09 23:07         ` Bart Van Assche
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-09 23:07 UTC (permalink / raw)
  To: ming.lei, axboe, joseph.qi; +Cc: linux-block

T24gTW9uLCAyMDE4LTA0LTA5IGF0IDE2OjU4IC0wNjAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBU
aGlzIGVuZHMgdXAgYmVpbmcgbnV0dHkgaW4gdGhlIGdlbmVyaWNfbWFrZV9yZXF1ZXN0KCkgY2Fz
ZSwgd2hlcmUgd2UNCj4gZG8gdGhlIGV4YWN0IHNhbWUgZW50ZXIvZXhpdCBsb2dpYyByaWdodCBh
ZnRlci4gVGhhdCBuZWVkcyB0byBnZXQgdW5pZmllZC4NCj4gTWF5YmUgbW92ZSB0aGUgcXVldWUg
ZW50ZXIgaW50byBnZW5lcmljX21ha2VfcmVxdWVzdF9jaGVja3MoKSwgYW5kIGV4aXQNCj4gaW4g
dGhlIGNhbGxlcj8NCg0KSGVsbG8gSmVucywNCg0KVGhlcmUgaXMgYSBjaGFsbGVuZ2U6IGdlbmVy
aWNfbWFrZV9yZXF1ZXN0KCkgc3VwcG9ydHMgYmlvIGNoYWlucyBpbiB3aGljaA0KZGlmZmVyZW50
IGJpbydzIGFwcGx5IHRvIGRpZmZlcmVudCByZXF1ZXN0IHF1ZXVlcyBhbmQgaXQgYWxzbyBzdXBw
b3J0IGJpbw0KY2hhaW5zIGluIHdoaWNoIHNvbWUgYmlvJ3MgaGF2ZSB0aGUgZmxhZyBSRVFfV0FJ
VCBzZXQgYW5kIG90aGVycyBub3QuIElzDQppdCBzYWZlIHRvIGRyb3AgdGhhdCBzdXBwb3J0Pw0K
DQpUaGFua3MsDQoNCkJhcnQuDQoNCg0K

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-09 22:54     ` Bart Van Assche
  2018-04-09 22:58       ` Jens Axboe
@ 2018-04-10  1:30       ` Ming Lei
  2018-04-10  1:34         ` Bart Van Assche
  1 sibling, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-04-10  1:30 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: axboe, joseph.qi, linux-block

On Mon, Apr 09, 2018 at 10:54:57PM +0000, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
> > The oops happens during generic_make_request_checks(), in
> > blk_throtl_bio() exactly.
> > So if we want to bypass dying queue, we have to check this before
> > generic_make_request_checks(), I think.
> 
> How about something like the patch below?
> 
> Thanks,
> 
> Bart.
> 
> Subject: [PATCH] blk-mq: Avoid that submitting a bio concurrently with device
>  removal triggers a crash
> 
> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
> it is no longer safe to access cgroup information during or after the
> blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
> call with a blk_queue_enter() / blk_queue_exit() pair.
> 
> ---
>  block/blk-core.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index d69888ff52f0..0c48bef8490f 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2388,9 +2388,24 @@ blk_qc_t generic_make_request(struct bio *bio)
>  	 * yet.
>  	 */
>  	struct bio_list bio_list_on_stack[2];
> +	blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ?
> +		BLK_MQ_REQ_NOWAIT : 0;
> +	struct request_queue *q = bio->bi_disk->queue;
> +	bool check_result;
>  	blk_qc_t ret = BLK_QC_T_NONE;
>  
> -	if (!generic_make_request_checks(bio))
> +	if (blk_queue_enter(q, flags) < 0) {

The queue pointer need to be checked before calling blk_queue_enter
since the check is done in generic_make_request_checks().

Also is it possible to see queue freed here?

-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [block regression] kernel oops triggered by removing scsi device dring IO
  2018-04-10  1:30       ` Ming Lei
@ 2018-04-10  1:34         ` Bart Van Assche
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2018-04-10  1:34 UTC (permalink / raw)
  To: ming.lei; +Cc: linux-block, axboe, joseph.qi

T24gVHVlLCAyMDE4LTA0LTEwIGF0IDA5OjMwICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gQWxz
byBpcyBpdCBwb3NzaWJsZSB0byBzZWUgcXVldWUgZnJlZWQgaGVyZT8NCg0KSSB0aGluayB0aGUg
Y2FsbGVyIHNob3VsZCBrZWVwIGEgcmVmZXJlbmNlIG9uIHRoZSByZXF1ZXN0IHF1ZXVlLiBPdGhl
cndpc2UNCndlIGhhdmUgYSBtdWNoIGJpZ2dlciBwcm9ibGVtIHRoYW4gYSByYWNlIGJldHdlZW4g
c3VibWl0dGluZyBhIGJpbyBhbmQNCnJlbW92aW5nIGEgcmVxdWVzdCBxdWV1ZSBmcm9tIHRoZSBj
Z3JvdXAgY29udHJvbGxlciBpbiBibGtfY2xlYW51cF9xdWV1ZSgpLg0KDQpCYXJ0Lg0KDQoNCg0K

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-04-10  1:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-08  4:21 [block regression] kernel oops triggered by removing scsi device dring IO Ming Lei
2018-04-08  8:11 ` Joseph Qi
2018-04-08  9:25   ` Ming Lei
2018-04-08 10:31     ` Ming Lei
2018-04-08 14:58   ` Bart Van Assche
2018-04-08 14:50 ` Bart Van Assche
2018-04-09  1:33   ` Joseph Qi
2018-04-09  2:48     ` Ming Lei
2018-04-09  4:47 ` Bart Van Assche
2018-04-09  6:54   ` Joseph Qi
2018-04-09 22:54     ` Bart Van Assche
2018-04-09 22:58       ` Jens Axboe
2018-04-09 23:07         ` Bart Van Assche
2018-04-10  1:30       ` Ming Lei
2018-04-10  1:34         ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.