* kernel oops with blk-mq-sched latest
@ 2017-01-17 8:57 Hannes Reinecke
2017-01-17 12:47 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2017-01-17 8:57 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, device-mapper development
Hi Jens,
I gave your latest patchset from
git.kernel.dk/linux-block blk-mq-sched
I see a kernel oops when shutting down:
[ 2132.708929] systemd-shutdown[1]: Detaching DM devices.
[ 2132.965107] BUG: unable to handle kernel NULL pointer dereference at
00000000
00000001
[ 2133.037182] IP: dd_merged_requests+0x6/0x60
[ 2133.077816] PGD 0
[ 2133.077818]
[ 2133.113087] Oops: 0000 [#1] SMP
[ list of modules removed ]
[ 2133.925265] CPU: 20 PID: 1 Comm: systemd-shutdow Not tainted
4.10.0-rc4+ #543
[ 2133.990034] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 09/18/2013
[ 2134.050522] task: ffff88042d614040 task.stack: ffffc90003150000
[ 2134.106915] RIP: 0010:dd_merged_requests+0x6/0x60
[ 2134.150593] RSP: 0018:ffffc90003153b18 EFLAGS: 00010002
[ 2134.198740] RAX: ffffffff81cc6de0 RBX: ffff8804296d5040 RCX:
0000000000000001
[ 2134.262708] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
ffff8804296d5040
[ 2134.326987] RBP: ffffc90003153b30 R08: 0000000000000000 R09:
0000000000000000
[ 2134.391054] R10: 0000000000000000 R11: 0001f8180001f815 R12:
0000000000000000
[ 2134.456095] R13: ffff8804296d5040 R14: ffff8804099801f0 R15:
0000000000000004
[ 2134.521196] FS: 00007fd64d3bf840(0000) GS:ffff88042f900000(0000)
knlGS:0000000000000000
[ 2134.595178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2134.648637] CR2: 0000000000000001 CR3: 000000081b892000 CR4:
00000000000406e0
[ 2134.713349] Call Trace:
[ 2134.737168] ? elv_drain_elevator+0x29/0xa0
[ 2134.775821] __blk_drain_queue+0x52/0x1a0
[ 2134.812473] blk_queue_bypass_start+0x6e/0xa0
[ 2134.854009] blkcg_deactivate_policy+0x30/0xf0
[ 2134.894993] blk_throtl_exit+0x34/0x50
[ 2134.929450] blkcg_exit_queue+0x35/0x40
[ 2134.965089] blk_release_queue+0x33/0xe0
[ 2135.001364] kobject_cleanup+0x63/0x170
[ 2135.037412] kobject_put+0x25/0x50
[ 2135.068622] blk_cleanup_queue+0x198/0x260
[ 2135.107780] cleanup_mapped_device+0xb5/0xf0 [dm_mod]
[ 2135.154741] __dm_destroy+0x1a5/0x290 [dm_mod]
[ 2135.195009] dm_destroy+0x13/0x20 [dm_mod]
[ 2135.232149] dev_remove+0xde/0x120 [dm_mod]
[ 2135.270184] ? dev_suspend+0x210/0x210 [dm_mod]
[ 2135.311478] ctl_ioctl+0x20b/0x510 [dm_mod]
[ 2135.349680] ? terminate_walk+0xc3/0x140
[ 2135.385442] ? kmem_cache_free+0x10/0x260
[ 2135.422315] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Known issue?
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel oops with blk-mq-sched latest
2017-01-17 8:57 kernel oops with blk-mq-sched latest Hannes Reinecke
@ 2017-01-17 12:47 ` Jens Axboe
0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2017-01-17 12:47 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: linux-block, Christoph Hellwig, device-mapper development
On 01/17/2017 12:57 AM, Hannes Reinecke wrote:
> Hi Jens,
>
> I gave your latest patchset from
>
> git.kernel.dk/linux-block blk-mq-sched
>
> I see a kernel oops when shutting down:
>
> [ 2132.708929] systemd-shutdown[1]: Detaching DM devices.
> [ 2132.965107] BUG: unable to handle kernel NULL pointer dereference at
> 00000000
> 00000001
> [ 2133.037182] IP: dd_merged_requests+0x6/0x60
> [ 2133.077816] PGD 0
> [ 2133.077818]
> [ 2133.113087] Oops: 0000 [#1] SMP
> [ list of modules removed ]
> [ 2133.925265] CPU: 20 PID: 1 Comm: systemd-shutdow Not tainted
> 4.10.0-rc4+ #543
> [ 2133.990034] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 09/18/2013
> [ 2134.050522] task: ffff88042d614040 task.stack: ffffc90003150000
> [ 2134.106915] RIP: 0010:dd_merged_requests+0x6/0x60
> [ 2134.150593] RSP: 0018:ffffc90003153b18 EFLAGS: 00010002
> [ 2134.198740] RAX: ffffffff81cc6de0 RBX: ffff8804296d5040 RCX:
> 0000000000000001
> [ 2134.262708] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
> ffff8804296d5040
> [ 2134.326987] RBP: ffffc90003153b30 R08: 0000000000000000 R09:
> 0000000000000000
> [ 2134.391054] R10: 0000000000000000 R11: 0001f8180001f815 R12:
> 0000000000000000
> [ 2134.456095] R13: ffff8804296d5040 R14: ffff8804099801f0 R15:
> 0000000000000004
> [ 2134.521196] FS: 00007fd64d3bf840(0000) GS:ffff88042f900000(0000)
> knlGS:0000000000000000
> [ 2134.595178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2134.648637] CR2: 0000000000000001 CR3: 000000081b892000 CR4:
> 00000000000406e0
> [ 2134.713349] Call Trace:
> [ 2134.737168] ? elv_drain_elevator+0x29/0xa0
> [ 2134.775821] __blk_drain_queue+0x52/0x1a0
> [ 2134.812473] blk_queue_bypass_start+0x6e/0xa0
> [ 2134.854009] blkcg_deactivate_policy+0x30/0xf0
> [ 2134.894993] blk_throtl_exit+0x34/0x50
> [ 2134.929450] blkcg_exit_queue+0x35/0x40
> [ 2134.965089] blk_release_queue+0x33/0xe0
> [ 2135.001364] kobject_cleanup+0x63/0x170
> [ 2135.037412] kobject_put+0x25/0x50
> [ 2135.068622] blk_cleanup_queue+0x198/0x260
> [ 2135.107780] cleanup_mapped_device+0xb5/0xf0 [dm_mod]
> [ 2135.154741] __dm_destroy+0x1a5/0x290 [dm_mod]
> [ 2135.195009] dm_destroy+0x13/0x20 [dm_mod]
> [ 2135.232149] dev_remove+0xde/0x120 [dm_mod]
> [ 2135.270184] ? dev_suspend+0x210/0x210 [dm_mod]
> [ 2135.311478] ctl_ioctl+0x20b/0x510 [dm_mod]
> [ 2135.349680] ? terminate_walk+0xc3/0x140
> [ 2135.385442] ? kmem_cache_free+0x10/0x260
> [ 2135.422315] dm_ctl_ioctl+0x13/0x20 [dm_mod]
>
> Known issue?
Try with this, looks like we're calling the old bypass path for the new
path, which is now triggering because q->elevator is true.
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8ba0af780e88..1dd14c5b637a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1284,7 +1284,12 @@ void blkcg_deactivate_policy(struct request_queue *q,
if (!blkcg_policy_enabled(q, pol))
return;
- blk_queue_bypass_start(q);
+ if (q->mq_ops) {
+ blk_mq_freeze_queue(q);
+ blk_mq_quiesce_queue(q);
+ } else
+ blk_queue_bypass_start(q);
+
spin_lock_irq(q->queue_lock);
__clear_bit(pol->plid, q->blkcg_pols);
@@ -1304,7 +1309,11 @@ void blkcg_deactivate_policy(struct request_queue *q,
}
spin_unlock_irq(q->queue_lock);
- blk_queue_bypass_end(q);
+
+ if (q->mq_ops)
+ blk_mq_unfreeze_queue(q);
+ else
+ blk_queue_bypass_end(q);
}
EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
diff --git a/block/elevator.c b/block/elevator.c
index d14cb87e6564..464372840774 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -613,6 +613,9 @@ void elv_drain_elevator(struct request_queue *q)
{
static int printed;
+ if (WARN_ON_ONCE(q->mq_ops))
+ return;
+
lockdep_assert_held(q->queue_lock);
while (q->elevator->type->ops.sq.elevator_dispatch_fn(q, 1))
--
Jens Axboe
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: kernel oops with blk-mq-sched latest
@ 2017-01-17 12:47 ` Jens Axboe
0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2017-01-17 12:47 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: linux-block, device-mapper development, Christoph Hellwig
On 01/17/2017 12:57 AM, Hannes Reinecke wrote:
> Hi Jens,
>
> I gave your latest patchset from
>
> git.kernel.dk/linux-block blk-mq-sched
>
> I see a kernel oops when shutting down:
>
> [ 2132.708929] systemd-shutdown[1]: Detaching DM devices.
> [ 2132.965107] BUG: unable to handle kernel NULL pointer dereference at
> 00000000
> 00000001
> [ 2133.037182] IP: dd_merged_requests+0x6/0x60
> [ 2133.077816] PGD 0
> [ 2133.077818]
> [ 2133.113087] Oops: 0000 [#1] SMP
> [ list of modules removed ]
> [ 2133.925265] CPU: 20 PID: 1 Comm: systemd-shutdow Not tainted
> 4.10.0-rc4+ #543
> [ 2133.990034] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 09/18/2013
> [ 2134.050522] task: ffff88042d614040 task.stack: ffffc90003150000
> [ 2134.106915] RIP: 0010:dd_merged_requests+0x6/0x60
> [ 2134.150593] RSP: 0018:ffffc90003153b18 EFLAGS: 00010002
> [ 2134.198740] RAX: ffffffff81cc6de0 RBX: ffff8804296d5040 RCX:
> 0000000000000001
> [ 2134.262708] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
> ffff8804296d5040
> [ 2134.326987] RBP: ffffc90003153b30 R08: 0000000000000000 R09:
> 0000000000000000
> [ 2134.391054] R10: 0000000000000000 R11: 0001f8180001f815 R12:
> 0000000000000000
> [ 2134.456095] R13: ffff8804296d5040 R14: ffff8804099801f0 R15:
> 0000000000000004
> [ 2134.521196] FS: 00007fd64d3bf840(0000) GS:ffff88042f900000(0000)
> knlGS:0000000000000000
> [ 2134.595178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2134.648637] CR2: 0000000000000001 CR3: 000000081b892000 CR4:
> 00000000000406e0
> [ 2134.713349] Call Trace:
> [ 2134.737168] ? elv_drain_elevator+0x29/0xa0
> [ 2134.775821] __blk_drain_queue+0x52/0x1a0
> [ 2134.812473] blk_queue_bypass_start+0x6e/0xa0
> [ 2134.854009] blkcg_deactivate_policy+0x30/0xf0
> [ 2134.894993] blk_throtl_exit+0x34/0x50
> [ 2134.929450] blkcg_exit_queue+0x35/0x40
> [ 2134.965089] blk_release_queue+0x33/0xe0
> [ 2135.001364] kobject_cleanup+0x63/0x170
> [ 2135.037412] kobject_put+0x25/0x50
> [ 2135.068622] blk_cleanup_queue+0x198/0x260
> [ 2135.107780] cleanup_mapped_device+0xb5/0xf0 [dm_mod]
> [ 2135.154741] __dm_destroy+0x1a5/0x290 [dm_mod]
> [ 2135.195009] dm_destroy+0x13/0x20 [dm_mod]
> [ 2135.232149] dev_remove+0xde/0x120 [dm_mod]
> [ 2135.270184] ? dev_suspend+0x210/0x210 [dm_mod]
> [ 2135.311478] ctl_ioctl+0x20b/0x510 [dm_mod]
> [ 2135.349680] ? terminate_walk+0xc3/0x140
> [ 2135.385442] ? kmem_cache_free+0x10/0x260
> [ 2135.422315] dm_ctl_ioctl+0x13/0x20 [dm_mod]
>
> Known issue?
Try with this, looks like we're calling the old bypass path for the new
path, which is now triggering because q->elevator is true.
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8ba0af780e88..1dd14c5b637a 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1284,7 +1284,12 @@ void blkcg_deactivate_policy(struct request_queue *q,
if (!blkcg_policy_enabled(q, pol))
return;
- blk_queue_bypass_start(q);
+ if (q->mq_ops) {
+ blk_mq_freeze_queue(q);
+ blk_mq_quiesce_queue(q);
+ } else
+ blk_queue_bypass_start(q);
+
spin_lock_irq(q->queue_lock);
__clear_bit(pol->plid, q->blkcg_pols);
@@ -1304,7 +1309,11 @@ void blkcg_deactivate_policy(struct request_queue *q,
}
spin_unlock_irq(q->queue_lock);
- blk_queue_bypass_end(q);
+
+ if (q->mq_ops)
+ blk_mq_unfreeze_queue(q);
+ else
+ blk_queue_bypass_end(q);
}
EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
diff --git a/block/elevator.c b/block/elevator.c
index d14cb87e6564..464372840774 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -613,6 +613,9 @@ void elv_drain_elevator(struct request_queue *q)
{
static int printed;
+ if (WARN_ON_ONCE(q->mq_ops))
+ return;
+
lockdep_assert_held(q->queue_lock);
while (q->elevator->type->ops.sq.elevator_dispatch_fn(q, 1))
--
Jens Axboe
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: kernel oops with blk-mq-sched latest
2017-01-17 12:47 ` Jens Axboe
(?)
@ 2017-01-17 13:00 ` Jens Axboe
2017-01-18 11:48 ` Hannes Reinecke
-1 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2017-01-17 13:00 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: linux-block, Christoph Hellwig, device-mapper development
On 01/17/2017 04:47 AM, Jens Axboe wrote:
> On 01/17/2017 12:57 AM, Hannes Reinecke wrote:
>> Hi Jens,
>>
>> I gave your latest patchset from
>>
>> git.kernel.dk/linux-block blk-mq-sched
>>
>> I see a kernel oops when shutting down:
>>
>> [ 2132.708929] systemd-shutdown[1]: Detaching DM devices.
>> [ 2132.965107] BUG: unable to handle kernel NULL pointer dereference at
>> 00000000
>> 00000001
>> [ 2133.037182] IP: dd_merged_requests+0x6/0x60
>> [ 2133.077816] PGD 0
>> [ 2133.077818]
>> [ 2133.113087] Oops: 0000 [#1] SMP
>> [ list of modules removed ]
>> [ 2133.925265] CPU: 20 PID: 1 Comm: systemd-shutdow Not tainted
>> 4.10.0-rc4+ #543
>> [ 2133.990034] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 09/18/2013
>> [ 2134.050522] task: ffff88042d614040 task.stack: ffffc90003150000
>> [ 2134.106915] RIP: 0010:dd_merged_requests+0x6/0x60
>> [ 2134.150593] RSP: 0018:ffffc90003153b18 EFLAGS: 00010002
>> [ 2134.198740] RAX: ffffffff81cc6de0 RBX: ffff8804296d5040 RCX:
>> 0000000000000001
>> [ 2134.262708] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
>> ffff8804296d5040
>> [ 2134.326987] RBP: ffffc90003153b30 R08: 0000000000000000 R09:
>> 0000000000000000
>> [ 2134.391054] R10: 0000000000000000 R11: 0001f8180001f815 R12:
>> 0000000000000000
>> [ 2134.456095] R13: ffff8804296d5040 R14: ffff8804099801f0 R15:
>> 0000000000000004
>> [ 2134.521196] FS: 00007fd64d3bf840(0000) GS:ffff88042f900000(0000)
>> knlGS:0000000000000000
>> [ 2134.595178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 2134.648637] CR2: 0000000000000001 CR3: 000000081b892000 CR4:
>> 00000000000406e0
>> [ 2134.713349] Call Trace:
>> [ 2134.737168] ? elv_drain_elevator+0x29/0xa0
>> [ 2134.775821] __blk_drain_queue+0x52/0x1a0
>> [ 2134.812473] blk_queue_bypass_start+0x6e/0xa0
>> [ 2134.854009] blkcg_deactivate_policy+0x30/0xf0
>> [ 2134.894993] blk_throtl_exit+0x34/0x50
>> [ 2134.929450] blkcg_exit_queue+0x35/0x40
>> [ 2134.965089] blk_release_queue+0x33/0xe0
>> [ 2135.001364] kobject_cleanup+0x63/0x170
>> [ 2135.037412] kobject_put+0x25/0x50
>> [ 2135.068622] blk_cleanup_queue+0x198/0x260
>> [ 2135.107780] cleanup_mapped_device+0xb5/0xf0 [dm_mod]
>> [ 2135.154741] __dm_destroy+0x1a5/0x290 [dm_mod]
>> [ 2135.195009] dm_destroy+0x13/0x20 [dm_mod]
>> [ 2135.232149] dev_remove+0xde/0x120 [dm_mod]
>> [ 2135.270184] ? dev_suspend+0x210/0x210 [dm_mod]
>> [ 2135.311478] ctl_ioctl+0x20b/0x510 [dm_mod]
>> [ 2135.349680] ? terminate_walk+0xc3/0x140
>> [ 2135.385442] ? kmem_cache_free+0x10/0x260
>> [ 2135.422315] dm_ctl_ioctl+0x13/0x20 [dm_mod]
>>
>> Known issue?
>
> Try with this, looks like we're calling the old bypass path for the new
> path, which is now triggering because q->elevator is true.
Should have grepped, there's one more path that uses the old bypass
path for q->mq_ops, potentially. This one handles that one, too.
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 8ba0af780e88..2630f64bed19 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1223,7 +1223,11 @@ int blkcg_activate_policy(struct request_queue *q,
if (blkcg_policy_enabled(q, pol))
return 0;
- blk_queue_bypass_start(q);
+ if (q->mq_ops) {
+ blk_mq_freeze_queue(q);
+ blk_mq_quiesce_queue(q);
+ } else
+ blk_queue_bypass_start(q);
pd_prealloc:
if (!pd_prealloc) {
pd_prealloc = pol->pd_alloc_fn(GFP_KERNEL, q->node);
@@ -1261,7 +1265,10 @@ int blkcg_activate_policy(struct request_queue *q,
spin_unlock_irq(q->queue_lock);
out_bypass_end:
- blk_queue_bypass_end(q);
+ if (q->mq_ops)
+ blk_mq_unfreeze_queue(q);
+ else
+ blk_queue_bypass_end(q);
if (pd_prealloc)
pol->pd_free_fn(pd_prealloc);
return ret;
@@ -1284,7 +1291,12 @@ void blkcg_deactivate_policy(struct request_queue *q,
if (!blkcg_policy_enabled(q, pol))
return;
- blk_queue_bypass_start(q);
+ if (q->mq_ops) {
+ blk_mq_freeze_queue(q);
+ blk_mq_quiesce_queue(q);
+ } else
+ blk_queue_bypass_start(q);
+
spin_lock_irq(q->queue_lock);
__clear_bit(pol->plid, q->blkcg_pols);
@@ -1304,7 +1316,11 @@ void blkcg_deactivate_policy(struct request_queue *q,
}
spin_unlock_irq(q->queue_lock);
- blk_queue_bypass_end(q);
+
+ if (q->mq_ops)
+ blk_mq_unfreeze_queue(q);
+ else
+ blk_queue_bypass_end(q);
}
EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
diff --git a/block/elevator.c b/block/elevator.c
index d14cb87e6564..464372840774 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -613,6 +613,9 @@ void elv_drain_elevator(struct request_queue *q)
{
static int printed;
+ if (WARN_ON_ONCE(q->mq_ops))
+ return;
+
lockdep_assert_held(q->queue_lock);
while (q->elevator->type->ops.sq.elevator_dispatch_fn(q, 1))
--
Jens Axboe
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: kernel oops with blk-mq-sched latest
2017-01-17 13:00 ` Jens Axboe
@ 2017-01-18 11:48 ` Hannes Reinecke
0 siblings, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2017-01-18 11:48 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, device-mapper development
On 01/17/2017 02:00 PM, Jens Axboe wrote:
> On 01/17/2017 04:47 AM, Jens Axboe wrote:
>> On 01/17/2017 12:57 AM, Hannes Reinecke wrote:
>>> Hi Jens,
>>>
>>> I gave your latest patchset from
>>>
>>> git.kernel.dk/linux-block blk-mq-sched
>>>
>>> I see a kernel oops when shutting down:
>>>
>>> [ 2132.708929] systemd-shutdown[1]: Detaching DM devices.
>>> [ 2132.965107] BUG: unable to handle kernel NULL pointer dereference at
>>> 00000000
>>> 00000001
>>> [ 2133.037182] IP: dd_merged_requests+0x6/0x60
>>> [ 2133.077816] PGD 0
>>> [ 2133.077818]
>>> [ 2133.113087] Oops: 0000 [#1] SMP
>>> [ list of modules removed ]
>>> [ 2133.925265] CPU: 20 PID: 1 Comm: systemd-shutdow Not tainted
>>> 4.10.0-rc4+ #543
>>> [ 2133.990034] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 09/18/2013
>>> [ 2134.050522] task: ffff88042d614040 task.stack: ffffc90003150000
>>> [ 2134.106915] RIP: 0010:dd_merged_requests+0x6/0x60
>>> [ 2134.150593] RSP: 0018:ffffc90003153b18 EFLAGS: 00010002
>>> [ 2134.198740] RAX: ffffffff81cc6de0 RBX: ffff8804296d5040 RCX:
>>> 0000000000000001
>>> [ 2134.262708] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
>>> ffff8804296d5040
>>> [ 2134.326987] RBP: ffffc90003153b30 R08: 0000000000000000 R09:
>>> 0000000000000000
>>> [ 2134.391054] R10: 0000000000000000 R11: 0001f8180001f815 R12:
>>> 0000000000000000
>>> [ 2134.456095] R13: ffff8804296d5040 R14: ffff8804099801f0 R15:
>>> 0000000000000004
>>> [ 2134.521196] FS: 00007fd64d3bf840(0000) GS:ffff88042f900000(0000)
>>> knlGS:0000000000000000
>>> [ 2134.595178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 2134.648637] CR2: 0000000000000001 CR3: 000000081b892000 CR4:
>>> 00000000000406e0
>>> [ 2134.713349] Call Trace:
>>> [ 2134.737168] ? elv_drain_elevator+0x29/0xa0
>>> [ 2134.775821] __blk_drain_queue+0x52/0x1a0
>>> [ 2134.812473] blk_queue_bypass_start+0x6e/0xa0
>>> [ 2134.854009] blkcg_deactivate_policy+0x30/0xf0
>>> [ 2134.894993] blk_throtl_exit+0x34/0x50
>>> [ 2134.929450] blkcg_exit_queue+0x35/0x40
>>> [ 2134.965089] blk_release_queue+0x33/0xe0
>>> [ 2135.001364] kobject_cleanup+0x63/0x170
>>> [ 2135.037412] kobject_put+0x25/0x50
>>> [ 2135.068622] blk_cleanup_queue+0x198/0x260
>>> [ 2135.107780] cleanup_mapped_device+0xb5/0xf0 [dm_mod]
>>> [ 2135.154741] __dm_destroy+0x1a5/0x290 [dm_mod]
>>> [ 2135.195009] dm_destroy+0x13/0x20 [dm_mod]
>>> [ 2135.232149] dev_remove+0xde/0x120 [dm_mod]
>>> [ 2135.270184] ? dev_suspend+0x210/0x210 [dm_mod]
>>> [ 2135.311478] ctl_ioctl+0x20b/0x510 [dm_mod]
>>> [ 2135.349680] ? terminate_walk+0xc3/0x140
>>> [ 2135.385442] ? kmem_cache_free+0x10/0x260
>>> [ 2135.422315] dm_ctl_ioctl+0x13/0x20 [dm_mod]
>>>
>>> Known issue?
>>
>> Try with this, looks like we're calling the old bypass path for the new
>> path, which is now triggering because q->elevator is true.
>
> Should have grepped, there's one more path that uses the old bypass
> path for q->mq_ops, potentially. This one handles that one, too.
>
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 8ba0af780e88..2630f64bed19 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1223,7 +1223,11 @@ int blkcg_activate_policy(struct request_queue *q,
> if (blkcg_policy_enabled(q, pol))
> return 0;
>
> - blk_queue_bypass_start(q);
> + if (q->mq_ops) {
> + blk_mq_freeze_queue(q);
> + blk_mq_quiesce_queue(q);
> + } else
> + blk_queue_bypass_start(q);
> pd_prealloc:
> if (!pd_prealloc) {
> pd_prealloc = pol->pd_alloc_fn(GFP_KERNEL, q->node);
> @@ -1261,7 +1265,10 @@ int blkcg_activate_policy(struct request_queue *q,
>
> spin_unlock_irq(q->queue_lock);
> out_bypass_end:
> - blk_queue_bypass_end(q);
> + if (q->mq_ops)
> + blk_mq_unfreeze_queue(q);
> + else
> + blk_queue_bypass_end(q);
> if (pd_prealloc)
> pol->pd_free_fn(pd_prealloc);
> return ret;
> @@ -1284,7 +1291,12 @@ void blkcg_deactivate_policy(struct request_queue *q,
> if (!blkcg_policy_enabled(q, pol))
> return;
>
> - blk_queue_bypass_start(q);
> + if (q->mq_ops) {
> + blk_mq_freeze_queue(q);
> + blk_mq_quiesce_queue(q);
> + } else
> + blk_queue_bypass_start(q);
> +
> spin_lock_irq(q->queue_lock);
>
> __clear_bit(pol->plid, q->blkcg_pols);
> @@ -1304,7 +1316,11 @@ void blkcg_deactivate_policy(struct request_queue *q,
> }
>
> spin_unlock_irq(q->queue_lock);
> - blk_queue_bypass_end(q);
> +
> + if (q->mq_ops)
> + blk_mq_unfreeze_queue(q);
> + else
> + blk_queue_bypass_end(q);
> }
> EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
>
> diff --git a/block/elevator.c b/block/elevator.c
> index d14cb87e6564..464372840774 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -613,6 +613,9 @@ void elv_drain_elevator(struct request_queue *q)
> {
> static int printed;
>
> + if (WARN_ON_ONCE(q->mq_ops))
> + return;
> +
> lockdep_assert_held(q->queue_lock);
>
> while (q->elevator->type->ops.sq.elevator_dispatch_fn(q, 1))
>
Nearly there.
You're missing a 'blk_mq_start_hw_queues(q)' after
blk_mq_unfreeze_queue(); without it the queue will stall after switching
the scheduler.
Also what's quite suspicious is this:
struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
struct request_queue *q)
{
struct blkcg_gq *blkg;
WARN_ON_ONCE(!rcu_read_lock_held());
lockdep_assert_held(q->queue_lock);
/*
* This could be the first entry point of blkcg implementation and
* we shouldn't allow anything to go through for a bypassing queue.
*/
if (unlikely(blk_queue_bypass(q)))
return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);
which now won't work as the respective flags aren't set anymore.
Not sure if that's a problem, though.
But you might want to look at that, too.
Nevertheless, with the mentioned modifications to your patch the crashes
don't occur anymore.
Sad news is that it doesn't help _that_ much on spinning rust mpt3sas;
there I still see a ~50% performance penalty on reads.
Write's slightly better than sq performance, though.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel oops with blk-mq-sched latest
@ 2017-01-18 11:48 ` Hannes Reinecke
0 siblings, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2017-01-18 11:48 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, device-mapper development, Christoph Hellwig
On 01/17/2017 02:00 PM, Jens Axboe wrote:
> On 01/17/2017 04:47 AM, Jens Axboe wrote:
>> On 01/17/2017 12:57 AM, Hannes Reinecke wrote:
>>> Hi Jens,
>>>
>>> I gave your latest patchset from
>>>
>>> git.kernel.dk/linux-block blk-mq-sched
>>>
>>> I see a kernel oops when shutting down:
>>>
>>> [ 2132.708929] systemd-shutdown[1]: Detaching DM devices.
>>> [ 2132.965107] BUG: unable to handle kernel NULL pointer dereference at
>>> 00000000
>>> 00000001
>>> [ 2133.037182] IP: dd_merged_requests+0x6/0x60
>>> [ 2133.077816] PGD 0
>>> [ 2133.077818]
>>> [ 2133.113087] Oops: 0000 [#1] SMP
>>> [ list of modules removed ]
>>> [ 2133.925265] CPU: 20 PID: 1 Comm: systemd-shutdow Not tainted
>>> 4.10.0-rc4+ #543
>>> [ 2133.990034] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 09/18/2013
>>> [ 2134.050522] task: ffff88042d614040 task.stack: ffffc90003150000
>>> [ 2134.106915] RIP: 0010:dd_merged_requests+0x6/0x60
>>> [ 2134.150593] RSP: 0018:ffffc90003153b18 EFLAGS: 00010002
>>> [ 2134.198740] RAX: ffffffff81cc6de0 RBX: ffff8804296d5040 RCX:
>>> 0000000000000001
>>> [ 2134.262708] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
>>> ffff8804296d5040
>>> [ 2134.326987] RBP: ffffc90003153b30 R08: 0000000000000000 R09:
>>> 0000000000000000
>>> [ 2134.391054] R10: 0000000000000000 R11: 0001f8180001f815 R12:
>>> 0000000000000000
>>> [ 2134.456095] R13: ffff8804296d5040 R14: ffff8804099801f0 R15:
>>> 0000000000000004
>>> [ 2134.521196] FS: 00007fd64d3bf840(0000) GS:ffff88042f900000(0000)
>>> knlGS:0000000000000000
>>> [ 2134.595178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 2134.648637] CR2: 0000000000000001 CR3: 000000081b892000 CR4:
>>> 00000000000406e0
>>> [ 2134.713349] Call Trace:
>>> [ 2134.737168] ? elv_drain_elevator+0x29/0xa0
>>> [ 2134.775821] __blk_drain_queue+0x52/0x1a0
>>> [ 2134.812473] blk_queue_bypass_start+0x6e/0xa0
>>> [ 2134.854009] blkcg_deactivate_policy+0x30/0xf0
>>> [ 2134.894993] blk_throtl_exit+0x34/0x50
>>> [ 2134.929450] blkcg_exit_queue+0x35/0x40
>>> [ 2134.965089] blk_release_queue+0x33/0xe0
>>> [ 2135.001364] kobject_cleanup+0x63/0x170
>>> [ 2135.037412] kobject_put+0x25/0x50
>>> [ 2135.068622] blk_cleanup_queue+0x198/0x260
>>> [ 2135.107780] cleanup_mapped_device+0xb5/0xf0 [dm_mod]
>>> [ 2135.154741] __dm_destroy+0x1a5/0x290 [dm_mod]
>>> [ 2135.195009] dm_destroy+0x13/0x20 [dm_mod]
>>> [ 2135.232149] dev_remove+0xde/0x120 [dm_mod]
>>> [ 2135.270184] ? dev_suspend+0x210/0x210 [dm_mod]
>>> [ 2135.311478] ctl_ioctl+0x20b/0x510 [dm_mod]
>>> [ 2135.349680] ? terminate_walk+0xc3/0x140
>>> [ 2135.385442] ? kmem_cache_free+0x10/0x260
>>> [ 2135.422315] dm_ctl_ioctl+0x13/0x20 [dm_mod]
>>>
>>> Known issue?
>>
>> Try with this, looks like we're calling the old bypass path for the new
>> path, which is now triggering because q->elevator is true.
>
> Should have grepped, there's one more path that uses the old bypass
> path for q->mq_ops, potentially. This one handles that one, too.
>
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 8ba0af780e88..2630f64bed19 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1223,7 +1223,11 @@ int blkcg_activate_policy(struct request_queue *q,
> if (blkcg_policy_enabled(q, pol))
> return 0;
>
> - blk_queue_bypass_start(q);
> + if (q->mq_ops) {
> + blk_mq_freeze_queue(q);
> + blk_mq_quiesce_queue(q);
> + } else
> + blk_queue_bypass_start(q);
> pd_prealloc:
> if (!pd_prealloc) {
> pd_prealloc = pol->pd_alloc_fn(GFP_KERNEL, q->node);
> @@ -1261,7 +1265,10 @@ int blkcg_activate_policy(struct request_queue *q,
>
> spin_unlock_irq(q->queue_lock);
> out_bypass_end:
> - blk_queue_bypass_end(q);
> + if (q->mq_ops)
> + blk_mq_unfreeze_queue(q);
> + else
> + blk_queue_bypass_end(q);
> if (pd_prealloc)
> pol->pd_free_fn(pd_prealloc);
> return ret;
> @@ -1284,7 +1291,12 @@ void blkcg_deactivate_policy(struct request_queue *q,
> if (!blkcg_policy_enabled(q, pol))
> return;
>
> - blk_queue_bypass_start(q);
> + if (q->mq_ops) {
> + blk_mq_freeze_queue(q);
> + blk_mq_quiesce_queue(q);
> + } else
> + blk_queue_bypass_start(q);
> +
> spin_lock_irq(q->queue_lock);
>
> __clear_bit(pol->plid, q->blkcg_pols);
> @@ -1304,7 +1316,11 @@ void blkcg_deactivate_policy(struct request_queue *q,
> }
>
> spin_unlock_irq(q->queue_lock);
> - blk_queue_bypass_end(q);
> +
> + if (q->mq_ops)
> + blk_mq_unfreeze_queue(q);
> + else
> + blk_queue_bypass_end(q);
> }
> EXPORT_SYMBOL_GPL(blkcg_deactivate_policy);
>
> diff --git a/block/elevator.c b/block/elevator.c
> index d14cb87e6564..464372840774 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -613,6 +613,9 @@ void elv_drain_elevator(struct request_queue *q)
> {
> static int printed;
>
> + if (WARN_ON_ONCE(q->mq_ops))
> + return;
> +
> lockdep_assert_held(q->queue_lock);
>
> while (q->elevator->type->ops.sq.elevator_dispatch_fn(q, 1))
>
Nearly there.
You're missing a 'blk_mq_start_hw_queues(q)' after
blk_mq_unfreeze_queue(); without it the queue will stall after switching
the scheduler.
Also what's quite suspicious is this:
struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
struct request_queue *q)
{
struct blkcg_gq *blkg;
WARN_ON_ONCE(!rcu_read_lock_held());
lockdep_assert_held(q->queue_lock);
/*
* This could be the first entry point of blkcg implementation and
* we shouldn't allow anything to go through for a bypassing queue.
*/
if (unlikely(blk_queue_bypass(q)))
return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);
which now won't work as the respective flags aren't set anymore.
Not sure if that's a problem, though.
But you might want to look at that, too.
Nevertheless, with the mentioned modifications to your patch the crashes
don't occur anymore.
Sad news is that it doesn't help _that_ much on spinning rust mpt3sas;
there I still see a ~50% performance penalty on reads.
Write's slightly better than sq performance, though.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel oops with blk-mq-sched latest
2017-01-18 11:48 ` Hannes Reinecke
(?)
@ 2017-01-18 14:40 ` Jens Axboe
-1 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2017-01-18 14:40 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: linux-block, Christoph Hellwig, device-mapper development
On 01/18/2017 03:48 AM, Hannes Reinecke wrote:
> Nearly there.
> You're missing a 'blk_mq_start_hw_queues(q)' after
> blk_mq_unfreeze_queue(); without it the queue will stall after switching
> the scheduler.
Yes indeed, forgot that. Needed after the quiesce.
> Also what's quite suspicious is this:
>
> struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
> struct request_queue *q)
> {
> struct blkcg_gq *blkg;
>
> WARN_ON_ONCE(!rcu_read_lock_held());
> lockdep_assert_held(q->queue_lock);
>
> /*
> * This could be the first entry point of blkcg implementation and
> * we shouldn't allow anything to go through for a bypassing queue.
> */
> if (unlikely(blk_queue_bypass(q)))
> return ERR_PTR(blk_queue_dying(q) ? -ENODEV : -EBUSY);
>
> which now won't work as the respective flags aren't set anymore.
> Not sure if that's a problem, though.
> But you might want to look at that, too.
dying is still used on blk-mq, but yes, the bypass check should now be
frozen for blk-mq. Not really directly related to the above change,
but it should be fixed up.
> Nevertheless, with the mentioned modifications to your patch the crashes
> don't occur anymore.
Great
> Sad news is that it doesn't help _that_ much on spinning rust mpt3sas;
> there I still see a ~50% performance penalty on reads.
> Write's slightly better than sq performance, though.
What is the test case? Full details please, from hardware to what you
are running. As I've mentioned before, I don't necessarily think your
performance issues are related to scheduling. Would be nice to get
to the bottom of it, though. And for that, I need more details.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-01-18 14:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-17 8:57 kernel oops with blk-mq-sched latest Hannes Reinecke
2017-01-17 12:47 ` Jens Axboe
2017-01-17 12:47 ` Jens Axboe
2017-01-17 13:00 ` Jens Axboe
2017-01-18 11:48 ` Hannes Reinecke
2017-01-18 11:48 ` Hannes Reinecke
2017-01-18 14:40 ` Jens Axboe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.