linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
@ 2014-05-07 15:37 Sasha Levin
  2014-05-07 15:45 ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Sasha Levin @ 2014-05-07 15:37 UTC (permalink / raw)
  To: axboe; +Cc: LKML, Dave Jones

Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel I've stumbled on the following spew:

[  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
[  986.964364] Modules linked in:
[  986.964996] CPU: 41 PID: 41607 Comm: kworker/u147:1 Not tainted 3.15.0-rc4-next-20140506-sasha-00021-gc164334-dirty #447
[  986.967025] Workqueue: kblockd blk_mq_run_work_fn
[  986.967939]  0000000000000009 ffff8802410e1c68 ffffffff94536f8a 0000000000000001
[  986.970333]  0000000000000000 ffff8802410e1ca8 ffffffff9115febc ffffffff91183ef7
[  986.973173]  ffff8801b478ef60 0000000000000000 ffff8802410e1ce8 ffff8801b3f07978
[  986.975749] Call Trace:
[  986.976700] dump_stack (lib/dump_stack.c:52)
[  986.978875] warn_slowpath_common (kernel/panic.c:430)
[  986.981924] ? process_one_work (kernel/workqueue.c:2224)
[  986.984245] warn_slowpath_null (kernel/panic.c:465)
[  986.986670] __blk_mq_run_hw_queue (arch/x86/include/asm/bitops.h:311 (discriminator 65) block/blk-mq.c:587 (discriminator 65))
[  986.989000] ? process_one_work (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2220)
[  986.991584] ? get_parent_ip (kernel/sched/core.c:2485)
[  986.994168] blk_mq_run_work_fn (block/blk-mq.c:777)
[  986.996379] process_one_work (kernel/workqueue.c:2227 include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 kernel/workqueue.c:2232)
[  986.998641] ? process_one_work (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2220)
[  987.001352] worker_thread (kernel/workqueue.c:2354)
[  987.003676] ? rescuer_thread (kernel/workqueue.c:2303)
[  987.006703] kthread (kernel/kthread.c:210)
[  987.009161] ? kthread_create_on_node (kernel/kthread.c:176)
[  987.012536] ret_from_fork (arch/x86/kernel/entry_64.S:553)
[  987.015272] ? kthread_create_on_node (kernel/kthread.c:176)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-07 15:37 blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue Sasha Levin
@ 2014-05-07 15:45 ` Jens Axboe
  2014-05-07 15:53   ` Sasha Levin
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2014-05-07 15:45 UTC (permalink / raw)
  To: Sasha Levin; +Cc: LKML, Dave Jones

On 05/07/2014 09:37 AM, Sasha Levin wrote:
> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following spew:
> 
> [  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()

I'm going to need more info than this. What were you running? How as kvm
invoked (nr cpus)?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-07 15:45 ` Jens Axboe
@ 2014-05-07 15:53   ` Sasha Levin
  2014-05-07 15:55     ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Sasha Levin @ 2014-05-07 15:53 UTC (permalink / raw)
  To: Jens Axboe; +Cc: LKML, Dave Jones

On 05/07/2014 11:45 AM, Jens Axboe wrote:
> On 05/07/2014 09:37 AM, Sasha Levin wrote:
>> Hi all,
>>
>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>> kernel I've stumbled on the following spew:
>>
>> [  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
> 
> I'm going to need more info than this. What were you running? How as kvm
> invoked (nr cpus)?

Sure!

It's running in a KVM tools guest (not qemu), with the following options:

'--rng --balloon -m 28000 -c 48 -p "numa=fake=32 init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 slub_debug=FZPU rcutorture.rcutorture_runnable=0 loop.max_loop=64 zram.num_devices=4 rcutorture.nreaders=8 oops=panic nr_hugepages=1000 numa_balancing=enable'.

So basically 48 vcpus (the host has 128 physical ones), and ~28G of RAM.

I've been running trinity as a fuzzer, which doesn't handle logging too well,
so I can't reproduce it's actions easily.

There was an additional stress of hotplugging CPUs and memory during this recent
fuzzing run, so it's fair to suspect that this happened as a result of that.

Anything else that might be helpful?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-07 15:53   ` Sasha Levin
@ 2014-05-07 15:55     ` Jens Axboe
  2014-05-09  3:22       ` Sasha Levin
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2014-05-07 15:55 UTC (permalink / raw)
  To: Sasha Levin; +Cc: LKML, Dave Jones

On 05/07/2014 09:53 AM, Sasha Levin wrote:
> On 05/07/2014 11:45 AM, Jens Axboe wrote:
>> On 05/07/2014 09:37 AM, Sasha Levin wrote:
>>> Hi all,
>>>
>>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>>> kernel I've stumbled on the following spew:
>>>
>>> [  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
>>
>> I'm going to need more info than this. What were you running? How as kvm
>> invoked (nr cpus)?
> 
> Sure!
> 
> It's running in a KVM tools guest (not qemu), with the following options:
> 
> '--rng --balloon -m 28000 -c 48 -p "numa=fake=32 init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 slub_debug=FZPU rcutorture.rcutorture_runnable=0 loop.max_loop=64 zram.num_devices=4 rcutorture.nreaders=8 oops=panic nr_hugepages=1000 numa_balancing=enable'.
> 
> So basically 48 vcpus (the host has 128 physical ones), and ~28G of RAM.
> 
> I've been running trinity as a fuzzer, which doesn't handle logging too well,
> so I can't reproduce it's actions easily.
> 
> There was an additional stress of hotplugging CPUs and memory during this recent
> fuzzing run, so it's fair to suspect that this happened as a result of that.

Aha!

> Anything else that might be helpful?

No, not too surprising given the info that cpu hotplug was being
stressed at the same time. blk-mq doesn't quiesce when this happens, so
it's very unlikely that there are races between updating the cpu masks
and flushing out the previously queued work.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-07 15:55     ` Jens Axboe
@ 2014-05-09  3:22       ` Sasha Levin
  2014-05-09  3:27         ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Sasha Levin @ 2014-05-09  3:22 UTC (permalink / raw)
  To: Jens Axboe; +Cc: LKML, Dave Jones

On 05/07/2014 11:55 AM, Jens Axboe wrote:
> On 05/07/2014 09:53 AM, Sasha Levin wrote:
>> On 05/07/2014 11:45 AM, Jens Axboe wrote:
>>> On 05/07/2014 09:37 AM, Sasha Levin wrote:
>>>> Hi all,
>>>>
>>>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>>>> kernel I've stumbled on the following spew:
>>>>
>>>> [  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
>>>
>>> I'm going to need more info than this. What were you running? How as kvm
>>> invoked (nr cpus)?
>>
>> Sure!
>>
>> It's running in a KVM tools guest (not qemu), with the following options:
>>
>> '--rng --balloon -m 28000 -c 48 -p "numa=fake=32 init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 slub_debug=FZPU rcutorture.rcutorture_runnable=0 loop.max_loop=64 zram.num_devices=4 rcutorture.nreaders=8 oops=panic nr_hugepages=1000 numa_balancing=enable'.
>>
>> So basically 48 vcpus (the host has 128 physical ones), and ~28G of RAM.
>>
>> I've been running trinity as a fuzzer, which doesn't handle logging too well,
>> so I can't reproduce it's actions easily.
>>
>> There was an additional stress of hotplugging CPUs and memory during this recent
>> fuzzing run, so it's fair to suspect that this happened as a result of that.
> 
> Aha!
> 
>> Anything else that might be helpful?
> 
> No, not too surprising given the info that cpu hotplug was being
> stressed at the same time. blk-mq doesn't quiesce when this happens, so
> it's very unlikely that there are races between updating the cpu masks
> and flushing out the previously queued work.

So this warning is something you'd expect when CPUs go up/down?


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-09  3:22       ` Sasha Levin
@ 2014-05-09  3:27         ` Jens Axboe
  2014-05-09 12:12           ` Shaohua Li
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2014-05-09  3:27 UTC (permalink / raw)
  To: Sasha Levin; +Cc: LKML, Dave Jones

On 2014-05-08 21:22, Sasha Levin wrote:
> On 05/07/2014 11:55 AM, Jens Axboe wrote:
>> On 05/07/2014 09:53 AM, Sasha Levin wrote:
>>> On 05/07/2014 11:45 AM, Jens Axboe wrote:
>>>> On 05/07/2014 09:37 AM, Sasha Levin wrote:
>>>>> Hi all,
>>>>>
>>>>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>>>>> kernel I've stumbled on the following spew:
>>>>>
>>>>> [  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
>>>>
>>>> I'm going to need more info than this. What were you running? How as kvm
>>>> invoked (nr cpus)?
>>>
>>> Sure!
>>>
>>> It's running in a KVM tools guest (not qemu), with the following options:
>>>
>>> '--rng --balloon -m 28000 -c 48 -p "numa=fake=32 init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 slub_debug=FZPU rcutorture.rcutorture_runnable=0 loop.max_loop=64 zram.num_devices=4 rcutorture.nreaders=8 oops=panic nr_hugepages=1000 numa_balancing=enable'.
>>>
>>> So basically 48 vcpus (the host has 128 physical ones), and ~28G of RAM.
>>>
>>> I've been running trinity as a fuzzer, which doesn't handle logging too well,
>>> so I can't reproduce it's actions easily.
>>>
>>> There was an additional stress of hotplugging CPUs and memory during this recent
>>> fuzzing run, so it's fair to suspect that this happened as a result of that.
>>
>> Aha!
>>
>>> Anything else that might be helpful?
>>
>> No, not too surprising given the info that cpu hotplug was being
>> stressed at the same time. blk-mq doesn't quiesce when this happens, so
>> it's very unlikely that there are races between updating the cpu masks
>> and flushing out the previously queued work.
>
> So this warning is something you'd expect when CPUs go up/down?

Let me put it this way - I'm not surprised that it triggered, but it 
will of course be fixed up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-09  3:27         ` Jens Axboe
@ 2014-05-09 12:12           ` Shaohua Li
  2014-05-09 14:22             ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohua Li @ 2014-05-09 12:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Sasha Levin, LKML, Dave Jones

On Thu, May 08, 2014 at 09:27:42PM -0600, Jens Axboe wrote:
> On 2014-05-08 21:22, Sasha Levin wrote:
> >On 05/07/2014 11:55 AM, Jens Axboe wrote:
> >>On 05/07/2014 09:53 AM, Sasha Levin wrote:
> >>>On 05/07/2014 11:45 AM, Jens Axboe wrote:
> >>>>On 05/07/2014 09:37 AM, Sasha Levin wrote:
> >>>>>Hi all,
> >>>>>
> >>>>>While fuzzing with trinity inside a KVM tools guest running the latest -next
> >>>>>kernel I've stumbled on the following spew:
> >>>>>
> >>>>>[  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
> >>>>
> >>>>I'm going to need more info than this. What were you running? How as kvm
> >>>>invoked (nr cpus)?
> >>>
> >>>Sure!
> >>>
> >>>It's running in a KVM tools guest (not qemu), with the following options:
> >>>
> >>>'--rng --balloon -m 28000 -c 48 -p "numa=fake=32 init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 slub_debug=FZPU rcutorture.rcutorture_runnable=0 loop.max_loop=64 zram.num_devices=4 rcutorture.nreaders=8 oops=panic nr_hugepages=1000 numa_balancing=enable'.
> >>>
> >>>So basically 48 vcpus (the host has 128 physical ones), and ~28G of RAM.
> >>>
> >>>I've been running trinity as a fuzzer, which doesn't handle logging too well,
> >>>so I can't reproduce it's actions easily.
> >>>
> >>>There was an additional stress of hotplugging CPUs and memory during this recent
> >>>fuzzing run, so it's fair to suspect that this happened as a result of that.
> >>
> >>Aha!
> >>
> >>>Anything else that might be helpful?
> >>
> >>No, not too surprising given the info that cpu hotplug was being
> >>stressed at the same time. blk-mq doesn't quiesce when this happens, so
> >>it's very unlikely that there are races between updating the cpu masks
> >>and flushing out the previously queued work.
> >
> >So this warning is something you'd expect when CPUs go up/down?
> 
> Let me put it this way - I'm not surprised that it triggered, but it
> will of course be fixed up.

Does reverting 1eaade629f5c47 change anything?

The ctx->online isn't changed immediately when cpu is offline, likely there are
something wrong. I'm wondering why we need that patch?

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue
  2014-05-09 12:12           ` Shaohua Li
@ 2014-05-09 14:22             ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2014-05-09 14:22 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Sasha Levin, LKML, Dave Jones

On 05/09/2014 06:12 AM, Shaohua Li wrote:
> On Thu, May 08, 2014 at 09:27:42PM -0600, Jens Axboe wrote:
>> On 2014-05-08 21:22, Sasha Levin wrote:
>>> On 05/07/2014 11:55 AM, Jens Axboe wrote:
>>>> On 05/07/2014 09:53 AM, Sasha Levin wrote:
>>>>> On 05/07/2014 11:45 AM, Jens Axboe wrote:
>>>>>> On 05/07/2014 09:37 AM, Sasha Levin wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>>>>>>> kernel I've stumbled on the following spew:
>>>>>>>
>>>>>>> [  986.962569] WARNING: CPU: 41 PID: 41607 at block/blk-mq.c:585 __blk_mq_run_hw_queue+0x90/0x500()
>>>>>>
>>>>>> I'm going to need more info than this. What were you running? How as kvm
>>>>>> invoked (nr cpus)?
>>>>>
>>>>> Sure!
>>>>>
>>>>> It's running in a KVM tools guest (not qemu), with the following options:
>>>>>
>>>>> '--rng --balloon -m 28000 -c 48 -p "numa=fake=32 init=/virt/init zcache ftrace_dump_on_oops debugpat kvm.mmu_audit=1 slub_debug=FZPU rcutorture.rcutorture_runnable=0 loop.max_loop=64 zram.num_devices=4 rcutorture.nreaders=8 oops=panic nr_hugepages=1000 numa_balancing=enable'.
>>>>>
>>>>> So basically 48 vcpus (the host has 128 physical ones), and ~28G of RAM.
>>>>>
>>>>> I've been running trinity as a fuzzer, which doesn't handle logging too well,
>>>>> so I can't reproduce it's actions easily.
>>>>>
>>>>> There was an additional stress of hotplugging CPUs and memory during this recent
>>>>> fuzzing run, so it's fair to suspect that this happened as a result of that.
>>>>
>>>> Aha!
>>>>
>>>>> Anything else that might be helpful?
>>>>
>>>> No, not too surprising given the info that cpu hotplug was being
>>>> stressed at the same time. blk-mq doesn't quiesce when this happens, so
>>>> it's very unlikely that there are races between updating the cpu masks
>>>> and flushing out the previously queued work.
>>>
>>> So this warning is something you'd expect when CPUs go up/down?
>>
>> Let me put it this way - I'm not surprised that it triggered, but it
>> will of course be fixed up.
> 
> Does reverting 1eaade629f5c47 change anything?
> 
> The ctx->online isn't changed immediately when cpu is offline, likely there are
> something wrong. I'm wondering why we need that patch?

We don't strictly need it. That commit isn't in what Sasha tested, however.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-05-09 14:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-07 15:37 blk-mq: WARN at block/blk-mq.c:585 __blk_mq_run_hw_queue Sasha Levin
2014-05-07 15:45 ` Jens Axboe
2014-05-07 15:53   ` Sasha Levin
2014-05-07 15:55     ` Jens Axboe
2014-05-09  3:22       ` Sasha Levin
2014-05-09  3:27         ` Jens Axboe
2014-05-09 12:12           ` Shaohua Li
2014-05-09 14:22             ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).