* Lockdep splat involving all_q_mutex
@ 2017-05-10 22:34 Paul E. McKenney
2017-05-11 2:55 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2017-05-10 22:34 UTC (permalink / raw)
To: peterz, axboe; +Cc: linux-kernel, rostedt, tglx
Hello!
I got the lockdep splat shown below during some rcutorture testing (which
does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
inversion") just got done moving these two statements in the other
direction.
Acquiring the update-side CPU-hotplug lock across sched_feat_write()
seems like it might be an alternative to eabe06595d62, but I figured
I should check first. Another approach would be to do the work in
blk_mq_queue_reinit_dead() asynchronously, for example, from a workqueue,
but I would have to know much more about blk_mq to know what effects
that would have.
Thoughts?
Thanx, Paul
[ 32.808758] ======================================================
[ 32.810110] [ INFO: possible circular locking dependency detected ]
[ 32.811468] 4.11.0+ #1 Not tainted
[ 32.812190] -------------------------------------------------------
[ 32.813626] torture_onoff/769 is trying to acquire lock:
[ 32.814769] (all_q_mutex){+.+...}, at: [<ffffffff93b884c3>] blk_mq_queue_reinit_work+0x13/0x110
[ 32.816655]
[ 32.816655] but task is already holding lock:
[ 32.817926] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff9386014b>] cpu_hotplug_begin+0x6b/0xb0
[ 32.819754]
[ 32.819754] which lock already depends on the new lock.
[ 32.819754]
[ 32.821898]
[ 32.821898] the existing dependency chain (in reverse order) is:
[ 32.823518]
[ 32.823518] -> #1 (cpu_hotplug.lock){+.+.+.}:
[ 32.824788] lock_acquire+0xd1/0x1b0
[ 32.825728] __mutex_lock+0x54/0x8a0
[ 32.826629] mutex_lock_nested+0x16/0x20
[ 32.827585] get_online_cpus+0x47/0x60
[ 32.828519] blk_mq_init_allocated_queue+0x398/0x4d0
[ 32.829754] blk_mq_init_queue+0x35/0x60
[ 32.830766] loop_add+0xe0/0x270
[ 32.831595] loop_init+0x10d/0x14b
[ 32.832454] do_one_initcall+0xef/0x160
[ 32.833449] kernel_init_freeable+0x1b6/0x23e
[ 32.834517] kernel_init+0x9/0x100
[ 32.835363] ret_from_fork+0x2e/0x40
[ 32.836254]
[ 32.836254] -> #0 (all_q_mutex){+.+...}:
[ 32.837635] __lock_acquire+0x10bf/0x1350
[ 32.838714] lock_acquire+0xd1/0x1b0
[ 32.839702] __mutex_lock+0x54/0x8a0
[ 32.840667] mutex_lock_nested+0x16/0x20
[ 32.841767] blk_mq_queue_reinit_work+0x13/0x110
[ 32.842987] blk_mq_queue_reinit_dead+0x17/0x20
[ 32.844164] cpuhp_invoke_callback+0x1d1/0x770
[ 32.845379] cpuhp_down_callbacks+0x3d/0x80
[ 32.846484] _cpu_down+0xad/0xe0
[ 32.847388] do_cpu_down+0x39/0x50
[ 32.848316] cpu_down+0xb/0x10
[ 32.849236] torture_offline+0x75/0x140
[ 32.850258] torture_onoff+0x102/0x1e0
[ 32.851278] kthread+0x104/0x140
[ 32.852158] ret_from_fork+0x2e/0x40
[ 32.853167]
[ 32.853167] other info that might help us debug this:
[ 32.853167]
[ 32.855052] Possible unsafe locking scenario:
[ 32.855052]
[ 32.856442] CPU0 CPU1
[ 32.857366] ---- ----
[ 32.858429] lock(cpu_hotplug.lock);
[ 32.859289] lock(all_q_mutex);
[ 32.860649] lock(cpu_hotplug.lock);
[ 32.862148] lock(all_q_mutex);
[ 32.862910]
[ 32.862910] *** DEADLOCK ***
[ 32.862910]
[ 32.864289] 3 locks held by torture_onoff/769:
[ 32.865386] #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff938601f2>] do_cpu_down+0x22/0x50
[ 32.867429] #1: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff938600e0>] cpu_hotplug_begin+0x0/0xb0
[ 32.869612] #2: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff9386014b>] cpu_hotplug_begin+0x6b/0xb0
[ 32.871700]
[ 32.871700] stack backtrace:
[ 32.872727] CPU: 1 PID: 769 Comm: torture_onoff Not tainted 4.11.0+ #1
[ 32.874299] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 32.876447] Call Trace:
[ 32.877053] dump_stack+0x67/0x97
[ 32.877828] print_circular_bug+0x1e3/0x250
[ 32.878789] __lock_acquire+0x10bf/0x1350
[ 32.879701] ? retint_kernel+0x10/0x10
[ 32.880567] lock_acquire+0xd1/0x1b0
[ 32.881453] ? lock_acquire+0xd1/0x1b0
[ 32.882305] ? blk_mq_queue_reinit_work+0x13/0x110
[ 32.883400] __mutex_lock+0x54/0x8a0
[ 32.884215] ? blk_mq_queue_reinit_work+0x13/0x110
[ 32.885390] ? kernfs_put+0x103/0x1a0
[ 32.886227] ? kernfs_put+0x103/0x1a0
[ 32.887063] ? blk_mq_queue_reinit_work+0x13/0x110
[ 32.888158] ? rcu_read_lock_sched_held+0x58/0x60
[ 32.889287] ? kmem_cache_free+0x1f7/0x260
[ 32.890224] ? anon_transport_class_unregister+0x20/0x20
[ 32.891443] ? kernfs_put+0x103/0x1a0
[ 32.892274] ? blk_mq_queue_reinit_work+0x110/0x110
[ 32.893436] mutex_lock_nested+0x16/0x20
[ 32.894328] ? mutex_lock_nested+0x16/0x20
[ 32.895260] blk_mq_queue_reinit_work+0x13/0x110
[ 32.896307] blk_mq_queue_reinit_dead+0x17/0x20
[ 32.897425] cpuhp_invoke_callback+0x1d1/0x770
[ 32.898443] ? __flow_cache_shrink+0x130/0x130
[ 32.899453] cpuhp_down_callbacks+0x3d/0x80
[ 32.900402] _cpu_down+0xad/0xe0
[ 32.901213] do_cpu_down+0x39/0x50
[ 32.902002] cpu_down+0xb/0x10
[ 32.902716] torture_offline+0x75/0x140
[ 32.903603] torture_onoff+0x102/0x1e0
[ 32.904459] kthread+0x104/0x140
[ 32.905243] ? torture_kthread_stopping+0x70/0x70
[ 32.906316] ? kthread_create_on_node+0x40/0x40
[ 32.907351] ret_from_fork+0x2e/0x40
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Lockdep splat involving all_q_mutex
2017-05-10 22:34 Lockdep splat involving all_q_mutex Paul E. McKenney
@ 2017-05-11 2:55 ` Jens Axboe
2017-05-11 3:13 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2017-05-11 2:55 UTC (permalink / raw)
To: paulmck, peterz; +Cc: linux-kernel, rostedt, tglx
On 05/10/2017 04:34 PM, Paul E. McKenney wrote:
> Hello!
>
> I got the lockdep splat shown below during some rcutorture testing (which
> does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
> tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
> My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
> and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
> I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
> inversion") just got done moving these two statements in the other
> direction.
The problem is that that patch got merged too early, as it only
fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus'
way, it's in my for-linus tree.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Lockdep splat involving all_q_mutex
2017-05-11 2:55 ` Jens Axboe
@ 2017-05-11 3:13 ` Paul E. McKenney
2017-05-11 20:12 ` Jens Axboe
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2017-05-11 3:13 UTC (permalink / raw)
To: Jens Axboe; +Cc: peterz, linux-kernel, rostedt, tglx
On Wed, May 10, 2017 at 08:55:54PM -0600, Jens Axboe wrote:
> On 05/10/2017 04:34 PM, Paul E. McKenney wrote:
> > Hello!
> >
> > I got the lockdep splat shown below during some rcutorture testing (which
> > does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
> > tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
> > My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
> > and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
> > I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
> > inversion") just got done moving these two statements in the other
> > direction.
>
> The problem is that that patch got merged too early, as it only
> fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus'
> way, it's in my for-linus tree.
Thank you for the update, looking forward to the fix.
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Lockdep splat involving all_q_mutex
2017-05-11 3:13 ` Paul E. McKenney
@ 2017-05-11 20:12 ` Jens Axboe
2017-05-11 20:23 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2017-05-11 20:12 UTC (permalink / raw)
To: paulmck; +Cc: peterz, linux-kernel, rostedt, tglx
On 05/10/2017 09:13 PM, Paul E. McKenney wrote:
> On Wed, May 10, 2017 at 08:55:54PM -0600, Jens Axboe wrote:
>> On 05/10/2017 04:34 PM, Paul E. McKenney wrote:
>>> Hello!
>>>
>>> I got the lockdep splat shown below during some rcutorture testing (which
>>> does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
>>> tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
>>> My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
>>> and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
>>> I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
>>> inversion") just got done moving these two statements in the other
>>> direction.
>>
>> The problem is that that patch got merged too early, as it only
>> fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus'
>> way, it's in my for-linus tree.
>
> Thank you for the update, looking forward to the fix.
It's upstream now.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Lockdep splat involving all_q_mutex
2017-05-11 20:12 ` Jens Axboe
@ 2017-05-11 20:23 ` Paul E. McKenney
2017-05-12 5:02 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2017-05-11 20:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: peterz, linux-kernel, rostedt, tglx
On Thu, May 11, 2017 at 02:12:39PM -0600, Jens Axboe wrote:
> On 05/10/2017 09:13 PM, Paul E. McKenney wrote:
> > On Wed, May 10, 2017 at 08:55:54PM -0600, Jens Axboe wrote:
> >> On 05/10/2017 04:34 PM, Paul E. McKenney wrote:
> >>> Hello!
> >>>
> >>> I got the lockdep splat shown below during some rcutorture testing (which
> >>> does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
> >>> tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
> >>> My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
> >>> and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
> >>> I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
> >>> inversion") just got done moving these two statements in the other
> >>> direction.
> >>
> >> The problem is that that patch got merged too early, as it only
> >> fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus'
> >> way, it's in my for-linus tree.
> >
> > Thank you for the update, looking forward to the fix.
>
> It's upstream now.
Thank you, Jens! I will test it this evening, Pacific Time.
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Lockdep splat involving all_q_mutex
2017-05-11 20:23 ` Paul E. McKenney
@ 2017-05-12 5:02 ` Paul E. McKenney
0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2017-05-12 5:02 UTC (permalink / raw)
To: Jens Axboe; +Cc: peterz, linux-kernel, rostedt, tglx
On Thu, May 11, 2017 at 01:23:44PM -0700, Paul E. McKenney wrote:
> On Thu, May 11, 2017 at 02:12:39PM -0600, Jens Axboe wrote:
> > On 05/10/2017 09:13 PM, Paul E. McKenney wrote:
> > > On Wed, May 10, 2017 at 08:55:54PM -0600, Jens Axboe wrote:
> > >> On 05/10/2017 04:34 PM, Paul E. McKenney wrote:
> > >>> Hello!
> > >>>
> > >>> I got the lockdep splat shown below during some rcutorture testing (which
> > >>> does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge
> > >>> tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm").
> > >>> My kneejerk reaction was just to reverse the "mutex_lock(&all_q_mutex);"
> > >>> and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then
> > >>> I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock
> > >>> inversion") just got done moving these two statements in the other
> > >>> direction.
> > >>
> > >> The problem is that that patch got merged too early, as it only
> > >> fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus'
> > >> way, it's in my for-linus tree.
> > >
> > > Thank you for the update, looking forward to the fix.
> >
> > It's upstream now.
>
> Thank you, Jens! I will test it this evening, Pacific Time.
And no more lockep splats, thank you!
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-05-12 5:02 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10 22:34 Lockdep splat involving all_q_mutex Paul E. McKenney
2017-05-11 2:55 ` Jens Axboe
2017-05-11 3:13 ` Paul E. McKenney
2017-05-11 20:12 ` Jens Axboe
2017-05-11 20:23 ` Paul E. McKenney
2017-05-12 5:02 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).