All of lore.kernel.org
 help / color / mirror / Atom feed
* possible circular locking dependency
@ 2012-05-03 20:02 Sergey Senozhatsky
  2012-05-06  8:55 ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Sergey Senozhatsky @ 2012-05-03 20:02 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, Paul E. McKenney, kvm, linux-kernel

Hello,
3.4-rc5

[32881.212463] kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using workaround
[32882.360505] 
[32882.360509] ======================================================
[32882.360511] [ INFO: possible circular locking dependency detected ]
[32882.360515] 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107 Not tainted
[32882.360517] -------------------------------------------------------
[32882.360519] qemu-system-x86/15168 is trying to acquire lock:
[32882.360521]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81033a60>] get_online_cpus+0x41/0x55
[32882.360532] 
[32882.360532] but task is already holding lock:
[32882.360534]  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
[32882.360542] 
[32882.360542] which lock already depends on the new lock.
[32882.360543] 
[32882.360545] 
[32882.360545] the existing dependency chain (in reverse order) is:
[32882.360547] 
[32882.360547] -> #3 (&sp->mutex){+.+...}:
[32882.360552]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
[32882.360557]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
[32882.360562]        [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
[32882.360566]        [<ffffffff81058027>] synchronize_srcu+0x15/0x17
[32882.360569]        [<ffffffff810586e8>] srcu_notifier_chain_unregister+0x5b/0x69
[32882.360573]        [<ffffffff813e3110>] cpufreq_unregister_notifier+0x22/0x3c
[32882.360580]        [<ffffffff813e3e42>] cpufreq_governor_dbs+0x322/0x3ac
[32882.360584]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
[32882.360587]        [<ffffffff813e21a5>] __cpufreq_set_policy+0xf3/0x145
[32882.360591]        [<ffffffff813e2def>] store_scaling_governor+0x173/0x1a9
[32882.360594]        [<ffffffff813e1f71>] store+0x5a/0x86
[32882.360597]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
[32882.360603]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
[32882.360607]        [<ffffffff8111f959>] sys_write+0x43/0x73
[32882.360610]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
[32882.360614] 
[32882.360615] -> #2 (dbs_mutex){+.+.+.}:
[32882.360619]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
[32882.360622]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
[32882.360625]        [<ffffffff813e3b9c>] cpufreq_governor_dbs+0x7c/0x3ac
[32882.360629]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
[32882.360632]        [<ffffffff813e21bb>] __cpufreq_set_policy+0x109/0x145
[32882.360636]        [<ffffffff813e244e>] cpufreq_add_dev_interface+0x257/0x288
[32882.360639]        [<ffffffff813e2889>] cpufreq_add_dev+0x40a/0x42a
[32882.360643]        [<ffffffff81398694>] subsys_interface_register+0x9b/0xdc
[32882.360648]        [<ffffffff813e1935>] cpufreq_register_driver+0xa0/0x14b
[32882.360652]        [<ffffffffa00a1086>] store_up_threshold+0x3a/0x50 [cpufreq_ondemand]
[32882.360657]        [<ffffffff8100020f>] do_one_initcall+0x7f/0x140
[32882.360663]        [<ffffffff8109158c>] sys_init_module+0x1818/0x1aec
[32882.360667]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
[32882.360671] 
[32882.360671] -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
[32882.360675]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
[32882.360679]        [<ffffffff814aa661>] down_write+0x49/0x6c
[32882.360682]        [<ffffffff813e1eaa>] lock_policy_rwsem_write+0x47/0x78
[32882.360685]        [<ffffffff814a18c9>] cpufreq_cpu_callback+0x57/0x81
[32882.360692]        [<ffffffff814b0643>] notifier_call_chain+0xac/0xd9
[32882.360697]        [<ffffffff810582b6>] __raw_notifier_call_chain+0xe/0x10
[32882.360701]        [<ffffffff81033950>] __cpu_notify+0x20/0x37
[32882.360705]        [<ffffffff8149091c>] _cpu_down+0x7b/0x25d
[32882.360709]        [<ffffffff81033b2f>] disable_nonboot_cpus+0x5f/0x10b
[32882.360712]        [<ffffffff81072e61>] suspend_devices_and_enter+0x197/0x401
[32882.360719]        [<ffffffff810731cf>] pm_suspend+0x104/0x1bd
[32882.360722]        [<ffffffff8107213a>] state_store+0xa0/0xc9
[32882.360726]        [<ffffffff8127115a>] kobj_attr_store+0xf/0x1b
[32882.360730]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
[32882.360733]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
[32882.360736]        [<ffffffff8111f959>] sys_write+0x43/0x73
[32882.360739]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
[32882.360743] 
[32882.360743] -> #0 (cpu_hotplug.lock){+.+.+.}:
[32882.360747]        [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
[32882.360751]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
[32882.360754]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
[32882.360757]        [<ffffffff81033a60>] get_online_cpus+0x41/0x55
[32882.360760]        [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
[32882.360766]        [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
[32882.360769]        [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
[32882.360773]        [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
[32882.360789]        [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
[32882.360798]        [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
[32882.360803]        [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
[32882.360816]        [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
[32882.360825]        [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
[32882.360829]        [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
[32882.360832]        [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
[32882.360835]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
[32882.360839] 
[32882.360839] other info that might help us debug this:
[32882.360840] 
[32882.360842] Chain exists of:
[32882.360842]   cpu_hotplug.lock --> dbs_mutex --> &sp->mutex
[32882.360847] 
[32882.360848]  Possible unsafe locking scenario:
[32882.360849] 
[32882.360851]        CPU0                    CPU1
[32882.360852]        ----                    ----
[32882.360854]   lock(&sp->mutex);
[32882.360856]                                lock(dbs_mutex);
[32882.360859]                                lock(&sp->mutex);
[32882.360862]   lock(cpu_hotplug.lock);
[32882.360865] 
[32882.360865]  *** DEADLOCK ***
[32882.360866] 
[32882.360868] 2 locks held by qemu-system-x86/15168:
[32882.360870]  #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa01df1c4>] kvm_set_memory_region+0x29/0x50 [kvm]
[32882.360882]  #1:  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
[32882.360888] 
[32882.360889] stack backtrace:
[32882.360892] Pid: 15168, comm: qemu-system-x86 Not tainted 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107
[32882.360894] Call Trace:
[32882.360898]  [<ffffffff814a3d1d>] print_circular_bug+0x29f/0x2b0
[32882.360901]  [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
[32882.360905]  [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
[32882.360908]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
[32882.360911]  [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
[32882.360914]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
[32882.360917]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
[32882.360921]  [<ffffffff81033a60>] get_online_cpus+0x41/0x55
[32882.360923]  [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
[32882.360927]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
[32882.360930]  [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
[32882.360933]  [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
[32882.360942]  [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
[32882.360945]  [<ffffffff81086268>] ? mark_held_locks+0xbe/0xea
[32882.360954]  [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
[32882.360959]  [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
[32882.360971]  [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
[32882.360980]  [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
[32882.360984]  [<ffffffff810855c5>] ? lock_release_non_nested+0x8b/0x241
[32882.360987]  [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
[32882.360990]  [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
[32882.360993]  [<ffffffff81120f8f>] ? fget_light+0x120/0x39b
[32882.360996]  [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
[32882.360999]  [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f



	-ss

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2012-05-03 20:02 possible circular locking dependency Sergey Senozhatsky
@ 2012-05-06  8:55 ` Avi Kivity
  2012-05-06 16:42   ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2012-05-06  8:55 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Marcelo Tosatti, Paul E. McKenney, kvm, linux-kernel, Dave Jones

On 05/03/2012 11:02 PM, Sergey Senozhatsky wrote:
> Hello,
> 3.4-rc5

Whoa.

Looks like inconsistent locking between cpufreq and
synchronize_srcu_expedited().  kvm triggered this because it is one of
the few users of synchronize_srcu_expedited(), but I don't think it is
doing anything wrong directly.

Dave, Paul?

> [32881.212463] kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using workaround
> [32882.360505] 
> [32882.360509] ======================================================
> [32882.360511] [ INFO: possible circular locking dependency detected ]
> [32882.360515] 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107 Not tainted
> [32882.360517] -------------------------------------------------------
> [32882.360519] qemu-system-x86/15168 is trying to acquire lock:
> [32882.360521]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> [32882.360532] 
> [32882.360532] but task is already holding lock:
> [32882.360534]  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> [32882.360542] 
> [32882.360542] which lock already depends on the new lock.
> [32882.360543] 
> [32882.360545] 
> [32882.360545] the existing dependency chain (in reverse order) is:
> [32882.360547] 
> [32882.360547] -> #3 (&sp->mutex){+.+...}:
> [32882.360552]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> [32882.360557]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> [32882.360562]        [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> [32882.360566]        [<ffffffff81058027>] synchronize_srcu+0x15/0x17
> [32882.360569]        [<ffffffff810586e8>] srcu_notifier_chain_unregister+0x5b/0x69
> [32882.360573]        [<ffffffff813e3110>] cpufreq_unregister_notifier+0x22/0x3c
> [32882.360580]        [<ffffffff813e3e42>] cpufreq_governor_dbs+0x322/0x3ac
> [32882.360584]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
> [32882.360587]        [<ffffffff813e21a5>] __cpufreq_set_policy+0xf3/0x145
> [32882.360591]        [<ffffffff813e2def>] store_scaling_governor+0x173/0x1a9
> [32882.360594]        [<ffffffff813e1f71>] store+0x5a/0x86
> [32882.360597]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
> [32882.360603]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
> [32882.360607]        [<ffffffff8111f959>] sys_write+0x43/0x73
> [32882.360610]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> [32882.360614] 
> [32882.360615] -> #2 (dbs_mutex){+.+.+.}:
> [32882.360619]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> [32882.360622]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> [32882.360625]        [<ffffffff813e3b9c>] cpufreq_governor_dbs+0x7c/0x3ac
> [32882.360629]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
> [32882.360632]        [<ffffffff813e21bb>] __cpufreq_set_policy+0x109/0x145
> [32882.360636]        [<ffffffff813e244e>] cpufreq_add_dev_interface+0x257/0x288
> [32882.360639]        [<ffffffff813e2889>] cpufreq_add_dev+0x40a/0x42a
> [32882.360643]        [<ffffffff81398694>] subsys_interface_register+0x9b/0xdc
> [32882.360648]        [<ffffffff813e1935>] cpufreq_register_driver+0xa0/0x14b
> [32882.360652]        [<ffffffffa00a1086>] store_up_threshold+0x3a/0x50 [cpufreq_ondemand]
> [32882.360657]        [<ffffffff8100020f>] do_one_initcall+0x7f/0x140
> [32882.360663]        [<ffffffff8109158c>] sys_init_module+0x1818/0x1aec
> [32882.360667]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> [32882.360671] 
> [32882.360671] -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
> [32882.360675]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> [32882.360679]        [<ffffffff814aa661>] down_write+0x49/0x6c
> [32882.360682]        [<ffffffff813e1eaa>] lock_policy_rwsem_write+0x47/0x78
> [32882.360685]        [<ffffffff814a18c9>] cpufreq_cpu_callback+0x57/0x81
> [32882.360692]        [<ffffffff814b0643>] notifier_call_chain+0xac/0xd9
> [32882.360697]        [<ffffffff810582b6>] __raw_notifier_call_chain+0xe/0x10
> [32882.360701]        [<ffffffff81033950>] __cpu_notify+0x20/0x37
> [32882.360705]        [<ffffffff8149091c>] _cpu_down+0x7b/0x25d
> [32882.360709]        [<ffffffff81033b2f>] disable_nonboot_cpus+0x5f/0x10b
> [32882.360712]        [<ffffffff81072e61>] suspend_devices_and_enter+0x197/0x401
> [32882.360719]        [<ffffffff810731cf>] pm_suspend+0x104/0x1bd
> [32882.360722]        [<ffffffff8107213a>] state_store+0xa0/0xc9
> [32882.360726]        [<ffffffff8127115a>] kobj_attr_store+0xf/0x1b
> [32882.360730]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
> [32882.360733]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
> [32882.360736]        [<ffffffff8111f959>] sys_write+0x43/0x73
> [32882.360739]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> [32882.360743] 
> [32882.360743] -> #0 (cpu_hotplug.lock){+.+.+.}:
> [32882.360747]        [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
> [32882.360751]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> [32882.360754]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> [32882.360757]        [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> [32882.360760]        [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
> [32882.360766]        [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
> [32882.360769]        [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
> [32882.360773]        [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
> [32882.360789]        [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
> [32882.360798]        [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
> [32882.360803]        [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
> [32882.360816]        [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
> [32882.360825]        [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
> [32882.360829]        [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
> [32882.360832]        [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
> [32882.360835]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> [32882.360839] 
> [32882.360839] other info that might help us debug this:
> [32882.360840] 
> [32882.360842] Chain exists of:
> [32882.360842]   cpu_hotplug.lock --> dbs_mutex --> &sp->mutex
> [32882.360847] 
> [32882.360848]  Possible unsafe locking scenario:
> [32882.360849] 
> [32882.360851]        CPU0                    CPU1
> [32882.360852]        ----                    ----
> [32882.360854]   lock(&sp->mutex);
> [32882.360856]                                lock(dbs_mutex);
> [32882.360859]                                lock(&sp->mutex);
> [32882.360862]   lock(cpu_hotplug.lock);
> [32882.360865] 
> [32882.360865]  *** DEADLOCK ***
> [32882.360866] 
> [32882.360868] 2 locks held by qemu-system-x86/15168:
> [32882.360870]  #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa01df1c4>] kvm_set_memory_region+0x29/0x50 [kvm]
> [32882.360882]  #1:  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> [32882.360888] 
> [32882.360889] stack backtrace:
> [32882.360892] Pid: 15168, comm: qemu-system-x86 Not tainted 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107
> [32882.360894] Call Trace:
> [32882.360898]  [<ffffffff814a3d1d>] print_circular_bug+0x29f/0x2b0
> [32882.360901]  [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
> [32882.360905]  [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> [32882.360908]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
> [32882.360911]  [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> [32882.360914]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
> [32882.360917]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
> [32882.360921]  [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> [32882.360923]  [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
> [32882.360927]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
> [32882.360930]  [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
> [32882.360933]  [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
> [32882.360942]  [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
> [32882.360945]  [<ffffffff81086268>] ? mark_held_locks+0xbe/0xea
> [32882.360954]  [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
> [32882.360959]  [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
> [32882.360971]  [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
> [32882.360980]  [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
> [32882.360984]  [<ffffffff810855c5>] ? lock_release_non_nested+0x8b/0x241
> [32882.360987]  [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
> [32882.360990]  [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
> [32882.360993]  [<ffffffff81120f8f>] ? fget_light+0x120/0x39b
> [32882.360996]  [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
> [32882.360999]  [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
>
>
>
> 	-ss


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2012-05-06  8:55 ` Avi Kivity
@ 2012-05-06 16:42   ` Paul E. McKenney
  2012-05-06 20:34     ` Sergey Senozhatsky
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2012-05-06 16:42 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Sergey Senozhatsky, Marcelo Tosatti, kvm, linux-kernel, Dave Jones

On Sun, May 06, 2012 at 11:55:30AM +0300, Avi Kivity wrote:
> On 05/03/2012 11:02 PM, Sergey Senozhatsky wrote:
> > Hello,
> > 3.4-rc5
> 
> Whoa.
> 
> Looks like inconsistent locking between cpufreq and
> synchronize_srcu_expedited().  kvm triggered this because it is one of
> the few users of synchronize_srcu_expedited(), but I don't think it is
> doing anything wrong directly.
> 
> Dave, Paul?

SRCU hasn't changed much in mainline for quite some time.  Holding
the hotplug mutex across a synchronize_srcu() is a bad idea, though.

However, there is a reworked implementation (courtesy of Lai Jiangshan)
in -rcu that does not acquire the hotplug mutex.  Could you try that out?

							Thanx, Paul

> > [32881.212463] kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using workaround
> > [32882.360505] 
> > [32882.360509] ======================================================
> > [32882.360511] [ INFO: possible circular locking dependency detected ]
> > [32882.360515] 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107 Not tainted
> > [32882.360517] -------------------------------------------------------
> > [32882.360519] qemu-system-x86/15168 is trying to acquire lock:
> > [32882.360521]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> > [32882.360532] 
> > [32882.360532] but task is already holding lock:
> > [32882.360534]  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> > [32882.360542] 
> > [32882.360542] which lock already depends on the new lock.
> > [32882.360543] 
> > [32882.360545] 
> > [32882.360545] the existing dependency chain (in reverse order) is:
> > [32882.360547] 
> > [32882.360547] -> #3 (&sp->mutex){+.+...}:
> > [32882.360552]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > [32882.360557]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > [32882.360562]        [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> > [32882.360566]        [<ffffffff81058027>] synchronize_srcu+0x15/0x17
> > [32882.360569]        [<ffffffff810586e8>] srcu_notifier_chain_unregister+0x5b/0x69
> > [32882.360573]        [<ffffffff813e3110>] cpufreq_unregister_notifier+0x22/0x3c
> > [32882.360580]        [<ffffffff813e3e42>] cpufreq_governor_dbs+0x322/0x3ac
> > [32882.360584]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
> > [32882.360587]        [<ffffffff813e21a5>] __cpufreq_set_policy+0xf3/0x145
> > [32882.360591]        [<ffffffff813e2def>] store_scaling_governor+0x173/0x1a9
> > [32882.360594]        [<ffffffff813e1f71>] store+0x5a/0x86
> > [32882.360597]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
> > [32882.360603]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
> > [32882.360607]        [<ffffffff8111f959>] sys_write+0x43/0x73
> > [32882.360610]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > [32882.360614] 
> > [32882.360615] -> #2 (dbs_mutex){+.+.+.}:
> > [32882.360619]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > [32882.360622]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > [32882.360625]        [<ffffffff813e3b9c>] cpufreq_governor_dbs+0x7c/0x3ac
> > [32882.360629]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
> > [32882.360632]        [<ffffffff813e21bb>] __cpufreq_set_policy+0x109/0x145
> > [32882.360636]        [<ffffffff813e244e>] cpufreq_add_dev_interface+0x257/0x288
> > [32882.360639]        [<ffffffff813e2889>] cpufreq_add_dev+0x40a/0x42a
> > [32882.360643]        [<ffffffff81398694>] subsys_interface_register+0x9b/0xdc
> > [32882.360648]        [<ffffffff813e1935>] cpufreq_register_driver+0xa0/0x14b
> > [32882.360652]        [<ffffffffa00a1086>] store_up_threshold+0x3a/0x50 [cpufreq_ondemand]
> > [32882.360657]        [<ffffffff8100020f>] do_one_initcall+0x7f/0x140
> > [32882.360663]        [<ffffffff8109158c>] sys_init_module+0x1818/0x1aec
> > [32882.360667]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > [32882.360671] 
> > [32882.360671] -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
> > [32882.360675]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > [32882.360679]        [<ffffffff814aa661>] down_write+0x49/0x6c
> > [32882.360682]        [<ffffffff813e1eaa>] lock_policy_rwsem_write+0x47/0x78
> > [32882.360685]        [<ffffffff814a18c9>] cpufreq_cpu_callback+0x57/0x81
> > [32882.360692]        [<ffffffff814b0643>] notifier_call_chain+0xac/0xd9
> > [32882.360697]        [<ffffffff810582b6>] __raw_notifier_call_chain+0xe/0x10
> > [32882.360701]        [<ffffffff81033950>] __cpu_notify+0x20/0x37
> > [32882.360705]        [<ffffffff8149091c>] _cpu_down+0x7b/0x25d
> > [32882.360709]        [<ffffffff81033b2f>] disable_nonboot_cpus+0x5f/0x10b
> > [32882.360712]        [<ffffffff81072e61>] suspend_devices_and_enter+0x197/0x401
> > [32882.360719]        [<ffffffff810731cf>] pm_suspend+0x104/0x1bd
> > [32882.360722]        [<ffffffff8107213a>] state_store+0xa0/0xc9
> > [32882.360726]        [<ffffffff8127115a>] kobj_attr_store+0xf/0x1b
> > [32882.360730]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
> > [32882.360733]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
> > [32882.360736]        [<ffffffff8111f959>] sys_write+0x43/0x73
> > [32882.360739]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > [32882.360743] 
> > [32882.360743] -> #0 (cpu_hotplug.lock){+.+.+.}:
> > [32882.360747]        [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
> > [32882.360751]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > [32882.360754]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > [32882.360757]        [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> > [32882.360760]        [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
> > [32882.360766]        [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
> > [32882.360769]        [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
> > [32882.360773]        [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
> > [32882.360789]        [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
> > [32882.360798]        [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
> > [32882.360803]        [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
> > [32882.360816]        [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
> > [32882.360825]        [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
> > [32882.360829]        [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
> > [32882.360832]        [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
> > [32882.360835]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > [32882.360839] 
> > [32882.360839] other info that might help us debug this:
> > [32882.360840] 
> > [32882.360842] Chain exists of:
> > [32882.360842]   cpu_hotplug.lock --> dbs_mutex --> &sp->mutex
> > [32882.360847] 
> > [32882.360848]  Possible unsafe locking scenario:
> > [32882.360849] 
> > [32882.360851]        CPU0                    CPU1
> > [32882.360852]        ----                    ----
> > [32882.360854]   lock(&sp->mutex);
> > [32882.360856]                                lock(dbs_mutex);
> > [32882.360859]                                lock(&sp->mutex);
> > [32882.360862]   lock(cpu_hotplug.lock);
> > [32882.360865] 
> > [32882.360865]  *** DEADLOCK ***
> > [32882.360866] 
> > [32882.360868] 2 locks held by qemu-system-x86/15168:
> > [32882.360870]  #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa01df1c4>] kvm_set_memory_region+0x29/0x50 [kvm]
> > [32882.360882]  #1:  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> > [32882.360888] 
> > [32882.360889] stack backtrace:
> > [32882.360892] Pid: 15168, comm: qemu-system-x86 Not tainted 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107
> > [32882.360894] Call Trace:
> > [32882.360898]  [<ffffffff814a3d1d>] print_circular_bug+0x29f/0x2b0
> > [32882.360901]  [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
> > [32882.360905]  [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > [32882.360908]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
> > [32882.360911]  [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > [32882.360914]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
> > [32882.360917]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
> > [32882.360921]  [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> > [32882.360923]  [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
> > [32882.360927]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
> > [32882.360930]  [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
> > [32882.360933]  [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
> > [32882.360942]  [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
> > [32882.360945]  [<ffffffff81086268>] ? mark_held_locks+0xbe/0xea
> > [32882.360954]  [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
> > [32882.360959]  [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
> > [32882.360971]  [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
> > [32882.360980]  [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
> > [32882.360984]  [<ffffffff810855c5>] ? lock_release_non_nested+0x8b/0x241
> > [32882.360987]  [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
> > [32882.360990]  [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
> > [32882.360993]  [<ffffffff81120f8f>] ? fget_light+0x120/0x39b
> > [32882.360996]  [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
> > [32882.360999]  [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> >
> >
> >
> > 	-ss
> 
> 
> -- 
> error compiling committee.c: too many arguments to function
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2012-05-06 16:42   ` Paul E. McKenney
@ 2012-05-06 20:34     ` Sergey Senozhatsky
  2012-05-07  3:47       ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Sergey Senozhatsky @ 2012-05-06 20:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Avi Kivity, Marcelo Tosatti, kvm, linux-kernel, Dave Jones

On (05/06/12 09:42), Paul E. McKenney wrote:
> On Sun, May 06, 2012 at 11:55:30AM +0300, Avi Kivity wrote:
> > On 05/03/2012 11:02 PM, Sergey Senozhatsky wrote:
> > > Hello,
> > > 3.4-rc5
> > 
> > Whoa.
> > 
> > Looks like inconsistent locking between cpufreq and
> > synchronize_srcu_expedited().  kvm triggered this because it is one of
> > the few users of synchronize_srcu_expedited(), but I don't think it is
> > doing anything wrong directly.
> > 
> > Dave, Paul?
> 
> SRCU hasn't changed much in mainline for quite some time.  Holding
> the hotplug mutex across a synchronize_srcu() is a bad idea, though.
> 
> However, there is a reworked implementation (courtesy of Lai Jiangshan)
> in -rcu that does not acquire the hotplug mutex.  Could you try that out?
>

Paul, should I try solely -rcu or there are several commits to pick up and apply
on top of -linus tree?

	-ss

 
> 
> > > [32881.212463] kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using workaround
> > > [32882.360505] 
> > > [32882.360509] ======================================================
> > > [32882.360511] [ INFO: possible circular locking dependency detected ]
> > > [32882.360515] 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107 Not tainted
> > > [32882.360517] -------------------------------------------------------
> > > [32882.360519] qemu-system-x86/15168 is trying to acquire lock:
> > > [32882.360521]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> > > [32882.360532] 
> > > [32882.360532] but task is already holding lock:
> > > [32882.360534]  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> > > [32882.360542] 
> > > [32882.360542] which lock already depends on the new lock.
> > > [32882.360543] 
> > > [32882.360545] 
> > > [32882.360545] the existing dependency chain (in reverse order) is:
> > > [32882.360547] 
> > > [32882.360547] -> #3 (&sp->mutex){+.+...}:
> > > [32882.360552]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > > [32882.360557]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > > [32882.360562]        [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> > > [32882.360566]        [<ffffffff81058027>] synchronize_srcu+0x15/0x17
> > > [32882.360569]        [<ffffffff810586e8>] srcu_notifier_chain_unregister+0x5b/0x69
> > > [32882.360573]        [<ffffffff813e3110>] cpufreq_unregister_notifier+0x22/0x3c
> > > [32882.360580]        [<ffffffff813e3e42>] cpufreq_governor_dbs+0x322/0x3ac
> > > [32882.360584]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
> > > [32882.360587]        [<ffffffff813e21a5>] __cpufreq_set_policy+0xf3/0x145
> > > [32882.360591]        [<ffffffff813e2def>] store_scaling_governor+0x173/0x1a9
> > > [32882.360594]        [<ffffffff813e1f71>] store+0x5a/0x86
> > > [32882.360597]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
> > > [32882.360603]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
> > > [32882.360607]        [<ffffffff8111f959>] sys_write+0x43/0x73
> > > [32882.360610]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > > [32882.360614] 
> > > [32882.360615] -> #2 (dbs_mutex){+.+.+.}:
> > > [32882.360619]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > > [32882.360622]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > > [32882.360625]        [<ffffffff813e3b9c>] cpufreq_governor_dbs+0x7c/0x3ac
> > > [32882.360629]        [<ffffffff813e2075>] __cpufreq_governor+0x6b/0xa8
> > > [32882.360632]        [<ffffffff813e21bb>] __cpufreq_set_policy+0x109/0x145
> > > [32882.360636]        [<ffffffff813e244e>] cpufreq_add_dev_interface+0x257/0x288
> > > [32882.360639]        [<ffffffff813e2889>] cpufreq_add_dev+0x40a/0x42a
> > > [32882.360643]        [<ffffffff81398694>] subsys_interface_register+0x9b/0xdc
> > > [32882.360648]        [<ffffffff813e1935>] cpufreq_register_driver+0xa0/0x14b
> > > [32882.360652]        [<ffffffffa00a1086>] store_up_threshold+0x3a/0x50 [cpufreq_ondemand]
> > > [32882.360657]        [<ffffffff8100020f>] do_one_initcall+0x7f/0x140
> > > [32882.360663]        [<ffffffff8109158c>] sys_init_module+0x1818/0x1aec
> > > [32882.360667]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > > [32882.360671] 
> > > [32882.360671] -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
> > > [32882.360675]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > > [32882.360679]        [<ffffffff814aa661>] down_write+0x49/0x6c
> > > [32882.360682]        [<ffffffff813e1eaa>] lock_policy_rwsem_write+0x47/0x78
> > > [32882.360685]        [<ffffffff814a18c9>] cpufreq_cpu_callback+0x57/0x81
> > > [32882.360692]        [<ffffffff814b0643>] notifier_call_chain+0xac/0xd9
> > > [32882.360697]        [<ffffffff810582b6>] __raw_notifier_call_chain+0xe/0x10
> > > [32882.360701]        [<ffffffff81033950>] __cpu_notify+0x20/0x37
> > > [32882.360705]        [<ffffffff8149091c>] _cpu_down+0x7b/0x25d
> > > [32882.360709]        [<ffffffff81033b2f>] disable_nonboot_cpus+0x5f/0x10b
> > > [32882.360712]        [<ffffffff81072e61>] suspend_devices_and_enter+0x197/0x401
> > > [32882.360719]        [<ffffffff810731cf>] pm_suspend+0x104/0x1bd
> > > [32882.360722]        [<ffffffff8107213a>] state_store+0xa0/0xc9
> > > [32882.360726]        [<ffffffff8127115a>] kobj_attr_store+0xf/0x1b
> > > [32882.360730]        [<ffffffff81181e83>] sysfs_write_file+0xee/0x126
> > > [32882.360733]        [<ffffffff8111f6b1>] vfs_write+0xa3/0x14c
> > > [32882.360736]        [<ffffffff8111f959>] sys_write+0x43/0x73
> > > [32882.360739]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > > [32882.360743] 
> > > [32882.360743] -> #0 (cpu_hotplug.lock){+.+.+.}:
> > > [32882.360747]        [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
> > > [32882.360751]        [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > > [32882.360754]        [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > > [32882.360757]        [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> > > [32882.360760]        [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
> > > [32882.360766]        [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
> > > [32882.360769]        [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
> > > [32882.360773]        [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
> > > [32882.360789]        [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
> > > [32882.360798]        [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
> > > [32882.360803]        [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
> > > [32882.360816]        [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
> > > [32882.360825]        [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
> > > [32882.360829]        [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
> > > [32882.360832]        [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
> > > [32882.360835]        [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > > [32882.360839] 
> > > [32882.360839] other info that might help us debug this:
> > > [32882.360840] 
> > > [32882.360842] Chain exists of:
> > > [32882.360842]   cpu_hotplug.lock --> dbs_mutex --> &sp->mutex
> > > [32882.360847] 
> > > [32882.360848]  Possible unsafe locking scenario:
> > > [32882.360849] 
> > > [32882.360851]        CPU0                    CPU1
> > > [32882.360852]        ----                    ----
> > > [32882.360854]   lock(&sp->mutex);
> > > [32882.360856]                                lock(dbs_mutex);
> > > [32882.360859]                                lock(&sp->mutex);
> > > [32882.360862]   lock(cpu_hotplug.lock);
> > > [32882.360865] 
> > > [32882.360865]  *** DEADLOCK ***
> > > [32882.360866] 
> > > [32882.360868] 2 locks held by qemu-system-x86/15168:
> > > [32882.360870]  #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa01df1c4>] kvm_set_memory_region+0x29/0x50 [kvm]
> > > [32882.360882]  #1:  (&sp->mutex){+.+...}, at: [<ffffffff81057f91>] __synchronize_srcu+0x9a/0x104
> > > [32882.360888] 
> > > [32882.360889] stack backtrace:
> > > [32882.360892] Pid: 15168, comm: qemu-system-x86 Not tainted 3.4.0-rc5-dbg-00932-gfabccd4-dirty #1107
> > > [32882.360894] Call Trace:
> > > [32882.360898]  [<ffffffff814a3d1d>] print_circular_bug+0x29f/0x2b0
> > > [32882.360901]  [<ffffffff81084e93>] __lock_acquire+0xf6b/0x1612
> > > [32882.360905]  [<ffffffff81085b36>] lock_acquire+0x148/0x1c3
> > > [32882.360908]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
> > > [32882.360911]  [<ffffffff814aa040>] mutex_lock_nested+0x6e/0x2d1
> > > [32882.360914]  [<ffffffff81033a60>] ? get_online_cpus+0x41/0x55
> > > [32882.360917]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
> > > [32882.360921]  [<ffffffff81033a60>] get_online_cpus+0x41/0x55
> > > [32882.360923]  [<ffffffff810ab15b>] synchronize_sched_expedited+0x26/0xfa
> > > [32882.360927]  [<ffffffff810ab135>] ? synchronize_sched+0xa8/0xa8
> > > [32882.360930]  [<ffffffff81057f9f>] __synchronize_srcu+0xa8/0x104
> > > [32882.360933]  [<ffffffff81058010>] synchronize_srcu_expedited+0x15/0x17
> > > [32882.360942]  [<ffffffffa01df109>] __kvm_set_memory_region+0x3d8/0x46a [kvm]
> > > [32882.360945]  [<ffffffff81086268>] ? mark_held_locks+0xbe/0xea
> > > [32882.360954]  [<ffffffffa01df1d2>] kvm_set_memory_region+0x37/0x50 [kvm]
> > > [32882.360959]  [<ffffffffa0258a89>] vmx_set_tss_addr+0x4c/0x200 [kvm_intel]
> > > [32882.360971]  [<ffffffffa01ef732>] kvm_arch_vm_ioctl+0x160/0x9df [kvm]
> > > [32882.360980]  [<ffffffffa01df571>] kvm_vm_ioctl+0x36a/0x39c [kvm]
> > > [32882.360984]  [<ffffffff810855c5>] ? lock_release_non_nested+0x8b/0x241
> > > [32882.360987]  [<ffffffff8112fc68>] vfs_ioctl+0x24/0x2f
> > > [32882.360990]  [<ffffffff81130566>] do_vfs_ioctl+0x412/0x455
> > > [32882.360993]  [<ffffffff81120f8f>] ? fget_light+0x120/0x39b
> > > [32882.360996]  [<ffffffff811305ff>] sys_ioctl+0x56/0x7b
> > > [32882.360999]  [<ffffffff814b432d>] system_call_fastpath+0x1a/0x1f
> > >
> > >
> > >
> > > 	-ss
> > 
> > 
> > -- 
> > error compiling committee.c: too many arguments to function
> > 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2012-05-06 20:34     ` Sergey Senozhatsky
@ 2012-05-07  3:47       ` Paul E. McKenney
  2012-05-07  7:52         ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2012-05-07  3:47 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Avi Kivity, Marcelo Tosatti, kvm, linux-kernel, Dave Jones

On Sun, May 06, 2012 at 11:34:39PM +0300, Sergey Senozhatsky wrote:
> On (05/06/12 09:42), Paul E. McKenney wrote:
> > On Sun, May 06, 2012 at 11:55:30AM +0300, Avi Kivity wrote:
> > > On 05/03/2012 11:02 PM, Sergey Senozhatsky wrote:
> > > > Hello,
> > > > 3.4-rc5
> > > 
> > > Whoa.
> > > 
> > > Looks like inconsistent locking between cpufreq and
> > > synchronize_srcu_expedited().  kvm triggered this because it is one of
> > > the few users of synchronize_srcu_expedited(), but I don't think it is
> > > doing anything wrong directly.
> > > 
> > > Dave, Paul?
> > 
> > SRCU hasn't changed much in mainline for quite some time.  Holding
> > the hotplug mutex across a synchronize_srcu() is a bad idea, though.
> > 
> > However, there is a reworked implementation (courtesy of Lai Jiangshan)
> > in -rcu that does not acquire the hotplug mutex.  Could you try that out?
> >
> 
> Paul, should I try solely -rcu or there are several commits to pick up and apply
> on top of -linus tree?

If you want the smallest possible change, take the rcu/srcu branch of -rcu.
If you want the works, take the rcu/next branch of -rcu.

You can find -rcu at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

								Thanx, Paul


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2012-05-07  3:47       ` Paul E. McKenney
@ 2012-05-07  7:52         ` Avi Kivity
  2012-05-07 22:10           ` Sergey Senozhatsky
  0 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2012-05-07  7:52 UTC (permalink / raw)
  To: paulmck
  Cc: Sergey Senozhatsky, Marcelo Tosatti, kvm, linux-kernel, Dave Jones

On 05/07/2012 06:47 AM, Paul E. McKenney wrote:
> On Sun, May 06, 2012 at 11:34:39PM +0300, Sergey Senozhatsky wrote:
> > On (05/06/12 09:42), Paul E. McKenney wrote:
> > > On Sun, May 06, 2012 at 11:55:30AM +0300, Avi Kivity wrote:
> > > > On 05/03/2012 11:02 PM, Sergey Senozhatsky wrote:
> > > > > Hello,
> > > > > 3.4-rc5
> > > > 
> > > > Whoa.
> > > > 
> > > > Looks like inconsistent locking between cpufreq and
> > > > synchronize_srcu_expedited().  kvm triggered this because it is one of
> > > > the few users of synchronize_srcu_expedited(), but I don't think it is
> > > > doing anything wrong directly.
> > > > 
> > > > Dave, Paul?
> > > 
> > > SRCU hasn't changed much in mainline for quite some time.  Holding
> > > the hotplug mutex across a synchronize_srcu() is a bad idea, though.
> > > 
> > > However, there is a reworked implementation (courtesy of Lai Jiangshan)
> > > in -rcu that does not acquire the hotplug mutex.  Could you try that out?
> > >
> > 
> > Paul, should I try solely -rcu or there are several commits to pick up and apply
> > on top of -linus tree?
>
> If you want the smallest possible change, take the rcu/srcu branch of -rcu.
> If you want the works, take the rcu/next branch of -rcu.
>
> You can find -rcu at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

To make the difference even smaller, merge the above branch with v3.4-rc5.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2012-05-07  7:52         ` Avi Kivity
@ 2012-05-07 22:10           ` Sergey Senozhatsky
  0 siblings, 0 replies; 9+ messages in thread
From: Sergey Senozhatsky @ 2012-05-07 22:10 UTC (permalink / raw)
  To: Avi Kivity; +Cc: paulmck, Marcelo Tosatti, kvm, linux-kernel, Dave Jones

On (05/07/12 10:52), Avi Kivity wrote:
> On 05/07/2012 06:47 AM, Paul E. McKenney wrote:
> > On Sun, May 06, 2012 at 11:34:39PM +0300, Sergey Senozhatsky wrote:
> > > On (05/06/12 09:42), Paul E. McKenney wrote:
> > > > On Sun, May 06, 2012 at 11:55:30AM +0300, Avi Kivity wrote:
> > > > > On 05/03/2012 11:02 PM, Sergey Senozhatsky wrote:
> > > > > > Hello,
> > > > > > 3.4-rc5
> > > > > 
> > > > > Whoa.
> > > > > 
> > > > > Looks like inconsistent locking between cpufreq and
> > > > > synchronize_srcu_expedited().  kvm triggered this because it is one of
> > > > > the few users of synchronize_srcu_expedited(), but I don't think it is
> > > > > doing anything wrong directly.
> > > > > 
> > > > > Dave, Paul?
> > > > 
> > > > SRCU hasn't changed much in mainline for quite some time.  Holding
> > > > the hotplug mutex across a synchronize_srcu() is a bad idea, though.
> > > > 
> > > > However, there is a reworked implementation (courtesy of Lai Jiangshan)
> > > > in -rcu that does not acquire the hotplug mutex.  Could you try that out?
> > > >
> > > 
> > > Paul, should I try solely -rcu or there are several commits to pick up and apply
> > > on top of -linus tree?
> >
> > If you want the smallest possible change, take the rcu/srcu branch of -rcu.
> > If you want the works, take the rcu/next branch of -rcu.
> >
> > You can find -rcu at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> 
> To make the difference even smaller, merge the above branch with v3.4-rc5.
> 

I'm unable to reproduce the issue on 3.4-rc6 so far. So I guess this will 
take some time.


	Sergey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: possible circular locking dependency
  2009-09-21 14:00 Christof Schmitt
@ 2009-11-10 13:33 ` Christof Schmitt
  0 siblings, 0 replies; 9+ messages in thread
From: Christof Schmitt @ 2009-11-10 13:33 UTC (permalink / raw)
  To: linux-scsi

On Mon, Sep 21, 2009 at 04:00:50PM +0200, Christof Schmitt wrote:
> The lock dependency checker found this circular lock dependency
> warning on the 2.6.31 kernel plus some s390 patches. But the problem
> occurs in common SCSI code in 5 steps:
> 
> #4 first acquires scan_mutex in scsi_remove_device,
>    then sd_ref_mutex in scsi_disk_get_from_dev
> 
> #3 first acquires rport_delete_work in run_workqueue (inlined in worker_thread),
>    then scan_mutex in scsi_remove_device
> 
> #2 first acquires fc_host->work_q in run_workqueue,
>    then rport_delete_work also in run_workqueue
> 
> #1 first acquires cpu_add_remove_lock in destroy_workqueue,
>    then fc_host->work_q in cleanup_workqueue_thread
> 
> #0 first acquires sd_ref_mutex in scsi_disk_put,
>    then cpu_add_remove_lock in destroy_workqueue
> 
> I think this is only a theoretical warning which will be very hard or
> impossible to trigger in reality. But at least the warning should be
> fixed to keep the lock dependency checker useful.
> 
> Does anybody have an idea how to break this dependency chain?

This still happens with 2.6.32. I think it boils down to:

#4: The work function acquiring the sd_ref_mutex gives:
    cpu_add_remove_lock -> sd_ref_mutex

#0: Calling destroy_workqueue from scsi_host_dev_release introduces
    the dependency
    sd_ref_mutex -> cpu_add_remove_lock

But the sd_ref_mutex is required for the scsi_disk references. So far,
i don't see a good way to approach this.

> 
> The complete output of the lock dependency checker:
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31 #12
> -------------------------------------------------------
> multipathd/2285 is trying to acquire lock:
>  (cpu_add_remove_lock){+.+.+.}, at: [<000000000006a38e>] destroy_workqueue+0x3a/0x274
> 
> but task is already holding lock:
>  (sd_ref_mutex){+.+.+.}, at: [<0000000000284202>] scsi_disk_put+0x36/0x5c
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #4 (sd_ref_mutex){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>        [<0000000000284190>] scsi_disk_get_from_dev+0x30/0x6c
>        [<0000000000284830>] sd_shutdown+0x28/0x160
>        [<0000000000284ca4>] sd_remove+0x68/0xac
>        [<0000000000257450>] __device_release_driver+0x98/0x108
>        [<00000000002575e8>] device_release_driver+0x38/0x48
>        [<000000000025674a>] bus_remove_device+0xd6/0x11c
>        [<000000000025458c>] device_del+0x160/0x218
>        [<0000000000272650>] __scsi_remove_device+0x6c/0xb4
>        [<00000000002726da>] scsi_remove_device+0x42/0x54
>        [<00000000002727c6>] __scsi_remove_target+0xce/0x108
>        [<00000000002728ae>] __remove_child+0x3a/0x4c
>        [<0000000000253b0e>] device_for_each_child+0x72/0xbc
>        [<000000000027284e>] scsi_remove_target+0x4e/0x74
>        [<000000000027929a>] fc_rport_final_delete+0xb2/0x20c
>        [<0000000000069ed0>] worker_thread+0x25c/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #3 (&shost->scan_mutex){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>        [<00000000002726d0>] scsi_remove_device+0x38/0x54
>        [<00000000002727c6>] __scsi_remove_target+0xce/0x108
>        [<00000000002728ae>] __remove_child+0x3a/0x4c
>        [<0000000000253b0e>] device_for_each_child+0x72/0xbc
>        [<000000000027284e>] scsi_remove_target+0x4e/0x74
>        [<000000000027929a>] fc_rport_final_delete+0xb2/0x20c
>        [<0000000000069ed0>] worker_thread+0x25c/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #2 (&rport->rport_delete_work){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<0000000000069eca>] worker_thread+0x256/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #1 ((fc_host->work_q_name)){+.+.+.}:
>        [<0000000000086782>] __lock_acquire+0xe76/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000006a2ae>] cleanup_workqueue_thread+0x62/0xac
>        [<000000000006a420>] destroy_workqueue+0xcc/0x274
>        [<0000000000279c4a>] fc_remove_host+0x1de/0x210
>        [<000000000034556e>] zfcp_adapter_scsi_unregister+0x96/0xc4
>        [<0000000000343df0>] zfcp_ccw_remove+0x9c/0x370
>        [<00000000002c2a6a>] ccw_device_remove+0x3e/0x1a8
>        [<0000000000257450>] __device_release_driver+0x98/0x108
>        [<00000000002575e8>] device_release_driver+0x38/0x48
>        [<000000000025674a>] bus_remove_device+0xd6/0x11c
>        [<000000000025458c>] device_del+0x160/0x218
>        [<00000000002c3404>] ccw_device_unregister+0x5c/0x7c
>        [<00000000002c3490>] io_subchannel_remove+0x6c/0x9c
>        [<00000000002be32e>] css_remove+0x3e/0x7c
>        [<0000000000257450>] __device_release_driver+0x98/0x108
>        [<00000000002575e8>] device_release_driver+0x38/0x48
>        [<000000000025674a>] bus_remove_device+0xd6/0x11c
>        [<000000000025458c>] device_del+0x160/0x218
>        [<000000000025466a>] device_unregister+0x26/0x38
>        [<00000000002be4bc>] css_sch_device_unregister+0x44/0x54
>        [<00000000002c435e>] ccw_device_call_sch_unregister+0x4e/0x78
>        [<0000000000069ed0>] worker_thread+0x25c/0x318
>        [<000000000006ff62>] kthread+0x9a/0xa4
>        [<000000000001c952>] kernel_thread_starter+0x6/0xc
>        [<000000000001c94c>] kernel_thread_starter+0x0/0xc
> 
> -> #0 (cpu_add_remove_lock){+.+.+.}:
>        [<0000000000086e5a>] __lock_acquire+0x154e/0x1940
>        [<00000000000872dc>] lock_acquire+0x90/0xb8
>        [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>        [<000000000006a38e>] destroy_workqueue+0x3a/0x274
>        [<0000000000265bb0>] scsi_host_dev_release+0x88/0x104
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<00000000002771cc>] fc_rport_dev_release+0x2c/0x40
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<000000000006ba9c>] execute_in_process_context+0xa4/0xbc
>        [<000000000025396a>] device_release+0x36/0xa0
>        [<000000000022ae92>] kobject_release+0x62/0xa8
>        [<000000000022c11c>] kref_put+0x74/0x94
>        [<0000000000284216>] scsi_disk_put+0x4a/0x5c
>        [<0000000000285560>] sd_release+0x6c/0x108
>        [<0000000000126364>] __blkdev_put+0x1b8/0x1cc
>        [<00000000000f224e>] __fput+0x12a/0x240
>        [<00000000000ee4c0>] filp_close+0x78/0xa8
>        [<00000000000ee5d0>] SyS_close+0xe0/0x148
>        [<000000000002a042>] sysc_noemu+0x10/0x16
>        [<0000020000041160>] 0x20000041160
> 
> other info that might help us debug this:
> 
> 2 locks held by multipathd/2285:
>  #0:  (&bdev->bd_mutex){+.+.+.}, at: [<00000000001261f2>] __blkdev_put+0x46/0x1cc
>  #1:  (sd_ref_mutex){+.+.+.}, at: [<0000000000284202>] scsi_disk_put+0x36/0x5c
> 
> stack backtrace:
> CPU: 1 Not tainted 2.6.31 #12
> Process multipathd (pid: 2285, task: 000000002d87b900, ksp: 000000002eca7800)
> 0000000000000000 000000002eca7770 0000000000000002 0000000000000000 
>        000000002eca7810 000000002eca7788 000000002eca7788 000000000046db82 
>        0000000000000000 0000000000000001 000000002d87bfd0 0000000000000000 
>        000000000000000d 0000000000000000 000000002eca77d8 000000000000000e 
>        000000000047fc30 0000000000017d80 000000002eca7770 000000002eca77b8 
> Call Trace:
> ([<0000000000017c82>] show_trace+0xee/0x144)
>  [<000000000008532e>] print_circular_bug_tail+0x10a/0x110
>  [<0000000000086e5a>] __lock_acquire+0x154e/0x1940
>  [<00000000000872dc>] lock_acquire+0x90/0xb8
>  [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
>  [<000000000006a38e>] destroy_workqueue+0x3a/0x274
>  [<0000000000265bb0>] scsi_host_dev_release+0x88/0x104
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<00000000002771cc>] fc_rport_dev_release+0x2c/0x40
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<000000000006ba9c>] execute_in_process_context+0xa4/0xbc
>  [<000000000025396a>] device_release+0x36/0xa0
>  [<000000000022ae92>] kobject_release+0x62/0xa8
>  [<000000000022c11c>] kref_put+0x74/0x94
>  [<0000000000284216>] scsi_disk_put+0x4a/0x5c
>  [<0000000000285560>] sd_release+0x6c/0x108
>  [<0000000000126364>] __blkdev_put+0x1b8/0x1cc
>  [<00000000000f224e>] __fput+0x12a/0x240
>  [<00000000000ee4c0>] filp_close+0x78/0xa8
>  [<00000000000ee5d0>] SyS_close+0xe0/0x148
>  [<000000000002a042>] sysc_noemu+0x10/0x16
>  [<0000020000041160>] 0x20000041160
> INFO: lockdep is turned off.
> 
> --
> Christof Schmitt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* possible circular locking dependency
@ 2009-09-21 14:00 Christof Schmitt
  2009-11-10 13:33 ` Christof Schmitt
  0 siblings, 1 reply; 9+ messages in thread
From: Christof Schmitt @ 2009-09-21 14:00 UTC (permalink / raw)
  To: linux-scsi

The lock dependency checker found this circular lock dependency
warning on the 2.6.31 kernel plus some s390 patches. But the problem
occurs in common SCSI code in 5 steps:

#4 first acquires scan_mutex in scsi_remove_device,
   then sd_ref_mutex in scsi_disk_get_from_dev

#3 first acquires rport_delete_work in run_workqueue (inlined in worker_thread),
   then scan_mutex in scsi_remove_device

#2 first acquires fc_host->work_q in run_workqueue,
   then rport_delete_work also in run_workqueue

#1 first acquires cpu_add_remove_lock in destroy_workqueue,
   then fc_host->work_q in cleanup_workqueue_thread

#0 first acquires sd_ref_mutex in scsi_disk_put,
   then cpu_add_remove_lock in destroy_workqueue

I think this is only a theoretical warning which will be very hard or
impossible to trigger in reality. But at least the warning should be
fixed to keep the lock dependency checker useful.

Does anybody have an idea how to break this dependency chain?

The complete output of the lock dependency checker:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31 #12
-------------------------------------------------------
multipathd/2285 is trying to acquire lock:
 (cpu_add_remove_lock){+.+.+.}, at: [<000000000006a38e>] destroy_workqueue+0x3a/0x274

but task is already holding lock:
 (sd_ref_mutex){+.+.+.}, at: [<0000000000284202>] scsi_disk_put+0x36/0x5c

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (sd_ref_mutex){+.+.+.}:
       [<0000000000086782>] __lock_acquire+0xe76/0x1940
       [<00000000000872dc>] lock_acquire+0x90/0xb8
       [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
       [<0000000000284190>] scsi_disk_get_from_dev+0x30/0x6c
       [<0000000000284830>] sd_shutdown+0x28/0x160
       [<0000000000284ca4>] sd_remove+0x68/0xac
       [<0000000000257450>] __device_release_driver+0x98/0x108
       [<00000000002575e8>] device_release_driver+0x38/0x48
       [<000000000025674a>] bus_remove_device+0xd6/0x11c
       [<000000000025458c>] device_del+0x160/0x218
       [<0000000000272650>] __scsi_remove_device+0x6c/0xb4
       [<00000000002726da>] scsi_remove_device+0x42/0x54
       [<00000000002727c6>] __scsi_remove_target+0xce/0x108
       [<00000000002728ae>] __remove_child+0x3a/0x4c
       [<0000000000253b0e>] device_for_each_child+0x72/0xbc
       [<000000000027284e>] scsi_remove_target+0x4e/0x74
       [<000000000027929a>] fc_rport_final_delete+0xb2/0x20c
       [<0000000000069ed0>] worker_thread+0x25c/0x318
       [<000000000006ff62>] kthread+0x9a/0xa4
       [<000000000001c952>] kernel_thread_starter+0x6/0xc
       [<000000000001c94c>] kernel_thread_starter+0x0/0xc

-> #3 (&shost->scan_mutex){+.+.+.}:
       [<0000000000086782>] __lock_acquire+0xe76/0x1940
       [<00000000000872dc>] lock_acquire+0x90/0xb8
       [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
       [<00000000002726d0>] scsi_remove_device+0x38/0x54
       [<00000000002727c6>] __scsi_remove_target+0xce/0x108
       [<00000000002728ae>] __remove_child+0x3a/0x4c
       [<0000000000253b0e>] device_for_each_child+0x72/0xbc
       [<000000000027284e>] scsi_remove_target+0x4e/0x74
       [<000000000027929a>] fc_rport_final_delete+0xb2/0x20c
       [<0000000000069ed0>] worker_thread+0x25c/0x318
       [<000000000006ff62>] kthread+0x9a/0xa4
       [<000000000001c952>] kernel_thread_starter+0x6/0xc
       [<000000000001c94c>] kernel_thread_starter+0x0/0xc

-> #2 (&rport->rport_delete_work){+.+.+.}:
       [<0000000000086782>] __lock_acquire+0xe76/0x1940
       [<00000000000872dc>] lock_acquire+0x90/0xb8
       [<0000000000069eca>] worker_thread+0x256/0x318
       [<000000000006ff62>] kthread+0x9a/0xa4
       [<000000000001c952>] kernel_thread_starter+0x6/0xc
       [<000000000001c94c>] kernel_thread_starter+0x0/0xc

-> #1 ((fc_host->work_q_name)){+.+.+.}:
       [<0000000000086782>] __lock_acquire+0xe76/0x1940
       [<00000000000872dc>] lock_acquire+0x90/0xb8
       [<000000000006a2ae>] cleanup_workqueue_thread+0x62/0xac
       [<000000000006a420>] destroy_workqueue+0xcc/0x274
       [<0000000000279c4a>] fc_remove_host+0x1de/0x210
       [<000000000034556e>] zfcp_adapter_scsi_unregister+0x96/0xc4
       [<0000000000343df0>] zfcp_ccw_remove+0x9c/0x370
       [<00000000002c2a6a>] ccw_device_remove+0x3e/0x1a8
       [<0000000000257450>] __device_release_driver+0x98/0x108
       [<00000000002575e8>] device_release_driver+0x38/0x48
       [<000000000025674a>] bus_remove_device+0xd6/0x11c
       [<000000000025458c>] device_del+0x160/0x218
       [<00000000002c3404>] ccw_device_unregister+0x5c/0x7c
       [<00000000002c3490>] io_subchannel_remove+0x6c/0x9c
       [<00000000002be32e>] css_remove+0x3e/0x7c
       [<0000000000257450>] __device_release_driver+0x98/0x108
       [<00000000002575e8>] device_release_driver+0x38/0x48
       [<000000000025674a>] bus_remove_device+0xd6/0x11c
       [<000000000025458c>] device_del+0x160/0x218
       [<000000000025466a>] device_unregister+0x26/0x38
       [<00000000002be4bc>] css_sch_device_unregister+0x44/0x54
       [<00000000002c435e>] ccw_device_call_sch_unregister+0x4e/0x78
       [<0000000000069ed0>] worker_thread+0x25c/0x318
       [<000000000006ff62>] kthread+0x9a/0xa4
       [<000000000001c952>] kernel_thread_starter+0x6/0xc
       [<000000000001c94c>] kernel_thread_starter+0x0/0xc

-> #0 (cpu_add_remove_lock){+.+.+.}:
       [<0000000000086e5a>] __lock_acquire+0x154e/0x1940
       [<00000000000872dc>] lock_acquire+0x90/0xb8
       [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
       [<000000000006a38e>] destroy_workqueue+0x3a/0x274
       [<0000000000265bb0>] scsi_host_dev_release+0x88/0x104
       [<000000000025396a>] device_release+0x36/0xa0
       [<000000000022ae92>] kobject_release+0x62/0xa8
       [<000000000022c11c>] kref_put+0x74/0x94
       [<00000000002771cc>] fc_rport_dev_release+0x2c/0x40
       [<000000000025396a>] device_release+0x36/0xa0
       [<000000000022ae92>] kobject_release+0x62/0xa8
       [<000000000022c11c>] kref_put+0x74/0x94
       [<000000000025396a>] device_release+0x36/0xa0
       [<000000000022ae92>] kobject_release+0x62/0xa8
       [<000000000022c11c>] kref_put+0x74/0x94
       [<000000000006ba9c>] execute_in_process_context+0xa4/0xbc
       [<000000000025396a>] device_release+0x36/0xa0
       [<000000000022ae92>] kobject_release+0x62/0xa8
       [<000000000022c11c>] kref_put+0x74/0x94
       [<0000000000284216>] scsi_disk_put+0x4a/0x5c
       [<0000000000285560>] sd_release+0x6c/0x108
       [<0000000000126364>] __blkdev_put+0x1b8/0x1cc
       [<00000000000f224e>] __fput+0x12a/0x240
       [<00000000000ee4c0>] filp_close+0x78/0xa8
       [<00000000000ee5d0>] SyS_close+0xe0/0x148
       [<000000000002a042>] sysc_noemu+0x10/0x16
       [<0000020000041160>] 0x20000041160

other info that might help us debug this:

2 locks held by multipathd/2285:
 #0:  (&bdev->bd_mutex){+.+.+.}, at: [<00000000001261f2>] __blkdev_put+0x46/0x1cc
 #1:  (sd_ref_mutex){+.+.+.}, at: [<0000000000284202>] scsi_disk_put+0x36/0x5c

stack backtrace:
CPU: 1 Not tainted 2.6.31 #12
Process multipathd (pid: 2285, task: 000000002d87b900, ksp: 000000002eca7800)
0000000000000000 000000002eca7770 0000000000000002 0000000000000000 
       000000002eca7810 000000002eca7788 000000002eca7788 000000000046db82 
       0000000000000000 0000000000000001 000000002d87bfd0 0000000000000000 
       000000000000000d 0000000000000000 000000002eca77d8 000000000000000e 
       000000000047fc30 0000000000017d80 000000002eca7770 000000002eca77b8 
Call Trace:
([<0000000000017c82>] show_trace+0xee/0x144)
 [<000000000008532e>] print_circular_bug_tail+0x10a/0x110
 [<0000000000086e5a>] __lock_acquire+0x154e/0x1940
 [<00000000000872dc>] lock_acquire+0x90/0xb8
 [<000000000046fccc>] mutex_lock_nested+0x80/0x41c
 [<000000000006a38e>] destroy_workqueue+0x3a/0x274
 [<0000000000265bb0>] scsi_host_dev_release+0x88/0x104
 [<000000000025396a>] device_release+0x36/0xa0
 [<000000000022ae92>] kobject_release+0x62/0xa8
 [<000000000022c11c>] kref_put+0x74/0x94
 [<00000000002771cc>] fc_rport_dev_release+0x2c/0x40
 [<000000000025396a>] device_release+0x36/0xa0
 [<000000000022ae92>] kobject_release+0x62/0xa8
 [<000000000022c11c>] kref_put+0x74/0x94
 [<000000000025396a>] device_release+0x36/0xa0
 [<000000000022ae92>] kobject_release+0x62/0xa8
 [<000000000022c11c>] kref_put+0x74/0x94
 [<000000000006ba9c>] execute_in_process_context+0xa4/0xbc
 [<000000000025396a>] device_release+0x36/0xa0
 [<000000000022ae92>] kobject_release+0x62/0xa8
 [<000000000022c11c>] kref_put+0x74/0x94
 [<0000000000284216>] scsi_disk_put+0x4a/0x5c
 [<0000000000285560>] sd_release+0x6c/0x108
 [<0000000000126364>] __blkdev_put+0x1b8/0x1cc
 [<00000000000f224e>] __fput+0x12a/0x240
 [<00000000000ee4c0>] filp_close+0x78/0xa8
 [<00000000000ee5d0>] SyS_close+0xe0/0x148
 [<000000000002a042>] sysc_noemu+0x10/0x16
 [<0000020000041160>] 0x20000041160
INFO: lockdep is turned off.

--
Christof Schmitt

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-05-07 22:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-03 20:02 possible circular locking dependency Sergey Senozhatsky
2012-05-06  8:55 ` Avi Kivity
2012-05-06 16:42   ` Paul E. McKenney
2012-05-06 20:34     ` Sergey Senozhatsky
2012-05-07  3:47       ` Paul E. McKenney
2012-05-07  7:52         ` Avi Kivity
2012-05-07 22:10           ` Sergey Senozhatsky
  -- strict thread matches above, loose matches on Subject: below --
2009-09-21 14:00 Christof Schmitt
2009-11-10 13:33 ` Christof Schmitt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.