linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
       [not found]   ` <1501753405.15969.43.camel@intel.com>
@ 2017-08-03 10:02     ` Kalle Valo
  2017-08-03 11:28       ` Coelho, Luciano
  0 siblings, 1 reply; 17+ messages in thread
From: Kalle Valo @ 2017-08-03 10:02 UTC (permalink / raw)
  To: Coelho, Luciano
  Cc: jikos, Zhang, Rui, edubezval, Sharon, Sara, Berg, Johannes,
	Grumbach, Emmanuel, linuxwifi, linux-kernel, linux-pm, Weinehall,
	David, linux-wireless

"Coelho, Luciano" <luciano.coelho@intel.com> writes:

> On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote:
>> On Mon, 31 Jul 2017, Jiri Kosina wrote:
>>=20
>> > Hi,
>> >=20
>> > booting current Linus' tree, I'm seeing lockdep splat (see the end of =
this=20
>> > mail).
>> >=20
>> > Apparently, there is AB-BA between tz->lock and mvm->mutex through the=
 CPU=20
>> > hotplug lock.
>> >=20
>> > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and=
=20
>> > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp()=20
>> > callback), which acquires mvm->mutex
>> >=20
>> > The less obvious dependency is primarily caused by iwl_op_mode_mvm_sta=
rt()=20
>> > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which i=
s=20
>> > broken, because that mutex is being taken also from CPU hotplug callba=
ck=20
>> > path, hence the AB-BA).
>>=20
>> As the "central" part of the dependency is being added by iwlwifi driver=
=20
>> (_iwl_pcie_rx_init() allocating workqueue while holding=20
>> trans_pcie->mutex), I'm adding iwlwifi folks as well to CC.
>>=20
>> >=20
>> >  =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>> >  WARNING: possible circular locking dependency detected
>> >  4.13.0-rc2-00110-g0b5477d #347 Not tainted
>> >  ------------------------------------------------------
>> >  modprobe/881 is trying to acquire lock:
>> >   (&mvm->mutex){+.+.+.}, at: [<ffffffffc0c504d2>] iwl_mvm_tzone_get_te=
mp+0x32/0x80 [iwlmvm]
>> >=20=20
>> >  but task is already holding lock:
>> >   (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_temp+=
0x41/0x70
>> >=20=20
>> >  which lock already depends on the new lock.
>> >=20
>> >=20=20
>> >  the existing dependency chain (in reverse order) is:
>> >=20=20
>> >  -> #5 (&tz->lock){+.+.+.}:
>> >         lock_acquire+0xbd/0x220
>> >         __mutex_lock+0x6e/0x900
>> >         mutex_lock_nested+0x1b/0x20
>> >         thermal_zone_get_temp+0x41/0x70
>> >         thermal_zone_device_update+0x3c/0x280
>> >         thermal_zone_device_register+0x3b8/0x610
>> >         pkg_thermal_cpu_online+0x20b/0x284 [x86_pkg_temp_thermal]
>> >         cpuhp_invoke_callback+0xac/0x900
>> >         cpuhp_thread_fun+0x79/0x160
>> >         smpboot_thread_fn+0x156/0x220
>> >         kthread+0x114/0x150
>> >         ret_from_fork+0x2a/0x40
>> >=20=20
>> >  -> #4 (cpuhp_state){+.+.+.}:
>> >         lock_acquire+0xbd/0x220
>> >         cpuhp_issue_call+0xea/0x170
>> >         __cpuhp_setup_state_cpuslocked+0x12a/0x190
>> >         __cpuhp_setup_state+0x46/0xc0
>> >         page_writeback_init+0x43/0x67
>> >         pagecache_init+0x39/0x3c
>> >         start_kernel+0x45a/0x4ae
>> >         x86_64_start_reservations+0x24/0x26
>> >         x86_64_start_kernel+0x13d/0x14c
>> >         verify_cpu+0x0/0xf1
>> >=20=20
>> >  -> #3 (cpuhp_state_mutex){+.+.+.}:
>> >         lock_acquire+0xbd/0x220
>> >         __mutex_lock+0x6e/0x900
>> >         mutex_lock_nested+0x1b/0x20
>> >         __cpuhp_setup_state_cpuslocked+0x4f/0x190
>> >         __cpuhp_setup_state+0x46/0xc0
>> >         page_alloc_init+0x28/0x30
>> >         start_kernel+0x186/0x4ae
>> >         x86_64_start_reservations+0x24/0x26
>> >         x86_64_start_kernel+0x13d/0x14c
>> >         verify_cpu+0x0/0xf1
>> >=20=20
>> >  -> #2 (cpu_hotplug_lock.rw_sem){++++++}:
>> >         lock_acquire+0xbd/0x220
>> >         cpus_read_lock+0x46/0x90
>> >         apply_workqueue_attrs+0x17/0x50
>> >         __alloc_workqueue_key+0x195/0x4d0
>> >         _iwl_pcie_rx_init+0x384/0x390 [iwlwifi]
>> >         iwl_pcie_rx_init+0x1e/0x380 [iwlwifi]
>> >         iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi]
>> >         iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm]
>> >         iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm]
>> >         iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm]
>> >         _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
>> >         iwl_opmode_register+0xaa/0xd0 [iwlwifi]
>> >         iwl_mvm_init+0x37/0x1000 [iwlmvm]
>> >         do_one_initcall+0x51/0x1a9
>> >         do_init_module+0x60/0x20e
>> >         load_module+0x203f/0x2b50
>> >         SYSC_finit_module+0x96/0xd0
>> >         SyS_finit_module+0xe/0x10
>> >         entry_SYSCALL_64_fastpath+0x23/0xc2
>> >=20=20
>> >  -> #1 (&trans_pcie->mutex){+.+.+.}:
>> >         lock_acquire+0xbd/0x220
>> >         __mutex_lock+0x6e/0x900
>> >         mutex_lock_nested+0x1b/0x20
>> >         iwl_trans_pcie_start_fw+0x130/0x6f0 [iwlwifi]
>> >         iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm]
>> >         iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm]
>> >         iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm]
>> >         _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
>> >         iwl_opmode_register+0xaa/0xd0 [iwlwifi]
>> >         iwl_mvm_init+0x37/0x1000 [iwlmvm]
>> >         do_one_initcall+0x51/0x1a9
>> >         do_init_module+0x60/0x20e
>> >         load_module+0x203f/0x2b50
>> >         SYSC_finit_module+0x96/0xd0
>> >         SyS_finit_module+0xe/0x10
>> >         entry_SYSCALL_64_fastpath+0x23/0xc2
>> >=20=20
>> >  -> #0 (&mvm->mutex){+.+.+.}:
>> >         __lock_acquire+0x13e1/0x1400
>> >         lock_acquire+0xbd/0x220
>> >         __mutex_lock+0x6e/0x900
>> >         mutex_lock_nested+0x1b/0x20
>> >         iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
>> >         thermal_zone_get_temp+0x51/0x70
>> >         thermal_zone_device_update+0x3c/0x280
>> >         thermal_zone_device_register+0x3b8/0x610
>> >         iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm]
>> >         iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm]
>> >         _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
>> >         iwl_opmode_register+0xaa/0xd0 [iwlwifi]
>> >         iwl_mvm_init+0x37/0x1000 [iwlmvm]
>> >         do_one_initcall+0x51/0x1a9
>> >         do_init_module+0x60/0x20e
>> >         load_module+0x203f/0x2b50
>> >         SYSC_finit_module+0x96/0xd0
>> >         SyS_finit_module+0xe/0x10
>> >         entry_SYSCALL_64_fastpath+0x23/0xc2
>> >=20=20
>> >  other info that might help us debug this:
>> >=20
>> >  Chain exists of:
>> >    &mvm->mutex --> cpuhp_state --> &tz->lock
>> >=20
>> >   Possible unsafe locking scenario:
>> >=20
>> > =C2=A0       CPU0        =C2=A0           CPU1
>> >         ----                    ----
>> >    lock(&tz->lock);
>> >                                 lock(cpuhp_state);
>> >                                 lock(&tz->lock);
>> >    lock(&mvm->mutex);
>> >=20=20
>> >   *** DEADLOCK ***
>> >=20
>> >  2 locks held by modprobe/881:
>> >   #0:  (&iwlwifi_opmode_table_mtx){+.+.+.}, at: [<ffffffffc0aa2df4>] i=
wl_opmode_register+0x24/0xd0 [iwlwifi]
>> >   #1:  (&tz->lock){+.+.+.}, at: [<ffffffff926a9351>] thermal_zone_get_=
temp+0x41/0x70
>> >=20=20
>> >  stack backtrace:
>> >  CPU: 3 PID: 881 Comm: modprobe Not tainted 4.13.0-rc2-00110-g0b5477d =
#347
>> >  Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05=
/31/2017
>> >  Call Trace:
>> >   dump_stack+0x85/0xc9
>> >   print_circular_bug+0x1f9/0x207
>> >   __lock_acquire+0x13e1/0x1400
>> >   lock_acquire+0xbd/0x220
>> >   ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
>> >   ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
>> >   __mutex_lock+0x6e/0x900
>> >   ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
>> >   ? thermal_zone_get_temp+0x41/0x70
>> >   ? iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
>> >   ? thermal_zone_get_temp+0x41/0x70
>> >   ? find_held_lock+0x39/0xb0
>> >   mutex_lock_nested+0x1b/0x20
>> >   iwl_mvm_tzone_get_temp+0x32/0x80 [iwlmvm]
>> >   thermal_zone_get_temp+0x51/0x70
>> >   thermal_zone_device_update+0x3c/0x280
>> >   thermal_zone_device_register+0x3b8/0x610
>> >   iwl_mvm_thermal_initialize+0x1d1/0x3a0 [iwlmvm]
>> >   iwl_op_mode_mvm_start+0xa1d/0xd30 [iwlmvm]
>> >   _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi]
>> >   iwl_opmode_register+0xaa/0xd0 [iwlwifi]
>> >   iwl_mvm_init+0x37/0x1000 [iwlmvm]
>> >   ? 0xffffffffc0c87000
>> >   do_one_initcall+0x51/0x1a9
>> >   ? rcu_read_lock_sched_held+0x98/0xa0
>> >   ? kmem_cache_alloc_trace+0x2a5/0x340
>> >   do_init_module+0x60/0x20e
>> >   load_module+0x203f/0x2b50
>> >   ? __symbol_put+0x50/0x50
>> >   SYSC_finit_module+0x96/0xd0
>> >   SyS_finit_module+0xe/0x10
>> >   entry_SYSCALL_64_fastpath+0x23/0xc2
>> >  RIP: 0033:0x7f2ef067cc89
>> >  RSP: 002b:00007ffea2ea3d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> >  RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f2ef067cc89
>> >  RDX: 0000000000000000 RSI: 000000000041af06 RDI: 0000000000000001
>> >  RBP: 0000000000000005 R08: 0000000000000000 R09: 000000000096b230
>> >  R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffea2ea2d80
>> >  R13: 00007ffea2ea2d60 R14: 0000000000000005 R15: 000000000096f3c0
>> >  thermal thermal_zone3: failed to read out thermal zone (-5)
>
> CCing David Weinehall who also just reported this to me.
>
> We'll check this ASAP.  Thanks for reporting!

Adding linux-wireless also to the loop.

--=20
Kalle Valo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
  2017-08-03 10:02     ` [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock Kalle Valo
@ 2017-08-03 11:28       ` Coelho, Luciano
  2017-08-03 11:34         ` Jiri Kosina
  0 siblings, 1 reply; 17+ messages in thread
From: Coelho, Luciano @ 2017-08-03 11:28 UTC (permalink / raw)
  To: jikos
  Cc: linux-kernel, linuxwifi, Zhang, Rui, edubezval, linux-pm,
	Weinehall, David, Berg, Johannes, kvalo, Sharon, Sara,
	linux-wireless, Grumbach, Emmanuel

T24gVGh1LCAyMDE3LTA4LTAzIGF0IDEzOjAyICswMzAwLCBLYWxsZSBWYWxvIHdyb3RlOg0KPiAi
Q29lbGhvLCBMdWNpYW5vIiA8bHVjaWFuby5jb2VsaG9AaW50ZWwuY29tPiB3cml0ZXM6DQo+IA0K
PiA+IE9uIFRodSwgMjAxNy0wOC0wMyBhdCAxMToxMCArMDIwMCwgSmlyaSBLb3NpbmEgd3JvdGU6
DQo+ID4gPiBPbiBNb24sIDMxIEp1bCAyMDE3LCBKaXJpIEtvc2luYSB3cm90ZToNCj4gPiA+IA0K
PiA+ID4gPiBIaSwNCj4gPiA+ID4gDQo+ID4gPiA+IGJvb3RpbmcgY3VycmVudCBMaW51cycgdHJl
ZSwgSSdtIHNlZWluZyBsb2NrZGVwIHNwbGF0IChzZWUgdGhlIGVuZCBvZiB0aGlzIA0KPiA+ID4g
PiBtYWlsKS4NCj4gPiA+ID4gDQo+ID4gPiA+IEFwcGFyZW50bHksIHRoZXJlIGlzIEFCLUJBIGJl
dHdlZW4gdHotPmxvY2sgYW5kIG12bS0+bXV0ZXggdGhyb3VnaCB0aGUgQ1BVIA0KPiA+ID4gPiBo
b3RwbHVnIGxvY2suDQo+ID4gPiA+IA0KPiA+ID4gPiBUaGUgb2Jpdm91cyBkZXBlbmN5IGlzOiB0
aGVybWFsX3pvbmVfZ2V0X3RlbXAoKSBhY3F1aXJlcyB0ei0+bG9jaywgYW5kIA0KPiA+ID4gPiB0
aGVuIGNhbGxzIGl3bF9tdm1fdHpvbmVfZ2V0X3RlbXAoKSAodGhyb3VnaCB0ei0+b3BzLT5nZXRf
dGVtcCgpIA0KPiA+ID4gPiBjYWxsYmFjayksIHdoaWNoIGFjcXVpcmVzIG12bS0+bXV0ZXgNCj4g
PiA+ID4gDQo+ID4gPiA+IFRoZSBsZXNzIG9idmlvdXMgZGVwZW5kZW5jeSBpcyBwcmltYXJpbHkg
Y2F1c2VkIGJ5IGl3bF9vcF9tb2RlX212bV9zdGFydCgpIA0KPiA+ID4gPiBhbGxvY2F0aW5nIHdv
cmtxdWV1ZSAoIzIgc3RhY2t0cmFjZSkgd2hpbGUgaG9sZGluZyBtdm0tPm11dGV4ICh3aGljaCBp
cyANCj4gPiA+ID4gYnJva2VuLCBiZWNhdXNlIHRoYXQgbXV0ZXggaXMgYmVpbmcgdGFrZW4gYWxz
byBmcm9tIENQVSBob3RwbHVnIGNhbGxiYWNrIA0KPiA+ID4gPiBwYXRoLCBoZW5jZSB0aGUgQUIt
QkEpLg0KPiA+ID4gDQo+ID4gPiBBcyB0aGUgImNlbnRyYWwiIHBhcnQgb2YgdGhlIGRlcGVuZGVu
Y3kgaXMgYmVpbmcgYWRkZWQgYnkgaXdsd2lmaSBkcml2ZXIgDQo+ID4gPiAoX2l3bF9wY2llX3J4
X2luaXQoKSBhbGxvY2F0aW5nIHdvcmtxdWV1ZSB3aGlsZSBob2xkaW5nIA0KPiA+ID4gdHJhbnNf
cGNpZS0+bXV0ZXgpLCBJJ20gYWRkaW5nIGl3bHdpZmkgZm9sa3MgYXMgd2VsbCB0byBDQy4NCg0K
Wy4uLl0NCg0KPiA+ID4gPiAgLT4gIzIgKGNwdV9ob3RwbHVnX2xvY2sucndfc2VtKXsrKysrKyt9
Og0KPiA+ID4gPiAgICAgICAgIGxvY2tfYWNxdWlyZSsweGJkLzB4MjIwDQo+ID4gPiA+ICAgICAg
ICAgY3B1c19yZWFkX2xvY2srMHg0Ni8weDkwDQo+ID4gPiA+ICAgICAgICAgYXBwbHlfd29ya3F1
ZXVlX2F0dHJzKzB4MTcvMHg1MA0KPiA+ID4gPiAgICAgICAgIF9fYWxsb2Nfd29ya3F1ZXVlX2tl
eSsweDE5NS8weDRkMA0KPiA+ID4gPiAgICAgICAgIF9pd2xfcGNpZV9yeF9pbml0KzB4Mzg0LzB4
MzkwIFtpd2x3aWZpXQ0KPiA+ID4gPiAgICAgICAgIGl3bF9wY2llX3J4X2luaXQrMHgxZS8weDM4
MCBbaXdsd2lmaV0NCj4gPiA+ID4gICAgICAgICBpd2xfdHJhbnNfcGNpZV9zdGFydF9mdysweDI5
NS8weDZmMCBbaXdsd2lmaV0NCj4gPiA+ID4gICAgICAgICBpd2xfbXZtX2xvYWRfdWNvZGVfd2Fp
dF9hbGl2ZSsweGU3LzB4MzkwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgaXdsX3J1bl9pbml0
X212bV91Y29kZSsweDg0LzB4MzIwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgaXdsX29wX21v
ZGVfbXZtX3N0YXJ0KzB4OTY0LzB4ZDMwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgX2l3bF9v
cF9tb2RlX3N0YXJ0LmlzcmEuOSsweDQ3LzB4YTAgW2l3bHdpZmldDQo+ID4gPiA+ICAgICAgICAg
aXdsX29wbW9kZV9yZWdpc3RlcisweGFhLzB4ZDAgW2l3bHdpZmldDQo+ID4gPiA+ICAgICAgICAg
aXdsX212bV9pbml0KzB4MzcvMHgxMDAwIFtpd2xtdm1dDQo+ID4gPiA+ICAgICAgICAgZG9fb25l
X2luaXRjYWxsKzB4NTEvMHgxYTkNCj4gPiA+ID4gICAgICAgICBkb19pbml0X21vZHVsZSsweDYw
LzB4MjBlDQo+ID4gPiA+ICAgICAgICAgbG9hZF9tb2R1bGUrMHgyMDNmLzB4MmI1MA0KPiA+ID4g
PiAgICAgICAgIFNZU0NfZmluaXRfbW9kdWxlKzB4OTYvMHhkMA0KPiA+ID4gPiAgICAgICAgIFN5
U19maW5pdF9tb2R1bGUrMHhlLzB4MTANCj4gPiA+ID4gICAgICAgICBlbnRyeV9TWVNDQUxMXzY0
X2Zhc3RwYXRoKzB4MjMvMHhjMg0KDQpPa2F5LCBzbyBhcyBJIHVuZGVyc3RhbmQgaXQgdGhlIHBy
b2JsZW0gaGFzIGJlZW4gdGhlcmUgZm9yIGEgbG9uZyB0aW1lLA0KYnV0IHRoZSBzcGxhdCBpcyBv
bmx5IGNvbWluZyB1cCBub3cgYmVjYXVzZSBvZiBUaG9tYXMnIHBhdGNoIHRoYXQgYWRkcw0KdGhl
IGxvY2tkZXAgbWFwWzFdLCByaWdodD8NCg0KSSBzZWUgdGhlIHdvcmtxdWV1ZSBhbGxvY2F0aW9u
IHlvdSBtZW50aW9uZWQuICBJJ2xsIHRyeSB0byBtb3ZlIHRoaXMNCmFsbG9jYXRpb24gb3V0IG9m
IHRoZSBtdXRleCBhbmQgc2VlIGhvdyBpdCBnb2VzLg0KDQpbMV0gaHR0cDovL2xrbWwua2VybmVs
Lm9yZy9yLzIwMTcwNTI0MDgxNTQ5LjcwOTM3NTg0NUBsaW51dHJvbml4LmRlDQoNCi0tDQpDaGVl
cnMsDQpMdWNhLg==

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
  2017-08-03 11:28       ` Coelho, Luciano
@ 2017-08-03 11:34         ` Jiri Kosina
  2017-08-03 13:09           ` Jiri Kosina
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Kosina @ 2017-08-03 11:34 UTC (permalink / raw)
  To: Coelho, Luciano
  Cc: linux-kernel, linuxwifi, Zhang, Rui, edubezval, linux-pm,
	Weinehall, David, Berg, Johannes, kvalo, Sharon, Sara,
	linux-wireless, Grumbach, Emmanuel

On Thu, 3 Aug 2017, Coelho, Luciano wrote:

> Okay, so as I understand it the problem has been there for a long time,
> but the splat is only coming up now because of Thomas' patch that adds
> the lockdep map[1], right?

Yeah, sorry, forgot to mention that pre-49dfe2a67797 kernels wouldn't 
produce this, as there would not be aware of the fact that 
cpus_read_lock() is actually semantically a lock.

> I see the workqueue allocation you mentioned.  I'll try to move this 
> allocation out of the mutex and see how it goes.

I have been briefly looking into this as well -- it'll basically have to 
be moved out of the trans_pcie->mutex context, but

(a) I'm not sure whether that's actually safe
(b) iwl_pcie_rx_reuse_rbd() (which is where corresponding work is being 
    queued) is not a proper context either (it's atomic context)

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
  2017-08-03 11:34         ` Jiri Kosina
@ 2017-08-03 13:09           ` Jiri Kosina
  2017-08-17 13:38             ` Jiri Kosina
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Kosina @ 2017-08-03 13:09 UTC (permalink / raw)
  To: Coelho, Luciano
  Cc: linux-kernel, linuxwifi, Zhang, Rui, edubezval, linux-pm,
	Weinehall, David, Berg, Johannes, kvalo, Sharon, Sara,
	linux-wireless, Grumbach, Emmanuel

On Thu, 3 Aug 2017, Jiri Kosina wrote:

> > I see the workqueue allocation you mentioned.  I'll try to move this 
> > allocation out of the mutex and see how it goes.
> 
> I have been briefly looking into this as well -- it'll basically have to 
> be moved out of the trans_pcie->mutex context, but
> 
> (a) I'm not sure whether that's actually safe
> (b) iwl_pcie_rx_reuse_rbd() (which is where corresponding work is being 
>     queued) is not a proper context either (it's atomic context)

Actually moving it out of trans_pcie->mutex is likely not to be enough, 
the dependency would still be there, just the graph will have one vertex 
less, with the dependency going directly from mvm mutex to 
cpu_hotplug_lock.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
  2017-08-03 13:09           ` Jiri Kosina
@ 2017-08-17 13:38             ` Jiri Kosina
  2017-08-17 14:02               ` Coelho, Luciano
                                 ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jiri Kosina @ 2017-08-17 13:38 UTC (permalink / raw)
  To: Coelho, Luciano
  Cc: linux-kernel, linuxwifi, Zhang, Rui, edubezval, linux-pm,
	Weinehall, David, Berg, Johannes, kvalo, Sharon, Sara,
	linux-wireless, Grumbach, Emmanuel

Hi,

anything new on this front please?

The splat (and therefore deadlock potential) is still there with current 
Linus' tree.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock
  2017-08-17 13:38             ` Jiri Kosina
@ 2017-08-17 14:02               ` Coelho, Luciano
  2017-08-22  7:32               ` [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() Luca Coelho
  2017-08-22  7:37               ` [PATCH v2] " Luca Coelho
  2 siblings, 0 replies; 17+ messages in thread
From: Coelho, Luciano @ 2017-08-17 14:02 UTC (permalink / raw)
  To: jikos
  Cc: linux-kernel, linuxwifi, Zhang, Rui, Weinehall, David, linux-pm,
	edubezval, Berg, Johannes, kvalo, Sharon, Sara, linux-wireless,
	Grumbach, Emmanuel

T24gVGh1LCAyMDE3LTA4LTE3IGF0IDE1OjM4ICswMjAwLCBKaXJpIEtvc2luYSB3cm90ZToNCj4g
SGksDQo+IA0KPiBhbnl0aGluZyBuZXcgb24gdGhpcyBmcm9udCBwbGVhc2U/DQo+IA0KPiBUaGUg
c3BsYXQgKGFuZCB0aGVyZWZvcmUgZGVhZGxvY2sgcG90ZW50aWFsKSBpcyBzdGlsbCB0aGVyZSB3
aXRoIGN1cnJlbnQgDQo+IExpbnVzJyB0cmVlLg0KDQpTb3JyeSwgaGF2ZW4ndCBoYWQgbW9yZSB0
aW1lIHRvIHNwZW5kIG9uIGl0LiAgSSdsbCBkbyBpdCB0aGlzIGV2ZW5pbmcuDQoNCkJ1dCwganVz
dCB0byBjbGFyaWZ5LCB0aGUgZGVhZGxvY2sgcG90ZW50aWFsIGhhcyBiZWVuIHRoZXJlIGZvciBh
IHdoaWxlLA0KcmlnaHQ/IFRoZSBvbmx5IGRpZmZlcmVuY2UgaXMgdGhhdCBub3cgd2UgZ2V0IHRo
ZSBzcGxhdC4NCg0KTm90IHNheWluZyB3ZSBzaG91bGRuJ3QgZml4IGl0IHRob3VnaC4gOykNCg0K
LS0NCkx1Y2Eu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-17 13:38             ` Jiri Kosina
  2017-08-17 14:02               ` Coelho, Luciano
@ 2017-08-22  7:32               ` Luca Coelho
  2017-08-22  7:37               ` [PATCH v2] " Luca Coelho
  2 siblings, 0 replies; 17+ messages in thread
From: Luca Coelho @ 2017-08-22  7:32 UTC (permalink / raw)
  To: jikos, linux-wireless
  Cc: luciano.coelho, linux-kernel, linuxwifi, rui.zhang, edubezval,
	linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

From: Luca Coelho <luciano.coelho@intel.com>

Work queues cannot be allocated in when a mutex is held because the
mutex may be in use and that would make it sleep.  Doing so generates
the following splat with 4.13+:

[   19.513298] ======================================================
[   19.513429] WARNING: possible circular locking dependency detected
[   19.513557] 4.13.0-rc5+ #6 Not tainted
[   19.513638] ------------------------------------------------------
[   19.513767] cpuhp/0/12 is trying to acquire lock:
[   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
[   19.514047]
[   19.514047] but task is already holding lock:
[   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
[   19.514338]
[   19.514338] which lock already depends on the new lock.

This lock dependency already existed with previous kernel versions,
but it was not detected until commit ... was introduced.

Reported-by: David Weinehall <david.weinehall@intel.com>
Reported-by: Jiri Kosina <jikos@kernel.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---
 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  2 ++
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c       | 10 +---------
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c    |  9 +++++++++
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index fa315d84e98e..a1ea9ef97ed9 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans);
 
 void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable);
 
+void iwl_pcie_rx_allocator_work(struct work_struct *data);
+
 /* common functions that are used by gen2 transport */
 void iwl_pcie_apm_config(struct iwl_trans *trans);
 int iwl_pcie_prepare_card_hw(struct iwl_trans *trans);
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
index 351c4423125a..942736d3fa75 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
@@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans,
 	rxq->free_count += RX_CLAIM_REQ_ALLOC;
 }
 
-static void iwl_pcie_rx_allocator_work(struct work_struct *data)
+void iwl_pcie_rx_allocator_work(struct work_struct *data)
 {
 	struct iwl_rb_allocator *rba_p =
 		container_of(data, struct iwl_rb_allocator, rx_alloc);
@@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans)
 			return err;
 	}
 	def_rxq = trans_pcie->rxq;
-	if (!rba->alloc_wq)
-		rba->alloc_wq = alloc_workqueue("rb_allocator",
-						WQ_HIGHPRI | WQ_UNBOUND, 1);
-	INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work);
 
 	spin_lock(&rba->lock);
 	atomic_set(&rba->req_pending, 0);
@@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans)
 	}
 
 	cancel_work_sync(&rba->rx_alloc);
-	if (rba->alloc_wq) {
-		destroy_workqueue(rba->alloc_wq);
-		rba->alloc_wq = NULL;
-	}
 
 	iwl_pcie_free_rbs_pool(trans);
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
index f95eec52508e..3927bbf04f72 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
@@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
 		iwl_pcie_tx_free(trans);
 	iwl_pcie_rx_free(trans);
 
+	if (trans_pcie->rba.alloc_wq) {
+		destroy_workqueue(trans_pcie->rba.alloc_wq);
+		trans_pcie->rba.alloc_wq = NULL;
+	}
+
 	if (trans_pcie->msix_enabled) {
 		for (i = 0; i < trans_pcie->alloc_vecs; i++) {
 			irq_set_affinity_hint(
@@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev,
 		trans_pcie->inta_mask = CSR_INI_SET_MASK;
 	 }
 
+	trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator",
+						   WQ_HIGHPRI | WQ_UNBOUND, 1);
+	INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work);
+
 #ifdef CONFIG_IWLWIFI_PCIE_RTPM
 	trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3;
 #else
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-17 13:38             ` Jiri Kosina
  2017-08-17 14:02               ` Coelho, Luciano
  2017-08-22  7:32               ` [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() Luca Coelho
@ 2017-08-22  7:37               ` Luca Coelho
  2017-08-24  6:00                 ` Luca Coelho
                                   ` (2 more replies)
  2 siblings, 3 replies; 17+ messages in thread
From: Luca Coelho @ 2017-08-22  7:37 UTC (permalink / raw)
  To: jikos, linux-wireless
  Cc: luciano.coelho, linux-kernel, linuxwifi, rui.zhang, edubezval,
	linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

From: Luca Coelho <luciano.coelho@intel.com>

Work queues cannot be allocated when a mutex is held because the mutex
may be in use and that would make it sleep.  Doing so generates the
following splat with 4.13+:

[   19.513298] ======================================================
[   19.513429] WARNING: possible circular locking dependency detected
[   19.513557] 4.13.0-rc5+ #6 Not tainted
[   19.513638] ------------------------------------------------------
[   19.513767] cpuhp/0/12 is trying to acquire lock:
[   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
[   19.514047]
[   19.514047] but task is already holding lock:
[   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
[   19.514338]
[   19.514338] which lock already depends on the new lock.

This lock dependency already existed with previous kernel versions,
but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
lock stacks for hotplug callbacks") was introduced.

Reported-by: David Weinehall <david.weinehall@intel.com>
Reported-by: Jiri Kosina <jikos@kernel.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---
In v2:
   - updated the commit message to a new version, with a grammar fix
     and the actual commit that exposed the problem;

 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  2 ++
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c       | 10 +---------
 drivers/net/wireless/intel/iwlwifi/pcie/trans.c    |  9 +++++++++
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index fa315d84e98e..a1ea9ef97ed9 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans);
 
 void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable);
 
+void iwl_pcie_rx_allocator_work(struct work_struct *data);
+
 /* common functions that are used by gen2 transport */
 void iwl_pcie_apm_config(struct iwl_trans *trans);
 int iwl_pcie_prepare_card_hw(struct iwl_trans *trans);
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
index 351c4423125a..942736d3fa75 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
@@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans,
 	rxq->free_count += RX_CLAIM_REQ_ALLOC;
 }
 
-static void iwl_pcie_rx_allocator_work(struct work_struct *data)
+void iwl_pcie_rx_allocator_work(struct work_struct *data)
 {
 	struct iwl_rb_allocator *rba_p =
 		container_of(data, struct iwl_rb_allocator, rx_alloc);
@@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans)
 			return err;
 	}
 	def_rxq = trans_pcie->rxq;
-	if (!rba->alloc_wq)
-		rba->alloc_wq = alloc_workqueue("rb_allocator",
-						WQ_HIGHPRI | WQ_UNBOUND, 1);
-	INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work);
 
 	spin_lock(&rba->lock);
 	atomic_set(&rba->req_pending, 0);
@@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans)
 	}
 
 	cancel_work_sync(&rba->rx_alloc);
-	if (rba->alloc_wq) {
-		destroy_workqueue(rba->alloc_wq);
-		rba->alloc_wq = NULL;
-	}
 
 	iwl_pcie_free_rbs_pool(trans);
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
index f95eec52508e..3927bbf04f72 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
@@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
 		iwl_pcie_tx_free(trans);
 	iwl_pcie_rx_free(trans);
 
+	if (trans_pcie->rba.alloc_wq) {
+		destroy_workqueue(trans_pcie->rba.alloc_wq);
+		trans_pcie->rba.alloc_wq = NULL;
+	}
+
 	if (trans_pcie->msix_enabled) {
 		for (i = 0; i < trans_pcie->alloc_vecs; i++) {
 			irq_set_affinity_hint(
@@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev,
 		trans_pcie->inta_mask = CSR_INI_SET_MASK;
 	 }
 
+	trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator",
+						   WQ_HIGHPRI | WQ_UNBOUND, 1);
+	INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work);
+
 #ifdef CONFIG_IWLWIFI_PCIE_RTPM
 	trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3;
 #else
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-22  7:37               ` [PATCH v2] " Luca Coelho
@ 2017-08-24  6:00                 ` Luca Coelho
  2017-08-24 19:56                   ` Jiri Kosina
  2017-08-24 13:50                 ` [v2] " Kalle Valo
  2017-08-30 14:57                 ` [PATCH v2] " David Weinehall
  2 siblings, 1 reply; 17+ messages in thread
From: Luca Coelho @ 2017-08-24  6:00 UTC (permalink / raw)
  To: jikos, linux-wireless
  Cc: linux-kernel, linuxwifi, rui.zhang, edubezval, linux-pm,
	david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

On Tue, 2017-08-22 at 10:37 +0300, Luca Coelho wrote:
> From: Luca Coelho <luciano.coelho@intel.com>
> 
> Work queues cannot be allocated when a mutex is held because the mutex
> may be in use and that would make it sleep.  Doing so generates the
> following splat with 4.13+:
> 
> [   19.513298] ======================================================
> [   19.513429] WARNING: possible circular locking dependency detected
> [   19.513557] 4.13.0-rc5+ #6 Not tainted
> [   19.513638] ------------------------------------------------------
> [   19.513767] cpuhp/0/12 is trying to acquire lock:
> [   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
> [   19.514047]
> [   19.514047] but task is already holding lock:
> [   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
> [   19.514338]
> [   19.514338] which lock already depends on the new lock.
> 
> This lock dependency already existed with previous kernel versions,
> but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
> lock stacks for hotplug callbacks") was introduced.
> 
> Reported-by: David Weinehall <david.weinehall@intel.com>
> Reported-by: Jiri Kosina <jikos@kernel.org>
> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

Jiri, did you have a chance to try this out? I'm about to ask Kalle to
merge this so it gets in in time for 4.13-rc7.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-22  7:37               ` [PATCH v2] " Luca Coelho
  2017-08-24  6:00                 ` Luca Coelho
@ 2017-08-24 13:50                 ` Kalle Valo
  2017-08-30 14:57                 ` [PATCH v2] " David Weinehall
  2 siblings, 0 replies; 17+ messages in thread
From: Kalle Valo @ 2017-08-24 13:50 UTC (permalink / raw)
  To: Luciano Coelho
  Cc: jikos, linux-wireless, luciano.coelho, linux-kernel, linuxwifi,
	rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg,
	sara.sharon, emmanuel.grumbach

Luciano Coelho <luca@coelho.fi> wrote:

> From: Luca Coelho <luciano.coelho@intel.com>
> 
> Work queues cannot be allocated when a mutex is held because the mutex
> may be in use and that would make it sleep.  Doing so generates the
> following splat with 4.13+:
> 
> [   19.513298] ======================================================
> [   19.513429] WARNING: possible circular locking dependency detected
> [   19.513557] 4.13.0-rc5+ #6 Not tainted
> [   19.513638] ------------------------------------------------------
> [   19.513767] cpuhp/0/12 is trying to acquire lock:
> [   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
> [   19.514047]
> [   19.514047] but task is already holding lock:
> [   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
> [   19.514338]
> [   19.514338] which lock already depends on the new lock.
> 
> This lock dependency already existed with previous kernel versions,
> but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
> lock stacks for hotplug callbacks") was introduced.
> 
> Reported-by: David Weinehall <david.weinehall@intel.com>
> Reported-by: Jiri Kosina <jikos@kernel.org>
> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

Patch applied to wireless-drivers.git, thanks.

10a54d8196d1 iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()

-- 
https://patchwork.kernel.org/patch/9914349/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-24  6:00                 ` Luca Coelho
@ 2017-08-24 19:56                   ` Jiri Kosina
  2017-08-24 20:00                     ` Luca Coelho
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Kosina @ 2017-08-24 19:56 UTC (permalink / raw)
  To: Luca Coelho
  Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval,
	linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

On Thu, 24 Aug 2017, Luca Coelho wrote:

> > Work queues cannot be allocated when a mutex is held because the mutex
> > may be in use and that would make it sleep.  Doing so generates the
> > following splat with 4.13+:
> > 
> > [   19.513298] ======================================================
> > [   19.513429] WARNING: possible circular locking dependency detected
> > [   19.513557] 4.13.0-rc5+ #6 Not tainted
> > [   19.513638] ------------------------------------------------------
> > [   19.513767] cpuhp/0/12 is trying to acquire lock:
> > [   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
> > [   19.514047]
> > [   19.514047] but task is already holding lock:
> > [   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
> > [   19.514338]
> > [   19.514338] which lock already depends on the new lock.
> > 
> > This lock dependency already existed with previous kernel versions,
> > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
> > lock stacks for hotplug callbacks") was introduced.
> > 
> > Reported-by: David Weinehall <david.weinehall@intel.com>
> > Reported-by: Jiri Kosina <jikos@kernel.org>
> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> 
> Jiri, did you have a chance to try this out? I'm about to ask Kalle to
> merge this so it gets in in time for 4.13-rc7.

Sorry, I am almost completely offline for one more week (vacation), and 
will not have access to the affected system before that. But this indeed 
looks like a correct fix to me, so feel free to add

	Acked-by: Jiri Kosina <jkosina@suse.cz>

I'll be able to provide my Tested-by: eventually only in ~10 days.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-24 19:56                   ` Jiri Kosina
@ 2017-08-24 20:00                     ` Luca Coelho
  2017-08-29  8:19                       ` Kalle Valo
  2017-09-04 11:43                       ` Jiri Kosina
  0 siblings, 2 replies; 17+ messages in thread
From: Luca Coelho @ 2017-08-24 20:00 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval,
	linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

On Thu, 2017-08-24 at 21:56 +0200, Jiri Kosina wrote:
> On Thu, 24 Aug 2017, Luca Coelho wrote:
> 
> > > Work queues cannot be allocated when a mutex is held because the mutex
> > > may be in use and that would make it sleep.  Doing so generates the
> > > following splat with 4.13+:
> > > 
> > > [   19.513298] ======================================================
> > > [   19.513429] WARNING: possible circular locking dependency detected
> > > [   19.513557] 4.13.0-rc5+ #6 Not tainted
> > > [   19.513638] ------------------------------------------------------
> > > [   19.513767] cpuhp/0/12 is trying to acquire lock:
> > > [   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
> > > [   19.514047]
> > > [   19.514047] but task is already holding lock:
> > > [   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
> > > [   19.514338]
> > > [   19.514338] which lock already depends on the new lock.
> > > 
> > > This lock dependency already existed with previous kernel versions,
> > > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
> > > lock stacks for hotplug callbacks") was introduced.
> > > 
> > > Reported-by: David Weinehall <david.weinehall@intel.com>
> > > Reported-by: Jiri Kosina <jikos@kernel.org>
> > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> > 
> > Jiri, did you have a chance to try this out? I'm about to ask Kalle to
> > merge this so it gets in in time for 4.13-rc7.
> 
> Sorry, I am almost completely offline for one more week (vacation), and 
> will not have access to the affected system before that.

Sounds good! Enjoy! ;)


>  But this indeed 
> looks like a correct fix to me, so feel free to add
> 
> 	Acked-by: Jiri Kosina <jkosina@suse.cz>
> 
> I'll be able to provide my Tested-by: eventually only in ~10 days.


Kalle already picked it up in wireless-drivers and this should make it's
way to 4.13-rc7 (we hope).

In any case, thanks for reporting and the help debugging it.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-24 20:00                     ` Luca Coelho
@ 2017-08-29  8:19                       ` Kalle Valo
  2017-09-04 11:43                       ` Jiri Kosina
  1 sibling, 0 replies; 17+ messages in thread
From: Kalle Valo @ 2017-08-29  8:19 UTC (permalink / raw)
  To: Luca Coelho
  Cc: Jiri Kosina, linux-wireless, linux-kernel, linuxwifi, rui.zhang,
	edubezval, linux-pm, david.weinehall, johannes.berg, sara.sharon,
	emmanuel.grumbach

Luca Coelho <luca@coelho.fi> writes:

> On Thu, 2017-08-24 at 21:56 +0200, Jiri Kosina wrote:
>> On Thu, 24 Aug 2017, Luca Coelho wrote:
>> 
>> > > Work queues cannot be allocated when a mutex is held because the mutex
>> > > may be in use and that would make it sleep.  Doing so generates the
>> > > following splat with 4.13+:
>> > > 
>> > > [   19.513298] ======================================================
>> > > [   19.513429] WARNING: possible circular locking dependency detected
>> > > [   19.513557] 4.13.0-rc5+ #6 Not tainted
>> > > [   19.513638] ------------------------------------------------------
>> > > [   19.513767] cpuhp/0/12 is trying to acquire lock:
>> > > [ 19.513867] (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>]
>> > > thermal_zone_get_temp+0x5b/0xb0
>> > > [   19.514047]
>> > > [   19.514047] but task is already holding lock:
>> > > [ 19.514166] (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>]
>> > > cpuhp_thread_fun+0x3a/0x210
>> > > [   19.514338]
>> > > [   19.514338] which lock already depends on the new lock.
>> > > 
>> > > This lock dependency already existed with previous kernel versions,
>> > > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
>> > > lock stacks for hotplug callbacks") was introduced.
>> > > 
>> > > Reported-by: David Weinehall <david.weinehall@intel.com>
>> > > Reported-by: Jiri Kosina <jikos@kernel.org>
>> > > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
>> > 
>> > Jiri, did you have a chance to try this out? I'm about to ask Kalle to
>> > merge this so it gets in in time for 4.13-rc7.
>> 
>> Sorry, I am almost completely offline for one more week (vacation), and 
>> will not have access to the affected system before that.
>
> Sounds good! Enjoy! ;)
>
>
>>  But this indeed 
>> looks like a correct fix to me, so feel free to add
>> 
>> 	Acked-by: Jiri Kosina <jkosina@suse.cz>
>> 
>> I'll be able to provide my Tested-by: eventually only in ~10 days.
>
>
> Kalle already picked it up in wireless-drivers and this should make it's
> way to 4.13-rc7 (we hope).

It (10a54d8196d1) didn't make it to -rc7 but it's in net tree now and
should make it to the next release on Sunday (either -rc8 or the final).

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-22  7:37               ` [PATCH v2] " Luca Coelho
  2017-08-24  6:00                 ` Luca Coelho
  2017-08-24 13:50                 ` [v2] " Kalle Valo
@ 2017-08-30 14:57                 ` David Weinehall
  2017-08-31  6:11                   ` Luca Coelho
  2 siblings, 1 reply; 17+ messages in thread
From: David Weinehall @ 2017-08-30 14:57 UTC (permalink / raw)
  To: Luca Coelho
  Cc: jikos, linux-wireless, luciano.coelho, linux-kernel, linuxwifi,
	rui.zhang, edubezval, linux-pm, david.weinehall, johannes.berg,
	kvalo, sara.sharon, emmanuel.grumbach

On Tue, Aug 22, 2017 at 10:37:29AM +0300, Luca Coelho wrote:
> From: Luca Coelho <luciano.coelho@intel.com>
> 
> Work queues cannot be allocated when a mutex is held because the mutex
> may be in use and that would make it sleep.  Doing so generates the
> following splat with 4.13+:
> 
> [   19.513298] ======================================================
> [   19.513429] WARNING: possible circular locking dependency detected
> [   19.513557] 4.13.0-rc5+ #6 Not tainted
> [   19.513638] ------------------------------------------------------
> [   19.513767] cpuhp/0/12 is trying to acquire lock:
> [   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
> [   19.514047]
> [   19.514047] but task is already holding lock:
> [   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
> [   19.514338]
> [   19.514338] which lock already depends on the new lock.
> 
> This lock dependency already existed with previous kernel versions,
> but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
> lock stacks for hotplug callbacks") was introduced.
> 
> Reported-by: David Weinehall <david.weinehall@intel.com>
> Reported-by: Jiri Kosina <jikos@kernel.org>
> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

With this patch I no longer get the lockdep warning,
and the driver seems to work just as well as before.

Thanks!

Tested-by: David Weinehall <david.weinehall@linux.intel.com>

> ---
> In v2:
>    - updated the commit message to a new version, with a grammar fix
>      and the actual commit that exposed the problem;
> 
>  drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  2 ++
>  drivers/net/wireless/intel/iwlwifi/pcie/rx.c       | 10 +---------
>  drivers/net/wireless/intel/iwlwifi/pcie/trans.c    |  9 +++++++++
>  3 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
> index fa315d84e98e..a1ea9ef97ed9 100644
> --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
> +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
> @@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans);
>  
>  void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable);
>  
> +void iwl_pcie_rx_allocator_work(struct work_struct *data);
> +
>  /* common functions that are used by gen2 transport */
>  void iwl_pcie_apm_config(struct iwl_trans *trans);
>  int iwl_pcie_prepare_card_hw(struct iwl_trans *trans);
> diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> index 351c4423125a..942736d3fa75 100644
> --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> @@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans,
>  	rxq->free_count += RX_CLAIM_REQ_ALLOC;
>  }
>  
> -static void iwl_pcie_rx_allocator_work(struct work_struct *data)
> +void iwl_pcie_rx_allocator_work(struct work_struct *data)
>  {
>  	struct iwl_rb_allocator *rba_p =
>  		container_of(data, struct iwl_rb_allocator, rx_alloc);
> @@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans)
>  			return err;
>  	}
>  	def_rxq = trans_pcie->rxq;
> -	if (!rba->alloc_wq)
> -		rba->alloc_wq = alloc_workqueue("rb_allocator",
> -						WQ_HIGHPRI | WQ_UNBOUND, 1);
> -	INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work);
>  
>  	spin_lock(&rba->lock);
>  	atomic_set(&rba->req_pending, 0);
> @@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans)
>  	}
>  
>  	cancel_work_sync(&rba->rx_alloc);
> -	if (rba->alloc_wq) {
> -		destroy_workqueue(rba->alloc_wq);
> -		rba->alloc_wq = NULL;
> -	}
>  
>  	iwl_pcie_free_rbs_pool(trans);
>  
> diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
> index f95eec52508e..3927bbf04f72 100644
> --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
> +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
> @@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
>  		iwl_pcie_tx_free(trans);
>  	iwl_pcie_rx_free(trans);
>  
> +	if (trans_pcie->rba.alloc_wq) {
> +		destroy_workqueue(trans_pcie->rba.alloc_wq);
> +		trans_pcie->rba.alloc_wq = NULL;
> +	}
> +
>  	if (trans_pcie->msix_enabled) {
>  		for (i = 0; i < trans_pcie->alloc_vecs; i++) {
>  			irq_set_affinity_hint(
> @@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev,
>  		trans_pcie->inta_mask = CSR_INI_SET_MASK;
>  	 }
>  
> +	trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator",
> +						   WQ_HIGHPRI | WQ_UNBOUND, 1);
> +	INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work);
> +
>  #ifdef CONFIG_IWLWIFI_PCIE_RTPM
>  	trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3;
>  #else
> -- 
> 2.14.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-30 14:57                 ` [PATCH v2] " David Weinehall
@ 2017-08-31  6:11                   ` Luca Coelho
  0 siblings, 0 replies; 17+ messages in thread
From: Luca Coelho @ 2017-08-31  6:11 UTC (permalink / raw)
  To: David Weinehall
  Cc: jikos, linux-wireless, linux-kernel, linuxwifi, rui.zhang,
	edubezval, linux-pm, david.weinehall, johannes.berg, kvalo,
	sara.sharon, emmanuel.grumbach

On Wed, 2017-08-30 at 17:57 +0300, David Weinehall wrote:
> On Tue, Aug 22, 2017 at 10:37:29AM +0300, Luca Coelho wrote:
> > From: Luca Coelho <luciano.coelho@intel.com>
> > 
> > Work queues cannot be allocated when a mutex is held because the mutex
> > may be in use and that would make it sleep.  Doing so generates the
> > following splat with 4.13+:
> > 
> > [   19.513298] ======================================================
> > [   19.513429] WARNING: possible circular locking dependency detected
> > [   19.513557] 4.13.0-rc5+ #6 Not tainted
> > [   19.513638] ------------------------------------------------------
> > [   19.513767] cpuhp/0/12 is trying to acquire lock:
> > [   19.513867]  (&tz->lock){+.+.+.}, at: [<ffffffff924afebb>] thermal_zone_get_temp+0x5b/0xb0
> > [   19.514047]
> > [   19.514047] but task is already holding lock:
> > [   19.514166]  (cpuhp_state){+.+.+.}, at: [<ffffffff91cc4baa>] cpuhp_thread_fun+0x3a/0x210
> > [   19.514338]
> > [   19.514338] which lock already depends on the new lock.
> > 
> > This lock dependency already existed with previous kernel versions,
> > but it was not detected until commit 49dfe2a67797 ("cpuhotplug: Link
> > lock stacks for hotplug callbacks") was introduced.
> > 
> > Reported-by: David Weinehall <david.weinehall@intel.com>
> > Reported-by: Jiri Kosina <jikos@kernel.org>
> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> 
> With this patch I no longer get the lockdep warning,
> and the driver seems to work just as well as before.

Great! Thanks for reporting and testing, David!


> Thanks!
> 
> Tested-by: David Weinehall <david.weinehall@linux.intel.com>

Thanks for this too, but unfortunately it's too late to add it, since
the patch is already in net tree.

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-08-24 20:00                     ` Luca Coelho
  2017-08-29  8:19                       ` Kalle Valo
@ 2017-09-04 11:43                       ` Jiri Kosina
  2017-09-04 11:44                         ` Luca Coelho
  1 sibling, 1 reply; 17+ messages in thread
From: Jiri Kosina @ 2017-09-04 11:43 UTC (permalink / raw)
  To: Luca Coelho
  Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval,
	linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

On Thu, 24 Aug 2017, Luca Coelho wrote:

> > looks like a correct fix to me, so feel free to add
> > 
> > 	Acked-by: Jiri Kosina <jkosina@suse.cz>
> > 
> > I'll be able to provide my Tested-by: eventually only in ~10 days.
> 
> 
> Kalle already picked it up in wireless-drivers and this should make it's
> way to 4.13-rc7 (we hope).
> 
> In any case, thanks for reporting and the help debugging it.

I know it's pretty late for this to be added to the commit, but I don't 
want this to be left in the void, so for the sake of completness:

	Tested-by: Jiri Kosina <jkosina@suse.cz>

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc()
  2017-09-04 11:43                       ` Jiri Kosina
@ 2017-09-04 11:44                         ` Luca Coelho
  0 siblings, 0 replies; 17+ messages in thread
From: Luca Coelho @ 2017-09-04 11:44 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: linux-wireless, linux-kernel, linuxwifi, rui.zhang, edubezval,
	linux-pm, david.weinehall, johannes.berg, kvalo, sara.sharon,
	emmanuel.grumbach

On Mon, 2017-09-04 at 13:43 +0200, Jiri Kosina wrote:
> On Thu, 24 Aug 2017, Luca Coelho wrote:
> 
> > > looks like a correct fix to me, so feel free to add
> > > 
> > > 	Acked-by: Jiri Kosina <jkosina@suse.cz>
> > > 
> > > I'll be able to provide my Tested-by: eventually only in ~10 days.
> > 
> > 
> > Kalle already picked it up in wireless-drivers and this should make it's
> > way to 4.13-rc7 (we hope).
> > 
> > In any case, thanks for reporting and the help debugging it.
> 
> I know it's pretty late for this to be added to the commit, but I don't 
> want this to be left in the void, so for the sake of completness:
> 
> 	Tested-by: Jiri Kosina <jkosina@suse.cz>

Thanks, Jiri, for reporting, debugging and testing!

--
Cheers,
Luca.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-09-04 11:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <alpine.LSU.2.20.1707311300440.4705@cbobk.fhfr.pm>
     [not found] ` <alpine.LSU.2.20.1708031107230.30597@cbobk.fhfr.pm>
     [not found]   ` <1501753405.15969.43.camel@intel.com>
2017-08-03 10:02     ` [linuxwifi] x86/thermal: AB-BA dependency between mvm->mutex and tz->lock Kalle Valo
2017-08-03 11:28       ` Coelho, Luciano
2017-08-03 11:34         ` Jiri Kosina
2017-08-03 13:09           ` Jiri Kosina
2017-08-17 13:38             ` Jiri Kosina
2017-08-17 14:02               ` Coelho, Luciano
2017-08-22  7:32               ` [PATCH] iwlwifi: pcie: move rx workqueue initialization to iwl_trans_pcie_alloc() Luca Coelho
2017-08-22  7:37               ` [PATCH v2] " Luca Coelho
2017-08-24  6:00                 ` Luca Coelho
2017-08-24 19:56                   ` Jiri Kosina
2017-08-24 20:00                     ` Luca Coelho
2017-08-29  8:19                       ` Kalle Valo
2017-09-04 11:43                       ` Jiri Kosina
2017-09-04 11:44                         ` Luca Coelho
2017-08-24 13:50                 ` [v2] " Kalle Valo
2017-08-30 14:57                 ` [PATCH v2] " David Weinehall
2017-08-31  6:11                   ` Luca Coelho

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).