linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* leaking mutex and a deadlock in brcmfmac
@ 2019-03-04 14:55 Piotr Figiel
  0 siblings, 0 replies; only message in thread
From: Piotr Figiel @ 2019-03-04 14:55 UTC (permalink / raw)
  To: linux-wireless
  Cc: arend.vanspriel, franky.lin, hante.meuleman, chi-hsien.lin,
	wright.feng, kvalo, davem, brcm80211-dev-list

Hi,
I noticed the kernel reports a leaking mutex BUG and a deadlock in brcmfmac
driver.
The issue is observable when the Raspberry Pi WiFi USB dongle is plugged in
and the kernel has following options enabled (step 12 from the
submit-checklist.rst): CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_ATOMIC_SLEEP, CONFIG_PROVE_RCU and
CONFIG_DEBUG_OBJECTS_RCU_HEAD.

I'm pretty sure the deadlock can manifest itself in practice as I observed
once an issue that if the dongle is disconnected shortly after connection
the USB stack stopped responding to connect/disconnect events. I wasn't able
to reproduce it but I gathered the stack traces via sysrq and noticed the
usb workqueue was waiting in disconnect on that leaked mutex
(dev_init_lock).

Is this maybe a known problem or could you provide some hints how to tackle
it?  The critical section protected by the leaking mutex seems pretty huge
and I couldn't find any straightforward approach to fix this.

Here are the logs when reprodicing this on recent wireless-drivers-next
master:

usb 1-1.3.1.1: new high-speed USB device number 9 using ci_hdrc
brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43143 for chip BCM43143/2
usbcore: registered new interface driver brcmfmac
BUG: workqueue leaked lock or atomic: kworker/1:1/0x00000000/135
     last function: brcmf_driver_register [brcmfmac]
1 lock held by kworker/1:1/135:
 #0: 68d3f0a3 (&devinfo->dev_init_lock){+.+.}, at: brcmf_usb_probe+0x78/0x570 [brcmfmac]
CPU: 1 PID: 135 Comm: kworker/1:1 Not tainted 5.0.0-rc7-00073-g837ea0d #112
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events brcmf_driver_register [brcmfmac]
[<80111c1c>] (unwind_backtrace) from [<8010d0e8>] (show_stack+0x10/0x14)
[<8010d0e8>] (show_stack) from [<809e64ac>] (dump_stack+0xa8/0xd4)
[<809e64ac>] (dump_stack) from [<801417d8>] (process_one_work+0x718/0x810)
[<801417d8>] (process_one_work) from [<801418fc>] (worker_thread+0x2c/0x564)
[<801418fc>] (worker_thread) from [<80147a80>] (kthread+0x13c/0x16c)
[<80147a80>] (kthread) from [<801010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xede63fb0 to 0xede63ff8)
3fa0:                                     00000000 00000000 00000000 00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000

======================================================
WARNING: possible circular locking dependency detected
5.0.0-rc7-00073-g837ea0d #112 Not tainted
------------------------------------------------------
kworker/1:1/135 is trying to acquire lock:
38feea71 ((wq_completion)"events"){+.+.}, at: process_one_work+0x174/0x810

but task is already holding lock:
68d3f0a3 (&devinfo->dev_init_lock){+.+.}, at: brcmf_usb_probe+0x78/0x570 [brcmfmac]

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&devinfo->dev_init_lock){+.+.}:
       mutex_lock_nested+0x1c/0x24
       brcmf_usb_probe+0x78/0x570 [brcmfmac]
       usb_probe_interface+0xc0/0x1bc
       really_probe+0x214/0x2dc
       __driver_attach+0xe4/0xe8
       bus_for_each_dev+0x68/0xb4
       bus_add_driver+0x19c/0x214
       driver_register+0x78/0x110
       usb_register_driver+0x84/0x148
       brcmf_usb_register+0x20/0x54 [brcmfmac]
       process_one_work+0x228/0x810
       worker_thread+0x2c/0x564
       kthread+0x13c/0x16c
       ret_from_fork+0x14/0x20
         (null)

-> #1 (brcmf_driver_work){+.+.}:
       worker_thread+0x2c/0x564
       kthread+0x13c/0x16c
       ret_from_fork+0x14/0x20
         (null)

-> #0 ((wq_completion)"events"){+.+.}:
       process_one_work+0x1b8/0x810
       worker_thread+0x2c/0x564
       kthread+0x13c/0x16c
       ret_from_fork+0x14/0x20
         (null)

other info that might help us debug this:

Chain exists of:
  (wq_completion)"events" --> brcmf_driver_work --> &devinfo->dev_init_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&devinfo->dev_init_lock);
                               lock(brcmf_driver_work);
                               lock(&devinfo->dev_init_lock);
  lock((wq_completion)"events");

 *** DEADLOCK ***

1 lock held by kworker/1:1/135:
 #0: 68d3f0a3 (&devinfo->dev_init_lock){+.+.}, at: brcmf_usb_probe+0x78/0x570 [brcmfmac]

stack backtrace:
CPU: 1 PID: 135 Comm: kworker/1:1 Not tainted 5.0.0-rc7-00073-g837ea0d #112
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events request_firmware_work_func
[<80111c1c>] (unwind_backtrace) from [<8010d0e8>] (show_stack+0x10/0x14)
[<8010d0e8>] (show_stack) from [<809e64ac>] (dump_stack+0xa8/0xd4)
[<809e64ac>] (dump_stack) from [<8017352c>] (print_circular_bug+0x210/0x330)
[<8017352c>] (print_circular_bug) from [<801762cc>] (__lock_acquire+0x12b0/0x1a04)
[<801762cc>] (__lock_acquire) from [<80177448>] (lock_acquire+0xe0/0x268)
[<80177448>] (lock_acquire) from [<80141278>] (process_one_work+0x1b8/0x810)
[<80141278>] (process_one_work) from [<801418fc>] (worker_thread+0x2c/0x564)
[<801418fc>] (worker_thread) from [<80147a80>] (kthread+0x13c/0x16c)
[<80147a80>] (kthread) from [<801010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xede63fb0 to 0xede63ff8)
3fa0:                                     00000000 00000000 00000000 00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43143 for chip BCM43143/2
brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available
brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM43143/2 wl0: Apr  3 2014 04:43:32 version 6.10.198.66 (r467479) FWID 01-32bd010e
ieee80211 phy0: brcmf_inetaddr_changed: fail to get arp ip table err:-52
IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

Best regards,
Piotr Figiel
PS. when replying please keep in To:/CC: as I'm not subscribed to the list.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-03-04 14:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-04 14:55 leaking mutex and a deadlock in brcmfmac Piotr Figiel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).