linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Kosina <jikos@kernel.org>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Luca Coelho <luciano.coelho@intel.com>,
	Heiner Kallweit <hkallweit1@gmail.com>,
	linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH] iwlwifi: don't call netif_napi_add() with rxq->lock held (was Re: Lockdep warning in iwl_pcie_rx_handle())
Date: Tue, 2 Mar 2021 10:27:34 +0100 (CET)	[thread overview]
Message-ID: <nycvar.YFH.7.76.2103021025410.12405@cbobk.fhfr.pm> (raw)
In-Reply-To: <2db8f779b4b37d4498cfeaed77d5ede54e429a6e.camel@sipsolutions.net>

On Mon, 1 Mar 2021, Johannes Berg wrote:

> > I am getting the splat below with Linus' tree as of today (5.11-rc1, 
> > fe07bfda2fb). I haven't started to look into the code yet, but apparently 
> > this has been already reported by Heiner here:
> > 
> > 	https://www.spinics.net/lists/linux-wireless/msg208353.html
> > 
> > so before I start digging deep into it (the previous kernel this 
> > particular machine had is 5.9, so I'd rather avoid lenghty bisect for now 
> > in case someone has already looked into it and has ideas where the problem 
> > is), I thought I'd ask whether this has been root-caused elsewhere 
> > already.
> 
> Yeah, I'm pretty sure we have a fix for this, though I'm not sure right
> now where it is in the pipeline.
> 
> It's called "iwlwifi: pcie: don't add NAPI under rxq->lock" but right
> now I can't find it in any of the public archives.

I was not able to find that patch anywhere indeed, but in the meantime I 
fixed it by the patch below. Please consider merging either of the fixes.

Also I am still seeing

	WARNING: CPU: 2 PID: 1138 at kernel/softirq.c:178 __local_bh_enable_ip+0xa5/0xf0

on the

	[   21.902173]  iwl_pcie_enqueue_hcmd+0x5d9/0xa00 [iwlwifi]
	[   21.906445]  iwl_trans_txq_send_hcmd+0x6c/0x430 [iwlwifi]
	[   21.910757]  iwl_trans_send_cmd+0x88/0x170 [iwlwifi]
	[   21.915074]  ? lock_acquire+0x277/0x3d0
	[   21.919327]  iwl_mvm_send_cmd+0x32/0x80 [iwlmvm]
	[   21.923616]  iwl_mvm_led_set+0xc2/0xe0 [iwlmvm]

codepath, but I'll look into it separately.



From: Jiri Kosina <jkosina@suse.cz>
Subject: [PATCH] iwlwifi: don't call netif_napi_add() with rxq->lock held

We can't call netif_napi_add() with rxq-lock held, as there is a potential
for deadlock as spotted by lockdep (see below). rxq->lock is not
protecting anything over the netif_napi_add() codepath anyway, so let's
drop it just before calling into NAPI.

 ========================================================
 WARNING: possible irq lock inversion dependency detected
 5.12.0-rc1-00002-gbada49429032 #5 Not tainted
 --------------------------------------------------------
 irq/136-iwlwifi/565 just changed the state of lock:
 ffff89f28433b0b0 (&rxq->lock){+.-.}-{2:2}, at: iwl_pcie_rx_handle+0x7f/0x960 [iwlwifi]
 but this lock took another, SOFTIRQ-unsafe lock in the past:
  (napi_hash_lock){+.+.}-{2:2}

 and interrupts could create inverse lock ordering between them.

 other info that might help us debug this:
  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(napi_hash_lock);
                                local_irq_disable();
                                lock(&rxq->lock);
                                lock(napi_hash_lock);
   <Interrupt>
     lock(&rxq->lock);

  *** DEADLOCK ***

 1 lock held by irq/136-iwlwifi/565:
  #0: ffff89f2b1440170 (sync_cmd_lockdep_map){+.+.}-{0:0}, at: iwl_pcie_irq_handler+0x5/0xb30

 the shortest dependencies between 2nd lock and 1st lock:
  -> (napi_hash_lock){+.+.}-{2:2} {
     HARDIRQ-ON-W at:
                       lock_acquire+0x277/0x3d0
                       _raw_spin_lock+0x2c/0x40
                       netif_napi_add+0x14b/0x270
                       e1000_probe+0x2fe/0xee0 [e1000e]
                       local_pci_probe+0x42/0x90
                       pci_device_probe+0x10b/0x1c0
                       really_probe+0xef/0x4b0
                       driver_probe_device+0xde/0x150
                       device_driver_attach+0x4f/0x60
                       __driver_attach+0x9c/0x140
                       bus_for_each_dev+0x79/0xc0
                       bus_add_driver+0x18d/0x220
                       driver_register+0x5b/0xf0
                       do_one_initcall+0x5b/0x300
                       do_init_module+0x5b/0x21c
                       load_module+0x1dae/0x22c0
                       __do_sys_finit_module+0xad/0x110
                       do_syscall_64+0x33/0x80
                       entry_SYSCALL_64_after_hwframe+0x44/0xae
     SOFTIRQ-ON-W at:
                       lock_acquire+0x277/0x3d0
                       _raw_spin_lock+0x2c/0x40
                       netif_napi_add+0x14b/0x270
                       e1000_probe+0x2fe/0xee0 [e1000e]
                       local_pci_probe+0x42/0x90
                       pci_device_probe+0x10b/0x1c0
                       really_probe+0xef/0x4b0
                       driver_probe_device+0xde/0x150
                       device_driver_attach+0x4f/0x60
                       __driver_attach+0x9c/0x140
                       bus_for_each_dev+0x79/0xc0
                       bus_add_driver+0x18d/0x220
                       driver_register+0x5b/0xf0
                       do_one_initcall+0x5b/0x300
                       do_init_module+0x5b/0x21c
                       load_module+0x1dae/0x22c0
                       __do_sys_finit_module+0xad/0x110
                       do_syscall_64+0x33/0x80
                       entry_SYSCALL_64_after_hwframe+0x44/0xae
     INITIAL USE at:
                      lock_acquire+0x277/0x3d0
                      _raw_spin_lock+0x2c/0x40
                      netif_napi_add+0x14b/0x270
                      e1000_probe+0x2fe/0xee0 [e1000e]
                      local_pci_probe+0x42/0x90
                      pci_device_probe+0x10b/0x1c0
                      really_probe+0xef/0x4b0
                      driver_probe_device+0xde/0x150
                      device_driver_attach+0x4f/0x60
                      __driver_attach+0x9c/0x140
                      bus_for_each_dev+0x79/0xc0
                      bus_add_driver+0x18d/0x220
                      driver_register+0x5b/0xf0
                      do_one_initcall+0x5b/0x300
                      do_init_module+0x5b/0x21c
                      load_module+0x1dae/0x22c0
                      __do_sys_finit_module+0xad/0x110
                      do_syscall_64+0x33/0x80
                      entry_SYSCALL_64_after_hwframe+0x44/0xae
   }
   ... key      at: [<ffffffffae84ef38>] napi_hash_lock+0x18/0x40
   ... acquired at:
    _raw_spin_lock+0x2c/0x40
    netif_napi_add+0x14b/0x270
    _iwl_pcie_rx_init+0x1f4/0x710 [iwlwifi]
    iwl_pcie_rx_init+0x1b/0x3b0 [iwlwifi]
    iwl_trans_pcie_start_fw+0x2ac/0x6a0 [iwlwifi]
    iwl_mvm_load_ucode_wait_alive+0x116/0x460 [iwlmvm]
    iwl_run_init_mvm_ucode+0xa4/0x3a0 [iwlmvm]
    iwl_op_mode_mvm_start+0x9ed/0xbf0 [iwlmvm]
    _iwl_op_mode_start.isra.4+0x42/0x80 [iwlwifi]
    iwl_opmode_register+0x71/0xe0 [iwlwifi]
    iwl_mvm_init+0x34/0x1000 [iwlmvm]
    do_one_initcall+0x5b/0x300
    do_init_module+0x5b/0x21c
    load_module+0x1dae/0x22c0
    __do_sys_finit_module+0xad/0x110
    do_syscall_64+0x33/0x80
    entry_SYSCALL_64_after_hwframe+0x44/0xae

[ ... lockdep output trimmed .... ]

Fixes: 25edc8f259c7106 ("iwlwifi: pcie: properly implement NAPI")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
index 1bccaa034284..6262dd9043aa 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
@@ -1063,11 +1063,12 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans)
 
 		iwl_pcie_rx_init_rxb_lists(rxq);
 
+		spin_unlock_bh(&rxq->lock);
+
 		if (!rxq->napi.poll)
 			netif_napi_add(&trans_pcie->napi_dev, &rxq->napi,
 				       iwl_pcie_dummy_napi_poll, 64);
 
-		spin_unlock_bh(&rxq->lock);
 	}
 
 	/* move the pool to the default queue and allocator ownerships */


-- 
Jiri Kosina
SUSE Labs


  reply	other threads:[~2021-03-02  9:49 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01 20:44 Lockdep warning in iwl_pcie_rx_handle() Jiri Kosina
2021-03-01 21:09 ` Jiri Kosina
2021-03-01 21:12 ` Johannes Berg
2021-03-02  9:27   ` Jiri Kosina [this message]
2021-03-02 10:34     ` [PATCH v2] iwlwifi: don't call netif_napi_add() with rxq->lock held (was Re: Lockdep warning in iwl_pcie_rx_handle()) Jiri Kosina
2021-03-02 10:49       ` Coelho, Luciano
2021-03-02 12:17         ` Kalle Valo
2021-03-02 14:55           ` Jiri Kosina
2021-03-03 21:39           ` Sedat Dilek
2021-03-03 15:59       ` Kalle Valo
     [not found]       ` <20210303155941.25521C43463@smtp.codeaurora.org>
2021-03-03 18:01         ` Jiri Kosina
2021-03-03 18:33           ` Kalle Valo
2021-03-03 19:19             ` Jiri Kosina
2021-03-02 10:48     ` [PATCH] " Coelho, Luciano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.76.2103021025410.12405@cbobk.fhfr.pm \
    --to=jikos@kernel.org \
    --cc=hkallweit1@gmail.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=luciano.coelho@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).