All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Lijian Zhang <Lijian.Zhang@arm.com>,
	Jia He <justin.he@arm.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.13 21/35] qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()
Date: Fri,  6 Aug 2021 10:17:04 +0200	[thread overview]
Message-ID: <20210806081114.428051803@linuxfoundation.org> (raw)
In-Reply-To: <20210806081113.718626745@linuxfoundation.org>

From: Jia He <justin.he@arm.com>

[ Upstream commit 6206b7981a36476f4695d661ae139f7db36a802d ]

Liajian reported a bug_on hit on a ThunderX2 arm64 server with FastLinQ
QL41000 ethernet controller:
 BUG: scheduling while atomic: kworker/0:4/531/0x00000200
  [qed_probe:488()]hw prepare failed
  kernel BUG at mm/vmalloc.c:2355!
  Internal error: Oops - BUG: 0 [#1] SMP
  CPU: 0 PID: 531 Comm: kworker/0:4 Tainted: G W 5.4.0-77-generic #86-Ubuntu
  pstate: 00400009 (nzcv daif +PAN -UAO)
 Call trace:
  vunmap+0x4c/0x50
  iounmap+0x48/0x58
  qed_free_pci+0x60/0x80 [qed]
  qed_probe+0x35c/0x688 [qed]
  __qede_probe+0x88/0x5c8 [qede]
  qede_probe+0x60/0xe0 [qede]
  local_pci_probe+0x48/0xa0
  work_for_cpu_fn+0x24/0x38
  process_one_work+0x1d0/0x468
  worker_thread+0x238/0x4e0
  kthread+0xf0/0x118
  ret_from_fork+0x10/0x18

In this case, qed_hw_prepare() returns error due to hw/fw error, but in
theory work queue should be in process context instead of interrupt.

The root cause might be the unpaired spin_{un}lock_bh() in
_qed_mcp_cmd_and_union(), which causes botton half is disabled incorrectly.

Reported-by: Lijian Zhang <Lijian.Zhang@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_mcp.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index cd882c453394..caeef25c89bb 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -474,14 +474,18 @@ _qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
 
 		spin_lock_bh(&p_hwfn->mcp_info->cmd_lock);
 
-		if (!qed_mcp_has_pending_cmd(p_hwfn))
+		if (!qed_mcp_has_pending_cmd(p_hwfn)) {
+			spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 			break;
+		}
 
 		rc = qed_mcp_update_pending_cmd(p_hwfn, p_ptt);
-		if (!rc)
+		if (!rc) {
+			spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 			break;
-		else if (rc != -EAGAIN)
+		} else if (rc != -EAGAIN) {
 			goto err;
+		}
 
 		spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 
@@ -498,6 +502,8 @@ _qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
 		return -EAGAIN;
 	}
 
+	spin_lock_bh(&p_hwfn->mcp_info->cmd_lock);
+
 	/* Send the mailbox command */
 	qed_mcp_reread_offsets(p_hwfn, p_ptt);
 	seq_num = ++p_hwfn->mcp_info->drv_mb_seq;
@@ -524,14 +530,18 @@ _qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
 
 		spin_lock_bh(&p_hwfn->mcp_info->cmd_lock);
 
-		if (p_cmd_elem->b_is_completed)
+		if (p_cmd_elem->b_is_completed) {
+			spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 			break;
+		}
 
 		rc = qed_mcp_update_pending_cmd(p_hwfn, p_ptt);
-		if (!rc)
+		if (!rc) {
+			spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 			break;
-		else if (rc != -EAGAIN)
+		} else if (rc != -EAGAIN) {
 			goto err;
+		}
 
 		spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 	} while (++cnt < max_retries);
@@ -554,6 +564,7 @@ _qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
 		return -EAGAIN;
 	}
 
+	spin_lock_bh(&p_hwfn->mcp_info->cmd_lock);
 	qed_mcp_cmd_del_elem(p_hwfn, p_cmd_elem);
 	spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
 
-- 
2.30.2




  parent reply	other threads:[~2021-08-06  8:24 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-06  8:16 [PATCH 5.13 00/35] 5.13.9-rc1 review Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 01/35] drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 02/35] Revert "drm/i915: Propagate errors on awaiting already signaled fences" Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 03/35] power: supply: ab8500: Call battery population once Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 04/35] skmsg: Increase sk->sk_drops when dropping packets Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 05/35] skmsg: Pass source psock to sk_psock_skb_redirect() Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 06/35] bpf, sockmap: On cleanup we additionally need to remove cached skb Greg Kroah-Hartman
2021-08-07 17:45   ` Naresh Kamboju
2021-08-09 17:33     ` John Fastabend
2021-08-06  8:16 ` [PATCH 5.13 07/35] cifs: use helpers when parsing uid/gid mount options and validate them Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 08/35] cifs: add missing parsing of backupuid Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 09/35] net: dsa: sja1105: parameterize the number of ports Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 10/35] net: dsa: sja1105: fix address learning getting disabled on the CPU port Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 11/35] ASoC: Intel: boards: handle hda-dsp-common as a module Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 12/35] ASoC: Intel: boards: create sof-maxim-common module Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 13/35] ASoC: Intel: boards: fix xrun issue on platform with max98373 Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 14/35] regulator: rtmv20: Fix wrong mask for strobe-polarity-high Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 15/35] regulator: rt5033: Fix n_voltages settings for BUCK and LDO Greg Kroah-Hartman
2021-08-06  8:16 ` [PATCH 5.13 16/35] spi: stm32h7: fix full duplex irq handler handling Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 17/35] ASoC: tlv320aic31xx: fix reversed bclk/wclk master bits Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 18/35] regulator: mtk-dvfsrc: Fix wrong dev pointer for devm_regulator_register Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 19/35] r8152: Fix potential PM refcount imbalance Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 20/35] r8152: Fix a deadlock by doubly PM resume Greg Kroah-Hartman
2021-08-06  8:17 ` Greg Kroah-Hartman [this message]
2021-08-06  8:17 ` [PATCH 5.13 22/35] ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspend Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 23/35] net: Fix zero-copy head len calculation Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 24/35] ASoC: ti: j721e-evm: Fix unbalanced domain activity tracking during startup Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 25/35] ASoC: ti: j721e-evm: Check for not initialized parent_clk_id Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 26/35] efi/mokvar: Reserve the table only if it is in boot services data Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 27/35] nvme: fix nvme_setup_command metadata trace event Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 28/35] drm/amd/display: Fix comparison error in dcn21 DML Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 29/35] drm/amd/display: Fix max vstartup calculation for modes with borders Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 30/35] io_uring: never attempt iopoll reissue from release path Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 31/35] io_uring: explicitly catch any illegal async queue attempt Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 32/35] Revert "spi: mediatek: fix fifo rx mode" Greg Kroah-Hartman
2021-08-06 18:54   ` Guenter Roeck
2021-08-07  8:20     ` Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 33/35] Revert "Bluetooth: Shutdown controller after workqueues are flushed or cancelled" Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 34/35] Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout" Greg Kroah-Hartman
2021-08-06  8:17 ` [PATCH 5.13 35/35] drm/amd/display: Fix ASSR regression on embedded panels Greg Kroah-Hartman
2021-08-06 10:44 ` [PATCH 5.13 00/35] 5.13.9-rc1 review Fox Chen
2021-08-06 14:33 ` Jon Hunter
2021-08-06 18:59 ` Guenter Roeck
2021-08-06 23:45 ` Justin Forbes
2021-08-07 13:16 ` Aakash Hemadri
2021-08-07 17:55 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210806081114.428051803@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=Lijian.Zhang@arm.com \
    --cc=davem@davemloft.net \
    --cc=justin.he@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.