From: Bart Van Assche <bvanassche@acm.org>
To: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>,
linux-scsi@vger.kernel.org,
Adrian Hunter <adrian.hunter@intel.com>,
Bart Van Assche <bvanassche@acm.org>,
Avri Altman <avri.altman@wdc.com>, Bean Huo <beanhuo@micron.com>,
Stanley Chu <stanley.chu@mediatek.com>,
"James E.J. Bottomley" <jejb@linux.ibm.com>,
Matthias Brugger <matthias.bgg@gmail.com>,
James Bottomley <James.Bottomley@HansenPartnership.com>,
Sujit Reddy Thumma <sthumma@codeaurora.org>,
Santosh Y <santoshsy@gmail.com>
Subject: [PATCH v2] scsi: ufs: Fix a race condition related to device management
Date: Wed, 20 Jul 2022 10:02:23 -0700 [thread overview]
Message-ID: <20220720170228.1598842-1-bvanassche@acm.org> (raw)
If a device management command completion happens after
wait_for_completion_timeout() times out and before ufshcd_clear_cmds() is
called then the completion code may crash on the complete() call in
__ufshcd_transfer_req_compl(). This patch fixes the following crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
Call trace:
complete+0x64/0x178
__ufshcd_transfer_req_compl+0x30c/0x9c0
ufshcd_poll+0xf0/0x208
ufshcd_sl_intr+0xb8/0xf0
ufshcd_intr+0x168/0x2f4
__handle_irq_event_percpu+0xa0/0x30c
handle_irq_event+0x84/0x178
handle_fasteoi_irq+0x150/0x2e8
__handle_domain_irq+0x114/0x1e4
gic_handle_irq.31846+0x58/0x300
el1_irq+0xe4/0x1c0
efi_header_end+0x110/0x680
__irq_exit_rcu+0x108/0x124
__handle_domain_irq+0x118/0x1e4
gic_handle_irq.31846+0x58/0x300
el1_irq+0xe4/0x1c0
cpuidle_enter_state+0x3ac/0x8c4
do_idle+0x2fc/0x55c
cpu_startup_entry+0x84/0x90
kernel_init+0x0/0x310
start_kernel+0x0/0x608
start_kernel+0x4ec/0x608
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Fixes: 5a0b0cb9bee7 ("[SCSI] ufs: Add support for sending NOP OUT UPIU")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
Changes compared to v1: made source code comments more clear.
drivers/ufs/core/ufshcd.c | 58 +++++++++++++++++++++++++++------------
1 file changed, 40 insertions(+), 18 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index decadd227b96..36b7212e9cb5 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -2992,37 +2992,59 @@ ufshcd_dev_cmd_completion(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba,
struct ufshcd_lrb *lrbp, int max_timeout)
{
- int err = 0;
- unsigned long time_left;
+ unsigned long time_left = msecs_to_jiffies(max_timeout);
unsigned long flags;
+ bool pending;
+ int err;
+retry:
time_left = wait_for_completion_timeout(hba->dev_cmd.complete,
- msecs_to_jiffies(max_timeout));
+ time_left);
- spin_lock_irqsave(hba->host->host_lock, flags);
- hba->dev_cmd.complete = NULL;
if (likely(time_left)) {
+ /*
+ * The completion handler called complete() and the caller of
+ * this function still owns the @lrbp tag so the code below does
+ * not trigger any race conditions.
+ */
+ hba->dev_cmd.complete = NULL;
err = ufshcd_get_tr_ocs(lrbp);
if (!err)
err = ufshcd_dev_cmd_completion(hba, lrbp);
- }
- spin_unlock_irqrestore(hba->host->host_lock, flags);
-
- if (!time_left) {
+ } else {
err = -ETIMEDOUT;
dev_dbg(hba->dev, "%s: dev_cmd request timedout, tag %d\n",
__func__, lrbp->task_tag);
- if (!ufshcd_clear_cmds(hba, 1U << lrbp->task_tag))
+ if (ufshcd_clear_cmds(hba, 1U << lrbp->task_tag) == 0) {
/* successfully cleared the command, retry if needed */
err = -EAGAIN;
- /*
- * in case of an error, after clearing the doorbell,
- * we also need to clear the outstanding_request
- * field in hba
- */
- spin_lock_irqsave(&hba->outstanding_lock, flags);
- __clear_bit(lrbp->task_tag, &hba->outstanding_reqs);
- spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+ /*
+ * Since clearing the command succeeded we also need to
+ * clear the task tag bit from the outstanding_reqs
+ * variable.
+ */
+ spin_lock_irqsave(&hba->outstanding_lock, flags);
+ pending = test_bit(lrbp->task_tag,
+ &hba->outstanding_reqs);
+ if (pending) {
+ hba->dev_cmd.complete = NULL;
+ __clear_bit(lrbp->task_tag,
+ &hba->outstanding_reqs);
+ }
+ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+ if (!pending) {
+ /*
+ * The completion handler ran while we tried to
+ * clear the command.
+ */
+ time_left = 1;
+ goto retry;
+ }
+ } else {
+ dev_err(hba->dev, "%s: failed to clear tag %d\n",
+ __func__, lrbp->task_tag);
+ }
}
return err;
next reply other threads:[~2022-07-20 17:02 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-20 17:02 Bart Van Assche [this message]
2022-07-27 3:15 ` [PATCH v2] scsi: ufs: Fix a race condition related to device management Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220720170228.1598842-1-bvanassche@acm.org \
--to=bvanassche@acm.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=adrian.hunter@intel.com \
--cc=avri.altman@wdc.com \
--cc=beanhuo@micron.com \
--cc=jaegeuk@kernel.org \
--cc=jejb@linux.ibm.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=matthias.bgg@gmail.com \
--cc=santoshsy@gmail.com \
--cc=stanley.chu@mediatek.com \
--cc=sthumma@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.