All of lore.kernel.org
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: linux-scsi@vger.kernel.org
Cc: martin.petersen@oracle.com, jejb@linux.vnet.ibm.com,
	hare@suse.de, john.garry@huawei.com
Subject: [PATCH] scsi_debug: fix scp is NULL errors
Date: Thu, 13 Aug 2020 11:57:38 -0400	[thread overview]
Message-ID: <20200813155738.109298-1-dgilbert@interlog.com> (raw)

John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures
that were mainly seen on aarch64 machines (e.g. RPi 4 with four
A72 CPUs). The problem was tracked down to a missing critical
section on a "short circuit" path. Namely, the time to process
the current command so far has already exceeded the requested
command duration (i.e. the number of nanoseconds in the ndelay
parameter).

The random=1 parameter setting was pivotal in finding this error.
The failure scenario involved first taking that "short circuit"
path (due to a very short command duration) and then taking the
more likely hrtimer_start() path (due to a longer command
duration). With random=1 each command's duration is taken from
the uniformly distributed [0..ndelay) interval.
The fio utility also helped by reliably generating the error
scenario at about once per minute on a RPi 4 (64 bit OS).

Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 drivers/scsi/scsi_debug.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index d95822dceeb6..4b4e31af22bd 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -5471,9 +5471,11 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
 				u64 d = ktime_get_boottime_ns() - ns_from_boot;
 
 				if (kt <= d) {	/* elapsed duration >= kt */
+					spin_lock_irqsave(&sqp->qc_lock, iflags);
 					sqcp->a_cmnd = NULL;
 					atomic_dec(&devip->num_in_q);
 					clear_bit(k, sqp->in_use_bm);
+					spin_unlock_irqrestore(&sqp->qc_lock, iflags);
 					if (new_sd_dp)
 						kfree(sd_dp);
 					/* call scsi_done() from this thread */
-- 
2.25.1


             reply	other threads:[~2020-08-13 15:57 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-13 15:57 Douglas Gilbert [this message]
2020-08-15 17:18 ` [PATCH] scsi_debug: fix scp is NULL errors Lee Duncan
2020-08-18  3:12 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200813155738.109298-1-dgilbert@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=hare@suse.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.