All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Fix a deadlock in the UFS driver
@ 2022-09-29 22:00 Bart Van Assche
  2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche

Hi Martin,

This patch series fixes a deadlock in the UFS driver between the suspend/resume
code and the SCSI error handler. Please consider this patch series for the next
merge window.

Thanks,

Bart.

Changes compared to v2:
* Replaced the custom error handling code in ufshcd_eh_timed_out() with a call
  to ufshcd_link_recovery().

Changes compared to v1:
* Added support in the SCSI core for failing SCSI commands quickly during host
  recovery.
* Removed the patch that splits the ufshcd_err_handler() function.
* Fixed the code in ufshcd_eh_timed_out() for handling command timeouts.
* Removed the power management notifier again.

Bart Van Assche (8):
  scsi: core: Fix a race between scsi_done() and scsi_timeout()
  scsi: core: Change the return type of .eh_timed_out()
  scsi: core: Support failing requests while recovering
  scsi: ufs: Remove an outdated comment
  scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode()
  scsi: ufs: Try harder to change the power mode
  scsi: ufs: Track system suspend / resume activity
  scsi: ufs: Fix a deadlock between PM and the SCSI error handler

 Documentation/scsi/scsi_eh.rst            |  7 ++--
 drivers/message/fusion/mptsas.c           |  8 ++---
 drivers/scsi/libiscsi.c                   | 26 +++++++-------
 drivers/scsi/megaraid/megaraid_sas_base.c |  7 ++--
 drivers/scsi/mvumi.c                      |  4 +--
 drivers/scsi/qla4xxx/ql4_os.c             |  8 ++---
 drivers/scsi/scsi_error.c                 | 41 +++++++++++------------
 drivers/scsi/scsi_lib.c                   |  8 +++--
 drivers/scsi/scsi_transport_fc.c          |  6 ++--
 drivers/scsi/scsi_transport_srp.c         |  8 ++---
 drivers/scsi/storvsc_drv.c                |  4 +--
 drivers/scsi/virtio_scsi.c                |  4 +--
 drivers/ufs/core/ufshcd.c                 | 37 +++++++++++++++-----
 include/scsi/libiscsi.h                   |  2 +-
 include/scsi/scsi_host.h                  | 14 +++++++-
 include/scsi/scsi_transport_fc.h          |  2 +-
 include/scsi/scsi_transport_srp.h         |  2 +-
 include/ufs/ufshcd.h                      |  5 ++-
 18 files changed, 115 insertions(+), 78 deletions(-)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout()
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-09-30  0:17   ` Mike Christie
  2022-09-29 22:00 ` [PATCH v3 2/8] scsi: core: Change the return type of .eh_timed_out() Bart Van Assche
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Keith Busch, Christoph Hellwig, Ming Lei, John Garry,
	Mike Christie, Hannes Reinecke, James E.J. Bottomley, Jens Axboe

If there is a race between scsi_done() and scsi_timeout() and if
scsi_timeout() loses the race, scsi_timeout() should not reset the
request timer. Hence change the return value for this case from
BLK_EH_RESET_TIMER into BLK_EH_DONE.

Although the block layer holds a reference on a request (req->ref) while
calling a timeout handler, restarting the timer (blk_add_timer()) while
a request is being completed is racy.

Cc: Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 065990bd198e ("scsi: set timed out out mq requests to complete")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 16bd0adc2339..d1b07ff64a96 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -343,19 +343,11 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
 
 	if (rtn == BLK_EH_DONE) {
 		/*
-		 * Set the command to complete first in order to prevent a real
-		 * completion from releasing the command while error handling
-		 * is using it. If the command was already completed, then the
-		 * lower level driver beat the timeout handler, and it is safe
-		 * to return without escalating error recovery.
-		 *
-		 * If timeout handling lost the race to a real completion, the
-		 * block layer may ignore that due to a fake timeout injection,
-		 * so return RESET_TIMER to allow error handling another shot
-		 * at this command.
+		 * If scsi_done() has already set SCMD_STATE_COMPLETE, do not
+		 * modify *scmd.
 		 */
 		if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
-			return BLK_EH_RESET_TIMER;
+			return BLK_EH_DONE;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/8] scsi: core: Change the return type of .eh_timed_out()
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
  2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-09-29 22:00 ` [PATCH v3 3/8] scsi: core: Support failing requests while recovering Bart Van Assche
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Lee Duncan, Christoph Hellwig, Ming Lei, John Garry,
	Mike Christie, Hannes Reinecke, Jonathan Corbet, Sathya Prakash,
	Sreekanth Reddy, Suganath Prabu Subramani, Chris Leech,
	James E.J. Bottomley, Kashyap Desai, Sumit Saxena, Shivasharan S,
	Nilesh Javali, Manish Rangankar, GR-QLogic-Storage-Upstream,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael S. Tsirkin, Jason Wang, Jens Axboe,
	Khazhismel Kumykov

Commit 6600593cbd93 ("block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE")
made it impossible for .eh_timed_out() implementations to call
scsi_done() without causing a crash. Restore support for SCSI timeout
handlers to call scsi_done() as follows:
* Change all .eh_timed_out() handlers as follows:
  - Change the return type into enum scsi_timeout_action.
  - Change BLK_EH_RESET_TIMER into SCSI_EH_RESET_TIMER.
  - Change BLK_EH_DONE into SCSI_EH_NOT_HANDLED.
* In scsi_timeout(), convert the SCSI_EH_* values into BLK_EH_* values.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 Documentation/scsi/scsi_eh.rst            |  7 +++--
 drivers/message/fusion/mptsas.c           |  8 +++---
 drivers/scsi/libiscsi.c                   | 26 +++++++++---------
 drivers/scsi/megaraid/megaraid_sas_base.c |  7 +++--
 drivers/scsi/mvumi.c                      |  4 +--
 drivers/scsi/qla4xxx/ql4_os.c             |  8 +++---
 drivers/scsi/scsi_error.c                 | 33 +++++++++++++----------
 drivers/scsi/scsi_transport_fc.c          |  6 ++---
 drivers/scsi/scsi_transport_srp.c         |  8 +++---
 drivers/scsi/storvsc_drv.c                |  4 +--
 drivers/scsi/virtio_scsi.c                |  4 +--
 include/scsi/libiscsi.h                   |  2 +-
 include/scsi/scsi_host.h                  | 14 +++++++++-
 include/scsi/scsi_transport_fc.h          |  2 +-
 include/scsi/scsi_transport_srp.h         |  2 +-
 15 files changed, 77 insertions(+), 58 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.rst b/Documentation/scsi/scsi_eh.rst
index bad624fab823..104d09e9af09 100644
--- a/Documentation/scsi/scsi_eh.rst
+++ b/Documentation/scsi/scsi_eh.rst
@@ -92,14 +92,17 @@ The timeout handler is scsi_timeout().  When a timeout occurs, this function
  1. invokes optional hostt->eh_timed_out() callback.  Return value can
     be one of
 
-    - BLK_EH_RESET_TIMER
+    - SCSI_EH_RESET_TIMER
 	This indicates that more time is required to finish the
 	command.  Timer is restarted.
 
-    - BLK_EH_DONE
+    - SCSI_EH_NOT_HANDLED
         eh_timed_out() callback did not handle the command.
 	Step #2 is taken.
 
+    - SCSI_EH_DONE
+        eh_timed_out() completed the command.
+
  2. scsi_abort_command() is invoked to schedule an asynchronous abort which may
     issue a retry scmd->allowed + 1 times.  Asynchronous aborts are not invoked
     for commands for which the SCSI_EH_ABORT_SCHEDULED flag is set (this
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 34901bcd1ce8..88fe4a860ae5 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1952,12 +1952,12 @@ mptsas_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *SCpnt)
  *	@sc: scsi command that the midlayer is about to time out
  *
  **/
-static enum blk_eh_timer_return mptsas_eh_timed_out(struct scsi_cmnd *sc)
+static enum scsi_timeout_action mptsas_eh_timed_out(struct scsi_cmnd *sc)
 {
 	MPT_SCSI_HOST *hd;
 	MPT_ADAPTER   *ioc;
 	VirtDevice    *vdevice;
-	enum blk_eh_timer_return rc = BLK_EH_DONE;
+	enum scsi_timeout_action rc = SCSI_EH_NOT_HANDLED;
 
 	hd = shost_priv(sc->device->host);
 	if (hd == NULL) {
@@ -1980,7 +1980,7 @@ static enum blk_eh_timer_return mptsas_eh_timed_out(struct scsi_cmnd *sc)
 		dtmprintk(ioc, printk(MYIOC_s_WARN_FMT ": %s: ioc is in reset,"
 		    "SML need to reset the timer (sc=%p)\n",
 		    ioc->name, __func__, sc));
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 	}
 	vdevice = sc->device->hostdata;
 	if (vdevice && vdevice->vtarget && (vdevice->vtarget->inDMD
@@ -1988,7 +1988,7 @@ static enum blk_eh_timer_return mptsas_eh_timed_out(struct scsi_cmnd *sc)
 		dtmprintk(ioc, printk(MYIOC_s_WARN_FMT ": %s: target removed "
 		    "or in device removal delay (sc=%p)\n",
 		    ioc->name, __func__, sc));
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		goto done;
 	}
 
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index d95f4bcdeb2e..ef2fc860257e 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2071,9 +2071,9 @@ static int iscsi_has_ping_timed_out(struct iscsi_conn *conn)
 		return 0;
 }
 
-enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
+enum scsi_timeout_action iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 {
-	enum blk_eh_timer_return rc = BLK_EH_DONE;
+	enum scsi_timeout_action rc = SCSI_EH_NOT_HANDLED;
 	struct iscsi_task *task = NULL, *running_task;
 	struct iscsi_cls_session *cls_session;
 	struct iscsi_session *session;
@@ -2093,7 +2093,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		 * Raced with completion. Blk layer has taken ownership
 		 * so let timeout code complete it now.
 		 */
-		rc = BLK_EH_DONE;
+		rc = SCSI_EH_NOT_HANDLED;
 		spin_unlock(&session->back_lock);
 		goto done;
 	}
@@ -2102,7 +2102,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		 * Racing with the completion path right now, so give it more
 		 * time so that path can complete it like normal.
 		 */
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		task = NULL;
 		spin_unlock(&session->back_lock);
 		goto done;
@@ -2120,21 +2120,21 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		if (unlikely(system_state != SYSTEM_RUNNING)) {
 			sc->result = DID_NO_CONNECT << 16;
 			ISCSI_DBG_EH(session, "sc on shutdown, handled\n");
-			rc = BLK_EH_DONE;
+			rc = SCSI_EH_NOT_HANDLED;
 			goto done;
 		}
 		/*
 		 * We are probably in the middle of iscsi recovery so let
 		 * that complete and handle the error.
 		 */
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		goto done;
 	}
 
 	conn = session->leadconn;
 	if (!conn) {
 		/* In the middle of shuting down */
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		goto done;
 	}
 
@@ -2151,7 +2151,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 			     "Last data xfer at %lu. Last timeout was at "
 			     "%lu\n.", task->last_xfer, task->last_timeout);
 		task->have_checked_conn = false;
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		goto done;
 	}
 
@@ -2162,7 +2162,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 	 * and can let the iscsi eh handle it
 	 */
 	if (iscsi_has_ping_timed_out(conn)) {
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		goto done;
 	}
 
@@ -2200,7 +2200,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 				     task->last_xfer, running_task->last_xfer,
 				     task->last_timeout);
 			spin_unlock(&session->back_lock);
-			rc = BLK_EH_RESET_TIMER;
+			rc = SCSI_EH_RESET_TIMER;
 			goto done;
 		}
 	}
@@ -2216,14 +2216,14 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 	 */
 	if (READ_ONCE(conn->ping_task)) {
 		task->have_checked_conn = true;
-		rc = BLK_EH_RESET_TIMER;
+		rc = SCSI_EH_RESET_TIMER;
 		goto done;
 	}
 
 	/* Make sure there is a transport check done */
 	iscsi_send_nopout(conn, NULL);
 	task->have_checked_conn = true;
-	rc = BLK_EH_RESET_TIMER;
+	rc = SCSI_EH_RESET_TIMER;
 
 done:
 	spin_unlock_bh(&session->frwd_lock);
@@ -2232,7 +2232,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		task->last_timeout = jiffies;
 		iscsi_put_task(task);
 	}
-	ISCSI_DBG_EH(session, "return %s\n", rc == BLK_EH_RESET_TIMER ?
+	ISCSI_DBG_EH(session, "return %s\n", rc == SCSI_EH_RESET_TIMER ?
 		     "timer reset" : "shutdown or nh");
 	return rc;
 }
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index ae6b9a570fa9..643b1a9a3480 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2927,15 +2927,14 @@ static int megasas_generic_reset(struct scsi_cmnd *scmd)
  * Sets the FW busy flag and reduces the host->can_queue if the
  * cmd has not been completed within the timeout period.
  */
-static enum
-blk_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
+static enum scsi_timeout_action megasas_reset_timer(struct scsi_cmnd *scmd)
 {
 	struct megasas_instance *instance;
 	unsigned long flags;
 
 	if (time_after(jiffies, scmd->jiffies_at_alloc +
 				(scmd_timeout * 2) * HZ)) {
-		return BLK_EH_DONE;
+		return SCSI_EH_NOT_HANDLED;
 	}
 
 	instance = (struct megasas_instance *)scmd->device->host->hostdata;
@@ -2949,7 +2948,7 @@ blk_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
 
 		spin_unlock_irqrestore(instance->host->host_lock, flags);
 	}
-	return BLK_EH_RESET_TIMER;
+	return SCSI_EH_RESET_TIMER;
 }
 
 /**
diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
index 05d3ce9b72db..b3dcb8918618 100644
--- a/drivers/scsi/mvumi.c
+++ b/drivers/scsi/mvumi.c
@@ -2109,7 +2109,7 @@ static int mvumi_queue_command(struct Scsi_Host *shost,
 	return 0;
 }
 
-static enum blk_eh_timer_return mvumi_timed_out(struct scsi_cmnd *scmd)
+static enum scsi_timeout_action mvumi_timed_out(struct scsi_cmnd *scmd)
 {
 	struct mvumi_cmd *cmd = mvumi_priv(scmd)->cmd_priv;
 	struct Scsi_Host *host = scmd->device->host;
@@ -2137,7 +2137,7 @@ static enum blk_eh_timer_return mvumi_timed_out(struct scsi_cmnd *scmd)
 	mvumi_return_cmd(mhba, cmd);
 	spin_unlock_irqrestore(mhba->shost->host_lock, flags);
 
-	return BLK_EH_DONE;
+	return SCSI_EH_NOT_HANDLED;
 }
 
 static int
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 9e849f6b0d0f..005502125b27 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -116,7 +116,7 @@ static int qla4xxx_iface_set_param(struct Scsi_Host *shost, void *data,
 static int qla4xxx_get_iface_param(struct iscsi_iface *iface,
 				   enum iscsi_param_type param_type,
 				   int param, char *buf);
-static enum blk_eh_timer_return qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc);
+static enum scsi_timeout_action qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc);
 static struct iscsi_endpoint *qla4xxx_ep_connect(struct Scsi_Host *shost,
 						 struct sockaddr *dst_addr,
 						 int non_blocking);
@@ -1871,17 +1871,17 @@ static void qla4xxx_conn_get_stats(struct iscsi_cls_conn *cls_conn,
 	return;
 }
 
-static enum blk_eh_timer_return qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc)
+static enum scsi_timeout_action qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc)
 {
 	struct iscsi_cls_session *session;
 	unsigned long flags;
-	enum blk_eh_timer_return ret = BLK_EH_DONE;
+	enum scsi_timeout_action ret = SCSI_EH_NOT_HANDLED;
 
 	session = starget_to_session(scsi_target(sc->device));
 
 	spin_lock_irqsave(&session->lock, flags);
 	if (session->state == ISCSI_SESSION_FAILED)
-		ret = BLK_EH_RESET_TIMER;
+		ret = SCSI_EH_RESET_TIMER;
 	spin_unlock_irqrestore(&session->lock, flags);
 
 	return ret;
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d1b07ff64a96..d3eee1435e47 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -328,7 +328,6 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 enum blk_eh_timer_return scsi_timeout(struct request *req)
 {
 	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req);
-	enum blk_eh_timer_return rtn = BLK_EH_DONE;
 	struct Scsi_Host *host = scmd->device->host;
 
 	trace_scsi_dispatch_cmd_timeout(scmd);
@@ -338,23 +337,29 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
 	if (host->eh_deadline != -1 && !host->last_reset)
 		host->last_reset = jiffies;
 
-	if (host->hostt->eh_timed_out)
-		rtn = host->hostt->eh_timed_out(scmd);
-
-	if (rtn == BLK_EH_DONE) {
-		/*
-		 * If scsi_done() has already set SCMD_STATE_COMPLETE, do not
-		 * modify *scmd.
-		 */
-		if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
+	if (host->hostt->eh_timed_out) {
+		switch (host->hostt->eh_timed_out(scmd)) {
+		case SCSI_EH_DONE:
 			return BLK_EH_DONE;
-		if (scsi_abort_command(scmd) != SUCCESS) {
-			set_host_byte(scmd, DID_TIME_OUT);
-			scsi_eh_scmd_add(scmd);
+		case SCSI_EH_RESET_TIMER:
+			return BLK_EH_RESET_TIMER;
+		case SCSI_EH_NOT_HANDLED:
+			break;
 		}
 	}
 
-	return rtn;
+	/*
+	 * If scsi_done() has already set SCMD_STATE_COMPLETE, do not modify
+	 * *scmd.
+	 */
+	if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
+		return BLK_EH_DONE;
+	if (scsi_abort_command(scmd) != SUCCESS) {
+		set_host_byte(scmd, DID_TIME_OUT);
+		scsi_eh_scmd_add(scmd);
+	}
+
+	return BLK_EH_DONE;
 }
 
 /**
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 8934160c4a33..7294bb98df92 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -2530,15 +2530,15 @@ static int fc_vport_match(struct attribute_container *cont,
  * Notes:
  *	This routine assumes no locks are held on entry.
  */
-enum blk_eh_timer_return
+enum scsi_timeout_action
 fc_eh_timed_out(struct scsi_cmnd *scmd)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(scmd->device));
 
 	if (rport->port_state == FC_PORTSTATE_BLOCKED)
-		return BLK_EH_RESET_TIMER;
+		return SCSI_EH_RESET_TIMER;
 
-	return BLK_EH_DONE;
+	return SCSI_EH_NOT_HANDLED;
 }
 EXPORT_SYMBOL(fc_eh_timed_out);
 
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 98a34ed10f1a..87d0fb8dc503 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -594,13 +594,13 @@ EXPORT_SYMBOL(srp_reconnect_rport);
  * @scmd: SCSI command.
  *
  * If a timeout occurs while an rport is in the blocked state, ask the SCSI
- * EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
- * handle the timeout (BLK_EH_DONE).
+ * EH to continue waiting (SCSI_EH_RESET_TIMER). Otherwise let the SCSI core
+ * handle the timeout (SCSI_EH_NOT_HANDLED).
  *
  * Note: This function is called from soft-IRQ context and with the request
  * queue lock held.
  */
-enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
+enum scsi_timeout_action srp_timed_out(struct scsi_cmnd *scmd)
 {
 	struct scsi_device *sdev = scmd->device;
 	struct Scsi_Host *shost = sdev->host;
@@ -611,7 +611,7 @@ enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
 	return rport && rport->fast_io_fail_tmo < 0 &&
 		rport->dev_loss_tmo < 0 &&
 		i->f->reset_timer_if_blocked && scsi_device_blocked(sdev) ?
-		BLK_EH_RESET_TIMER : BLK_EH_DONE;
+		SCSI_EH_RESET_TIMER : SCSI_EH_NOT_HANDLED;
 }
 EXPORT_SYMBOL(srp_timed_out);
 
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 25c44c87c972..169bbe510fca 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1643,13 +1643,13 @@ static int storvsc_host_reset_handler(struct scsi_cmnd *scmnd)
  * be unbounded on Azure.  Reset the timer unconditionally to give the host a
  * chance to perform EH.
  */
-static enum blk_eh_timer_return storvsc_eh_timed_out(struct scsi_cmnd *scmnd)
+static enum scsi_timeout_action storvsc_eh_timed_out(struct scsi_cmnd *scmnd)
 {
 #if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
 	if (scmnd->device->host->transportt == fc_transport_template)
 		return fc_eh_timed_out(scmnd);
 #endif
-	return BLK_EH_RESET_TIMER;
+	return SCSI_EH_RESET_TIMER;
 }
 
 static bool storvsc_scsi_cmd_ok(struct scsi_cmnd *scmnd)
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 00cf6743db8c..0eda846b7365 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -731,9 +731,9 @@ static void virtscsi_commit_rqs(struct Scsi_Host *shost, u16 hwq)
  * latencies might be higher than on bare metal.  Reset the timer
  * unconditionally to give the host a chance to perform EH.
  */
-static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
+static enum scsi_timeout_action virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
 {
-	return BLK_EH_RESET_TIMER;
+	return SCSI_EH_RESET_TIMER;
 }
 
 static struct scsi_host_template virtscsi_host_template = {
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 654cc3918c94..695eebc6f2c8 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -393,7 +393,7 @@ extern int iscsi_eh_recover_target(struct scsi_cmnd *sc);
 extern int iscsi_eh_session_reset(struct scsi_cmnd *sc);
 extern int iscsi_eh_device_reset(struct scsi_cmnd *sc);
 extern int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc);
-extern enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc);
+extern enum scsi_timeout_action iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc);
 
 /*
  * iSCSI host helpers.
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index aa7b7496c93a..85b2c4986dea 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -27,6 +27,18 @@ struct scsi_transport_template;
 #define MODE_INITIATOR 0x01
 #define MODE_TARGET 0x02
 
+/**
+ * enum scsi_timeout_action - How to handle a command that timed out.
+ * @SCSI_EH_DONE: The command has already been completed.
+ * @SCSI_EH_RESET_TIMER: Reset the timer and continue waiting for completion.
+ * @SCSI_EH_NOT_HANDLED: The command has not yet finished. Abort the command.
+ */
+enum scsi_timeout_action {
+	SCSI_EH_DONE,
+	SCSI_EH_RESET_TIMER,
+	SCSI_EH_NOT_HANDLED,
+};
+
 struct scsi_host_template {
 	/*
 	 * Put fields referenced in IO submission path together in
@@ -331,7 +343,7 @@ struct scsi_host_template {
 	 *
 	 * Status: OPTIONAL
 	 */
-	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
+	enum scsi_timeout_action (*eh_timed_out)(struct scsi_cmnd *);
 	/*
 	 * Optional routine that allows the transport to decide if a cmd
 	 * is retryable. Return true if the transport is in a state the
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index e80a7c542c88..3dcda19d3520 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -862,7 +862,7 @@ struct fc_vport *fc_vport_create(struct Scsi_Host *shost, int channel,
 int fc_vport_terminate(struct fc_vport *vport);
 int fc_block_rport(struct fc_rport *rport);
 int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
-enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
+enum scsi_timeout_action fc_eh_timed_out(struct scsi_cmnd *scmd);
 bool fc_eh_should_retry_cmd(struct scsi_cmnd *scmd);
 
 static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job *job)
diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h
index d22df12584f9..dfc78aa112ad 100644
--- a/include/scsi/scsi_transport_srp.h
+++ b/include/scsi/scsi_transport_srp.h
@@ -118,7 +118,7 @@ extern int srp_reconnect_rport(struct srp_rport *rport);
 extern void srp_start_tl_fail_timers(struct srp_rport *rport);
 extern void srp_remove_host(struct Scsi_Host *);
 extern void srp_stop_rport_timers(struct srp_rport *rport);
-enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd);
+enum scsi_timeout_action srp_timed_out(struct scsi_cmnd *scmd);
 
 /**
  * srp_chkready() - evaluate the transport layer state before I/O

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 3/8] scsi: core: Support failing requests while recovering
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
  2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
  2022-09-29 22:00 ` [PATCH v3 2/8] scsi: core: Change the return type of .eh_timed_out() Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-10-03 17:27   ` Mike Christie
  2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	Christoph Hellwig, Ming Lei, John Garry, Mike Christie,
	Hannes Reinecke, James E.J. Bottomley

The current behavior for SCSI commands submitted while error recovery
is ongoing is to retry command submission after error recovery has
finished. See also the scsi_host_in_recovery() check in
scsi_host_queue_ready(). Add support for failing SCSI commands while
host recovery is in progress. This functionality will be used to fix a
deadlock in the UFS driver.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 473d9403f0c1..ecd2ce3815df 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
 				   struct scsi_device *sdev,
 				   struct scsi_cmnd *cmd)
 {
-	if (scsi_host_in_recovery(shost))
-		return 0;
-
 	if (atomic_read(&shost->host_blocked) > 0) {
 		if (scsi_host_busy(shost) > 0)
 			goto starved;
@@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 	ret = BLK_STS_RESOURCE;
 	if (!scsi_target_queue_ready(shost, sdev))
 		goto out_put_budget;
+	if (unlikely(scsi_host_in_recovery(shost))) {
+		if (req->cmd_flags & REQ_FAILFAST_MASK)
+			ret = BLK_STS_OFFLINE;
+		goto out_dec_target_busy;
+	}
 	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
 		goto out_dec_target_busy;
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 4/8] scsi: ufs: Remove an outdated comment
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
                   ` (2 preceding siblings ...)
  2022-09-29 22:00 ` [PATCH v3 3/8] scsi: core: Support failing requests while recovering Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-09-30 10:26   ` Bean Huo
  2022-10-03  5:55   ` Adrian Hunter
  2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	James E.J. Bottomley, Bean Huo, Avri Altman, Jinyoung Choi

Although the host lock had to be held by ufshcd_clk_scaling_start_busy()
callers when that function was introduced, that is no longer the case
today. Hence remove the comment that claims that callers of this function
must hold the host lock.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 7c15cbc737b4..78c980585dc3 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -2013,7 +2013,6 @@ static void ufshcd_exit_clk_gating(struct ufs_hba *hba)
 	destroy_workqueue(hba->clk_gating.clk_gating_workq);
 }
 
-/* Must be called with host lock acquired */
 static void ufshcd_clk_scaling_start_busy(struct ufs_hba *hba)
 {
 	bool queue_resume_work = false;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode()
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
                   ` (3 preceding siblings ...)
  2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-09-30 10:28   ` Bean Huo
  2022-10-03  5:58   ` Adrian Hunter
  2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	James E.J. Bottomley, Bean Huo, Avri Altman, Jinyoung Choi

Convert if (ret) { ... } if (!ret) { ... } into
if (ret) { ... } else { ... }.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 78c980585dc3..02e73208b921 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8798,10 +8798,9 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 				scsi_print_sense_hdr(sdp, NULL, &sshdr);
 			ret = -EIO;
 		}
-	}
-
-	if (!ret)
+	} else {
 		hba->curr_dev_pwr_mode = pwr_mode;
+	}
 
 	scsi_device_put(sdp);
 	hba->host->eh_noresume = 0;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
                   ` (4 preceding siblings ...)
  2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-09-30 12:14   ` Bean Huo
  2022-10-03  6:10   ` Adrian Hunter
  2022-09-29 22:00 ` [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity Bart Van Assche
  2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
  7 siblings, 2 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	James E.J. Bottomley, Bean Huo, Avri Altman, Jinyoung Choi

Instead of only retrying the START STOP UNIT command if a unit attention
is reported, repeat it if any SCSI error is reported by the device or if
the command timed out.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 02e73208b921..e8c0504e9e83 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8784,9 +8784,9 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	for (retries = 3; retries > 0; --retries) {
 		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
 				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
-		if (!scsi_status_is_check_condition(ret) ||
-				!scsi_sense_valid(&sshdr) ||
-				sshdr.sense_key != UNIT_ATTENTION)
+		if (ret < 0)
+			break;
+		if (host_byte(ret) == 0 && scsi_status_is_good(ret))
 			break;
 	}
 	if (ret) {

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
                   ` (5 preceding siblings ...)
  2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-10-03  6:10   ` Adrian Hunter
  2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
  7 siblings, 1 reply; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	James E.J. Bottomley, Bean Huo, Avri Altman, Jinyoung Choi,
	Stanley Chu, jongmin jeong, Eric Biggers

Add a new boolean variable that tracks whether the system is suspending,
suspended or resuming. This information will be used in a later patch to
fix a deadlock between the SCSI error handler and the suspend code.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 2 ++
 include/ufs/ufshcd.h      | 5 ++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index e8c0504e9e83..5507d93a4bba 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -9253,6 +9253,7 @@ static int ufshcd_wl_suspend(struct device *dev)
 
 	hba = shost_priv(sdev->host);
 	down(&hba->host_sem);
+	hba->system_suspending = true;
 
 	if (pm_runtime_suspended(dev))
 		goto out;
@@ -9294,6 +9295,7 @@ static int ufshcd_wl_resume(struct device *dev)
 		hba->curr_dev_pwr_mode, hba->uic_link_state);
 	if (!ret)
 		hba->is_sys_suspended = false;
+	hba->system_suspending = false;
 	up(&hba->host_sem);
 	return ret;
 }
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index 9f28349ebcff..96538eb3a6c0 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -802,7 +802,9 @@ struct ufs_hba_monitor {
  * @caps: bitmask with information about UFS controller capabilities
  * @devfreq: frequency scaling information owned by the devfreq core
  * @clk_scaling: frequency scaling information owned by the UFS driver
- * @is_sys_suspended: whether or not the entire system has been suspended
+ * @system_suspending: system suspend has been started and system resume has
+ *	not yet finished.
+ * @is_sys_suspended: UFS device has been suspended because of system suspend
  * @urgent_bkops_lvl: keeps track of urgent bkops level for device
  * @is_urgent_bkops_lvl_checked: keeps track if the urgent bkops level for
  *  device is known or not.
@@ -943,6 +945,7 @@ struct ufs_hba {
 
 	struct devfreq *devfreq;
 	struct ufs_clk_scaling clk_scaling;
+	bool system_suspending;
 	bool is_sys_suspended;
 
 	enum bkops_status urgent_bkops_lvl;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
                   ` (6 preceding siblings ...)
  2022-09-29 22:00 ` [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity Bart Van Assche
@ 2022-09-29 22:00 ` Bart Van Assche
  2022-09-30 15:03   ` Bean Huo
  2022-10-03  6:18   ` Adrian Hunter
  7 siblings, 2 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-29 22:00 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Bart Van Assche,
	James E.J. Bottomley, Bean Huo, Avri Altman, Jinyoung Choi

The following deadlock has been observed on multiple test setups:
* ufshcd_wl_suspend() is waiting for blk_execute_rq() to complete while it
  holds host_sem.
* ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
  latter function tries to obtain host_sem.
This is a deadlock because blk_execute_rq() can't execute SCSI commands
while the host is in the SHOST_RECOVERY state and because the error
handler cannot make progress either.

Fix this deadlock as follows:
* Fail attempts to suspend the system while the SCSI error handler is in
  progress.
* If the system is suspending and a START STOP UNIT command times out,
  handle the SCSI command timeout from inside the context of the SCSI
  timeout handler instead of activating the SCSI error handler.
* Reduce the START STOP UNIT command timeout to one second since on
  Android devices a kernel panic is triggered if an attempt to suspend
  the system takes more than 20 seconds. One second should be enough for
  the START STOP UNIT command since this command completes in less than
  a millisecond for the UFS devices I have access to.

The runtime power management code is not affected by this deadlock since
hba->host_sem is not touched by the runtime power management functions
in the UFS driver.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 5507d93a4bba..0294b51e776a 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8295,6 +8295,26 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie)
 	}
 }
 
+static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
+{
+	struct ufs_hba *hba = shost_priv(scmd->device->host);
+
+	if (!hba->system_suspending) {
+		/* Activate the error handler in the SCSI core. */
+		return SCSI_EH_NOT_HANDLED;
+	}
+
+	/*
+	 * Handle errors directly to prevent a deadlock between
+	 * ufshcd_set_dev_pwr_mode() and ufshcd_err_handler().
+	 */
+	ufshcd_link_recovery(hba);
+	dev_info(hba->dev, "%s() finished; outstanding_tasks = %#lx.\n",
+		 __func__, hba->outstanding_tasks);
+
+	return hba->outstanding_tasks ? SCSI_EH_RESET_TIMER : SCSI_EH_DONE;
+}
+
 static const struct attribute_group *ufshcd_driver_groups[] = {
 	&ufs_sysfs_unit_descriptor_group,
 	&ufs_sysfs_lun_attributes_group,
@@ -8329,6 +8349,7 @@ static struct scsi_host_template ufshcd_driver_template = {
 	.eh_abort_handler	= ufshcd_abort,
 	.eh_device_reset_handler = ufshcd_eh_device_reset_handler,
 	.eh_host_reset_handler   = ufshcd_eh_host_reset_handler,
+	.eh_timed_out		= ufshcd_eh_timed_out,
 	.this_id		= -1,
 	.sg_tablesize		= SG_ALL,
 	.cmd_per_lun		= UFSHCD_CMD_PER_LUN,
@@ -8783,7 +8804,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	 */
 	for (retries = 3; retries > 0; --retries) {
 		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
-				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
+				   1 * HZ, 0, REQ_FAILFAST_DEV, RQF_PM, NULL);
 		if (ret < 0)
 			break;
 		if (host_byte(ret) == 0 && scsi_status_is_good(ret))

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout()
  2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
@ 2022-09-30  0:17   ` Mike Christie
  2022-09-30  0:32     ` Bart Van Assche
  0 siblings, 1 reply; 27+ messages in thread
From: Mike Christie @ 2022-09-30  0:17 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Keith Busch,
	Christoph Hellwig, Ming Lei, John Garry, Hannes Reinecke,
	James E.J. Bottomley, Jens Axboe

On 9/29/22 5:00 PM, Bart Van Assche wrote:
> If there is a race between scsi_done() and scsi_timeout() and if
> scsi_timeout() loses the race, scsi_timeout() should not reset the
> request timer. Hence change the return value for this case from
> BLK_EH_RESET_TIMER into BLK_EH_DONE.
> 
> Although the block layer holds a reference on a request (req->ref) while
> calling a timeout handler, restarting the timer (blk_add_timer()) while
> a request is being completed is racy.
> 
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Reported-by: Adrian Hunter <adrian.hunter@intel.com>
> Fixes: 065990bd198e ("scsi: set timed out out mq requests to complete")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_error.c | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 16bd0adc2339..d1b07ff64a96 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -343,19 +343,11 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
>  
>  	if (rtn == BLK_EH_DONE) {
>  		/*
> -		 * Set the command to complete first in order to prevent a real
> -		 * completion from releasing the command while error handling
> -		 * is using it. If the command was already completed, then the
> -		 * lower level driver beat the timeout handler, and it is safe
> -		 * to return without escalating error recovery.
> -		 *
> -		 * If timeout handling lost the race to a real completion, the
> -		 * block layer may ignore that due to a fake timeout injection,
> -		 * so return RESET_TIMER to allow error handling another shot

I've been wondering about this code too.

I think the patch is correct for the normal cases, but I didn't understand the
old fake timeout comment case. From the comment it seemed like that was the reason
we did the RESET_TIMER. Does that not exist anymore or was it just bogus?

The commit you referenced actually was returning BLK_EH_DONE like we want. This
commit:

commit f1342709d18af97b0e71449d5696b8873d1a456c
Author: Keith Busch <keith.busch@intel.com>
Date:   Mon Nov 26 09:54:29 2018 -0700

    scsi: Do not rely on blk-mq for double completions


changed it to BLK_EH_RESET_TIMER and changed the above comment to mention
the fake timeout case. However, the commit message mentioned the patch was done
because we didn't want scsi digging the block layer.

If the fake injection thingy is bogus, then it seems ok to me.

Reviewed-by: Mike Christie <michael.christie@oracle.com>


> -		 * at this command.
> +		 * If scsi_done() has already set SCMD_STATE_COMPLETE, do not
> +		 * modify *scmd.
>  		 */
>  		if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
> -			return BLK_EH_RESET_TIMER;
> +			return BLK_EH_DONE;
>  		if (scsi_abort_command(scmd) != SUCCESS) {
>  			set_host_byte(scmd, DID_TIME_OUT);
>  			scsi_eh_scmd_add(scmd);


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout()
  2022-09-30  0:17   ` Mike Christie
@ 2022-09-30  0:32     ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-09-30  0:32 UTC (permalink / raw)
  To: Mike Christie, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Keith Busch,
	Christoph Hellwig, Ming Lei, John Garry, Hannes Reinecke,
	James E.J. Bottomley, Jens Axboe

On 9/29/22 17:17, Mike Christie wrote:
> On 9/29/22 5:00 PM, Bart Van Assche wrote:
>>   	if (rtn == BLK_EH_DONE) {
>>   		/*
>> -		 * Set the command to complete first in order to prevent a real
>> -		 * completion from releasing the command while error handling
>> -		 * is using it. If the command was already completed, then the
>> -		 * lower level driver beat the timeout handler, and it is safe
>> -		 * to return without escalating error recovery.
>> -		 *
>> -		 * If timeout handling lost the race to a real completion, the
>> -		 * block layer may ignore that due to a fake timeout injection,
>> -		 * so return RESET_TIMER to allow error handling another shot
> 
> I've been wondering about this code too.
> 
> I think the patch is correct for the normal cases, but I didn't understand the
> old fake timeout comment case. From the comment it seemed like that was the reason
> we did the RESET_TIMER. Does that not exist anymore or was it just bogus?

Before commit 15f73f5b3e59 ("blk-mq: move failure injection out of
blk_mq_complete_request") the scsi_mq_done() function cleared the
SCMD_STATE_COMPLETE bit in case of fake timeout injection. I think
that commit made the above comment incorrect.

> The commit you referenced actually was returning BLK_EH_DONE like we want. This
> commit:
> 
> commit f1342709d18af97b0e71449d5696b8873d1a456c
> Author: Keith Busch <keith.busch@intel.com>
> Date:   Mon Nov 26 09:54:29 2018 -0700
> 
>      scsi: Do not rely on blk-mq for double completions
> 
> 
> changed it to BLK_EH_RESET_TIMER and changed the above comment to mention
> the fake timeout case. However, the commit message mentioned the patch was done
> because we didn't want scsi digging the block layer.
> 
> If the fake injection thingy is bogus, then it seems ok to me.

Hmm ... I probably should modify the Fixes tag.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/8] scsi: ufs: Remove an outdated comment
  2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
@ 2022-09-30 10:26   ` Bean Huo
  2022-10-03  5:55   ` Adrian Hunter
  1 sibling, 0 replies; 27+ messages in thread
From: Bean Huo @ 2022-09-30 10:26 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On Thu, 2022-09-29 at 15:00 -0700, Bart Van Assche wrote:
> Although the host lock had to be held by
> ufshcd_clk_scaling_start_busy()
> callers when that function was introduced, that is no longer the case
> today. Hence remove the comment that claims that callers of this
> function
> must hold the host lock.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Bean Huo <beanhuo@micron.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode()
  2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
@ 2022-09-30 10:28   ` Bean Huo
  2022-10-03  5:58   ` Adrian Hunter
  1 sibling, 0 replies; 27+ messages in thread
From: Bean Huo @ 2022-09-30 10:28 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On Thu, 2022-09-29 at 15:00 -0700, Bart Van Assche wrote:
> Convert if (ret) { ... } if (!ret) { ... } into
> if (ret) { ... } else { ... }.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Bean Huo <beanhuo@micron.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode
  2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
@ 2022-09-30 12:14   ` Bean Huo
  2022-10-03  6:10   ` Adrian Hunter
  1 sibling, 0 replies; 27+ messages in thread
From: Bean Huo @ 2022-09-30 12:14 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On Thu, 2022-09-29 at 15:00 -0700, Bart Van Assche wrote:
> Instead of only retrying the START STOP UNIT command if a unit
> attention
> is reported, repeat it if any SCSI error is reported by the device or
> if
> the command timed out.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Bean Huo <beanhuo@micron.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
@ 2022-09-30 15:03   ` Bean Huo
  2022-09-30 17:15     ` Bart Van Assche
  2022-10-03  6:18   ` Adrian Hunter
  1 sibling, 1 reply; 27+ messages in thread
From: Bean Huo @ 2022-09-30 15:03 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On Thu, 2022-09-29 at 15:00 -0700, Bart Van Assche wrote:
> +       ufshcd_link_recovery(hba);
> +       dev_info(hba->dev, "%s() finished; outstanding_tasks =
> %#lx.\n",
> +                __func__, hba->outstanding_tasks);
> +
> +       return hba->outstanding_tasks ? SCSI_EH_RESET_TIMER :
> SCSI_EH_DONE;

Bart,

you have reset the device and host,  in the case, there are pending
TMs, Should be cleared locally, just like ufshcd_err_handler() does?

Kind regards,
Bean

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-09-30 15:03   ` Bean Huo
@ 2022-09-30 17:15     ` Bart Van Assche
  2022-10-02 21:21       ` Bean Huo
  0 siblings, 1 reply; 27+ messages in thread
From: Bart Van Assche @ 2022-09-30 17:15 UTC (permalink / raw)
  To: Bean Huo, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On 9/30/22 08:03, Bean Huo wrote:
> On Thu, 2022-09-29 at 15:00 -0700, Bart Van Assche wrote:
>> +       ufshcd_link_recovery(hba);
>> +       dev_info(hba->dev, "%s() finished; outstanding_tasks =
>> %#lx.\n",
>> +                __func__, hba->outstanding_tasks);
>> +
>> +       return hba->outstanding_tasks ? SCSI_EH_RESET_TIMER :
>> SCSI_EH_DONE;
> 
> Bart,
> 
> you have reset the device and host,  in the case, there are pending
> TMs, Should be cleared locally, just like ufshcd_err_handler() does?

Hi Bean,

Do you agree with the following?
* SCSI task management functions are only submitted by the SCSI error
   handler.
* The ufshcd_link_recovery() call added by this patch can only be
   invoked during system suspend.
* System suspend only happens after processes and kernel threads have
   been frozen and after sync() finished. Hence, no I/O should be in
   progress when ufshcd_wl_suspend() is called and the SCSI error handler
   should not be running either.
* Hence, no SCSI commands other than START STOP UNIT should be in
   progress when the ufshcd_link_recovery() call added by this
   patch is invoked. No TMFs should be in progress either.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-09-30 17:15     ` Bart Van Assche
@ 2022-10-02 21:21       ` Bean Huo
  2022-10-04  0:21         ` Bart Van Assche
  0 siblings, 1 reply; 27+ messages in thread
From: Bean Huo @ 2022-10-02 21:21 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On Fri, 2022-09-30 at 10:15 -0700, Bart Van Assche wrote:
> > > +       ufshcd_link_recovery(hba);
> > > +       dev_info(hba->dev, "%s() finished; outstanding_tasks =
> > > %#lx.\n",
> > > +                __func__, hba->outstanding_tasks);
> > > +
> > > +       return hba->outstanding_tasks ? SCSI_EH_RESET_TIMER :
> > > SCSI_EH_DONE;
> > 
> > Bart,
> > 
> > you have reset the device and host,  in the case, there are pending
> > TMs, Should be cleared locally, just like ufshcd_err_handler()
> > does?
> 
> Hi Bean,
> 
> Do you agree with the following?
> * SCSI task management functions are only submitted by the SCSI error
>    handler.
> * The ufshcd_link_recovery() call added by this patch can only be
>    invoked during system suspend.
> * System suspend only happens after processes and kernel threads have
>    been frozen and after sync() finished. Hence, no I/O should be in
>    progress when ufshcd_wl_suspend() is called and the SCSI error
> handler
>    should not be running either.
> * Hence, no SCSI commands other than START STOP UNIT should be in
>    progress when the ufshcd_link_recovery() call added by this
>    patch is invoked. No TMFs should be in progress either.
> 
> Thanks,
> 
> Bart.

Hi Bart,

Yes, install the whole series of changes, ufshcd_link_recovery() will
be called during the system is in suspending in the case of SSU
timeouts. in this case, outstanding_tasks should be 0? 

Kind regards,
Bean

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/8] scsi: ufs: Remove an outdated comment
  2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
  2022-09-30 10:26   ` Bean Huo
@ 2022-10-03  5:55   ` Adrian Hunter
  1 sibling, 0 replies; 27+ messages in thread
From: Adrian Hunter @ 2022-10-03  5:55 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi

On 30/09/22 01:00, Bart Van Assche wrote:
> Although the host lock had to be held by ufshcd_clk_scaling_start_busy()
> callers when that function was introduced, that is no longer the case
> today. Hence remove the comment that claims that callers of this function
> must hold the host lock.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/ufs/core/ufshcd.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7c15cbc737b4..78c980585dc3 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -2013,7 +2013,6 @@ static void ufshcd_exit_clk_gating(struct ufs_hba *hba)
>  	destroy_workqueue(hba->clk_gating.clk_gating_workq);
>  }
>  
> -/* Must be called with host lock acquired */
>  static void ufshcd_clk_scaling_start_busy(struct ufs_hba *hba)
>  {
>  	bool queue_resume_work = false;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode()
  2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
  2022-09-30 10:28   ` Bean Huo
@ 2022-10-03  5:58   ` Adrian Hunter
  1 sibling, 0 replies; 27+ messages in thread
From: Adrian Hunter @ 2022-10-03  5:58 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi

On 30/09/22 01:00, Bart Van Assche wrote:
> Convert if (ret) { ... } if (!ret) { ... } into
> if (ret) { ... } else { ... }.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/ufs/core/ufshcd.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 78c980585dc3..02e73208b921 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -8798,10 +8798,9 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
>  				scsi_print_sense_hdr(sdp, NULL, &sshdr);
>  			ret = -EIO;
>  		}
> -	}
> -
> -	if (!ret)
> +	} else {
>  		hba->curr_dev_pwr_mode = pwr_mode;
> +	}
>  
>  	scsi_device_put(sdp);
>  	hba->host->eh_noresume = 0;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode
  2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
  2022-09-30 12:14   ` Bean Huo
@ 2022-10-03  6:10   ` Adrian Hunter
  2022-10-03 16:38     ` Bart Van Assche
  1 sibling, 1 reply; 27+ messages in thread
From: Adrian Hunter @ 2022-10-03  6:10 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi

On 30/09/22 01:00, Bart Van Assche wrote:
> Instead of only retrying the START STOP UNIT command if a unit attention
> is reported, repeat it if any SCSI error is reported by the device or if
> the command timed out.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/ufs/core/ufshcd.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 02e73208b921..e8c0504e9e83 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -8784,9 +8784,9 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
>  	for (retries = 3; retries > 0; --retries) {
>  		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
>  				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> -		if (!scsi_status_is_check_condition(ret) ||
> -				!scsi_sense_valid(&sshdr) ||
> -				sshdr.sense_key != UNIT_ATTENTION)
> +		if (ret < 0)
> +			break;

Could use a comment to answer: why not retry if ret < 0 ?

> +		if (host_byte(ret) == 0 && scsi_status_is_good(ret))
>  			break;

Also could use a comment: why not just "if (!ret)" ?

>  	}
>  	if (ret) {


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity
  2022-09-29 22:00 ` [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity Bart Van Assche
@ 2022-10-03  6:10   ` Adrian Hunter
  0 siblings, 0 replies; 27+ messages in thread
From: Adrian Hunter @ 2022-10-03  6:10 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi, Stanley Chu, jongmin jeong,
	Eric Biggers

On 30/09/22 01:00, Bart Van Assche wrote:
> Add a new boolean variable that tracks whether the system is suspending,
> suspended or resuming. This information will be used in a later patch to
> fix a deadlock between the SCSI error handler and the suspend code.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/ufs/core/ufshcd.c | 2 ++
>  include/ufs/ufshcd.h      | 5 ++++-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index e8c0504e9e83..5507d93a4bba 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -9253,6 +9253,7 @@ static int ufshcd_wl_suspend(struct device *dev)
>  
>  	hba = shost_priv(sdev->host);
>  	down(&hba->host_sem);
> +	hba->system_suspending = true;
>  
>  	if (pm_runtime_suspended(dev))
>  		goto out;
> @@ -9294,6 +9295,7 @@ static int ufshcd_wl_resume(struct device *dev)
>  		hba->curr_dev_pwr_mode, hba->uic_link_state);
>  	if (!ret)
>  		hba->is_sys_suspended = false;
> +	hba->system_suspending = false;
>  	up(&hba->host_sem);
>  	return ret;
>  }
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index 9f28349ebcff..96538eb3a6c0 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -802,7 +802,9 @@ struct ufs_hba_monitor {
>   * @caps: bitmask with information about UFS controller capabilities
>   * @devfreq: frequency scaling information owned by the devfreq core
>   * @clk_scaling: frequency scaling information owned by the UFS driver
> - * @is_sys_suspended: whether or not the entire system has been suspended
> + * @system_suspending: system suspend has been started and system resume has
> + *	not yet finished.
> + * @is_sys_suspended: UFS device has been suspended because of system suspend
>   * @urgent_bkops_lvl: keeps track of urgent bkops level for device
>   * @is_urgent_bkops_lvl_checked: keeps track if the urgent bkops level for
>   *  device is known or not.
> @@ -943,6 +945,7 @@ struct ufs_hba {
>  
>  	struct devfreq *devfreq;
>  	struct ufs_clk_scaling clk_scaling;
> +	bool system_suspending;
>  	bool is_sys_suspended;
>  
>  	enum bkops_status urgent_bkops_lvl;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
  2022-09-30 15:03   ` Bean Huo
@ 2022-10-03  6:18   ` Adrian Hunter
  2022-10-03 16:45     ` Bart Van Assche
  1 sibling, 1 reply; 27+ messages in thread
From: Adrian Hunter @ 2022-10-03  6:18 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi

On 30/09/22 01:00, Bart Van Assche wrote:
> The following deadlock has been observed on multiple test setups:
> * ufshcd_wl_suspend() is waiting for blk_execute_rq() to complete while it
>   holds host_sem.
> * ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
>   latter function tries to obtain host_sem.
> This is a deadlock because blk_execute_rq() can't execute SCSI commands
> while the host is in the SHOST_RECOVERY state and because the error
> handler cannot make progress either.
> 
> Fix this deadlock as follows:
> * Fail attempts to suspend the system while the SCSI error handler is in
>   progress.
> * If the system is suspending and a START STOP UNIT command times out,
>   handle the SCSI command timeout from inside the context of the SCSI
>   timeout handler instead of activating the SCSI error handler.
> * Reduce the START STOP UNIT command timeout to one second since on
>   Android devices a kernel panic is triggered if an attempt to suspend
>   the system takes more than 20 seconds. One second should be enough for
>   the START STOP UNIT command since this command completes in less than
>   a millisecond for the UFS devices I have access to.
> 
> The runtime power management code is not affected by this deadlock since
> hba->host_sem is not touched by the runtime power management functions
> in the UFS driver.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Comment below.  Nevertheless,

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/ufs/core/ufshcd.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 5507d93a4bba..0294b51e776a 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -8295,6 +8295,26 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie)
>  	}
>  }
>  
> +static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
> +{
> +	struct ufs_hba *hba = shost_priv(scmd->device->host);
> +
> +	if (!hba->system_suspending) {
> +		/* Activate the error handler in the SCSI core. */
> +		return SCSI_EH_NOT_HANDLED;
> +	}
> +
> +	/*
> +	 * Handle errors directly to prevent a deadlock between
> +	 * ufshcd_set_dev_pwr_mode() and ufshcd_err_handler().
> +	 */
> +	ufshcd_link_recovery(hba);

This is OK for now, but ufshcd_link_recovery() doesn't check for errors
*after* reset, and retry error handling - in which case it might need also
the error handling that ufshcd_err_handler() does.

It might be better to have only 1 error handling function.

> +	dev_info(hba->dev, "%s() finished; outstanding_tasks = %#lx.\n",
> +		 __func__, hba->outstanding_tasks);
> +
> +	return hba->outstanding_tasks ? SCSI_EH_RESET_TIMER : SCSI_EH_DONE;
> +}
> +
>  static const struct attribute_group *ufshcd_driver_groups[] = {
>  	&ufs_sysfs_unit_descriptor_group,
>  	&ufs_sysfs_lun_attributes_group,
> @@ -8329,6 +8349,7 @@ static struct scsi_host_template ufshcd_driver_template = {
>  	.eh_abort_handler	= ufshcd_abort,
>  	.eh_device_reset_handler = ufshcd_eh_device_reset_handler,
>  	.eh_host_reset_handler   = ufshcd_eh_host_reset_handler,
> +	.eh_timed_out		= ufshcd_eh_timed_out,
>  	.this_id		= -1,
>  	.sg_tablesize		= SG_ALL,
>  	.cmd_per_lun		= UFSHCD_CMD_PER_LUN,
> @@ -8783,7 +8804,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
>  	 */
>  	for (retries = 3; retries > 0; --retries) {
>  		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
> -				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
> +				   1 * HZ, 0, REQ_FAILFAST_DEV, RQF_PM, NULL);
>  		if (ret < 0)
>  			break;
>  		if (host_byte(ret) == 0 && scsi_status_is_good(ret))


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode
  2022-10-03  6:10   ` Adrian Hunter
@ 2022-10-03 16:38     ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-10-03 16:38 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi

On 10/2/22 23:10, Adrian Hunter wrote:
> On 30/09/22 01:00, Bart Van Assche wrote:
>> Instead of only retrying the START STOP UNIT command if a unit attention
>> is reported, repeat it if any SCSI error is reported by the device or if
>> the command timed out.
>>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   drivers/ufs/core/ufshcd.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
>> index 02e73208b921..e8c0504e9e83 100644
>> --- a/drivers/ufs/core/ufshcd.c
>> +++ b/drivers/ufs/core/ufshcd.c
>> @@ -8784,9 +8784,9 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
>>   	for (retries = 3; retries > 0; --retries) {
>>   		ret = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, &sshdr,
>>   				START_STOP_TIMEOUT, 0, 0, RQF_PM, NULL);
>> -		if (!scsi_status_is_check_condition(ret) ||
>> -				!scsi_sense_valid(&sshdr) ||
>> -				sshdr.sense_key != UNIT_ATTENTION)
>> +		if (ret < 0)
>> +			break;
> 
> Could use a comment to answer: why not retry if ret < 0 ?

OK, I will add a comment above that if-test.

>> +		if (host_byte(ret) == 0 && scsi_status_is_good(ret))
>>   			break;
> 
> Also could use a comment: why not just "if (!ret)" ?

The UFS spec says "CONDITION MET - Not used for UFS" so changing the 
above test into "if (!ret)" would be valid. If nobody objects I will 
change the above test into "if (ret == 0)" instead of adding a comment.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-10-03  6:18   ` Adrian Hunter
@ 2022-10-03 16:45     ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-10-03 16:45 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, James E.J. Bottomley, Bean Huo,
	Avri Altman, Jinyoung Choi

On 10/2/22 23:18, Adrian Hunter wrote:
> This is OK for now, but ufshcd_link_recovery() doesn't check for errors
> *after* reset, and retry error handling - in which case it might need also
> the error handling that ufshcd_err_handler() does.
> 
> It might be better to have only 1 error handling function.

Hi Adrian,

I'm fine with sharing code between error handling contexts but I'm not 
sure a single error handling function is the best solution. Having a 
single error handling function and calling it from the timeout function 
would require to add so many if-tests into ufshcd_err_handler() that it 
would make that function unreadable.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/8] scsi: core: Support failing requests while recovering
  2022-09-29 22:00 ` [PATCH v3 3/8] scsi: core: Support failing requests while recovering Bart Van Assche
@ 2022-10-03 17:27   ` Mike Christie
  2022-10-03 19:20     ` Bart Van Assche
  0 siblings, 1 reply; 27+ messages in thread
From: Mike Christie @ 2022-10-03 17:27 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Christoph Hellwig,
	Ming Lei, John Garry, Hannes Reinecke, James E.J. Bottomley

On 9/29/22 5:00 PM, Bart Van Assche wrote:
> The current behavior for SCSI commands submitted while error recovery
> is ongoing is to retry command submission after error recovery has
> finished. See also the scsi_host_in_recovery() check in
> scsi_host_queue_ready(). Add support for failing SCSI commands while
> host recovery is in progress. This functionality will be used to fix a
> deadlock in the UFS driver.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_lib.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 473d9403f0c1..ecd2ce3815df 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>  				   struct scsi_device *sdev,
>  				   struct scsi_cmnd *cmd)
>  {
> -	if (scsi_host_in_recovery(shost))
> -		return 0;
> -
>  	if (atomic_read(&shost->host_blocked) > 0) {
>  		if (scsi_host_busy(shost) > 0)
>  			goto starved;
> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	ret = BLK_STS_RESOURCE;
>  	if (!scsi_target_queue_ready(shost, sdev))
>  		goto out_put_budget;
> +	if (unlikely(scsi_host_in_recovery(shost))) {
> +		if (req->cmd_flags & REQ_FAILFAST_MASK)
> +			ret = BLK_STS_OFFLINE;
> +		goto out_dec_target_busy;
> +	}
>  	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
>  		goto out_dec_target_busy;
>  

This might add a regression to dm-multipath or it at least makes
the behavior difficult to understand for users and support people.

If there is a transport issue and the cmd times out and the abort
does as well or fails, then we would start the scsi eh recovery. When the
driver/transport class figures out it's a transport issue we will block
the scsi_device. So before this patch we would requeue the cmd then we
would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to
fire before failing commands upwards and failing paths.

With this patch we can end up doing 2 or 3 things depending on timing:
1. If the cmd hits the code above we will fail the command and cause a
path failure.

2. If driver/transport blocks the scsi_device first then we would hit the
scsi_device state check in scsi_queue_rq and not fail the path like before.

3. With or without your patch, if dm_mq_queue_rq calls the busy callback
first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery)
then it might see all the paths in recovery. It considers this a temp condition
and will requeue cmds. So in this case we will not fail the path.

I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT
above, or maybe add a new FAILFAST type that you could use?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/8] scsi: core: Support failing requests while recovering
  2022-10-03 17:27   ` Mike Christie
@ 2022-10-03 19:20     ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-10-03 19:20 UTC (permalink / raw)
  To: Mike Christie, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, Christoph Hellwig,
	Ming Lei, John Garry, Hannes Reinecke, James E.J. Bottomley

On 10/3/22 10:27, Mike Christie wrote:
> On 9/29/22 5:00 PM, Bart Van Assche wrote:
>> The current behavior for SCSI commands submitted while error recovery
>> is ongoing is to retry command submission after error recovery has
>> finished. See also the scsi_host_in_recovery() check in
>> scsi_host_queue_ready(). Add support for failing SCSI commands while
>> host recovery is in progress. This functionality will be used to fix a
>> deadlock in the UFS driver.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Mike Christie <michael.christie@oracle.com>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   drivers/scsi/scsi_lib.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 473d9403f0c1..ecd2ce3815df 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>>   				   struct scsi_device *sdev,
>>   				   struct scsi_cmnd *cmd)
>>   {
>> -	if (scsi_host_in_recovery(shost))
>> -		return 0;
>> -
>>   	if (atomic_read(&shost->host_blocked) > 0) {
>>   		if (scsi_host_busy(shost) > 0)
>>   			goto starved;
>> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>>   	ret = BLK_STS_RESOURCE;
>>   	if (!scsi_target_queue_ready(shost, sdev))
>>   		goto out_put_budget;
>> +	if (unlikely(scsi_host_in_recovery(shost))) {
>> +		if (req->cmd_flags & REQ_FAILFAST_MASK)
>> +			ret = BLK_STS_OFFLINE;
>> +		goto out_dec_target_busy;
>> +	}
>>   	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
>>   		goto out_dec_target_busy;
>>   
> 
> This might add a regression to dm-multipath or it at least makes
> the behavior difficult to understand for users and support people.
> 
> If there is a transport issue and the cmd times out and the abort
> does as well or fails, then we would start the scsi eh recovery. When the
> driver/transport class figures out it's a transport issue we will block
> the scsi_device. So before this patch we would requeue the cmd then we
> would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to
> fire before failing commands upwards and failing paths.
> 
> With this patch we can end up doing 2 or 3 things depending on timing:
> 1. If the cmd hits the code above we will fail the command and cause a
> path failure.
> 
> 2. If driver/transport blocks the scsi_device first then we would hit the
> scsi_device state check in scsi_queue_rq and not fail the path like before.
> 
> 3. With or without your patch, if dm_mq_queue_rq calls the busy callback
> first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery)
> then it might see all the paths in recovery. It considers this a temp condition
> and will requeue cmds. So in this case we will not fail the path.
> 
> I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT
> above, or maybe add a new FAILFAST type that you could use?

Hi Mike,

I will look into introducing a new request flag for this purpose.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler
  2022-10-02 21:21       ` Bean Huo
@ 2022-10-04  0:21         ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2022-10-04  0:21 UTC (permalink / raw)
  To: Bean Huo, Martin K . Petersen
  Cc: Jaegeuk Kim, linux-scsi, Adrian Hunter, James E.J. Bottomley,
	Bean Huo, Avri Altman, Jinyoung Choi

On 10/2/22 14:21, Bean Huo wrote:
> Yes, install the whole series of changes, ufshcd_link_recovery() will
> be called during the system is in suspending in the case of SSU
> timeouts. in this case, outstanding_tasks should be 0?
  Hi Bean,

'outstanding_tasks' needs to be changed into 'outstanding_reqs' in patch 
8/8. My understanding is that ufshcd_link_recovery() calls 
ufshcd_complete_requests() indirectly. That causes the bits 
corresponding to completed requests to be cleared from outstanding_reqs 
but does not guarantee that outstanding_reqs will be zero.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-10-04  0:21 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-29 22:00 [PATCH v3 0/8] Fix a deadlock in the UFS driver Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout() Bart Van Assche
2022-09-30  0:17   ` Mike Christie
2022-09-30  0:32     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 2/8] scsi: core: Change the return type of .eh_timed_out() Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 3/8] scsi: core: Support failing requests while recovering Bart Van Assche
2022-10-03 17:27   ` Mike Christie
2022-10-03 19:20     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 4/8] scsi: ufs: Remove an outdated comment Bart Van Assche
2022-09-30 10:26   ` Bean Huo
2022-10-03  5:55   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 5/8] scsi: ufs: Use 'else' in ufshcd_set_dev_pwr_mode() Bart Van Assche
2022-09-30 10:28   ` Bean Huo
2022-10-03  5:58   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 6/8] scsi: ufs: Try harder to change the power mode Bart Van Assche
2022-09-30 12:14   ` Bean Huo
2022-10-03  6:10   ` Adrian Hunter
2022-10-03 16:38     ` Bart Van Assche
2022-09-29 22:00 ` [PATCH v3 7/8] scsi: ufs: Track system suspend / resume activity Bart Van Assche
2022-10-03  6:10   ` Adrian Hunter
2022-09-29 22:00 ` [PATCH v3 8/8] scsi: ufs: Fix a deadlock between PM and the SCSI error handler Bart Van Assche
2022-09-30 15:03   ` Bean Huo
2022-09-30 17:15     ` Bart Van Assche
2022-10-02 21:21       ` Bean Huo
2022-10-04  0:21         ` Bart Van Assche
2022-10-03  6:18   ` Adrian Hunter
2022-10-03 16:45     ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.