All of lore.kernel.org
 help / color / mirror / Atom feed
* BLK_EH_HANDLED
@ 2018-05-23 12:19 Christoph Hellwig
  2018-05-23 12:19 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
                   ` (14 more replies)
  0 siblings, 15 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

Hi all,

this series removes the BLK_EH_HANDLED return value, and instead places
responsibility of timed out commands entirely on the drivers.  Except
for some odd layering vilations in libiscsi this actually is
surprisingly simple.  The last two pathes contain a respin of Keith'
series to refcount struct request in the eye of timeouts.  Once the
BLK_EH_HANDLED special case is gone this actually seems the EH fixing
approach that seems the most sensible to me.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 01/14] libata: remove ata_scsi_timed_out
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:21   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

As far as I can tell this function can't even be called any more, given
that ATA implements its own eh_strategy_handler with ata_scsi_error, which
never calls ->eh_timed_out.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/ata/libata-eh.c | 51 -----------------------------------------
 include/linux/libata.h  |  2 --
 2 files changed, 53 deletions(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index c016829a38fd..e87785dec151 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -500,57 +500,6 @@ void ata_eh_release(struct ata_port *ap)
 	mutex_unlock(&ap->host->eh_mutex);
 }
 
-/**
- *	ata_scsi_timed_out - SCSI layer time out callback
- *	@cmd: timed out SCSI command
- *
- *	Handles SCSI layer timeout.  We race with normal completion of
- *	the qc for @cmd.  If the qc is already gone, we lose and let
- *	the scsi command finish (EH_HANDLED).  Otherwise, the qc has
- *	timed out and EH should be invoked.  Prevent ata_qc_complete()
- *	from finishing it by setting EH_SCHEDULED and return
- *	EH_NOT_HANDLED.
- *
- *	TODO: kill this function once old EH is gone.
- *
- *	LOCKING:
- *	Called from timer context
- *
- *	RETURNS:
- *	EH_HANDLED or EH_NOT_HANDLED
- */
-enum blk_eh_timer_return ata_scsi_timed_out(struct scsi_cmnd *cmd)
-{
-	struct Scsi_Host *host = cmd->device->host;
-	struct ata_port *ap = ata_shost_to_port(host);
-	unsigned long flags;
-	struct ata_queued_cmd *qc;
-	enum blk_eh_timer_return ret;
-
-	DPRINTK("ENTER\n");
-
-	if (ap->ops->error_handler) {
-		ret = BLK_EH_NOT_HANDLED;
-		goto out;
-	}
-
-	ret = BLK_EH_HANDLED;
-	spin_lock_irqsave(ap->lock, flags);
-	qc = ata_qc_from_tag(ap, ap->link.active_tag);
-	if (qc) {
-		WARN_ON(qc->scsicmd != cmd);
-		qc->flags |= ATA_QCFLAG_EH_SCHEDULED;
-		qc->err_mask |= AC_ERR_TIMEOUT;
-		ret = BLK_EH_NOT_HANDLED;
-	}
-	spin_unlock_irqrestore(ap->lock, flags);
-
- out:
-	DPRINTK("EXIT, ret=%d\n", ret);
-	return ret;
-}
-EXPORT_SYMBOL(ata_scsi_timed_out);
-
 static void ata_eh_unload(struct ata_port *ap)
 {
 	struct ata_link *link;
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 1795fecdea17..1c113134c98f 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1133,7 +1133,6 @@ extern int ata_sas_port_start(struct ata_port *ap);
 extern void ata_sas_port_stop(struct ata_port *ap);
 extern int ata_sas_slave_configure(struct scsi_device *, struct ata_port *);
 extern int ata_sas_queuecmd(struct scsi_cmnd *cmd, struct ata_port *ap);
-extern enum blk_eh_timer_return ata_scsi_timed_out(struct scsi_cmnd *cmd);
 extern int sata_scr_valid(struct ata_link *link);
 extern int sata_scr_read(struct ata_link *link, int reg, u32 *val);
 extern int sata_scr_write(struct ata_link *link, int reg, u32 val);
@@ -1359,7 +1358,6 @@ extern struct device_attribute *ata_common_sdev_attrs[];
 	.proc_name		= drv_name,			\
 	.slave_configure	= ata_scsi_slave_config,	\
 	.slave_destroy		= ata_scsi_slave_destroy,	\
-	.eh_timed_out		= ata_scsi_timed_out,		\
 	.bios_param		= ata_std_bios_param,		\
 	.unlock_native_capacity	= ata_scsi_unlock_native_capacity, \
 	.sdev_attrs		= ata_common_sdev_attrs
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
  2018-05-23 12:19 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 12:41   ` Johannes Thumshirn
  2018-05-23 13:22   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command.  Fix the symbolic name to reflect that a little better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/scsi/scsi_eh.txt            | 4 ++--
 block/blk-mq.c                            | 2 +-
 block/blk-timeout.c                       | 2 +-
 drivers/block/nbd.c                       | 2 +-
 drivers/message/fusion/mptsas.c           | 2 +-
 drivers/s390/block/dasd.c                 | 6 +++---
 drivers/scsi/gdth.c                       | 2 +-
 drivers/scsi/libiscsi.c                   | 2 +-
 drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
 drivers/scsi/mvumi.c                      | 2 +-
 drivers/scsi/qla4xxx/ql4_os.c             | 2 +-
 drivers/scsi/scsi_error.c                 | 4 ++--
 drivers/scsi/scsi_transport_fc.c          | 4 ++--
 drivers/scsi/scsi_transport_srp.c         | 4 ++--
 drivers/scsi/ufs/ufshcd.c                 | 6 +++---
 include/linux/blkdev.h                    | 2 +-
 include/scsi/scsi_host.h                  | 2 +-
 17 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 11e447bdb3a5..3ae8419e72cf 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -97,9 +97,9 @@ function
 	This indicates that more time is required to finish the
 	command.  Timer is restarted.  This action is counted as a
 	retry and only allowed scmd->allowed + 1(!) times.  Once the
-	limit is reached, action for BLK_EH_NOT_HANDLED is taken instead.
+	limit is reached, action for BLK_EH_DONE is taken instead.
 
-    - BLK_EH_NOT_HANDLED
+    - BLK_EH_DONE
         eh_timed_out() callback did not handle the command.
 	Step #2 is taken.
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index df928200b17e..546acc8cb201 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -856,7 +856,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 		blk_mq_rq_update_aborted_gstate(req, 0);
 		blk_add_timer(req);
 		break;
-	case BLK_EH_NOT_HANDLED:
+	case BLK_EH_DONE:
 		break;
 	default:
 		printk(KERN_ERR "block: bad eh return: %d\n", ret);
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 652d4d4d3e97..3c622335811f 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -93,7 +93,7 @@ static void blk_rq_timed_out(struct request *req)
 		blk_add_timer(req);
 		blk_clear_rq_complete(req);
 		break;
-	case BLK_EH_NOT_HANDLED:
+	case BLK_EH_DONE:
 		/*
 		 * LLD handles this for now but in the future
 		 * we can send a request msg to abort the command
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 1147e4680c80..09d9959287fe 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -328,7 +328,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 			}
 			blk_mq_requeue_request(req, true);
 			nbd_config_put(nbd);
-			return BLK_EH_NOT_HANDLED;
+			return BLK_EH_DONE;
 		}
 	} else {
 		dev_err_ratelimited(nbd_to_dev(nbd),
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 86503f60468f..19a5aa70ecda 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1929,7 +1929,7 @@ static enum blk_eh_timer_return mptsas_eh_timed_out(struct scsi_cmnd *sc)
 	MPT_SCSI_HOST *hd;
 	MPT_ADAPTER   *ioc;
 	VirtDevice    *vdevice;
-	enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return rc = BLK_EH_DONE;
 
 	hd = shost_priv(sc->device->host);
 	if (hd == NULL) {
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 04143c08bd6e..b0e89ca48a3c 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3053,7 +3053,7 @@ static blk_status_t do_dasd_request(struct blk_mq_hw_ctx *hctx,
  *
  * Return values:
  * BLK_EH_RESET_TIMER if the request should be left running
- * BLK_EH_NOT_HANDLED if the request is handled or terminated
+ * BLK_EH_DONE if the request is handled or terminated
  *		      by the driver.
  */
 enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
@@ -3065,7 +3065,7 @@ enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
 	int rc = 0;
 
 	if (!cqr)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 
 	spin_lock_irqsave(&cqr->dq->lock, flags);
 	device = cqr->startdev ? cqr->startdev : block->base;
@@ -3124,7 +3124,7 @@ enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
 	spin_unlock(&block->queue_lock);
 	spin_unlock_irqrestore(&cqr->dq->lock, flags);
 
-	return rc ? BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
+	return rc ? BLK_EH_RESET_TIMER : BLK_EH_DONE;
 }
 
 static int dasd_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index c35f05c4c6bb..85604795d8ee 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -3882,7 +3882,7 @@ static enum blk_eh_timer_return gdth_timed_out(struct scsi_cmnd *scp)
 	struct gdth_cmndinfo *cmndinfo = gdth_cmnd_priv(scp);
 	u8 b, t;
 	unsigned long flags;
-	enum blk_eh_timer_return retval = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return retval = BLK_EH_DONE;
 
 	TRACE(("%s() cmd 0x%x\n", scp->cmnd[0], __func__));
 	b = scp->device->channel;
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 15a2fef51e38..eee43ba83a60 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1963,7 +1963,7 @@ static int iscsi_has_ping_timed_out(struct iscsi_conn *conn)
 
 enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 {
-	enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return rc = BLK_EH_DONE;
 	struct iscsi_task *task = NULL, *running_task;
 	struct iscsi_cls_session *cls_session;
 	struct iscsi_session *session;
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index b89c6e6c0589..ce656c466ca9 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2772,7 +2772,7 @@ blk_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
 
 	if (time_after(jiffies, scmd->jiffies_at_alloc +
 				(scmd_timeout * 2) * HZ)) {
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	instance = (struct megasas_instance *)scmd->device->host->hostdata;
diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
index fe97401ad192..afd27165cd93 100644
--- a/drivers/scsi/mvumi.c
+++ b/drivers/scsi/mvumi.c
@@ -2155,7 +2155,7 @@ static enum blk_eh_timer_return mvumi_timed_out(struct scsi_cmnd *scmd)
 	mvumi_return_cmd(mhba, cmd);
 	spin_unlock_irqrestore(mhba->shost->host_lock, flags);
 
-	return BLK_EH_NOT_HANDLED;
+	return BLK_EH_DONE;
 }
 
 static int
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 94c14ce94da2..0e13349dce57 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -1848,7 +1848,7 @@ static enum blk_eh_timer_return qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc)
 	struct iscsi_cls_session *session;
 	struct iscsi_session *sess;
 	unsigned long flags;
-	enum blk_eh_timer_return ret = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return ret = BLK_EH_DONE;
 
 	session = starget_to_session(scsi_target(sc->device));
 	sess = session->dd_data;
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b36e73090018..9c02ba2e7ef3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -282,7 +282,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 enum blk_eh_timer_return scsi_times_out(struct request *req)
 {
 	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req);
-	enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return rtn = BLK_EH_DONE;
 	struct Scsi_Host *host = scmd->device->host;
 
 	trace_scsi_dispatch_cmd_timeout(scmd);
@@ -294,7 +294,7 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 	if (host->hostt->eh_timed_out)
 		rtn = host->hostt->eh_timed_out(scmd);
 
-	if (rtn == BLK_EH_NOT_HANDLED) {
+	if (rtn == BLK_EH_DONE) {
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index be3be0f9cb2d..90075a0ddcfe 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -2087,7 +2087,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)
 	if (rport->port_state == FC_PORTSTATE_BLOCKED)
 		return BLK_EH_RESET_TIMER;
 
-	return BLK_EH_NOT_HANDLED;
+	return BLK_EH_DONE;
 }
 EXPORT_SYMBOL(fc_eh_timed_out);
 
@@ -3592,7 +3592,7 @@ fc_bsg_job_timeout(struct request *req)
 
 	/* the blk_end_sync_io() doesn't check the error */
 	if (!inflight)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 	else
 		return BLK_EH_HANDLED;
 }
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 36f6190931bc..a9c1c991da35 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -587,7 +587,7 @@ EXPORT_SYMBOL(srp_reconnect_rport);
  *
  * If a timeout occurs while an rport is in the blocked state, ask the SCSI
  * EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
- * handle the timeout (BLK_EH_NOT_HANDLED).
+ * handle the timeout (BLK_EH_DONE).
  *
  * Note: This function is called from soft-IRQ context and with the request
  * queue lock held.
@@ -602,7 +602,7 @@ enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
 	pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
 	return rport->fast_io_fail_tmo < 0 && rport->dev_loss_tmo < 0 &&
 		i->f->reset_timer_if_blocked && scsi_device_blocked(sdev) ?
-		BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
+		BLK_EH_RESET_TIMER : BLK_EH_DONE;
 }
 EXPORT_SYMBOL(srp_timed_out);
 
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 00e79057f870..d0a1674915a1 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6497,12 +6497,12 @@ static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
 	bool found = false;
 
 	if (!scmd || !scmd->device || !scmd->device->host)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 
 	host = scmd->device->host;
 	hba = shost_priv(host);
 	if (!hba)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 
 	spin_lock_irqsave(host->host_lock, flags);
 
@@ -6520,7 +6520,7 @@ static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
 	 * SCSI command was not actually dispatched to UFS driver, otherwise
 	 * let SCSI layer handle the error as usual.
 	 */
-	return found ? BLK_EH_NOT_HANDLED : BLK_EH_RESET_TIMER;
+	return found ? BLK_EH_DONE : BLK_EH_RESET_TIMER;
 }
 
 static const struct attribute_group *ufshcd_driver_groups[] = {
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f3999719f828..ba0844c176f2 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -337,7 +337,7 @@ typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
 typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
-	BLK_EH_NOT_HANDLED,
+	BLK_EH_DONE,
 	BLK_EH_HANDLED,
 	BLK_EH_RESET_TIMER,
 };
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 12f454cb6f61..53b485fe9b67 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -307,7 +307,7 @@ struct scsi_host_template {
 	 * EH_HANDLED:		I fixed the error, please complete the command
 	 * EH_RESET_TIMER:	I need more time, reset the timer and
 	 *			begin counting again
-	 * EH_NOT_HANDLED	Begin normal error recovery
+	 * EH_DONE:		Begin normal error recovery
 	 *
 	 * Status: OPTIONAL
 	 */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
  2018-05-23 12:19 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
  2018-05-23 12:19 ` [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 12:42   ` Johannes Thumshirn
                     ` (2 more replies)
  2018-05-23 12:19 ` [PATCH 04/14] nbd: complete requests " Christoph Hellwig
                   ` (11 subsequent siblings)
  14 siblings, 3 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

NVMe always completes the request before returning from ->timeout, either
by polling for it, or by disabling the controller.  Return BLK_EH_DONE so
that the block layer doesn't even try to complete it again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/host/pci.c    | 14 +++++---------
 drivers/nvme/host/rdma.c   |  2 +-
 drivers/nvme/target/loop.c |  2 +-
 3 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 917e1714f7d9..31525324b79f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1205,7 +1205,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		nvme_warn_reset(dev, csts);
 		nvme_dev_disable(dev, false);
 		nvme_reset_ctrl(&dev->ctrl);
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	/*
@@ -1215,14 +1215,14 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		dev_warn(dev->ctrl.device,
 			 "I/O %d QID %d timeout, completion polled\n",
 			 req->tag, nvmeq->qid);
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	/*
 	 * Shutdown immediately if controller times out while starting. The
 	 * reset work will see the pci device disabled when it gets the forced
 	 * cancellation error. All outstanding requests are completed on
-	 * shutdown, so we return BLK_EH_HANDLED.
+	 * shutdown, so we return BLK_EH_DONE.
 	 */
 	switch (dev->ctrl.state) {
 	case NVME_CTRL_CONNECTING:
@@ -1232,7 +1232,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 			 req->tag, nvmeq->qid);
 		nvme_dev_disable(dev, false);
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	default:
 		break;
 	}
@@ -1249,12 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		nvme_dev_disable(dev, false);
 		nvme_reset_ctrl(&dev->ctrl);
 
-		/*
-		 * Mark the request as handled, since the inline shutdown
-		 * forces all outstanding requests to complete.
-		 */
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	if (atomic_dec_return(&dev->ctrl.abort_limit) < 0) {
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 1eb4438a8763..ac7462cd7f0f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1598,7 +1598,7 @@ nvme_rdma_timeout(struct request *rq, bool reserved)
 	/* fail with DNR on cmd timeout */
 	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
 
-	return BLK_EH_HANDLED;
+	return BLK_EH_DONE;
 }
 
 static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 27a8561c0cb9..22e3627bf16b 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -146,7 +146,7 @@ nvme_loop_timeout(struct request *rq, bool reserved)
 	/* fail with DNR on admin cmd timeout */
 	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
 
-	return BLK_EH_HANDLED;
+	return BLK_EH_DONE;
 }
 
 static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 04/14] nbd: complete requests from ->timeout
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (2 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:28   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 05/14] mtip32xx: " Christoph Hellwig
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/nbd.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 09d9959287fe..30a851b8a99c 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -300,7 +300,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
-		return BLK_EH_HANDLED;
+		goto done;
 	}
 	config = nbd->config;
 
@@ -338,8 +338,9 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	cmd->status = BLK_STS_IOERR;
 	sock_shutdown(nbd);
 	nbd_config_put(nbd);
-
-	return BLK_EH_HANDLED;
+done:
+	blk_mq_complete_request(req);
+	return BLK_EH_DONE;
 }
 
 /*
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 05/14] mtip32xx: complete requests from ->timeout
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (3 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 04/14] nbd: complete requests " Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 12:46   ` Johannes Thumshirn
  2018-05-23 13:29   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 06/14] null_blk: " Christoph Hellwig
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/mtip32xx/mtip32xx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index 6df5b0b1517a..beace13effe4 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3720,7 +3720,8 @@ static enum blk_eh_timer_return mtip_cmd_timeout(struct request *req,
 		struct mtip_cmd *cmd = blk_mq_rq_to_pdu(req);
 
 		cmd->status = BLK_STS_TIMEOUT;
-		return BLK_EH_HANDLED;
+		blk_mq_complete_request(req);
+		return BLK_EH_DONE;
 	}
 
 	if (test_bit(req->tag, dd->port->cmds_to_issue))
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 06/14] null_blk: complete requests from ->timeout
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (4 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 05/14] mtip32xx: " Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 12:46   ` Johannes Thumshirn
  2018-05-23 13:29   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 07/14] scsi_transport_fc: " Christoph Hellwig
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/null_blk.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index a76553293a31..f86e7af1dc8e 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -1365,7 +1365,8 @@ static blk_qc_t null_queue_bio(struct request_queue *q, struct bio *bio)
 static enum blk_eh_timer_return null_rq_timed_out_fn(struct request *rq)
 {
 	pr_info("null: rq %p timed out\n", rq);
-	return BLK_EH_HANDLED;
+	blk_mq_complete_request(rq);
+	return BLK_EH_DONE;
 }
 
 static int null_rq_prep_fn(struct request_queue *q, struct request *req)
@@ -1427,7 +1428,8 @@ static void null_request_fn(struct request_queue *q)
 static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res)
 {
 	pr_info("null: rq %p timed out\n", rq);
-	return BLK_EH_HANDLED;
+	blk_mq_complete_request(rq);
+	return BLK_EH_DONE;
 }
 
 static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 07/14] scsi_transport_fc: complete requests from ->timeout
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (5 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 06/14] null_blk: " Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 12:48   ` Johannes Thumshirn
  2018-05-23 13:30   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 08/14] mmc: " Christoph Hellwig
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/scsi_transport_fc.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 90075a0ddcfe..7a9a65588a1b 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3591,10 +3591,9 @@ fc_bsg_job_timeout(struct request *req)
 	}
 
 	/* the blk_end_sync_io() doesn't check the error */
-	if (!inflight)
-		return BLK_EH_DONE;
-	else
-		return BLK_EH_HANDLED;
+	if (inflight)
+		blk_mq_complete_request(req);
+	return BLK_EH_DONE;
 }
 
 /**
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 08/14] mmc: complete requests from ->timeout
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (6 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 07/14] scsi_transport_fc: " Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:31   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 09/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

[While this keeps existing behavior it seems to mismatch the comment,
 maintainers please chime in!]

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/mmc/core/queue.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 56e9a803db21..648eb6743ed5 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -111,8 +111,9 @@ static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
 				__mmc_cqe_recovery_notifier(mq);
 			return BLK_EH_RESET_TIMER;
 		}
-		/* No timeout */
-		return BLK_EH_HANDLED;
+		/* No timeout (XXX: huh? comment doesn't make much sense) */
+		blk_mq_complete_request(req);
+		return BLK_EH_DONE;
 	default:
 		/* Timeout is handled by mmc core */
 		return BLK_EH_RESET_TIMER;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 09/14] libiscsi: don't try to bypass SCSI EH
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (7 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 08/14] mmc: " Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:35   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 10/14] block: remove BLK_EH_HANDLED Christoph Hellwig
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

libiscsi is the only SCSI code that return BLK_EH_HANDLED, thus trying to
bypass the normal SCSI EH code.  We are going to remove this return value
at the block layer, and at least from a quick look it doesn't look too
harmful to try to send an abort for these cases, especially as the first
one should not actually be possible.  If this doesn't work out iscsi
will probably need its own eh_strategy_handler instead to just do the
right thing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/libiscsi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index eee43ba83a60..71bdc0b52cf9 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1982,7 +1982,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		 * Raced with completion. Blk layer has taken ownership
 		 * so let timeout code complete it now.
 		 */
-		rc = BLK_EH_HANDLED;
+		rc = BLK_EH_DONE;
 		goto done;
 	}
 
@@ -1997,7 +1997,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		if (unlikely(system_state != SYSTEM_RUNNING)) {
 			sc->result = DID_NO_CONNECT << 16;
 			ISCSI_DBG_EH(session, "sc on shutdown, handled\n");
-			rc = BLK_EH_HANDLED;
+			rc = BLK_EH_DONE;
 			goto done;
 		}
 		/*
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 10/14] block: remove BLK_EH_HANDLED
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (8 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 09/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:36   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 11/14] block: document the blk_eh_timer_return values Christoph Hellwig
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/scsi/scsi_eh.txt | 11 -----------
 block/blk-mq.c                 |  3 ---
 block/blk-timeout.c            |  3 ---
 include/linux/blkdev.h         |  1 -
 4 files changed, 18 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 3ae8419e72cf..1b7436932a2b 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -82,17 +82,6 @@ function
  1. invokes optional hostt->eh_timed_out() callback.  Return value can
     be one of
 
-    - BLK_EH_HANDLED
-	This indicates that eh_timed_out() dealt with the timeout.
-	The command is passed back to the block layer and completed
-	via __blk_complete_requests().
-
-	*NOTE* After returning BLK_EH_HANDLED the SCSI layer is
-	assumed to be finished with the command, and no other
-	functions from the SCSI layer will be called. So this
-	should typically only be returned if the eh_timed_out()
-	handler raced with normal completion.
-
     - BLK_EH_RESET_TIMER
 	This indicates that more time is required to finish the
 	command.  Timer is restarted.  This action is counted as a
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 546acc8cb201..22ab2a148a9b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -844,9 +844,6 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 		ret = ops->timeout(req, reserved);
 
 	switch (ret) {
-	case BLK_EH_HANDLED:
-		__blk_mq_complete_request(req);
-		break;
 	case BLK_EH_RESET_TIMER:
 		/*
 		 * As nothing prevents from completion happening while
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 3c622335811f..aae8f233341b 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -86,9 +86,6 @@ static void blk_rq_timed_out(struct request *req)
 	if (q->rq_timed_out_fn)
 		ret = q->rq_timed_out_fn(req);
 	switch (ret) {
-	case BLK_EH_HANDLED:
-		__blk_complete_request(req);
-		break;
 	case BLK_EH_RESET_TIMER:
 		blk_add_timer(req);
 		blk_clear_rq_complete(req);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ba0844c176f2..3d9d4da4dedd 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -338,7 +338,6 @@ typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
 	BLK_EH_DONE,
-	BLK_EH_HANDLED,
 	BLK_EH_RESET_TIMER,
 };
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 11/14] block: document the blk_eh_timer_return values
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (9 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 10/14] block: remove BLK_EH_HANDLED Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:36   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 12/14] blk-mq: Fix timeout and state order Christoph Hellwig
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/blkdev.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3d9d4da4dedd..3815d9dcfbe0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -337,8 +337,8 @@ typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
 typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
-	BLK_EH_DONE,
-	BLK_EH_RESET_TIMER,
+	BLK_EH_DONE,		/* drivers has completed the command */
+	BLK_EH_RESET_TIMER,	/* reset timer and try again */
 };
 
 typedef enum blk_eh_timer_return (rq_timed_out_fn)(struct request *);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 12/14] blk-mq: Fix timeout and state order
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (10 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 11/14] block: document the blk_eh_timer_return values Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:37   ` Hannes Reinecke
  2018-05-23 12:19 ` [PATCH 13/14] blk-mq: Remove generation seqeunce Christoph Hellwig
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

From: Keith Busch <keith.busch@intel.com>

The block layer had been setting the state to in-flight prior to updating
the timer. This is the wrong order since the timeout handler could observe
the in-flight state with the older timeout, believing the request had
expired when in fact it is just getting started.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 block/blk-mq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 22ab2a148a9b..614cb03732ed 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -697,8 +697,8 @@ void blk_mq_start_request(struct request *rq)
 	preempt_disable();
 	write_seqcount_begin(&rq->gstate_seq);
 
-	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
 	blk_add_timer(rq);
+	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
 
 	write_seqcount_end(&rq->gstate_seq);
 	preempt_enable();
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 13/14] blk-mq: Remove generation seqeunce
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (11 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 12/14] blk-mq: Fix timeout and state order Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 14:24   ` Keith Busch
  2018-05-23 12:19 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
  2018-05-23 13:20 ` BLK_EH_HANDLED Hannes Reinecke
  14 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

From: Keith Busch <keith.busch@intel.com>

This patch simplifies the timeout handling by relying on the request
reference counting to ensure the iterator is operating on an inflight
and truly timed out request. Since the reference counting prevents the
tag from being reallocated, the block layer no longer needs to prevent
drivers from completing their requests while the timeout handler is
operating on it: a driver completing a request is allowed to proceed to
the next state without additional syncronization with the block layer.

This also removes any need for generation sequence numbers since the
request lifetime is prevented from being reallocated as a new sequence
while timeout handling is operating on it.

To enables this a refcount is added to struct request so that request
users can be sure they're operating on the same request without it
changing while they're processing it.  The request's tag won't be
released for reuse until both the timeout handler and the completion
are done with it.

[hch: slight cleanups]

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-core.c       |   6 -
 block/blk-mq-debugfs.c |   1 -
 block/blk-mq.c         | 318 +++++++++++------------------------------
 block/blk-mq.h         |  42 +-----
 block/blk-timeout.c    |   1 -
 include/linux/blkdev.h |  35 ++---
 6 files changed, 98 insertions(+), 305 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 43370faee935..cee03cad99f2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -198,12 +198,6 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
 	rq->internal_tag = -1;
 	rq->start_time_ns = ktime_get_ns();
 	rq->part = NULL;
-	seqcount_init(&rq->gstate_seq);
-	u64_stats_init(&rq->aborted_gstate_sync);
-	/*
-	 * See comment of blk_mq_init_request
-	 */
-	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 3080e18cb859..ffa622366922 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -344,7 +344,6 @@ static const char *const rqf_name[] = {
 	RQF_NAME(STATS),
 	RQF_NAME(SPECIAL_PAYLOAD),
 	RQF_NAME(ZONE_WRITE_LOCKED),
-	RQF_NAME(MQ_TIMEOUT_EXPIRED),
 	RQF_NAME(MQ_POLL_SLEPT),
 };
 #undef RQF_NAME
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 614cb03732ed..8c7b1803b7e9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -332,6 +332,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 #endif
 
 	data->ctx->rq_dispatched[op_is_sync(op)]++;
+	refcount_set(&rq->ref, 1);
 	return rq;
 }
 
@@ -465,13 +466,27 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
 }
 EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);
 
+static void __blk_mq_free_request(struct request *rq)
+{
+	struct request_queue *q = rq->q;
+	struct blk_mq_ctx *ctx = rq->mq_ctx;
+	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
+	const int sched_tag = rq->internal_tag;
+
+	if (rq->tag != -1)
+		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
+	if (sched_tag != -1)
+		blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
+	blk_mq_sched_restart(hctx);
+	blk_queue_exit(q);
+}
+
 void blk_mq_free_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
 	struct elevator_queue *e = q->elevator;
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
-	const int sched_tag = rq->internal_tag;
 
 	if (rq->rq_flags & RQF_ELVPRIV) {
 		if (e && e->type->ops.mq.finish_request)
@@ -494,13 +509,9 @@ void blk_mq_free_request(struct request *rq)
 	if (blk_rq_rl(rq))
 		blk_put_rl(blk_rq_rl(rq));
 
-	blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
-	if (rq->tag != -1)
-		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
-	if (sched_tag != -1)
-		blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
-	blk_mq_sched_restart(hctx);
-	blk_queue_exit(q);
+	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
+	if (refcount_dec_and_test(&rq->ref))
+		__blk_mq_free_request(rq);
 }
 EXPORT_SYMBOL_GPL(blk_mq_free_request);
 
@@ -541,14 +552,27 @@ static void __blk_mq_complete_request_remote(void *data)
 	rq->q->softirq_done_fn(rq);
 }
 
-static void __blk_mq_complete_request(struct request *rq)
+
+/**
+ * blk_mq_complete_request - end I/O on a request
+ * @rq:		the request being processed
+ *
+ * Description:
+ *	Ends all I/O on a request. It does not handle partial completions.
+ *	The actual completion happens out-of-order, through a IPI handler.
+ **/
+void blk_mq_complete_request(struct request *rq)
 {
+	struct request_queue *q = rq->q;
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	bool shared = false;
 	int cpu;
 
+	if (unlikely(blk_should_fake_timeout(q)))
+		return;
+
 	WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT);
-	blk_mq_rq_update_state(rq, MQ_RQ_COMPLETE);
+	WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
 
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
@@ -572,90 +596,6 @@ static void __blk_mq_complete_request(struct request *rq)
 	}
 	put_cpu();
 }
-
-static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx)
-	__releases(hctx->srcu)
-{
-	if (!(hctx->flags & BLK_MQ_F_BLOCKING))
-		rcu_read_unlock();
-	else
-		srcu_read_unlock(hctx->srcu, srcu_idx);
-}
-
-static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx)
-	__acquires(hctx->srcu)
-{
-	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
-		/* shut up gcc false positive */
-		*srcu_idx = 0;
-		rcu_read_lock();
-	} else
-		*srcu_idx = srcu_read_lock(hctx->srcu);
-}
-
-static void blk_mq_rq_update_aborted_gstate(struct request *rq, u64 gstate)
-{
-	unsigned long flags;
-
-	/*
-	 * blk_mq_rq_aborted_gstate() is used from the completion path and
-	 * can thus be called from irq context.  u64_stats_fetch in the
-	 * middle of update on the same CPU leads to lockup.  Disable irq
-	 * while updating.
-	 */
-	local_irq_save(flags);
-	u64_stats_update_begin(&rq->aborted_gstate_sync);
-	rq->aborted_gstate = gstate;
-	u64_stats_update_end(&rq->aborted_gstate_sync);
-	local_irq_restore(flags);
-}
-
-static u64 blk_mq_rq_aborted_gstate(struct request *rq)
-{
-	unsigned int start;
-	u64 aborted_gstate;
-
-	do {
-		start = u64_stats_fetch_begin(&rq->aborted_gstate_sync);
-		aborted_gstate = rq->aborted_gstate;
-	} while (u64_stats_fetch_retry(&rq->aborted_gstate_sync, start));
-
-	return aborted_gstate;
-}
-
-/**
- * blk_mq_complete_request - end I/O on a request
- * @rq:		the request being processed
- *
- * Description:
- *	Ends all I/O on a request. It does not handle partial completions.
- *	The actual completion happens out-of-order, through a IPI handler.
- **/
-void blk_mq_complete_request(struct request *rq)
-{
-	struct request_queue *q = rq->q;
-	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
-	int srcu_idx;
-
-	if (unlikely(blk_should_fake_timeout(q)))
-		return;
-
-	/*
-	 * If @rq->aborted_gstate equals the current instance, timeout is
-	 * claiming @rq and we lost.  This is synchronized through
-	 * hctx_lock().  See blk_mq_timeout_work() for details.
-	 *
-	 * Completion path never blocks and we can directly use RCU here
-	 * instead of hctx_lock() which can be either RCU or SRCU.
-	 * However, that would complicate paths which want to synchronize
-	 * against us.  Let stay in sync with the issue path so that
-	 * hctx_lock() covers both issue and completion paths.
-	 */
-	hctx_lock(hctx, &srcu_idx);
-	if (blk_mq_rq_aborted_gstate(rq) != rq->gstate)
-		__blk_mq_complete_request(rq);
-	hctx_unlock(hctx, srcu_idx);
-}
 EXPORT_SYMBOL(blk_mq_complete_request);
 
 int blk_mq_request_started(struct request *rq)
@@ -683,25 +623,8 @@ void blk_mq_start_request(struct request *rq)
 
 	WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
 
-	/*
-	 * Mark @rq in-flight which also advances the generation number,
-	 * and register for timeout.  Protect with a seqcount to allow the
-	 * timeout path to read both @rq->gstate and @rq->deadline
-	 * coherently.
-	 *
-	 * This is the only place where a request is marked in-flight.  If
-	 * the timeout path reads an in-flight @rq->gstate, the
-	 * @rq->deadline it reads together under @rq->gstate_seq is
-	 * guaranteed to be the matching one.
-	 */
-	preempt_disable();
-	write_seqcount_begin(&rq->gstate_seq);
-
 	blk_add_timer(rq);
-	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
-
-	write_seqcount_end(&rq->gstate_seq);
-	preempt_enable();
+	WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT);
 
 	if (q->dma_drain_size && blk_rq_bytes(rq)) {
 		/*
@@ -714,11 +637,6 @@ void blk_mq_start_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_start_request);
 
-/*
- * When we reach here because queue is busy, it's safe to change the state
- * to IDLE without checking @rq->aborted_gstate because we should still be
- * holding the RCU read lock and thus protected against timeout.
- */
 static void __blk_mq_requeue_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
@@ -728,8 +646,8 @@ static void __blk_mq_requeue_request(struct request *rq)
 	trace_block_rq_requeue(q, rq);
 	wbt_requeue(q->rq_wb, rq);
 
-	if (blk_mq_rq_state(rq) != MQ_RQ_IDLE) {
-		blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
+	if (blk_mq_request_started(rq)) {
+		WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 		if (q->dma_drain_size && blk_rq_bytes(rq))
 			rq->nr_phys_segments--;
 	}
@@ -827,30 +745,16 @@ struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
 }
 EXPORT_SYMBOL(blk_mq_tag_to_rq);
 
-struct blk_mq_timeout_data {
-	unsigned long next;
-	unsigned int next_set;
-	unsigned int nr_expired;
-};
-
 static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 {
 	const struct blk_mq_ops *ops = req->q->mq_ops;
 	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
 
-	req->rq_flags |= RQF_MQ_TIMEOUT_EXPIRED;
-
 	if (ops->timeout)
 		ret = ops->timeout(req, reserved);
 
 	switch (ret) {
 	case BLK_EH_RESET_TIMER:
-		/*
-		 * As nothing prevents from completion happening while
-		 * ->aborted_gstate is set, this may lead to ignored
-		 * completions and further spurious timeouts.
-		 */
-		blk_mq_rq_update_aborted_gstate(req, 0);
 		blk_add_timer(req);
 		break;
 	case BLK_EH_DONE:
@@ -861,64 +765,65 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 	}
 }
 
-static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
-		struct request *rq, void *priv, bool reserved)
+static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
 {
-	struct blk_mq_timeout_data *data = priv;
-	unsigned long gstate, deadline;
-	int start;
-
-	might_sleep();
+	unsigned long deadline;
 
-	if (rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED)
-		return;
+	if (blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT)
+		return false;
 
-	/* read coherent snapshots of @rq->state_gen and @rq->deadline */
-	while (true) {
-		start = read_seqcount_begin(&rq->gstate_seq);
-		gstate = READ_ONCE(rq->gstate);
-		deadline = blk_rq_deadline(rq);
-		if (!read_seqcount_retry(&rq->gstate_seq, start))
-			break;
-		cond_resched();
-	}
+	deadline = blk_rq_deadline(rq);
+	if (time_after_eq(jiffies, deadline))
+		return true;
 
-	/* if in-flight && overdue, mark for abortion */
-	if ((gstate & MQ_RQ_STATE_MASK) == MQ_RQ_IN_FLIGHT &&
-	    time_after_eq(jiffies, deadline)) {
-		blk_mq_rq_update_aborted_gstate(rq, gstate);
-		data->nr_expired++;
-		hctx->nr_expired++;
-	} else if (!data->next_set || time_after(data->next, deadline)) {
-		data->next = deadline;
-		data->next_set = 1;
-	}
+	if (*next == 0)
+		*next = deadline;
+	else if (time_after(*next, deadline))
+		*next = deadline;
+	return false;
 }
 
-static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx,
+static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, void *priv, bool reserved)
 {
+	unsigned long *next = priv;
+
+	/*
+	 * Just do a quick check if it is expired before locking the request in
+	 * so we're not unnecessarilly synchronizing across CPUs.
+	 */
+	if (!blk_mq_req_expired(rq, next))
+		return;
+
+	/*
+	 * We have reason to believe the request may be expired. Take a
+	 * reference on the request to lock this request lifetime into its
+	 * currently allocated context to prevent it from being reallocated in
+	 * the event the completion by-passes this timeout handler.
+	 * 
+	 * If the reference was already released, then the driver beat the
+	 * timeout handler to posting a natural completion.
+	 */
+	if (!refcount_inc_not_zero(&rq->ref))
+		return;
+
 	/*
-	 * We marked @rq->aborted_gstate and waited for RCU.  If there were
-	 * completions that we lost to, they would have finished and
-	 * updated @rq->gstate by now; otherwise, the completion path is
-	 * now guaranteed to see @rq->aborted_gstate and yield.  If
-	 * @rq->aborted_gstate still matches @rq->gstate, @rq is ours.
+	 * The request is now locked and cannot be reallocated underneath the
+	 * timeout handler's processing. Re-verify this exact request is truly
+	 * expired; if it is not expired, then the request was completed and
+	 * reallocated as a new request.
 	 */
-	if (!(rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) &&
-	    READ_ONCE(rq->gstate) == rq->aborted_gstate)
+	if (blk_mq_req_expired(rq, next))
 		blk_mq_rq_timed_out(rq, reserved);
+	if (refcount_dec_and_test(&rq->ref))
+		__blk_mq_free_request(rq);
 }
 
 static void blk_mq_timeout_work(struct work_struct *work)
 {
 	struct request_queue *q =
 		container_of(work, struct request_queue, timeout_work);
-	struct blk_mq_timeout_data data = {
-		.next		= 0,
-		.next_set	= 0,
-		.nr_expired	= 0,
-	};
+	unsigned long next = 0;
 	struct blk_mq_hw_ctx *hctx;
 	int i;
 
@@ -938,39 +843,10 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	if (!percpu_ref_tryget(&q->q_usage_counter))
 		return;
 
-	/* scan for the expired ones and set their ->aborted_gstate */
-	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &data);
-
-	if (data.nr_expired) {
-		bool has_rcu = false;
-
-		/*
-		 * Wait till everyone sees ->aborted_gstate.  The
-		 * sequential waits for SRCUs aren't ideal.  If this ever
-		 * becomes a problem, we can add per-hw_ctx rcu_head and
-		 * wait in parallel.
-		 */
-		queue_for_each_hw_ctx(q, hctx, i) {
-			if (!hctx->nr_expired)
-				continue;
-
-			if (!(hctx->flags & BLK_MQ_F_BLOCKING))
-				has_rcu = true;
-			else
-				synchronize_srcu(hctx->srcu);
-
-			hctx->nr_expired = 0;
-		}
-		if (has_rcu)
-			synchronize_rcu();
-
-		/* terminate the ones we won */
-		blk_mq_queue_tag_busy_iter(q, blk_mq_terminate_expired, NULL);
-	}
+	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
 
-	if (data.next_set) {
-		data.next = blk_rq_timeout(round_jiffies_up(data.next));
-		mod_timer(&q->timeout, data.next);
+	if (next != 0) {
+		mod_timer(&q->timeout, next);
 	} else {
 		/*
 		 * Request timeouts are handled as a forward rolling timer. If
@@ -1315,8 +1191,6 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 
 static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 {
-	int srcu_idx;
-
 	/*
 	 * We should be running this queue from one of the CPUs that
 	 * are mapped to it.
@@ -1350,9 +1224,7 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 
 	might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
 
-	hctx_lock(hctx, &srcu_idx);
 	blk_mq_sched_dispatch_requests(hctx);
-	hctx_unlock(hctx, srcu_idx);
 }
 
 static inline int blk_mq_first_mapped_cpu(struct blk_mq_hw_ctx *hctx)
@@ -1439,7 +1311,6 @@ EXPORT_SYMBOL(blk_mq_delay_run_hw_queue);
 
 bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 {
-	int srcu_idx;
 	bool need_run;
 
 	/*
@@ -1450,10 +1321,8 @@ bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 	 * And queue will be rerun in blk_mq_unquiesce_queue() if it is
 	 * quiesced.
 	 */
-	hctx_lock(hctx, &srcu_idx);
 	need_run = !blk_queue_quiesced(hctx->queue) &&
 		blk_mq_hctx_has_pending(hctx);
-	hctx_unlock(hctx, srcu_idx);
 
 	if (need_run) {
 		__blk_mq_delay_run_hw_queue(hctx, async, 0);
@@ -1809,34 +1678,23 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, blk_qc_t *cookie)
 {
 	blk_status_t ret;
-	int srcu_idx;
 
 	might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
 
-	hctx_lock(hctx, &srcu_idx);
-
 	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
 	if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
 		blk_mq_sched_insert_request(rq, false, true, false);
 	else if (ret != BLK_STS_OK)
 		blk_mq_end_request(rq, ret);
-
-	hctx_unlock(hctx, srcu_idx);
 }
 
 blk_status_t blk_mq_request_issue_directly(struct request *rq)
 {
-	blk_status_t ret;
-	int srcu_idx;
 	blk_qc_t unused_cookie;
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu);
 
-	hctx_lock(hctx, &srcu_idx);
-	ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
-	hctx_unlock(hctx, srcu_idx);
-
-	return ret;
+	return __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
 }
 
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
@@ -2046,15 +1904,7 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 			return ret;
 	}
 
-	seqcount_init(&rq->gstate_seq);
-	u64_stats_init(&rq->aborted_gstate_sync);
-	/*
-	 * start gstate with gen 1 instead of 0, otherwise it will be equal
-	 * to aborted_gstate, and be identified timed out by
-	 * blk_mq_terminate_expired.
-	 */
-	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
-
+	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 	return 0;
 }
 
diff --git a/block/blk-mq.h b/block/blk-mq.h
index e1bb420dc5d6..89231e439b2f 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -30,20 +30,6 @@ struct blk_mq_ctx {
 	struct kobject		kobj;
 } ____cacheline_aligned_in_smp;
 
-/*
- * Bits for request->gstate.  The lower two bits carry MQ_RQ_* state value
- * and the upper bits the generation number.
- */
-enum mq_rq_state {
-	MQ_RQ_IDLE		= 0,
-	MQ_RQ_IN_FLIGHT		= 1,
-	MQ_RQ_COMPLETE		= 2,
-
-	MQ_RQ_STATE_BITS	= 2,
-	MQ_RQ_STATE_MASK	= (1 << MQ_RQ_STATE_BITS) - 1,
-	MQ_RQ_GEN_INC		= 1 << MQ_RQ_STATE_BITS,
-};
-
 void blk_mq_freeze_queue(struct request_queue *q);
 void blk_mq_free_queue(struct request_queue *q);
 int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
@@ -107,33 +93,9 @@ void blk_mq_release(struct request_queue *q);
  * blk_mq_rq_state() - read the current MQ_RQ_* state of a request
  * @rq: target request.
  */
-static inline int blk_mq_rq_state(struct request *rq)
+static inline enum mq_rq_state blk_mq_rq_state(struct request *rq)
 {
-	return READ_ONCE(rq->gstate) & MQ_RQ_STATE_MASK;
-}
-
-/**
- * blk_mq_rq_update_state() - set the current MQ_RQ_* state of a request
- * @rq: target request.
- * @state: new state to set.
- *
- * Set @rq's state to @state.  The caller is responsible for ensuring that
- * there are no other updaters.  A request can transition into IN_FLIGHT
- * only from IDLE and doing so increments the generation number.
- */
-static inline void blk_mq_rq_update_state(struct request *rq,
-					  enum mq_rq_state state)
-{
-	u64 old_val = READ_ONCE(rq->gstate);
-	u64 new_val = (old_val & ~MQ_RQ_STATE_MASK) | state;
-
-	if (state == MQ_RQ_IN_FLIGHT) {
-		WARN_ON_ONCE((old_val & MQ_RQ_STATE_MASK) != MQ_RQ_IDLE);
-		new_val += MQ_RQ_GEN_INC;
-	}
-
-	/* avoid exposing interim values */
-	WRITE_ONCE(rq->gstate, new_val);
+	return READ_ONCE(rq->state);
 }
 
 static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q,
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index aae8f233341b..4b8a48d48ba1 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -211,7 +211,6 @@ void blk_add_timer(struct request *req)
 		req->timeout = q->rq_timeout;
 
 	blk_rq_set_deadline(req, jiffies + req->timeout);
-	req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED;
 
 	/*
 	 * Only the non-mq case needs to add the request to a protected list.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3815d9dcfbe0..9f06b29adaa4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -125,15 +125,22 @@ typedef __u32 __bitwise req_flags_t;
 #define RQF_SPECIAL_PAYLOAD	((__force req_flags_t)(1 << 18))
 /* The per-zone write lock is held for this request */
 #define RQF_ZONE_WRITE_LOCKED	((__force req_flags_t)(1 << 19))
-/* timeout is expired */
-#define RQF_MQ_TIMEOUT_EXPIRED	((__force req_flags_t)(1 << 20))
 /* already slept for hybrid poll */
-#define RQF_MQ_POLL_SLEPT	((__force req_flags_t)(1 << 21))
+#define RQF_MQ_POLL_SLEPT	((__force req_flags_t)(1 << 20))
 
 /* flags that prevent us from merging requests: */
 #define RQF_NOMERGE_FLAGS \
 	(RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_SPECIAL_PAYLOAD)
 
+/*
+ * Request state for blk-mq.
+ */
+enum mq_rq_state {
+	MQ_RQ_IDLE		= 0,
+	MQ_RQ_IN_FLIGHT		= 1,
+	MQ_RQ_COMPLETE		= 2,
+};
+
 /*
  * Try to put the fields that are referenced together in the same cacheline.
  *
@@ -236,26 +243,8 @@ struct request {
 
 	unsigned int extra_len;	/* length of alignment and padding */
 
-	/*
-	 * On blk-mq, the lower bits of ->gstate (generation number and
-	 * state) carry the MQ_RQ_* state value and the upper bits the
-	 * generation number which is monotonically incremented and used to
-	 * distinguish the reuse instances.
-	 *
-	 * ->gstate_seq allows updates to ->gstate and other fields
-	 * (currently ->deadline) during request start to be read
-	 * atomically from the timeout path, so that it can operate on a
-	 * coherent set of information.
-	 */
-	seqcount_t gstate_seq;
-	u64 gstate;
-
-	/*
-	 * ->aborted_gstate is used by the timeout to claim a specific
-	 * recycle instance of this request.  See blk_mq_timeout_work().
-	 */
-	struct u64_stats_sync aborted_gstate_sync;
-	u64 aborted_gstate;
+	enum mq_rq_state state;
+	refcount_t ref;
 
 	/* access through blk_rq_set_deadline, blk_rq_deadline */
 	unsigned long __deadline;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (12 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 13/14] blk-mq: Remove generation seqeunce Christoph Hellwig
@ 2018-05-23 12:19 ` Christoph Hellwig
  2018-05-23 13:20 ` BLK_EH_HANDLED Hannes Reinecke
  14 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-23 12:19 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8c7b1803b7e9..592bab689f8e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -747,22 +747,16 @@ EXPORT_SYMBOL(blk_mq_tag_to_rq);
 
 static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 {
-	const struct blk_mq_ops *ops = req->q->mq_ops;
-	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
+	if (req->q->mq_ops->timeout) {
+		enum blk_eh_timer_return ret;
 
-	if (ops->timeout)
-		ret = ops->timeout(req, reserved);
-
-	switch (ret) {
-	case BLK_EH_RESET_TIMER:
-		blk_add_timer(req);
-		break;
-	case BLK_EH_DONE:
-		break;
-	default:
-		printk(KERN_ERR "block: bad eh return: %d\n", ret);
-		break;
+		ret = req->q->mq_ops->timeout(req, reserved);
+		if (ret == BLK_EH_DONE)
+			return;
+		WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
 	}
+
+	blk_add_timer(req);
 }
 
 static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE
  2018-05-23 12:19 ` [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
@ 2018-05-23 12:41   ` Johannes Thumshirn
  2018-05-23 13:22   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Johannes Thumshirn @ 2018-05-23 12:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, Rafael David Tinoco,
	linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-23 12:19 ` [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
@ 2018-05-23 12:42   ` Johannes Thumshirn
  2018-05-23 13:27   ` Hannes Reinecke
  2018-05-24  4:45   ` Ming Lei
  2 siblings, 0 replies; 39+ messages in thread
From: Johannes Thumshirn @ 2018-05-23 12:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, Rafael David Tinoco,
	linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/14] mtip32xx: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 05/14] mtip32xx: " Christoph Hellwig
@ 2018-05-23 12:46   ` Johannes Thumshirn
  2018-05-23 13:29   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Johannes Thumshirn @ 2018-05-23 12:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, Rafael David Tinoco,
	linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 06/14] null_blk: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 06/14] null_blk: " Christoph Hellwig
@ 2018-05-23 12:46   ` Johannes Thumshirn
  2018-05-23 13:29   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Johannes Thumshirn @ 2018-05-23 12:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, Rafael David Tinoco,
	linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 07/14] scsi_transport_fc: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 07/14] scsi_transport_fc: " Christoph Hellwig
@ 2018-05-23 12:48   ` Johannes Thumshirn
  2018-05-23 13:30   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Johannes Thumshirn @ 2018-05-23 12:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, Rafael David Tinoco,
	linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: BLK_EH_HANDLED
  2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
                   ` (13 preceding siblings ...)
  2018-05-23 12:19 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
@ 2018-05-23 13:20 ` Hannes Reinecke
  14 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:20 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> Hi all,
> 
> this series removes the BLK_EH_HANDLED return value, and instead places
> responsibility of timed out commands entirely on the drivers.  Except
> for some odd layering vilations in libiscsi this actually is
> surprisingly simple.  The last two pathes contain a respin of Keith'
> series to refcount struct request in the eye of timeouts.  Once the
> BLK_EH_HANDLED special case is gone this actually seems the EH fixing
> approach that seems the most sensible to me.
> 
YES!

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 01/14] libata: remove ata_scsi_timed_out
  2018-05-23 12:19 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
@ 2018-05-23 13:21   ` Hannes Reinecke
  0 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:21 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> As far as I can tell this function can't even be called any more, given
> that ATA implements its own eh_strategy_handler with ata_scsi_error, which
> never calls ->eh_timed_out.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/ata/libata-eh.c | 51 -----------------------------------------
>   include/linux/libata.h  |  2 --
>   2 files changed, 53 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE
  2018-05-23 12:19 ` [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
  2018-05-23 12:41   ` Johannes Thumshirn
@ 2018-05-23 13:22   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:22 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
> is not what is happening - instead the driver already completed the
> command.  Fix the symbolic name to reflect that a little better.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   Documentation/scsi/scsi_eh.txt            | 4 ++--
>   block/blk-mq.c                            | 2 +-
>   block/blk-timeout.c                       | 2 +-
>   drivers/block/nbd.c                       | 2 +-
>   drivers/message/fusion/mptsas.c           | 2 +-
>   drivers/s390/block/dasd.c                 | 6 +++---
>   drivers/scsi/gdth.c                       | 2 +-
>   drivers/scsi/libiscsi.c                   | 2 +-
>   drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
>   drivers/scsi/mvumi.c                      | 2 +-
>   drivers/scsi/qla4xxx/ql4_os.c             | 2 +-
>   drivers/scsi/scsi_error.c                 | 4 ++--
>   drivers/scsi/scsi_transport_fc.c          | 4 ++--
>   drivers/scsi/scsi_transport_srp.c         | 4 ++--
>   drivers/scsi/ufs/ufshcd.c                 | 6 +++---
>   include/linux/blkdev.h                    | 2 +-
>   include/scsi/scsi_host.h                  | 2 +-
>   17 files changed, 25 insertions(+), 25 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-23 12:19 ` [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
  2018-05-23 12:42   ` Johannes Thumshirn
@ 2018-05-23 13:27   ` Hannes Reinecke
  2018-05-28 15:24     ` Christoph Hellwig
  2018-05-24  4:45   ` Ming Lei
  2 siblings, 1 reply; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:27 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> NVMe always completes the request before returning from ->timeout, either
> by polling for it, or by disabling the controller.  Return BLK_EH_DONE so
> that the block layer doesn't even try to complete it again.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/nvme/host/pci.c    | 14 +++++---------
>   drivers/nvme/host/rdma.c , |  2 +-
>   drivers/nvme/target/loop.c |  2 +-
>   3 files changed, 7 insertions(+), 11 deletions(-)
> 
Is there a way of _testing_ this patch?
It looks pretty dodgy, just replacing BLK_EH_HANDLED with 
BLK_EH_NOT_HANDLED.
And, as nothing else has changed, would imply that we can do it on 
older/stable versions, too.
Which means that the original code was buggy to start with.
Hence a test here would be really beneficial.

But then, assuming you did some tests here:

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/14] nbd: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 04/14] nbd: complete requests " Christoph Hellwig
@ 2018-05-23 13:28   ` Hannes Reinecke
  2018-05-28 15:25     ` Christoph Hellwig
  0 siblings, 1 reply; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:28 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> By completing the request entirely in the driver we can remove the
> BLK_EH_HANDLED return value and thus the split responsibility between the
> driver and the block layer that has been causing trouble.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/nbd.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 09d9959287fe..30a851b8a99c 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -300,7 +300,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>   
>   	if (!refcount_inc_not_zero(&nbd->config_refs)) {
>   		cmd->status = BLK_STS_TIMEOUT;
> -		return BLK_EH_HANDLED;
> +		goto done;
>   	}
>   	config = nbd->config;
>   
> @@ -338,8 +338,9 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>   	cmd->status = BLK_STS_IOERR;
>   	sock_shutdown(nbd);
>   	nbd_config_put(nbd);
> -
> -	return BLK_EH_HANDLED;
> +done:
> +	blk_mq_complete_request(req);
> +	return BLK_EH_DONE;
>   }
>   
>   /*
> 
Again, some testcase would be nice ...

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/14] mtip32xx: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 05/14] mtip32xx: " Christoph Hellwig
  2018-05-23 12:46   ` Johannes Thumshirn
@ 2018-05-23 13:29   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:29 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> By completing the request entirely in the driver we can remove the
> BLK_EH_HANDLED return value and thus the split responsibility between the
> driver and the block layer that has been causing trouble.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/mtip32xx/mtip32xx.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
> index 6df5b0b1517a..beace13effe4 100644
> --- a/drivers/block/mtip32xx/mtip32xx.c
> +++ b/drivers/block/mtip32xx/mtip32xx.c
> @@ -3720,7 +3720,8 @@ static enum blk_eh_timer_return mtip_cmd_timeout(struct request *req,
>   		struct mtip_cmd *cmd = blk_mq_rq_to_pdu(req);
>   
>   		cmd->status = BLK_STS_TIMEOUT;
> -		return BLK_EH_HANDLED;
> +		blk_mq_complete_request(req);
> +		return BLK_EH_DONE;
>   	}
>   
>   	if (test_bit(req->tag, dd->port->cmds_to_issue))
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 06/14] null_blk: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 06/14] null_blk: " Christoph Hellwig
  2018-05-23 12:46   ` Johannes Thumshirn
@ 2018-05-23 13:29   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:29 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> By completing the request entirely in the driver we can remove the
> BLK_EH_HANDLED return value and thus the split responsibility between the
> driver and the block layer that has been causing trouble.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/block/null_blk.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 07/14] scsi_transport_fc: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 07/14] scsi_transport_fc: " Christoph Hellwig
  2018-05-23 12:48   ` Johannes Thumshirn
@ 2018-05-23 13:30   ` Hannes Reinecke
  1 sibling, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:30 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> By completing the request entirely in the driver we can remove the
> BLK_EH_HANDLED return value and thus the split responsibility between the
> driver and the block layer that has been causing trouble.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/scsi/scsi_transport_fc.c | 7 +++----
>   1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index 90075a0ddcfe..7a9a65588a1b 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -3591,10 +3591,9 @@ fc_bsg_job_timeout(struct request *req)
>   	}
>   
>   	/* the blk_end_sync_io() doesn't check the error */
> -	if (!inflight)
> -		return BLK_EH_DONE;
> -	else
> -		return BLK_EH_HANDLED;
> +	if (inflight)
> +		blk_mq_complete_request(req);
> +	return BLK_EH_DONE;
>   }
>   
>   /**
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/14] mmc: complete requests from ->timeout
  2018-05-23 12:19 ` [PATCH 08/14] mmc: " Christoph Hellwig
@ 2018-05-23 13:31   ` Hannes Reinecke
  0 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:31 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> By completing the request entirely in the driver we can remove the
> BLK_EH_HANDLED return value and thus the split responsibility between the
> driver and the block layer that has been causing trouble.
> 
> [While this keeps existing behavior it seems to mismatch the comment,
>   maintainers please chime in!]
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/mmc/core/queue.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 56e9a803db21..648eb6743ed5 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -111,8 +111,9 @@ static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
>   				__mmc_cqe_recovery_notifier(mq);
>   			return BLK_EH_RESET_TIMER;
>   		}
> -		/* No timeout */
> -		return BLK_EH_HANDLED;
> +		/* No timeout (XXX: huh? comment doesn't make much sense) */
> +		blk_mq_complete_request(req);
> +		return BLK_EH_DONE;
>   	default:
>   		/* Timeout is handled by mmc core */
>   		return BLK_EH_RESET_TIMER;
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/14] libiscsi: don't try to bypass SCSI EH
  2018-05-23 12:19 ` [PATCH 09/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
@ 2018-05-23 13:35   ` Hannes Reinecke
  0 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:35 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> libiscsi is the only SCSI code that return BLK_EH_HANDLED, thus trying to
> bypass the normal SCSI EH code.  We are going to remove this return value
> at the block layer, and at least from a quick look it doesn't look too
> harmful to try to send an abort for these cases, especially as the first
> one should not actually be possible.  If this doesn't work out iscsi
> will probably need its own eh_strategy_handler instead to just do the
> right thing.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/scsi/libiscsi.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index eee43ba83a60..71bdc0b52cf9 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1982,7 +1982,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
>   		 * Raced with completion. Blk layer has taken ownership
>   		 * so let timeout code complete it now.
>   		 */
> -		rc = BLK_EH_HANDLED;
> +		rc = BLK_EH_DONE;
>   		goto done;
>   	}
>   
> @@ -1997,7 +1997,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
>   		if (unlikely(system_state != SYSTEM_RUNNING)) {
>   			sc->result = DID_NO_CONNECT << 16;
>   			ISCSI_DBG_EH(session, "sc on shutdown, handled\n");
> -			rc = BLK_EH_HANDLED;
> +			rc = BLK_EH_DONE;
>   			goto done;
>   		}
>   		/*
> 
That should be okay, as these two returns are the pathological case 
where the command has already completed before the timeout handler got 
invoked.
IE the very same issue which triggered this patchset in the first place.

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 10/14] block: remove BLK_EH_HANDLED
  2018-05-23 12:19 ` [PATCH 10/14] block: remove BLK_EH_HANDLED Christoph Hellwig
@ 2018-05-23 13:36   ` Hannes Reinecke
  0 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:36 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   Documentation/scsi/scsi_eh.txt | 11 -----------
>   block/blk-mq.c                 |  3 ---
>   block/blk-timeout.c            |  3 ---
>   include/linux/blkdev.h         |  1 -
>   4 files changed, 18 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/14] block: document the blk_eh_timer_return values
  2018-05-23 12:19 ` [PATCH 11/14] block: document the blk_eh_timer_return values Christoph Hellwig
@ 2018-05-23 13:36   ` Hannes Reinecke
  0 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:36 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   include/linux/blkdev.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 3d9d4da4dedd..3815d9dcfbe0 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -337,8 +337,8 @@ typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
>   typedef void (exit_rq_fn)(struct request_queue *, struct request *);
>   
>   enum blk_eh_timer_return {
> -	BLK_EH_DONE,
> -	BLK_EH_RESET_TIMER,
> +	BLK_EH_DONE,		/* drivers has completed the command */
> +	BLK_EH_RESET_TIMER,	/* reset timer and try again */
>   };
>   
>   typedef enum blk_eh_timer_return (rq_timed_out_fn)(struct request *);
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/14] blk-mq: Fix timeout and state order
  2018-05-23 12:19 ` [PATCH 12/14] blk-mq: Fix timeout and state order Christoph Hellwig
@ 2018-05-23 13:37   ` Hannes Reinecke
  0 siblings, 0 replies; 39+ messages in thread
From: Hannes Reinecke @ 2018-05-23 13:37 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, Rafael David Tinoco, linux-block, linux-scsi

On 05/23/2018 02:19 PM, Christoph Hellwig wrote:
> From: Keith Busch <keith.busch@intel.com>
> 
> The block layer had been setting the state to in-flight prior to updating
> the timer. This is the wrong order since the timeout handler could observe
> the in-flight state with the older timeout, believing the request had
> expired when in fact it is just getting started.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>   block/blk-mq.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 22ab2a148a9b..614cb03732ed 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -697,8 +697,8 @@ void blk_mq_start_request(struct request *rq)
>   	preempt_disable();
>   	write_seqcount_begin(&rq->gstate_seq);
>   
> -	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
>   	blk_add_timer(rq);
> +	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
>   
>   	write_seqcount_end(&rq->gstate_seq);
>   	preempt_enable();
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/14] blk-mq: Remove generation seqeunce
  2018-05-23 12:19 ` [PATCH 13/14] blk-mq: Remove generation seqeunce Christoph Hellwig
@ 2018-05-23 14:24   ` Keith Busch
  0 siblings, 0 replies; 39+ messages in thread
From: Keith Busch @ 2018-05-23 14:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo,
	Lee Duncan, Chris Leech, Rafael David Tinoco, linux-block,
	linux-scsi

On Wed, May 23, 2018 at 02:19:40PM +0200, Christoph Hellwig wrote:
> -static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx)
> -	__releases(hctx->srcu)
> -{
> -	if (!(hctx->flags & BLK_MQ_F_BLOCKING))
> -		rcu_read_unlock();
> -	else
> -		srcu_read_unlock(hctx->srcu, srcu_idx);
> -}
> -
> -static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx)
> -	__acquires(hctx->srcu)
> -{
> -	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
> -		/* shut up gcc false positive */
> -		*srcu_idx = 0;
> -		rcu_read_lock();
> -	} else
> -		*srcu_idx = srcu_read_lock(hctx->srcu);
> -}

I may have too ambitious on removing hctx_lock. That's actually nothing
to do with the completion issues, but it is necessary for quiesce to
work reliably. Otherwise, I still think there is some promise to the
reset of the approach.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-23 12:19 ` [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
  2018-05-23 12:42   ` Johannes Thumshirn
  2018-05-23 13:27   ` Hannes Reinecke
@ 2018-05-24  4:45   ` Ming Lei
  2018-05-28 11:44     ` Christoph Hellwig
  2 siblings, 1 reply; 39+ messages in thread
From: Ming Lei @ 2018-05-24  4:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Josef Bacik, Tejun Heo,
	Lee Duncan, Chris Leech, Rafael David Tinoco, linux-block,
	linux-scsi

On Wed, May 23, 2018 at 02:19:30PM +0200, Christoph Hellwig wrote:
> NVMe always completes the request before returning from ->timeout, either
> by polling for it, or by disabling the controller.  Return BLK_EH_DONE so
> that the block layer doesn't even try to complete it again.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/nvme/host/pci.c    | 14 +++++---------
>  drivers/nvme/host/rdma.c   |  2 +-
>  drivers/nvme/target/loop.c |  2 +-
>  3 files changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 917e1714f7d9..31525324b79f 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1205,7 +1205,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		nvme_warn_reset(dev, csts);
>  		nvme_dev_disable(dev, false);
>  		nvme_reset_ctrl(&dev->ctrl);
> -		return BLK_EH_HANDLED;
> +		return BLK_EH_DONE;
>  	}
>  
>  	/*
> @@ -1215,14 +1215,14 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		dev_warn(dev->ctrl.device,
>  			 "I/O %d QID %d timeout, completion polled\n",
>  			 req->tag, nvmeq->qid);
> -		return BLK_EH_HANDLED;
> +		return BLK_EH_DONE;
>  	}
>  
>  	/*
>  	 * Shutdown immediately if controller times out while starting. The
>  	 * reset work will see the pci device disabled when it gets the forced
>  	 * cancellation error. All outstanding requests are completed on
> -	 * shutdown, so we return BLK_EH_HANDLED.
> +	 * shutdown, so we return BLK_EH_DONE.
>  	 */
>  	switch (dev->ctrl.state) {
>  	case NVME_CTRL_CONNECTING:
> @@ -1232,7 +1232,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  			 req->tag, nvmeq->qid);
>  		nvme_dev_disable(dev, false);
>  		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> -		return BLK_EH_HANDLED;
> +		return BLK_EH_DONE;
>  	default:
>  		break;
>  	}
> @@ -1249,12 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		nvme_dev_disable(dev, false);
>  		nvme_reset_ctrl(&dev->ctrl);
>  
> -		/*
> -		 * Mark the request as handled, since the inline shutdown
> -		 * forces all outstanding requests to complete.
> -		 */
>  		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> -		return BLK_EH_HANDLED;
> +		return BLK_EH_DONE;
>  	}
>  
>  	if (atomic_dec_return(&dev->ctrl.abort_limit) < 0) {
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 1eb4438a8763..ac7462cd7f0f 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1598,7 +1598,7 @@ nvme_rdma_timeout(struct request *rq, bool reserved)
>  	/* fail with DNR on cmd timeout */
>  	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
>  
> -	return BLK_EH_HANDLED;
> +	return BLK_EH_DONE;
>  }
>  
>  static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
> diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
> index 27a8561c0cb9..22e3627bf16b 100644
> --- a/drivers/nvme/target/loop.c
> +++ b/drivers/nvme/target/loop.c
> @@ -146,7 +146,7 @@ nvme_loop_timeout(struct request *rq, bool reserved)
>  	/* fail with DNR on admin cmd timeout */
>  	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
>  
> -	return BLK_EH_HANDLED;
> +	return BLK_EH_DONE;
>  }
>  
>  static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx,
> -- 
> 2.17.0
> 

This change should have been done after '[PATCH 13/14] blk-mq: Remove
generation seqeunce', otherwise the timed-out request won't be completed
by nvme_cancel_request() at all because we always marked this request as
'COMPLETE' before calling .timeout().

Thanks,
Ming

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-24  4:45   ` Ming Lei
@ 2018-05-28 11:44     ` Christoph Hellwig
  2018-05-29  0:50       ` Ming Lei
  0 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-28 11:44 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Bart Van Assche,
	Josef Bacik, Tejun Heo, Lee Duncan, Chris Leech,
	Rafael David Tinoco, linux-block, linux-scsi

On Thu, May 24, 2018 at 12:45:15PM +0800, Ming Lei wrote:
> This change should have been done after '[PATCH 13/14] blk-mq: Remove
> generation seqeunce', otherwise the timed-out request won't be completed
> by nvme_cancel_request() at all because we always marked this request as
> 'COMPLETE' before calling .timeout().

Yes.  Or we should start out with reverting the whole series introducing
the gstate.  I think it has shown to create much more problems than it
solved, and starting from a clean state without it will allow us to
iterate much saner.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-23 13:27   ` Hannes Reinecke
@ 2018-05-28 15:24     ` Christoph Hellwig
  0 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-28 15:24 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Bart Van Assche,
	Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan, Chris Leech,
	Rafael David Tinoco, linux-block, linux-scsi

On Wed, May 23, 2018 at 03:27:35PM +0200, Hannes Reinecke wrote:
> Is there a way of _testing_ this patch?

Use the error injection framework Johannes wrote?  Use blktests
(but make sure you have a fixed version of test 011).

> It looks pretty dodgy, just replacing BLK_EH_HANDLED with 
> BLK_EH_NOT_HANDLED.
> And, as nothing else has changed, would imply that we can do it on 
> older/stable versions, too.
> Which means that the original code was buggy to start with.
> Hence a test here would be really beneficial.

NVMe always completed the requests itself, so not too much changed.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/14] nbd: complete requests from ->timeout
  2018-05-23 13:28   ` Hannes Reinecke
@ 2018-05-28 15:25     ` Christoph Hellwig
  0 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2018-05-28 15:25 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, Keith Busch, Bart Van Assche,
	Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan, Chris Leech,
	Rafael David Tinoco, linux-block, linux-scsi

On Wed, May 23, 2018 at 03:28:38PM +0200, Hannes Reinecke wrote:
> Again, some testcase would be nice ...

For this one I don't have an actual test case.  I'll have to defer
to the nbd maintainers for that.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-28 11:44     ` Christoph Hellwig
@ 2018-05-29  0:50       ` Ming Lei
  0 siblings, 0 replies; 39+ messages in thread
From: Ming Lei @ 2018-05-29  0:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Josef Bacik, Tejun Heo,
	Lee Duncan, Chris Leech, Rafael David Tinoco, linux-block,
	linux-scsi

On Mon, May 28, 2018 at 01:44:25PM +0200, Christoph Hellwig wrote:
> On Thu, May 24, 2018 at 12:45:15PM +0800, Ming Lei wrote:
> > This change should have been done after '[PATCH 13/14] blk-mq: Remove
> > generation seqeunce', otherwise the timed-out request won't be completed
> > by nvme_cancel_request() at all because we always marked this request as
> > 'COMPLETE' before calling .timeout().
> 
> Yes.  Or we should start out with reverting the whole series introducing
> the gstate.  I think it has shown to create much more problems than it
> solved, and starting from a clean state without it will allow us to
> iterate much saner.

It may not help for this case by reverting the series introducing the gstate.

This behaviour has been used from the beginning, such as, the code
becomes the following after the revert:

                if (!blk_mark_rq_complete(rq))
                        blk_mq_rq_timed_out(rq, reserved);

Given the patch 13("blk-mq: Remove generation seqeunce") is the 1st one
to change the behaviour, we should put this patch and other related ones
after patch 13 for avoiding to break 'git bisect'.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2018-05-29  0:50 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-23 12:19 BLK_EH_HANDLED Christoph Hellwig
2018-05-23 12:19 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
2018-05-23 13:21   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 02/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
2018-05-23 12:41   ` Johannes Thumshirn
2018-05-23 13:22   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 03/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
2018-05-23 12:42   ` Johannes Thumshirn
2018-05-23 13:27   ` Hannes Reinecke
2018-05-28 15:24     ` Christoph Hellwig
2018-05-24  4:45   ` Ming Lei
2018-05-28 11:44     ` Christoph Hellwig
2018-05-29  0:50       ` Ming Lei
2018-05-23 12:19 ` [PATCH 04/14] nbd: complete requests " Christoph Hellwig
2018-05-23 13:28   ` Hannes Reinecke
2018-05-28 15:25     ` Christoph Hellwig
2018-05-23 12:19 ` [PATCH 05/14] mtip32xx: " Christoph Hellwig
2018-05-23 12:46   ` Johannes Thumshirn
2018-05-23 13:29   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 06/14] null_blk: " Christoph Hellwig
2018-05-23 12:46   ` Johannes Thumshirn
2018-05-23 13:29   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 07/14] scsi_transport_fc: " Christoph Hellwig
2018-05-23 12:48   ` Johannes Thumshirn
2018-05-23 13:30   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 08/14] mmc: " Christoph Hellwig
2018-05-23 13:31   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 09/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
2018-05-23 13:35   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 10/14] block: remove BLK_EH_HANDLED Christoph Hellwig
2018-05-23 13:36   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 11/14] block: document the blk_eh_timer_return values Christoph Hellwig
2018-05-23 13:36   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 12/14] blk-mq: Fix timeout and state order Christoph Hellwig
2018-05-23 13:37   ` Hannes Reinecke
2018-05-23 12:19 ` [PATCH 13/14] blk-mq: Remove generation seqeunce Christoph Hellwig
2018-05-23 14:24   ` Keith Busch
2018-05-23 12:19 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
2018-05-23 13:20 ` BLK_EH_HANDLED Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.