All of lore.kernel.org
 help / color / mirror / Atom feed
* drop abort sequence, kill BLK_EH_HANDLED v2
@ 2018-05-29 13:52 Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
                   ` (14 more replies)
  0 siblings, 15 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

Hi all,

this series removes the BLK_EH_HANDLED return value, and instead places
responsibility of timed out commands entirely on the drivers.  Except
for some odd layering vilations in libiscsi this actually is
surprisingly simple.   Some of the grunt work here is from Keith and
Bart, just repackaged up

Changes since v1:
 - reordered the series
 - added back submission-time hctx lock
 - use cmpxchg to protect against concurrent blk_abort_request calls.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/14] libata: remove ata_scsi_timed_out
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 02/14] blk-mq: Fix timeout and state order Christoph Hellwig
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

As far as I can tell this function can't even be called any more, given
that ATA implements its own eh_strategy_handler with ata_scsi_error, which
never calls ->eh_timed_out.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/ata/libata-eh.c | 51 -----------------------------------------
 include/linux/libata.h  |  2 --
 2 files changed, 53 deletions(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index c016829a38fd..e87785dec151 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -500,57 +500,6 @@ void ata_eh_release(struct ata_port *ap)
 	mutex_unlock(&ap->host->eh_mutex);
 }
 
-/**
- *	ata_scsi_timed_out - SCSI layer time out callback
- *	@cmd: timed out SCSI command
- *
- *	Handles SCSI layer timeout.  We race with normal completion of
- *	the qc for @cmd.  If the qc is already gone, we lose and let
- *	the scsi command finish (EH_HANDLED).  Otherwise, the qc has
- *	timed out and EH should be invoked.  Prevent ata_qc_complete()
- *	from finishing it by setting EH_SCHEDULED and return
- *	EH_NOT_HANDLED.
- *
- *	TODO: kill this function once old EH is gone.
- *
- *	LOCKING:
- *	Called from timer context
- *
- *	RETURNS:
- *	EH_HANDLED or EH_NOT_HANDLED
- */
-enum blk_eh_timer_return ata_scsi_timed_out(struct scsi_cmnd *cmd)
-{
-	struct Scsi_Host *host = cmd->device->host;
-	struct ata_port *ap = ata_shost_to_port(host);
-	unsigned long flags;
-	struct ata_queued_cmd *qc;
-	enum blk_eh_timer_return ret;
-
-	DPRINTK("ENTER\n");
-
-	if (ap->ops->error_handler) {
-		ret = BLK_EH_NOT_HANDLED;
-		goto out;
-	}
-
-	ret = BLK_EH_HANDLED;
-	spin_lock_irqsave(ap->lock, flags);
-	qc = ata_qc_from_tag(ap, ap->link.active_tag);
-	if (qc) {
-		WARN_ON(qc->scsicmd != cmd);
-		qc->flags |= ATA_QCFLAG_EH_SCHEDULED;
-		qc->err_mask |= AC_ERR_TIMEOUT;
-		ret = BLK_EH_NOT_HANDLED;
-	}
-	spin_unlock_irqrestore(ap->lock, flags);
-
- out:
-	DPRINTK("EXIT, ret=%d\n", ret);
-	return ret;
-}
-EXPORT_SYMBOL(ata_scsi_timed_out);
-
 static void ata_eh_unload(struct ata_port *ap)
 {
 	struct ata_link *link;
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 1795fecdea17..1c113134c98f 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1133,7 +1133,6 @@ extern int ata_sas_port_start(struct ata_port *ap);
 extern void ata_sas_port_stop(struct ata_port *ap);
 extern int ata_sas_slave_configure(struct scsi_device *, struct ata_port *);
 extern int ata_sas_queuecmd(struct scsi_cmnd *cmd, struct ata_port *ap);
-extern enum blk_eh_timer_return ata_scsi_timed_out(struct scsi_cmnd *cmd);
 extern int sata_scr_valid(struct ata_link *link);
 extern int sata_scr_read(struct ata_link *link, int reg, u32 *val);
 extern int sata_scr_write(struct ata_link *link, int reg, u32 val);
@@ -1359,7 +1358,6 @@ extern struct device_attribute *ata_common_sdev_attrs[];
 	.proc_name		= drv_name,			\
 	.slave_configure	= ata_scsi_slave_config,	\
 	.slave_destroy		= ata_scsi_slave_destroy,	\
-	.eh_timed_out		= ata_scsi_timed_out,		\
 	.bios_param		= ata_std_bios_param,		\
 	.unlock_native_capacity	= ata_scsi_unlock_native_capacity, \
 	.sdev_attrs		= ata_common_sdev_attrs
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/14] blk-mq: Fix timeout and state order
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 03/14] blk-mq: Remove generation seqeunce Christoph Hellwig
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

From: Keith Busch <keith.busch@intel.com>

The block layer had been setting the state to in-flight prior to updating
the timer. This is the wrong order since the timeout handler could observe
the in-flight state with the older timeout, believing the request had
expired when in fact it is just getting started.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index df928200b17e..3581a1e5c8a7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -697,8 +697,8 @@ void blk_mq_start_request(struct request *rq)
 	preempt_disable();
 	write_seqcount_begin(&rq->gstate_seq);
 
-	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
 	blk_add_timer(rq);
+	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
 
 	write_seqcount_end(&rq->gstate_seq);
 	preempt_enable();
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/14] blk-mq: Remove generation seqeunce
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 02/14] blk-mq: Fix timeout and state order Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-06-04  6:56   ` Bart Van Assche
  2018-05-29 13:52 ` [PATCH 04/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

From: Keith Busch <keith.busch@intel.com>

This patch simplifies the timeout handling by relying on the request
reference counting to ensure the iterator is operating on an inflight
and truly timed out request. Since the reference counting prevents the
tag from being reallocated, the block layer no longer needs to prevent
drivers from completing their requests while the timeout handler is
operating on it: a driver completing a request is allowed to proceed to
the next state without additional syncronization with the block layer.

This also removes any need for generation sequence numbers since the
request lifetime is prevented from being reallocated as a new sequence
while timeout handling is operating on it.

To enables this a refcount is added to struct request so that request
users can be sure they're operating on the same request without it
changing while they're processing it.  The request's tag won't be
released for reuse until both the timeout handler and the completion
are done with it.

Signed-off-by: Keith Busch <keith.busch@intel.com>
[hch: slight cleanups, added back submission side hctx lock, use cmpxchg
 for completions]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-core.c       |   6 -
 block/blk-mq-debugfs.c |   1 -
 block/blk-mq.c         | 258 ++++++++++++-----------------------------
 block/blk-mq.h         |  42 +------
 block/blk-timeout.c    |   1 -
 include/linux/blkdev.h |  35 ++----
 6 files changed, 89 insertions(+), 254 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 43370faee935..cee03cad99f2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -198,12 +198,6 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
 	rq->internal_tag = -1;
 	rq->start_time_ns = ktime_get_ns();
 	rq->part = NULL;
-	seqcount_init(&rq->gstate_seq);
-	u64_stats_init(&rq->aborted_gstate_sync);
-	/*
-	 * See comment of blk_mq_init_request
-	 */
-	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 3080e18cb859..ffa622366922 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -344,7 +344,6 @@ static const char *const rqf_name[] = {
 	RQF_NAME(STATS),
 	RQF_NAME(SPECIAL_PAYLOAD),
 	RQF_NAME(ZONE_WRITE_LOCKED),
-	RQF_NAME(MQ_TIMEOUT_EXPIRED),
 	RQF_NAME(MQ_POLL_SLEPT),
 };
 #undef RQF_NAME
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3581a1e5c8a7..270d55cd70b5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -332,6 +332,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 #endif
 
 	data->ctx->rq_dispatched[op_is_sync(op)]++;
+	refcount_set(&rq->ref, 1);
 	return rq;
 }
 
@@ -465,13 +466,27 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
 }
 EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);
 
+static void __blk_mq_free_request(struct request *rq)
+{
+	struct request_queue *q = rq->q;
+	struct blk_mq_ctx *ctx = rq->mq_ctx;
+	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
+	const int sched_tag = rq->internal_tag;
+
+	if (rq->tag != -1)
+		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
+	if (sched_tag != -1)
+		blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
+	blk_mq_sched_restart(hctx);
+	blk_queue_exit(q);
+}
+
 void blk_mq_free_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
 	struct elevator_queue *e = q->elevator;
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
-	const int sched_tag = rq->internal_tag;
 
 	if (rq->rq_flags & RQF_ELVPRIV) {
 		if (e && e->type->ops.mq.finish_request)
@@ -494,13 +509,9 @@ void blk_mq_free_request(struct request *rq)
 	if (blk_rq_rl(rq))
 		blk_put_rl(blk_rq_rl(rq));
 
-	blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
-	if (rq->tag != -1)
-		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
-	if (sched_tag != -1)
-		blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
-	blk_mq_sched_restart(hctx);
-	blk_queue_exit(q);
+	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
+	if (refcount_dec_and_test(&rq->ref))
+		__blk_mq_free_request(rq);
 }
 EXPORT_SYMBOL_GPL(blk_mq_free_request);
 
@@ -547,8 +558,9 @@ static void __blk_mq_complete_request(struct request *rq)
 	bool shared = false;
 	int cpu;
 
-	WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT);
-	blk_mq_rq_update_state(rq, MQ_RQ_COMPLETE);
+	if (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
+			MQ_RQ_IN_FLIGHT)
+		return;
 
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
@@ -593,36 +605,6 @@ static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx)
 		*srcu_idx = srcu_read_lock(hctx->srcu);
 }
 
-static void blk_mq_rq_update_aborted_gstate(struct request *rq, u64 gstate)
-{
-	unsigned long flags;
-
-	/*
-	 * blk_mq_rq_aborted_gstate() is used from the completion path and
-	 * can thus be called from irq context.  u64_stats_fetch in the
-	 * middle of update on the same CPU leads to lockup.  Disable irq
-	 * while updating.
-	 */
-	local_irq_save(flags);
-	u64_stats_update_begin(&rq->aborted_gstate_sync);
-	rq->aborted_gstate = gstate;
-	u64_stats_update_end(&rq->aborted_gstate_sync);
-	local_irq_restore(flags);
-}
-
-static u64 blk_mq_rq_aborted_gstate(struct request *rq)
-{
-	unsigned int start;
-	u64 aborted_gstate;
-
-	do {
-		start = u64_stats_fetch_begin(&rq->aborted_gstate_sync);
-		aborted_gstate = rq->aborted_gstate;
-	} while (u64_stats_fetch_retry(&rq->aborted_gstate_sync, start));
-
-	return aborted_gstate;
-}
-
 /**
  * blk_mq_complete_request - end I/O on a request
  * @rq:		the request being processed
@@ -633,28 +615,9 @@ static u64 blk_mq_rq_aborted_gstate(struct request *rq)
  **/
 void blk_mq_complete_request(struct request *rq)
 {
-	struct request_queue *q = rq->q;
-	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
-	int srcu_idx;
-
-	if (unlikely(blk_should_fake_timeout(q)))
+	if (unlikely(blk_should_fake_timeout(rq->q)))
 		return;
-
-	/*
-	 * If @rq->aborted_gstate equals the current instance, timeout is
-	 * claiming @rq and we lost.  This is synchronized through
-	 * hctx_lock().  See blk_mq_timeout_work() for details.
-	 *
-	 * Completion path never blocks and we can directly use RCU here
-	 * instead of hctx_lock() which can be either RCU or SRCU.
-	 * However, that would complicate paths which want to synchronize
-	 * against us.  Let stay in sync with the issue path so that
-	 * hctx_lock() covers both issue and completion paths.
-	 */
-	hctx_lock(hctx, &srcu_idx);
-	if (blk_mq_rq_aborted_gstate(rq) != rq->gstate)
-		__blk_mq_complete_request(rq);
-	hctx_unlock(hctx, srcu_idx);
+	__blk_mq_complete_request(rq);
 }
 EXPORT_SYMBOL(blk_mq_complete_request);
 
@@ -683,25 +646,8 @@ void blk_mq_start_request(struct request *rq)
 
 	WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
 
-	/*
-	 * Mark @rq in-flight which also advances the generation number,
-	 * and register for timeout.  Protect with a seqcount to allow the
-	 * timeout path to read both @rq->gstate and @rq->deadline
-	 * coherently.
-	 *
-	 * This is the only place where a request is marked in-flight.  If
-	 * the timeout path reads an in-flight @rq->gstate, the
-	 * @rq->deadline it reads together under @rq->gstate_seq is
-	 * guaranteed to be the matching one.
-	 */
-	preempt_disable();
-	write_seqcount_begin(&rq->gstate_seq);
-
 	blk_add_timer(rq);
-	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
-
-	write_seqcount_end(&rq->gstate_seq);
-	preempt_enable();
+	WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT);
 
 	if (q->dma_drain_size && blk_rq_bytes(rq)) {
 		/*
@@ -714,11 +660,6 @@ void blk_mq_start_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_start_request);
 
-/*
- * When we reach here because queue is busy, it's safe to change the state
- * to IDLE without checking @rq->aborted_gstate because we should still be
- * holding the RCU read lock and thus protected against timeout.
- */
 static void __blk_mq_requeue_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
@@ -728,8 +669,8 @@ static void __blk_mq_requeue_request(struct request *rq)
 	trace_block_rq_requeue(q, rq);
 	wbt_requeue(q->rq_wb, rq);
 
-	if (blk_mq_rq_state(rq) != MQ_RQ_IDLE) {
-		blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
+	if (blk_mq_request_started(rq)) {
+		WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 		if (q->dma_drain_size && blk_rq_bytes(rq))
 			rq->nr_phys_segments--;
 	}
@@ -827,33 +768,20 @@ struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
 }
 EXPORT_SYMBOL(blk_mq_tag_to_rq);
 
-struct blk_mq_timeout_data {
-	unsigned long next;
-	unsigned int next_set;
-	unsigned int nr_expired;
-};
-
 static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 {
 	const struct blk_mq_ops *ops = req->q->mq_ops;
 	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
 
-	req->rq_flags |= RQF_MQ_TIMEOUT_EXPIRED;
-
 	if (ops->timeout)
 		ret = ops->timeout(req, reserved);
 
 	switch (ret) {
 	case BLK_EH_HANDLED:
-		__blk_mq_complete_request(req);
+		if (blk_mq_rq_state(req) == MQ_RQ_IN_FLIGHT)
+			__blk_mq_complete_request(req);
 		break;
 	case BLK_EH_RESET_TIMER:
-		/*
-		 * As nothing prevents from completion happening while
-		 * ->aborted_gstate is set, this may lead to ignored
-		 * completions and further spurious timeouts.
-		 */
-		blk_mq_rq_update_aborted_gstate(req, 0);
 		blk_add_timer(req);
 		break;
 	case BLK_EH_NOT_HANDLED:
@@ -864,64 +792,65 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 	}
 }
 
-static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
-		struct request *rq, void *priv, bool reserved)
+static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
 {
-	struct blk_mq_timeout_data *data = priv;
-	unsigned long gstate, deadline;
-	int start;
+	unsigned long deadline;
 
-	might_sleep();
-
-	if (rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED)
-		return;
+	if (blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT)
+		return false;
 
-	/* read coherent snapshots of @rq->state_gen and @rq->deadline */
-	while (true) {
-		start = read_seqcount_begin(&rq->gstate_seq);
-		gstate = READ_ONCE(rq->gstate);
-		deadline = blk_rq_deadline(rq);
-		if (!read_seqcount_retry(&rq->gstate_seq, start))
-			break;
-		cond_resched();
-	}
+	deadline = blk_rq_deadline(rq);
+	if (time_after_eq(jiffies, deadline))
+		return true;
 
-	/* if in-flight && overdue, mark for abortion */
-	if ((gstate & MQ_RQ_STATE_MASK) == MQ_RQ_IN_FLIGHT &&
-	    time_after_eq(jiffies, deadline)) {
-		blk_mq_rq_update_aborted_gstate(rq, gstate);
-		data->nr_expired++;
-		hctx->nr_expired++;
-	} else if (!data->next_set || time_after(data->next, deadline)) {
-		data->next = deadline;
-		data->next_set = 1;
-	}
+	if (*next == 0)
+		*next = deadline;
+	else if (time_after(*next, deadline))
+		*next = deadline;
+	return false;
 }
 
-static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx,
+static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, void *priv, bool reserved)
 {
+	unsigned long *next = priv;
+
 	/*
-	 * We marked @rq->aborted_gstate and waited for RCU.  If there were
-	 * completions that we lost to, they would have finished and
-	 * updated @rq->gstate by now; otherwise, the completion path is
-	 * now guaranteed to see @rq->aborted_gstate and yield.  If
-	 * @rq->aborted_gstate still matches @rq->gstate, @rq is ours.
+	 * Just do a quick check if it is expired before locking the request in
+	 * so we're not unnecessarilly synchronizing across CPUs.
 	 */
-	if (!(rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) &&
-	    READ_ONCE(rq->gstate) == rq->aborted_gstate)
+	if (!blk_mq_req_expired(rq, next))
+		return;
+
+	/*
+	 * We have reason to believe the request may be expired. Take a
+	 * reference on the request to lock this request lifetime into its
+	 * currently allocated context to prevent it from being reallocated in
+	 * the event the completion by-passes this timeout handler.
+	 * 
+	 * If the reference was already released, then the driver beat the
+	 * timeout handler to posting a natural completion.
+	 */
+	if (!refcount_inc_not_zero(&rq->ref))
+		return;
+
+	/*
+	 * The request is now locked and cannot be reallocated underneath the
+	 * timeout handler's processing. Re-verify this exact request is truly
+	 * expired; if it is not expired, then the request was completed and
+	 * reallocated as a new request.
+	 */
+	if (blk_mq_req_expired(rq, next))
 		blk_mq_rq_timed_out(rq, reserved);
+	if (refcount_dec_and_test(&rq->ref))
+		__blk_mq_free_request(rq);
 }
 
 static void blk_mq_timeout_work(struct work_struct *work)
 {
 	struct request_queue *q =
 		container_of(work, struct request_queue, timeout_work);
-	struct blk_mq_timeout_data data = {
-		.next		= 0,
-		.next_set	= 0,
-		.nr_expired	= 0,
-	};
+	unsigned long next = 0;
 	struct blk_mq_hw_ctx *hctx;
 	int i;
 
@@ -941,39 +870,10 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	if (!percpu_ref_tryget(&q->q_usage_counter))
 		return;
 
-	/* scan for the expired ones and set their ->aborted_gstate */
-	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &data);
-
-	if (data.nr_expired) {
-		bool has_rcu = false;
-
-		/*
-		 * Wait till everyone sees ->aborted_gstate.  The
-		 * sequential waits for SRCUs aren't ideal.  If this ever
-		 * becomes a problem, we can add per-hw_ctx rcu_head and
-		 * wait in parallel.
-		 */
-		queue_for_each_hw_ctx(q, hctx, i) {
-			if (!hctx->nr_expired)
-				continue;
-
-			if (!(hctx->flags & BLK_MQ_F_BLOCKING))
-				has_rcu = true;
-			else
-				synchronize_srcu(hctx->srcu);
-
-			hctx->nr_expired = 0;
-		}
-		if (has_rcu)
-			synchronize_rcu();
+	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next);
 
-		/* terminate the ones we won */
-		blk_mq_queue_tag_busy_iter(q, blk_mq_terminate_expired, NULL);
-	}
-
-	if (data.next_set) {
-		data.next = blk_rq_timeout(round_jiffies_up(data.next));
-		mod_timer(&q->timeout, data.next);
+	if (next != 0) {
+		mod_timer(&q->timeout, next);
 	} else {
 		/*
 		 * Request timeouts are handled as a forward rolling timer. If
@@ -2049,15 +1949,7 @@ static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 			return ret;
 	}
 
-	seqcount_init(&rq->gstate_seq);
-	u64_stats_init(&rq->aborted_gstate_sync);
-	/*
-	 * start gstate with gen 1 instead of 0, otherwise it will be equal
-	 * to aborted_gstate, and be identified timed out by
-	 * blk_mq_terminate_expired.
-	 */
-	WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC);
-
+	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 	return 0;
 }
 
diff --git a/block/blk-mq.h b/block/blk-mq.h
index e1bb420dc5d6..89231e439b2f 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -30,20 +30,6 @@ struct blk_mq_ctx {
 	struct kobject		kobj;
 } ____cacheline_aligned_in_smp;
 
-/*
- * Bits for request->gstate.  The lower two bits carry MQ_RQ_* state value
- * and the upper bits the generation number.
- */
-enum mq_rq_state {
-	MQ_RQ_IDLE		= 0,
-	MQ_RQ_IN_FLIGHT		= 1,
-	MQ_RQ_COMPLETE		= 2,
-
-	MQ_RQ_STATE_BITS	= 2,
-	MQ_RQ_STATE_MASK	= (1 << MQ_RQ_STATE_BITS) - 1,
-	MQ_RQ_GEN_INC		= 1 << MQ_RQ_STATE_BITS,
-};
-
 void blk_mq_freeze_queue(struct request_queue *q);
 void blk_mq_free_queue(struct request_queue *q);
 int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
@@ -107,33 +93,9 @@ void blk_mq_release(struct request_queue *q);
  * blk_mq_rq_state() - read the current MQ_RQ_* state of a request
  * @rq: target request.
  */
-static inline int blk_mq_rq_state(struct request *rq)
+static inline enum mq_rq_state blk_mq_rq_state(struct request *rq)
 {
-	return READ_ONCE(rq->gstate) & MQ_RQ_STATE_MASK;
-}
-
-/**
- * blk_mq_rq_update_state() - set the current MQ_RQ_* state of a request
- * @rq: target request.
- * @state: new state to set.
- *
- * Set @rq's state to @state.  The caller is responsible for ensuring that
- * there are no other updaters.  A request can transition into IN_FLIGHT
- * only from IDLE and doing so increments the generation number.
- */
-static inline void blk_mq_rq_update_state(struct request *rq,
-					  enum mq_rq_state state)
-{
-	u64 old_val = READ_ONCE(rq->gstate);
-	u64 new_val = (old_val & ~MQ_RQ_STATE_MASK) | state;
-
-	if (state == MQ_RQ_IN_FLIGHT) {
-		WARN_ON_ONCE((old_val & MQ_RQ_STATE_MASK) != MQ_RQ_IDLE);
-		new_val += MQ_RQ_GEN_INC;
-	}
-
-	/* avoid exposing interim values */
-	WRITE_ONCE(rq->gstate, new_val);
+	return READ_ONCE(rq->state);
 }
 
 static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q,
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 652d4d4d3e97..f95d6e6cbc96 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -214,7 +214,6 @@ void blk_add_timer(struct request *req)
 		req->timeout = q->rq_timeout;
 
 	blk_rq_set_deadline(req, jiffies + req->timeout);
-	req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED;
 
 	/*
 	 * Only the non-mq case needs to add the request to a protected list.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f3999719f828..a1c05e85a443 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -125,15 +125,22 @@ typedef __u32 __bitwise req_flags_t;
 #define RQF_SPECIAL_PAYLOAD	((__force req_flags_t)(1 << 18))
 /* The per-zone write lock is held for this request */
 #define RQF_ZONE_WRITE_LOCKED	((__force req_flags_t)(1 << 19))
-/* timeout is expired */
-#define RQF_MQ_TIMEOUT_EXPIRED	((__force req_flags_t)(1 << 20))
 /* already slept for hybrid poll */
-#define RQF_MQ_POLL_SLEPT	((__force req_flags_t)(1 << 21))
+#define RQF_MQ_POLL_SLEPT	((__force req_flags_t)(1 << 20))
 
 /* flags that prevent us from merging requests: */
 #define RQF_NOMERGE_FLAGS \
 	(RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_SPECIAL_PAYLOAD)
 
+/*
+ * Request state for blk-mq.
+ */
+enum mq_rq_state {
+	MQ_RQ_IDLE		= 0,
+	MQ_RQ_IN_FLIGHT		= 1,
+	MQ_RQ_COMPLETE		= 2,
+};
+
 /*
  * Try to put the fields that are referenced together in the same cacheline.
  *
@@ -236,26 +243,8 @@ struct request {
 
 	unsigned int extra_len;	/* length of alignment and padding */
 
-	/*
-	 * On blk-mq, the lower bits of ->gstate (generation number and
-	 * state) carry the MQ_RQ_* state value and the upper bits the
-	 * generation number which is monotonically incremented and used to
-	 * distinguish the reuse instances.
-	 *
-	 * ->gstate_seq allows updates to ->gstate and other fields
-	 * (currently ->deadline) during request start to be read
-	 * atomically from the timeout path, so that it can operate on a
-	 * coherent set of information.
-	 */
-	seqcount_t gstate_seq;
-	u64 gstate;
-
-	/*
-	 * ->aborted_gstate is used by the timeout to claim a specific
-	 * recycle instance of this request.  See blk_mq_timeout_work().
-	 */
-	struct u64_stats_sync aborted_gstate_sync;
-	u64 aborted_gstate;
+	enum mq_rq_state state;
+	refcount_t ref;
 
 	/* access through blk_rq_set_deadline, blk_rq_deadline */
 	unsigned long __deadline;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 03/14] blk-mq: Remove generation seqeunce Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 05/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command.  Fix the symbolic name to reflect that a little better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 Documentation/scsi/scsi_eh.txt            | 4 ++--
 block/blk-mq.c                            | 2 +-
 block/blk-timeout.c                       | 2 +-
 drivers/block/nbd.c                       | 2 +-
 drivers/message/fusion/mptsas.c           | 2 +-
 drivers/s390/block/dasd.c                 | 6 +++---
 drivers/scsi/gdth.c                       | 2 +-
 drivers/scsi/libiscsi.c                   | 2 +-
 drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
 drivers/scsi/mvumi.c                      | 2 +-
 drivers/scsi/qla4xxx/ql4_os.c             | 2 +-
 drivers/scsi/scsi_error.c                 | 4 ++--
 drivers/scsi/scsi_transport_fc.c          | 4 ++--
 drivers/scsi/scsi_transport_srp.c         | 4 ++--
 drivers/scsi/ufs/ufshcd.c                 | 6 +++---
 include/linux/blkdev.h                    | 2 +-
 include/scsi/scsi_host.h                  | 2 +-
 17 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 11e447bdb3a5..3ae8419e72cf 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -97,9 +97,9 @@ function
 	This indicates that more time is required to finish the
 	command.  Timer is restarted.  This action is counted as a
 	retry and only allowed scmd->allowed + 1(!) times.  Once the
-	limit is reached, action for BLK_EH_NOT_HANDLED is taken instead.
+	limit is reached, action for BLK_EH_DONE is taken instead.
 
-    - BLK_EH_NOT_HANDLED
+    - BLK_EH_DONE
         eh_timed_out() callback did not handle the command.
 	Step #2 is taken.
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 270d55cd70b5..a7a601f08bf5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -784,7 +784,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 	case BLK_EH_RESET_TIMER:
 		blk_add_timer(req);
 		break;
-	case BLK_EH_NOT_HANDLED:
+	case BLK_EH_DONE:
 		break;
 	default:
 		printk(KERN_ERR "block: bad eh return: %d\n", ret);
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index f95d6e6cbc96..11879e98c249 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -93,7 +93,7 @@ static void blk_rq_timed_out(struct request *req)
 		blk_add_timer(req);
 		blk_clear_rq_complete(req);
 		break;
-	case BLK_EH_NOT_HANDLED:
+	case BLK_EH_DONE:
 		/*
 		 * LLD handles this for now but in the future
 		 * we can send a request msg to abort the command
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 800e1ec71f3d..88ae833aabe9 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -332,7 +332,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 			}
 			blk_mq_requeue_request(req, true);
 			nbd_config_put(nbd);
-			return BLK_EH_NOT_HANDLED;
+			return BLK_EH_DONE;
 		}
 	} else {
 		dev_err_ratelimited(nbd_to_dev(nbd),
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 86503f60468f..19a5aa70ecda 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1929,7 +1929,7 @@ static enum blk_eh_timer_return mptsas_eh_timed_out(struct scsi_cmnd *sc)
 	MPT_SCSI_HOST *hd;
 	MPT_ADAPTER   *ioc;
 	VirtDevice    *vdevice;
-	enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return rc = BLK_EH_DONE;
 
 	hd = shost_priv(sc->device->host);
 	if (hd == NULL) {
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 04143c08bd6e..b0e89ca48a3c 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -3053,7 +3053,7 @@ static blk_status_t do_dasd_request(struct blk_mq_hw_ctx *hctx,
  *
  * Return values:
  * BLK_EH_RESET_TIMER if the request should be left running
- * BLK_EH_NOT_HANDLED if the request is handled or terminated
+ * BLK_EH_DONE if the request is handled or terminated
  *		      by the driver.
  */
 enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
@@ -3065,7 +3065,7 @@ enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
 	int rc = 0;
 
 	if (!cqr)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 
 	spin_lock_irqsave(&cqr->dq->lock, flags);
 	device = cqr->startdev ? cqr->startdev : block->base;
@@ -3124,7 +3124,7 @@ enum blk_eh_timer_return dasd_times_out(struct request *req, bool reserved)
 	spin_unlock(&block->queue_lock);
 	spin_unlock_irqrestore(&cqr->dq->lock, flags);
 
-	return rc ? BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
+	return rc ? BLK_EH_RESET_TIMER : BLK_EH_DONE;
 }
 
 static int dasd_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index c35f05c4c6bb..85604795d8ee 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -3882,7 +3882,7 @@ static enum blk_eh_timer_return gdth_timed_out(struct scsi_cmnd *scp)
 	struct gdth_cmndinfo *cmndinfo = gdth_cmnd_priv(scp);
 	u8 b, t;
 	unsigned long flags;
-	enum blk_eh_timer_return retval = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return retval = BLK_EH_DONE;
 
 	TRACE(("%s() cmd 0x%x\n", scp->cmnd[0], __func__));
 	b = scp->device->channel;
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 15a2fef51e38..eee43ba83a60 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1963,7 +1963,7 @@ static int iscsi_has_ping_timed_out(struct iscsi_conn *conn)
 
 enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 {
-	enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return rc = BLK_EH_DONE;
 	struct iscsi_task *task = NULL, *running_task;
 	struct iscsi_cls_session *cls_session;
 	struct iscsi_session *session;
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index b89c6e6c0589..ce656c466ca9 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2772,7 +2772,7 @@ blk_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
 
 	if (time_after(jiffies, scmd->jiffies_at_alloc +
 				(scmd_timeout * 2) * HZ)) {
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	instance = (struct megasas_instance *)scmd->device->host->hostdata;
diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
index fe97401ad192..afd27165cd93 100644
--- a/drivers/scsi/mvumi.c
+++ b/drivers/scsi/mvumi.c
@@ -2155,7 +2155,7 @@ static enum blk_eh_timer_return mvumi_timed_out(struct scsi_cmnd *scmd)
 	mvumi_return_cmd(mhba, cmd);
 	spin_unlock_irqrestore(mhba->shost->host_lock, flags);
 
-	return BLK_EH_NOT_HANDLED;
+	return BLK_EH_DONE;
 }
 
 static int
diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
index 94c14ce94da2..0e13349dce57 100644
--- a/drivers/scsi/qla4xxx/ql4_os.c
+++ b/drivers/scsi/qla4xxx/ql4_os.c
@@ -1848,7 +1848,7 @@ static enum blk_eh_timer_return qla4xxx_eh_cmd_timed_out(struct scsi_cmnd *sc)
 	struct iscsi_cls_session *session;
 	struct iscsi_session *sess;
 	unsigned long flags;
-	enum blk_eh_timer_return ret = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return ret = BLK_EH_DONE;
 
 	session = starget_to_session(scsi_target(sc->device));
 	sess = session->dd_data;
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b36e73090018..9c02ba2e7ef3 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -282,7 +282,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 enum blk_eh_timer_return scsi_times_out(struct request *req)
 {
 	struct scsi_cmnd *scmd = blk_mq_rq_to_pdu(req);
-	enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
+	enum blk_eh_timer_return rtn = BLK_EH_DONE;
 	struct Scsi_Host *host = scmd->device->host;
 
 	trace_scsi_dispatch_cmd_timeout(scmd);
@@ -294,7 +294,7 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 	if (host->hostt->eh_timed_out)
 		rtn = host->hostt->eh_timed_out(scmd);
 
-	if (rtn == BLK_EH_NOT_HANDLED) {
+	if (rtn == BLK_EH_DONE) {
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index be3be0f9cb2d..90075a0ddcfe 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -2087,7 +2087,7 @@ fc_eh_timed_out(struct scsi_cmnd *scmd)
 	if (rport->port_state == FC_PORTSTATE_BLOCKED)
 		return BLK_EH_RESET_TIMER;
 
-	return BLK_EH_NOT_HANDLED;
+	return BLK_EH_DONE;
 }
 EXPORT_SYMBOL(fc_eh_timed_out);
 
@@ -3592,7 +3592,7 @@ fc_bsg_job_timeout(struct request *req)
 
 	/* the blk_end_sync_io() doesn't check the error */
 	if (!inflight)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 	else
 		return BLK_EH_HANDLED;
 }
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 36f6190931bc..a9c1c991da35 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -587,7 +587,7 @@ EXPORT_SYMBOL(srp_reconnect_rport);
  *
  * If a timeout occurs while an rport is in the blocked state, ask the SCSI
  * EH to continue waiting (BLK_EH_RESET_TIMER). Otherwise let the SCSI core
- * handle the timeout (BLK_EH_NOT_HANDLED).
+ * handle the timeout (BLK_EH_DONE).
  *
  * Note: This function is called from soft-IRQ context and with the request
  * queue lock held.
@@ -602,7 +602,7 @@ enum blk_eh_timer_return srp_timed_out(struct scsi_cmnd *scmd)
 	pr_debug("timeout for sdev %s\n", dev_name(&sdev->sdev_gendev));
 	return rport->fast_io_fail_tmo < 0 && rport->dev_loss_tmo < 0 &&
 		i->f->reset_timer_if_blocked && scsi_device_blocked(sdev) ?
-		BLK_EH_RESET_TIMER : BLK_EH_NOT_HANDLED;
+		BLK_EH_RESET_TIMER : BLK_EH_DONE;
 }
 EXPORT_SYMBOL(srp_timed_out);
 
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 00e79057f870..d0a1674915a1 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6497,12 +6497,12 @@ static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
 	bool found = false;
 
 	if (!scmd || !scmd->device || !scmd->device->host)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 
 	host = scmd->device->host;
 	hba = shost_priv(host);
 	if (!hba)
-		return BLK_EH_NOT_HANDLED;
+		return BLK_EH_DONE;
 
 	spin_lock_irqsave(host->host_lock, flags);
 
@@ -6520,7 +6520,7 @@ static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
 	 * SCSI command was not actually dispatched to UFS driver, otherwise
 	 * let SCSI layer handle the error as usual.
 	 */
-	return found ? BLK_EH_NOT_HANDLED : BLK_EH_RESET_TIMER;
+	return found ? BLK_EH_DONE : BLK_EH_RESET_TIMER;
 }
 
 static const struct attribute_group *ufshcd_driver_groups[] = {
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a1c05e85a443..f82e05df905b 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -326,7 +326,7 @@ typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
 typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
-	BLK_EH_NOT_HANDLED,
+	BLK_EH_DONE,
 	BLK_EH_HANDLED,
 	BLK_EH_RESET_TIMER,
 };
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 12f454cb6f61..53b485fe9b67 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -307,7 +307,7 @@ struct scsi_host_template {
 	 * EH_HANDLED:		I fixed the error, please complete the command
 	 * EH_RESET_TIMER:	I need more time, reset the timer and
 	 *			begin counting again
-	 * EH_NOT_HANDLED	Begin normal error recovery
+	 * EH_DONE:		Begin normal error recovery
 	 *
 	 * Status: OPTIONAL
 	 */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/14] nvme: return BLK_EH_DONE from ->timeout
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 04/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 06/14] nbd: complete requests " Christoph Hellwig
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

NVMe always completes the request before returning from ->timeout, either
by polling for it, or by disabling the controller.  Return BLK_EH_DONE so
that the block layer doesn't even try to complete it again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/nvme/host/pci.c    | 14 +++++---------
 drivers/nvme/host/rdma.c   |  2 +-
 drivers/nvme/target/loop.c |  2 +-
 3 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 917e1714f7d9..31525324b79f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1205,7 +1205,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		nvme_warn_reset(dev, csts);
 		nvme_dev_disable(dev, false);
 		nvme_reset_ctrl(&dev->ctrl);
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	/*
@@ -1215,14 +1215,14 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		dev_warn(dev->ctrl.device,
 			 "I/O %d QID %d timeout, completion polled\n",
 			 req->tag, nvmeq->qid);
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	/*
 	 * Shutdown immediately if controller times out while starting. The
 	 * reset work will see the pci device disabled when it gets the forced
 	 * cancellation error. All outstanding requests are completed on
-	 * shutdown, so we return BLK_EH_HANDLED.
+	 * shutdown, so we return BLK_EH_DONE.
 	 */
 	switch (dev->ctrl.state) {
 	case NVME_CTRL_CONNECTING:
@@ -1232,7 +1232,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 			 req->tag, nvmeq->qid);
 		nvme_dev_disable(dev, false);
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	default:
 		break;
 	}
@@ -1249,12 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		nvme_dev_disable(dev, false);
 		nvme_reset_ctrl(&dev->ctrl);
 
-		/*
-		 * Mark the request as handled, since the inline shutdown
-		 * forces all outstanding requests to complete.
-		 */
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
-		return BLK_EH_HANDLED;
+		return BLK_EH_DONE;
 	}
 
 	if (atomic_dec_return(&dev->ctrl.abort_limit) < 0) {
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 1eb4438a8763..ac7462cd7f0f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1598,7 +1598,7 @@ nvme_rdma_timeout(struct request *rq, bool reserved)
 	/* fail with DNR on cmd timeout */
 	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
 
-	return BLK_EH_HANDLED;
+	return BLK_EH_DONE;
 }
 
 static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 27a8561c0cb9..22e3627bf16b 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -146,7 +146,7 @@ nvme_loop_timeout(struct request *rq, bool reserved)
 	/* fail with DNR on admin cmd timeout */
 	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
 
-	return BLK_EH_HANDLED;
+	return BLK_EH_DONE;
 }
 
 static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/14] nbd: complete requests from ->timeout
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 05/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 07/14] mtip32xx: " Christoph Hellwig
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/block/nbd.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 88ae833aabe9..8860b24098bc 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -304,7 +304,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 
 	if (!refcount_inc_not_zero(&nbd->config_refs)) {
 		cmd->status = BLK_STS_TIMEOUT;
-		return BLK_EH_HANDLED;
+		goto done;
 	}
 	config = nbd->config;
 
@@ -342,8 +342,9 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	cmd->status = BLK_STS_IOERR;
 	sock_shutdown(nbd);
 	nbd_config_put(nbd);
-
-	return BLK_EH_HANDLED;
+done:
+	blk_mq_complete_request(req);
+	return BLK_EH_DONE;
 }
 
 /*
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/14] mtip32xx: complete requests from ->timeout
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 06/14] nbd: complete requests " Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 14:17   ` Johannes Thumshirn
  2018-05-29 13:52 ` [PATCH 08/14] null_blk: " Christoph Hellwig
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/block/mtip32xx/mtip32xx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c
index e873daca9d13..95657b814543 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3719,7 +3719,8 @@ static enum blk_eh_timer_return mtip_cmd_timeout(struct request *req,
 		struct mtip_cmd *cmd = blk_mq_rq_to_pdu(req);
 
 		cmd->status = BLK_STS_TIMEOUT;
-		return BLK_EH_HANDLED;
+		blk_mq_complete_request(req);
+		return BLK_EH_DONE;
 	}
 
 	if (test_bit(req->tag, dd->port->cmds_to_issue))
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/14] null_blk: complete requests from ->timeout
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 07/14] mtip32xx: " Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 09/14] scsi_transport_fc: " Christoph Hellwig
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/block/null_blk.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index b4d368e3ddcd..2bdadd7f1454 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -1365,7 +1365,8 @@ static blk_qc_t null_queue_bio(struct request_queue *q, struct bio *bio)
 static enum blk_eh_timer_return null_rq_timed_out_fn(struct request *rq)
 {
 	pr_info("null: rq %p timed out\n", rq);
-	return BLK_EH_HANDLED;
+	blk_mq_complete_request(rq);
+	return BLK_EH_DONE;
 }
 
 static int null_rq_prep_fn(struct request_queue *q, struct request *req)
@@ -1427,7 +1428,8 @@ static void null_request_fn(struct request_queue *q)
 static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res)
 {
 	pr_info("null: rq %p timed out\n", rq);
-	return BLK_EH_HANDLED;
+	blk_mq_complete_request(rq);
+	return BLK_EH_DONE;
 }
 
 static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/14] scsi_transport_fc: complete requests from ->timeout
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 08/14] null_blk: " Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 13:52 ` [PATCH 10/14] mmc: " Christoph Hellwig
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/scsi/scsi_transport_fc.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 90075a0ddcfe..7a9a65588a1b 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3591,10 +3591,9 @@ fc_bsg_job_timeout(struct request *req)
 	}
 
 	/* the blk_end_sync_io() doesn't check the error */
-	if (!inflight)
-		return BLK_EH_DONE;
-	else
-		return BLK_EH_HANDLED;
+	if (inflight)
+		blk_mq_complete_request(req);
+	return BLK_EH_DONE;
 }
 
 /**
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/14] mmc: complete requests from ->timeout
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 09/14] scsi_transport_fc: " Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 14:55   ` Shawn Lin
  2018-05-29 13:52 ` [PATCH 11/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.

[While this keeps existing behavior it seems to mismatch the comment,
 maintainers please chime in!]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/mmc/core/queue.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 56e9a803db21..648eb6743ed5 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -111,8 +111,9 @@ static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
 				__mmc_cqe_recovery_notifier(mq);
 			return BLK_EH_RESET_TIMER;
 		}
-		/* No timeout */
-		return BLK_EH_HANDLED;
+		/* No timeout (XXX: huh? comment doesn't make much sense) */
+		blk_mq_complete_request(req);
+		return BLK_EH_DONE;
 	default:
 		/* Timeout is handled by mmc core */
 		return BLK_EH_RESET_TIMER;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/14] libiscsi: don't try to bypass SCSI EH
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 10/14] mmc: " Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-30  1:04   ` Lee Duncan
  2018-05-29 13:52 ` [PATCH 12/14] block: remove BLK_EH_HANDLED Christoph Hellwig
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

libiscsi is the only SCSI code that return BLK_EH_HANDLED, thus trying to
bypass the normal SCSI EH code.  We are going to remove this return value
at the block layer, and at least from a quick look it doesn't look too
harmful to try to send an abort for these cases, especially as the first
one should not actually be possible.  If this doesn't work out iscsi
will probably need its own eh_strategy_handler instead to just do the
right thing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/libiscsi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index eee43ba83a60..71bdc0b52cf9 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1982,7 +1982,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		 * Raced with completion. Blk layer has taken ownership
 		 * so let timeout code complete it now.
 		 */
-		rc = BLK_EH_HANDLED;
+		rc = BLK_EH_DONE;
 		goto done;
 	}
 
@@ -1997,7 +1997,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
 		if (unlikely(system_state != SYSTEM_RUNNING)) {
 			sc->result = DID_NO_CONNECT << 16;
 			ISCSI_DBG_EH(session, "sc on shutdown, handled\n");
-			rc = BLK_EH_HANDLED;
+			rc = BLK_EH_DONE;
 			goto done;
 		}
 		/*
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/14] block: remove BLK_EH_HANDLED
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 11/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 14:17   ` Johannes Thumshirn
  2018-05-29 13:52 ` [PATCH 13/14] block: document the blk_eh_timer_return values Christoph Hellwig
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 Documentation/scsi/scsi_eh.txt | 11 -----------
 block/blk-mq.c                 |  4 ----
 block/blk-timeout.c            |  3 ---
 include/linux/blkdev.h         |  1 -
 4 files changed, 19 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 3ae8419e72cf..1b7436932a2b 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -82,17 +82,6 @@ function
  1. invokes optional hostt->eh_timed_out() callback.  Return value can
     be one of
 
-    - BLK_EH_HANDLED
-	This indicates that eh_timed_out() dealt with the timeout.
-	The command is passed back to the block layer and completed
-	via __blk_complete_requests().
-
-	*NOTE* After returning BLK_EH_HANDLED the SCSI layer is
-	assumed to be finished with the command, and no other
-	functions from the SCSI layer will be called. So this
-	should typically only be returned if the eh_timed_out()
-	handler raced with normal completion.
-
     - BLK_EH_RESET_TIMER
 	This indicates that more time is required to finish the
 	command.  Timer is restarted.  This action is counted as a
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a7a601f08bf5..d898de8841a0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -777,10 +777,6 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 		ret = ops->timeout(req, reserved);
 
 	switch (ret) {
-	case BLK_EH_HANDLED:
-		if (blk_mq_rq_state(req) == MQ_RQ_IN_FLIGHT)
-			__blk_mq_complete_request(req);
-		break;
 	case BLK_EH_RESET_TIMER:
 		blk_add_timer(req);
 		break;
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 11879e98c249..4b8a48d48ba1 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -86,9 +86,6 @@ static void blk_rq_timed_out(struct request *req)
 	if (q->rq_timed_out_fn)
 		ret = q->rq_timed_out_fn(req);
 	switch (ret) {
-	case BLK_EH_HANDLED:
-		__blk_complete_request(req);
-		break;
 	case BLK_EH_RESET_TIMER:
 		blk_add_timer(req);
 		blk_clear_rq_complete(req);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f82e05df905b..d838a89639f2 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -327,7 +327,6 @@ typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
 	BLK_EH_DONE,
-	BLK_EH_HANDLED,
 	BLK_EH_RESET_TIMER,
 };
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 13/14] block: document the blk_eh_timer_return values
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 12/14] block: remove BLK_EH_HANDLED Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 14:17   ` Johannes Thumshirn
  2018-05-29 13:52 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
  2018-05-29 15:12 ` drop abort sequence, kill BLK_EH_HANDLED v2 Jens Axboe
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
 include/linux/blkdev.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d838a89639f2..9f06b29adaa4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -326,8 +326,8 @@ typedef int (init_rq_fn)(struct request_queue *, struct request *, gfp_t);
 typedef void (exit_rq_fn)(struct request_queue *, struct request *);
 
 enum blk_eh_timer_return {
-	BLK_EH_DONE,
-	BLK_EH_RESET_TIMER,
+	BLK_EH_DONE,		/* drivers has completed the command */
+	BLK_EH_RESET_TIMER,	/* reset timer and try again */
 };
 
 typedef enum blk_eh_timer_return (rq_timed_out_fn)(struct request *);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 13/14] block: document the blk_eh_timer_return values Christoph Hellwig
@ 2018-05-29 13:52 ` Christoph Hellwig
  2018-05-29 14:19   ` Johannes Thumshirn
  2018-05-29 15:12 ` drop abort sequence, kill BLK_EH_HANDLED v2 Jens Axboe
  14 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2018-05-29 13:52 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d898de8841a0..ba6f30a67027 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -770,22 +770,16 @@ EXPORT_SYMBOL(blk_mq_tag_to_rq);
 
 static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 {
-	const struct blk_mq_ops *ops = req->q->mq_ops;
-	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
+	if (req->q->mq_ops->timeout) {
+		enum blk_eh_timer_return ret;
 
-	if (ops->timeout)
-		ret = ops->timeout(req, reserved);
-
-	switch (ret) {
-	case BLK_EH_RESET_TIMER:
-		blk_add_timer(req);
-		break;
-	case BLK_EH_DONE:
-		break;
-	default:
-		printk(KERN_ERR "block: bad eh return: %d\n", ret);
-		break;
+		ret = req->q->mq_ops->timeout(req, reserved);
+		if (ret == BLK_EH_DONE)
+			return;
+		WARN_ON_ONCE(ret != BLK_EH_RESET_TIMER);
 	}
+
+	blk_add_timer(req);
 }
 
 static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 07/14] mtip32xx: complete requests from ->timeout
  2018-05-29 13:52 ` [PATCH 07/14] mtip32xx: " Christoph Hellwig
@ 2018-05-29 14:17   ` Johannes Thumshirn
  0 siblings, 0 replies; 26+ messages in thread
From: Johannes Thumshirn @ 2018-05-29 14:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 12/14] block: remove BLK_EH_HANDLED
  2018-05-29 13:52 ` [PATCH 12/14] block: remove BLK_EH_HANDLED Christoph Hellwig
@ 2018-05-29 14:17   ` Johannes Thumshirn
  0 siblings, 0 replies; 26+ messages in thread
From: Johannes Thumshirn @ 2018-05-29 14:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 13/14] block: document the blk_eh_timer_return values
  2018-05-29 13:52 ` [PATCH 13/14] block: document the blk_eh_timer_return values Christoph Hellwig
@ 2018-05-29 14:17   ` Johannes Thumshirn
  0 siblings, 0 replies; 26+ messages in thread
From: Johannes Thumshirn @ 2018-05-29 14:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out
  2018-05-29 13:52 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
@ 2018-05-29 14:19   ` Johannes Thumshirn
  0 siblings, 0 replies; 26+ messages in thread
From: Johannes Thumshirn @ 2018-05-29 14:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, linux-block, linux-scsi

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/14] mmc: complete requests from ->timeout
  2018-05-29 13:52 ` [PATCH 10/14] mmc: " Christoph Hellwig
@ 2018-05-29 14:55   ` Shawn Lin
  2018-05-30  8:09     ` Adrian Hunter
  0 siblings, 1 reply; 26+ messages in thread
From: Shawn Lin @ 2018-05-29 14:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, shawn.lin, Bart Van Assche, Ming Lei,
	Josef Bacik, Tejun Heo, Lee Duncan, Chris Leech, linux-block,
	linux-scsi, Ulf Hansson, Adrian Hunter, linux-mmc

[+ Ulf, Adrian, linux-mmc, as linux-mmc and folks are missing in the
recipient list.]

On 2018/5/29 21:52, Christoph Hellwig wrote:
> By completing the request entirely in the driver we can remove the
> BLK_EH_HANDLED return value and thus the split responsibility between the
> driver and the block layer that has been causing trouble.
> 
> [While this keeps existing behavior it seems to mismatch the comment,
>   maintainers please chime in!]

The comment means the timeout doesn't happen for the current in-flight 
request which command queue is doing, and the request is finished 
previously(correct me if I'm wrong). So mmc defers the complete stuff to
blk_timeout_work to help. Maybe the comment should gone as well.

Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> ---
>   drivers/mmc/core/queue.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 56e9a803db21..648eb6743ed5 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -111,8 +111,9 @@ static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
>   				__mmc_cqe_recovery_notifier(mq);
>   			return BLK_EH_RESET_TIMER;
>   		}
> -		/* No timeout */
> -		return BLK_EH_HANDLED;
> +		/* No timeout (XXX: huh? comment doesn't make much sense) */
> +		blk_mq_complete_request(req);
> +		return BLK_EH_DONE;
>   	default:
>   		/* Timeout is handled by mmc core */
>   		return BLK_EH_RESET_TIMER;
> 


-- 
Best Regards
Shawn Lin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: drop abort sequence, kill BLK_EH_HANDLED v2
  2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2018-05-29 13:52 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
@ 2018-05-29 15:12 ` Jens Axboe
  14 siblings, 0 replies; 26+ messages in thread
From: Jens Axboe @ 2018-05-29 15:12 UTC (permalink / raw)
  To: Christoph Hellwig, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Lee Duncan,
	Chris Leech, linux-block, linux-scsi

On 5/29/18 7:52 AM, Christoph Hellwig wrote:
> Hi all,
> 
> this series removes the BLK_EH_HANDLED return value, and instead places
> responsibility of timed out commands entirely on the drivers.  Except
> for some odd layering vilations in libiscsi this actually is
> surprisingly simple.   Some of the grunt work here is from Keith and
> Bart, just repackaged up

Survives my basic smoke testing, including timeout triggering with
null_blk and SCSI. I have applied this, it's much cleaner and simpler.
And we shrink the request a lot, which is great. Note I applied a
followup patch that kills two holes in struct request, and I fixed
a trailing whitespace in patch #3.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 11/14] libiscsi: don't try to bypass SCSI EH
  2018-05-29 13:52 ` [PATCH 11/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
@ 2018-05-30  1:04   ` Lee Duncan
  0 siblings, 0 replies; 26+ messages in thread
From: Lee Duncan @ 2018-05-30  1:04 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe, Keith Busch
  Cc: Bart Van Assche, Ming Lei, Josef Bacik, Tejun Heo, Chris Leech,
	linux-block, linux-scsi

On 05/29/2018 06:52 AM, Christoph Hellwig wrote:
> libiscsi is the only SCSI code that return BLK_EH_HANDLED, thus trying to
> bypass the normal SCSI EH code.  We are going to remove this return value
> at the block layer, and at least from a quick look it doesn't look too
> harmful to try to send an abort for these cases, especially as the first
> one should not actually be possible.  If this doesn't work out iscsi
> will probably need its own eh_strategy_handler instead to just do the
> right thing.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> ---
>  drivers/scsi/libiscsi.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index eee43ba83a60..71bdc0b52cf9 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -1982,7 +1982,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
>  		 * Raced with completion. Blk layer has taken ownership
>  		 * so let timeout code complete it now.
>  		 */
> -		rc = BLK_EH_HANDLED;
> +		rc = BLK_EH_DONE;
>  		goto done;
>  	}
>  
> @@ -1997,7 +1997,7 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
>  		if (unlikely(system_state != SYSTEM_RUNNING)) {
>  			sc->result = DID_NO_CONNECT << 16;
>  			ISCSI_DBG_EH(session, "sc on shutdown, handled\n");
> -			rc = BLK_EH_HANDLED;
> +			rc = BLK_EH_DONE;
>  			goto done;
>  		}
>  		/*
> 
looks good to me.

Signed-off-by: Lee Duncan <lduncan@suse.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 10/14] mmc: complete requests from ->timeout
  2018-05-29 14:55   ` Shawn Lin
@ 2018-05-30  8:09     ` Adrian Hunter
  0 siblings, 0 replies; 26+ messages in thread
From: Adrian Hunter @ 2018-05-30  8:09 UTC (permalink / raw)
  To: Shawn Lin, Christoph Hellwig
  Cc: Jens Axboe, Keith Busch, Bart Van Assche, Ming Lei, Josef Bacik,
	Tejun Heo, Lee Duncan, Chris Leech, linux-block, linux-scsi,
	Ulf Hansson, linux-mmc

On 29/05/18 17:55, Shawn Lin wrote:
> [+ Ulf, Adrian, linux-mmc, as linux-mmc and folks are missing in the
> recipient list.]
> 
> On 2018/5/29 21:52, Christoph Hellwig wrote:
>> By completing the request entirely in the driver we can remove the
>> BLK_EH_HANDLED return value and thus the split responsibility between the
>> driver and the block layer that has been causing trouble.
>>
>> [While this keeps existing behavior it seems to mismatch the comment,
>>   maintainers please chime in!]
> 
> The comment means the timeout doesn't happen for the current in-flight
> request which command queue is doing, and the request is finished
> previously(correct me if I'm wrong). So mmc defers the complete stuff to
> blk_timeout_work to help. Maybe the comment should gone as well.
> 
> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
> 
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Reviewed-by: Hannes Reinecke <hare@suse.com>
>> ---
>>   drivers/mmc/core/queue.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
>> index 56e9a803db21..648eb6743ed5 100644
>> --- a/drivers/mmc/core/queue.c
>> +++ b/drivers/mmc/core/queue.c
>> @@ -111,8 +111,9 @@ static enum blk_eh_timer_return
>> mmc_cqe_timed_out(struct request *req)
>>                   __mmc_cqe_recovery_notifier(mq);
>>               return BLK_EH_RESET_TIMER;
>>           }
>> -        /* No timeout */
>> -        return BLK_EH_HANDLED;
>> +        /* No timeout (XXX: huh? comment doesn't make much sense) */
>> +        blk_mq_complete_request(req);

It seems the block layer timeout handler no longer castrates the completion
path, in which case blk_mq_complete_request() is not needed here.

The "No timeout" comment can be removed.

>> +        return BLK_EH_DONE;
>>       default:
>>           /* Timeout is handled by mmc core */
>>           return BLK_EH_RESET_TIMER;
>>
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/14] blk-mq: Remove generation seqeunce
  2018-05-29 13:52 ` [PATCH 03/14] blk-mq: Remove generation seqeunce Christoph Hellwig
@ 2018-06-04  6:56   ` Bart Van Assche
  2018-06-04 13:42     ` hch
  0 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2018-06-04  6:56 UTC (permalink / raw)
  To: hch, keith.busch, axboe
  Cc: linux-scsi, tj, lduncan, linux-block, josef, cleech, ming.lei

T24gVHVlLCAyMDE4LTA1LTI5IGF0IDE1OjUyICswMjAwLCBDaHJpc3RvcGggSGVsbHdpZyB3cm90
ZToNCj4gRnJvbTogS2VpdGggQnVzY2ggPGtlaXRoLmJ1c2NoQGludGVsLmNvbT4NCj4gDQo+IFRo
aXMgcGF0Y2ggc2ltcGxpZmllcyB0aGUgdGltZW91dCBoYW5kbGluZyBieSByZWx5aW5nIG9uIHRo
ZSByZXF1ZXN0DQo+IHJlZmVyZW5jZSBjb3VudGluZyB0byBlbnN1cmUgdGhlIGl0ZXJhdG9yIGlz
IG9wZXJhdGluZyBvbiBhbiBpbmZsaWdodA0KPiBhbmQgdHJ1bHkgdGltZWQgb3V0IHJlcXVlc3Qu
IFNpbmNlIHRoZSByZWZlcmVuY2UgY291bnRpbmcgcHJldmVudHMgdGhlDQo+IHRhZyBmcm9tIGJl
aW5nIHJlYWxsb2NhdGVkLCB0aGUgYmxvY2sgbGF5ZXIgbm8gbG9uZ2VyIG5lZWRzIHRvIHByZXZl
bnQNCj4gZHJpdmVycyBmcm9tIGNvbXBsZXRpbmcgdGhlaXIgcmVxdWVzdHMgd2hpbGUgdGhlIHRp
bWVvdXQgaGFuZGxlciBpcw0KPiBvcGVyYXRpbmcgb24gaXQ6IGEgZHJpdmVyIGNvbXBsZXRpbmcg
YSByZXF1ZXN0IGlzIGFsbG93ZWQgdG8gcHJvY2VlZCB0bw0KPiB0aGUgbmV4dCBzdGF0ZSB3aXRo
b3V0IGFkZGl0aW9uYWwgc3luY3Jvbml6YXRpb24gd2l0aCB0aGUgYmxvY2sgbGF5ZXIuDQo+IA0K
PiBUaGlzIGFsc28gcmVtb3ZlcyBhbnkgbmVlZCBmb3IgZ2VuZXJhdGlvbiBzZXF1ZW5jZSBudW1i
ZXJzIHNpbmNlIHRoZQ0KPiByZXF1ZXN0IGxpZmV0aW1lIGlzIHByZXZlbnRlZCBmcm9tIGJlaW5n
IHJlYWxsb2NhdGVkIGFzIGEgbmV3IHNlcXVlbmNlDQo+IHdoaWxlIHRpbWVvdXQgaGFuZGxpbmcg
aXMgb3BlcmF0aW5nIG9uIGl0Lg0KPiANCj4gVG8gZW5hYmxlcyB0aGlzIGEgcmVmY291bnQgaXMg
YWRkZWQgdG8gc3RydWN0IHJlcXVlc3Qgc28gdGhhdCByZXF1ZXN0DQo+IHVzZXJzIGNhbiBiZSBz
dXJlIHRoZXkncmUgb3BlcmF0aW5nIG9uIHRoZSBzYW1lIHJlcXVlc3Qgd2l0aG91dCBpdA0KPiBj
aGFuZ2luZyB3aGlsZSB0aGV5J3JlIHByb2Nlc3NpbmcgaXQuICBUaGUgcmVxdWVzdCdzIHRhZyB3
b24ndCBiZQ0KPiByZWxlYXNlZCBmb3IgcmV1c2UgdW50aWwgYm90aCB0aGUgdGltZW91dCBoYW5k
bGVyIGFuZCB0aGUgY29tcGxldGlvbg0KPiBhcmUgZG9uZSB3aXRoIGl0Lg0KPiANCj4gU2lnbmVk
LW9mZi1ieTogS2VpdGggQnVzY2ggPGtlaXRoLmJ1c2NoQGludGVsLmNvbT4NCj4gW2hjaDogc2xp
Z2h0IGNsZWFudXBzLCBhZGRlZCBiYWNrIHN1Ym1pc3Npb24gc2lkZSBoY3R4IGxvY2ssIHVzZSBj
bXB4Y2hnDQo+ICBmb3IgY29tcGxldGlvbnNdDQo+IFNpZ25lZC1vZmYtYnk6IENocmlzdG9waCBI
ZWxsd2lnIDxoY2hAbHN0LmRlPg0KDQpIZWxsbyBDaHJpc3RvcGgsDQoNCkNhbiB5b3UgZXhwbGFp
biB3aHkgeW91IHByZWZlciB0aGlzIGFwcHJvYWNoPyBJIHRoaW5rIHRoaXMgcGF0Y2ggaW50cm9k
dWNlcw0KbW9yZSBuZXcgYXRvbWljIG9wZXJhdGlvbnMgaW4gdGhlIGhvdCBwYXRoIHRoYW4gdGhl
IGFwcHJvYWNoIEkgaGFkIHByb3Bvc2VkDQooImJsay1tcTogUmV3b3JrIGJsay1tcSB0aW1lb3V0
IGhhbmRsaW5nIGFnYWluIikuIEluIG90aGVyIHdvcmRzLCBJIHRoaW5rIHRoZQ0KcGF0Y2ggdGhh
dCBJIGhhZCBwcm9wb3NlZCB3aWxsIHlpZWxkIGJldHRlciBwZXJmb3JtYW5jZS4NCg0KVGhhbmtz
LA0KDQpCYXJ0Lg0KDQoNCg==

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/14] blk-mq: Remove generation seqeunce
  2018-06-04  6:56   ` Bart Van Assche
@ 2018-06-04 13:42     ` hch
  2018-06-04 14:13       ` Bart Van Assche
  0 siblings, 1 reply; 26+ messages in thread
From: hch @ 2018-06-04 13:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: hch, keith.busch, axboe, linux-scsi, tj, lduncan, linux-block,
	josef, cleech, ming.lei

On Mon, Jun 04, 2018 at 06:56:22AM +0000, Bart Van Assche wrote:
> Can you explain why you prefer this approach? I think this patch introduces
> more new atomic operations in the hot path than the approach I had proposed
> ("blk-mq: Rework blk-mq timeout handling again"). In other words, I think the
> patch that I had proposed will yield better performance.

Primarily because the generation counter is fundamentally the wrong
thing to do.  If we know the tag has some pending action on it we
should not reuse it insted of working around races that stem from the
fact that we reuse it.

We can look at the performance numbers and tune as needed, but for now
we need to get the high level approach right first.  I think once
we've done a proper audit of the SCSI EH code and blk_abort_request
users we can probably drop the cmpxchg and only have a single
atomic_dec_and_test left, which in terms of atomic operations isn't
the world.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/14] blk-mq: Remove generation seqeunce
  2018-06-04 13:42     ` hch
@ 2018-06-04 14:13       ` Bart Van Assche
  0 siblings, 0 replies; 26+ messages in thread
From: Bart Van Assche @ 2018-06-04 14:13 UTC (permalink / raw)
  To: hch
  Cc: linux-block, axboe, ming.lei, linux-scsi, josef, lduncan, tj,
	keith.busch, cleech

T24gTW9uLCAyMDE4LTA2LTA0IGF0IDE1OjQyICswMjAwLCBoY2hAbHN0LmRlIHdyb3RlOg0KPiBP
biBNb24sIEp1biAwNCwgMjAxOCBhdCAwNjo1NjoyMkFNICswMDAwLCBCYXJ0IFZhbiBBc3NjaGUg
d3JvdGU6DQo+ID4gQ2FuIHlvdSBleHBsYWluIHdoeSB5b3UgcHJlZmVyIHRoaXMgYXBwcm9hY2g/
IEkgdGhpbmsgdGhpcyBwYXRjaCBpbnRyb2R1Y2VzDQo+ID4gbW9yZSBuZXcgYXRvbWljIG9wZXJh
dGlvbnMgaW4gdGhlIGhvdCBwYXRoIHRoYW4gdGhlIGFwcHJvYWNoIEkgaGFkIHByb3Bvc2VkDQo+
ID4gKCJibGstbXE6IFJld29yayBibGstbXEgdGltZW91dCBoYW5kbGluZyBhZ2FpbiIpLiBJbiBv
dGhlciB3b3JkcywgSSB0aGluayB0aGUNCj4gPiBwYXRjaCB0aGF0IEkgaGFkIHByb3Bvc2VkIHdp
bGwgeWllbGQgYmV0dGVyIHBlcmZvcm1hbmNlLg0KPiANCj4gUHJpbWFyaWx5IGJlY2F1c2UgdGhl
IGdlbmVyYXRpb24gY291bnRlciBpcyBmdW5kYW1lbnRhbGx5IHRoZSB3cm9uZw0KPiB0aGluZyB0
byBkby4gIElmIHdlIGtub3cgdGhlIHRhZyBoYXMgc29tZSBwZW5kaW5nIGFjdGlvbiBvbiBpdCB3
ZQ0KPiBzaG91bGQgbm90IHJldXNlIGl0IGluc3RlZCBvZiB3b3JraW5nIGFyb3VuZCByYWNlcyB0
aGF0IHN0ZW0gZnJvbSB0aGUNCj4gZmFjdCB0aGF0IHdlIHJldXNlIGl0Lg0KDQpXZSBtYXkgZWFj
aCBoYXZlIGEgZGlmZmVyZW50IG9waW5pb24gYWJvdXQgdGhpcy4gQW55d2F5LCBsZXQncyBub3Qg
Z2V0DQpkaXN0cmFjdGVkIGJ5IGEgZGlzY3Vzc2lvbiBhYm91dCB0aGVvcmV0aWNhbCBhcmd1bWVu
dHMuDQoNCj4gV2UgY2FuIGxvb2sgYXQgdGhlIHBlcmZvcm1hbmNlIG51bWJlcnMgYW5kIHR1bmUg
YXMgbmVlZGVkLCBidXQgZm9yIG5vdw0KPiB3ZSBuZWVkIHRvIGdldCB0aGUgaGlnaCBsZXZlbCBh
cHByb2FjaCByaWdodCBmaXJzdC4gIEkgdGhpbmsgb25jZQ0KPiB3ZSd2ZSBkb25lIGEgcHJvcGVy
IGF1ZGl0IG9mIHRoZSBTQ1NJIEVIIGNvZGUgYW5kIGJsa19hYm9ydF9yZXF1ZXN0DQo+IHVzZXJz
IHdlIGNhbiBwcm9iYWJseSBkcm9wIHRoZSBjbXB4Y2hnIGFuZCBvbmx5IGhhdmUgYSBzaW5nbGUN
Cj4gYXRvbWljX2RlY19hbmRfdGVzdCBsZWZ0LCB3aGljaCBpbiB0ZXJtcyBvZiBhdG9taWMgb3Bl
cmF0aW9ucyBpc24ndA0KPiB0aGUgd29ybGQuDQoNCkkgdGhpbmsgdGhhdCB0aGUgcmFjZSBiZXR3
ZWVuIHJlcXVlc3QgY29tcGxldGlvbiBhbmQgcmVxdWVzdCB0aW1lb3V0DQpoYW5kbGluZyBpcyBm
dW5kYW1lbnRhbCBzbyBJJ20gbm90IHN1cmUgdGhhdCBpdCBpcyBwb3NzaWJsZSB0byBlbGltaW5h
dGUNCnRoZSBjbXB4Y2hnKCkgY2FsbHMuIEFkZGl0aW9uYWxseSwgZXZlbiBpZiB0aGUgY21weGNo
ZygpIGNhbGxzIHdvdWxkIGJlDQplbGltaW5hdGVkLCBzdHJ1Y3QgcmVxdWVzdCB3aWxsIHN0aWxs
IGhhdmUgdHdvIHZhcmlhYmxlcyB0aGF0IHJlcHJlc2VudA0KdGhlIHJlcXVlc3Qgc3RhdGUgKHJx
LT5zdGF0ZSBhbmQgcnEtPnJlZikgYW5kIHRoYXQgYXJlIHVwZGF0ZWQgc2VwYXJhdGVseS4NCldp
dGggbXkgYXBwcm9hY2ggYWxsIHJlcXVlc3Qgc3RhdGUgaW5mb3JtYXRpb24gaXMgc3RvcmVkIGlu
IGEgc2luZ2xlDQp2YXJpYWJsZSB3aGljaCBtYWtlcyBpdCBlYXN5IHRvIHVwZGF0ZSB0aGUgaW5k
aXZpZHVhbCBmaWVsZHMgb2YgdGhhdA0KdmFyaWFibGUgc2ltdWx0YW5lb3VzbHkuDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-06-04 14:13 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-29 13:52 drop abort sequence, kill BLK_EH_HANDLED v2 Christoph Hellwig
2018-05-29 13:52 ` [PATCH 01/14] libata: remove ata_scsi_timed_out Christoph Hellwig
2018-05-29 13:52 ` [PATCH 02/14] blk-mq: Fix timeout and state order Christoph Hellwig
2018-05-29 13:52 ` [PATCH 03/14] blk-mq: Remove generation seqeunce Christoph Hellwig
2018-06-04  6:56   ` Bart Van Assche
2018-06-04 13:42     ` hch
2018-06-04 14:13       ` Bart Van Assche
2018-05-29 13:52 ` [PATCH 04/14] block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE Christoph Hellwig
2018-05-29 13:52 ` [PATCH 05/14] nvme: return BLK_EH_DONE from ->timeout Christoph Hellwig
2018-05-29 13:52 ` [PATCH 06/14] nbd: complete requests " Christoph Hellwig
2018-05-29 13:52 ` [PATCH 07/14] mtip32xx: " Christoph Hellwig
2018-05-29 14:17   ` Johannes Thumshirn
2018-05-29 13:52 ` [PATCH 08/14] null_blk: " Christoph Hellwig
2018-05-29 13:52 ` [PATCH 09/14] scsi_transport_fc: " Christoph Hellwig
2018-05-29 13:52 ` [PATCH 10/14] mmc: " Christoph Hellwig
2018-05-29 14:55   ` Shawn Lin
2018-05-30  8:09     ` Adrian Hunter
2018-05-29 13:52 ` [PATCH 11/14] libiscsi: don't try to bypass SCSI EH Christoph Hellwig
2018-05-30  1:04   ` Lee Duncan
2018-05-29 13:52 ` [PATCH 12/14] block: remove BLK_EH_HANDLED Christoph Hellwig
2018-05-29 14:17   ` Johannes Thumshirn
2018-05-29 13:52 ` [PATCH 13/14] block: document the blk_eh_timer_return values Christoph Hellwig
2018-05-29 14:17   ` Johannes Thumshirn
2018-05-29 13:52 ` [PATCH 14/14] blk-mq: simplify blk_mq_rq_timed_out Christoph Hellwig
2018-05-29 14:19   ` Johannes Thumshirn
2018-05-29 15:12 ` drop abort sequence, kill BLK_EH_HANDLED v2 Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.