All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5]stop normal completion path entering a timeout req
@ 2018-06-20 13:22 Jianchao Wang
  2018-06-20 13:22 ` [PATCH 1/5] blk-mq: prevent normal completion from entering a timeout request Jianchao Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Jianchao Wang @ 2018-06-20 13:22 UTC (permalink / raw)
  To: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson
  Cc: linux-block, linux-scsi, linux-kernel

Dear all

scsi timeout and error handler are based on an assumption that normal
completion mustn't do anything on an timeout request. After 12f5b931
(blk-mq: Remove generation seqeunce), we lost this. __blk_mq_complete
request could ensure a request won't be completed twice, but it can
still complete a timeout request.
scsi (even other drivers) have been working on this assumption for many
years, it is dangerous to discard it suddenly. This patch set is to regain this.

1st patch
 - introduce blk_mq_mark_rq_complete in blk_mq_check_expired and blk_mq_complete_request
   to avoid the race between them.
 - blk_mq_complete_request cannot complete a timeout request anymore, so have to export
   __blk_mq_complete_request for LLDD timeout path
2nd ~ 5th patches
  replace the blk_mq_complete_request to __blk_mq_complete_request.

Jianchao Wang(5)
blk-mq: prevent normal completion from entering a timeout request
nbd: use __blk_mq_complete_request in timeout path
null_blk: use __blk_mq_complete_request in timeout path
mmc: use __blk_mq_complete_request in timeout path
nvme: use __blk_mq_complete_request in timeout path

 block/blk-mq.c             | 22 +++++++++++++++-------
 drivers/block/nbd.c        |  2 +-
 drivers/block/null_blk.c   |  2 +-
 drivers/mmc/core/queue.c   |  2 +-
 drivers/nvme/host/pci.c    |  8 ++++++++
 drivers/nvme/host/rdma.c   |  1 +
 drivers/nvme/target/loop.c |  1 +
 include/linux/blk-mq.h     |  1 +
 include/linux/blkdev.h     |  6 ++++++

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/5] blk-mq: prevent normal completion from entering a timeout request
  2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
@ 2018-06-20 13:22 ` Jianchao Wang
  2018-06-20 13:22 ` [PATCH 2/5] nbd: use __blk_mq_complete_request in timeout path Jianchao Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Jianchao Wang @ 2018-06-20 13:22 UTC (permalink / raw)
  To: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson
  Cc: linux-block, linux-scsi, linux-kernel

scsi time out and error handler are based on an assumption that
normal completion mustn't do anything on an timeout request.
After 12f5b931 (blk-mq: Remove generation seqeunce), we lost this.
To regain it, introduce blk_mq_mark_rq_complete which change state
from IN-FLIGHT to COMPLETE atomically. It will be added to
blk_mq_complete_request and blk_mq_check_expired to avoid the race
between timeout and normal io completion path.

Because we have hand over the task to complete a timeout request to
LLDD timeout path by invoking blk_mq_complete_request, export
__blk_mq_complete_request for LLDD to complete the timeout request.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-mq.c         | 22 +++++++++++++++-------
 include/linux/blk-mq.h |  1 +
 include/linux/blkdev.h |  6 ++++++
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 70c65bb..a73acbd 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -473,6 +473,7 @@ static void __blk_mq_free_request(struct request *rq)
 	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
 	const int sched_tag = rq->internal_tag;
 
+	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 	if (rq->tag != -1)
 		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
 	if (sched_tag != -1)
@@ -509,7 +510,6 @@ void blk_mq_free_request(struct request *rq)
 	if (blk_rq_rl(rq))
 		blk_put_rl(blk_rq_rl(rq));
 
-	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 	if (refcount_dec_and_test(&rq->ref))
 		__blk_mq_free_request(rq);
 }
@@ -552,15 +552,17 @@ static void __blk_mq_complete_request_remote(void *data)
 	rq->q->softirq_done_fn(rq);
 }
 
-static void __blk_mq_complete_request(struct request *rq)
+/*
+ * The LLDD timeout path must invoke this interface to complete
+ * the request.
+ */
+void __blk_mq_complete_request(struct request *rq)
 {
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	bool shared = false;
 	int cpu;
 
-	if (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
-			MQ_RQ_IN_FLIGHT)
-		return;
+	WARN_ON(blk_mq_rq_state(rq) != MQ_RQ_COMPLETE);
 
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
@@ -584,6 +586,7 @@ static void __blk_mq_complete_request(struct request *rq)
 	}
 	put_cpu();
 }
+EXPORT_SYMBOL(__blk_mq_complete_request);
 
 static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx)
 	__releases(hctx->srcu)
@@ -617,7 +620,9 @@ void blk_mq_complete_request(struct request *rq)
 {
 	if (unlikely(blk_should_fake_timeout(rq->q)))
 		return;
-	__blk_mq_complete_request(rq);
+
+	if (blk_mq_mark_rq_complete(rq))
+		__blk_mq_complete_request(rq);
 }
 EXPORT_SYMBOL(blk_mq_complete_request);
 
@@ -783,6 +788,7 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 
 	req->rq_flags &= ~RQF_TIMED_OUT;
 	blk_add_timer(req);
+	WRITE_ONCE(req->state, MQ_RQ_IN_FLIGHT);
 }
 
 static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
@@ -835,8 +841,10 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 	 * expired; if it is not expired, then the request was completed and
 	 * reallocated as a new request.
 	 */
-	if (blk_mq_req_expired(rq, next))
+	if (blk_mq_req_expired(rq, next) &&
+		blk_mq_mark_rq_complete(rq)) {
 		blk_mq_rq_timed_out(rq, reserved);
+	}
 	if (refcount_dec_and_test(&rq->ref))
 		__blk_mq_free_request(rq);
 }
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e3147eb..0a509ae 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -259,6 +259,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head,
 void blk_mq_kick_requeue_list(struct request_queue *q);
 void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
 void blk_mq_complete_request(struct request *rq);
+void __blk_mq_complete_request(struct request *rq);
 bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
 			   struct bio *bio);
 bool blk_mq_queue_stopped(struct request_queue *q);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 9154570..37511c4 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -272,6 +272,12 @@ struct request {
 #endif
 };
 
+static inline bool blk_mq_mark_rq_complete(struct request *rq)
+{
+	return (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) ==
+			MQ_RQ_IN_FLIGHT);
+}
+
 static inline bool blk_op_is_scsi(unsigned int op)
 {
 	return op == REQ_OP_SCSI_IN || op == REQ_OP_SCSI_OUT;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/5] nbd: use __blk_mq_complete_request in timeout path
  2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
  2018-06-20 13:22 ` [PATCH 1/5] blk-mq: prevent normal completion from entering a timeout request Jianchao Wang
@ 2018-06-20 13:22 ` Jianchao Wang
  2018-06-20 14:13   ` Josef Bacik
  2018-06-20 13:22 ` [PATCH 3/5] null_blk: " Jianchao Wang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Jianchao Wang @ 2018-06-20 13:22 UTC (permalink / raw)
  To: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson
  Cc: linux-block, linux-scsi, linux-kernel

To regain the capability to prevent normal completion path from
entering a timeout request, blk_mq_mark_rq_complete is introduced
in blk_mq_complete_request. Have to use __blk_mq_complete_request
in timeout path to complete a timeout request.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/block/nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 3b7083b..d7e533a 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -354,7 +354,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
 	sock_shutdown(nbd);
 	nbd_config_put(nbd);
 done:
-	blk_mq_complete_request(req);
+	__blk_mq_complete_request(req);
 	return BLK_EH_DONE;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/5] null_blk: use __blk_mq_complete_request in timeout path
  2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
  2018-06-20 13:22 ` [PATCH 1/5] blk-mq: prevent normal completion from entering a timeout request Jianchao Wang
  2018-06-20 13:22 ` [PATCH 2/5] nbd: use __blk_mq_complete_request in timeout path Jianchao Wang
@ 2018-06-20 13:22 ` Jianchao Wang
  2018-06-20 13:22 ` [PATCH 4/5] mmc: " Jianchao Wang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Jianchao Wang @ 2018-06-20 13:22 UTC (permalink / raw)
  To: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson
  Cc: linux-block, linux-scsi, linux-kernel

To regain the capability to prevent normal completion path from
entering a timeout request, blk_mq_mark_rq_complete is introduced
in blk_mq_complete_request. Have to use __blk_mq_complete_request
in timeout path to complete a timeout request.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/block/null_blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index 7948049..d756539 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -1428,7 +1428,7 @@ static void null_request_fn(struct request_queue *q)
 static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res)
 {
 	pr_info("null: rq %p timed out\n", rq);
-	blk_mq_complete_request(rq);
+	__blk_mq_complete_request(rq);
 	return BLK_EH_DONE;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/5] mmc: use __blk_mq_complete_request in timeout path
  2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
                   ` (2 preceding siblings ...)
  2018-06-20 13:22 ` [PATCH 3/5] null_blk: " Jianchao Wang
@ 2018-06-20 13:22 ` Jianchao Wang
  2018-06-20 13:22 ` [PATCH 5/5] nvme: " Jianchao Wang
  2018-06-20 18:16 ` [PATCH 0/5]stop normal completion path entering a timeout req Keith Busch
  5 siblings, 0 replies; 24+ messages in thread
From: Jianchao Wang @ 2018-06-20 13:22 UTC (permalink / raw)
  To: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson
  Cc: linux-block, linux-scsi, linux-kernel

To regain the capability to prevent normal completion path from
entering a timeout request, blk_mq_mark_rq_complete is introduced
in blk_mq_complete_request. Have to use __blk_mq_complete_request
in timeout path to complete a timeout request.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/mmc/core/queue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 648eb67..43dcd7a 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -112,7 +112,7 @@ static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
 			return BLK_EH_RESET_TIMER;
 		}
 		/* No timeout (XXX: huh? comment doesn't make much sense) */
-		blk_mq_complete_request(req);
+		__blk_mq_complete_request(req);
 		return BLK_EH_DONE;
 	default:
 		/* Timeout is handled by mmc core */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path
  2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
                   ` (3 preceding siblings ...)
  2018-06-20 13:22 ` [PATCH 4/5] mmc: " Jianchao Wang
@ 2018-06-20 13:22 ` Jianchao Wang
  2018-06-20 14:39   ` Christoph Hellwig
  2018-06-20 18:16 ` [PATCH 0/5]stop normal completion path entering a timeout req Keith Busch
  5 siblings, 1 reply; 24+ messages in thread
From: Jianchao Wang @ 2018-06-20 13:22 UTC (permalink / raw)
  To: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson
  Cc: linux-block, linux-scsi, linux-kernel

To regain the capability to prevent normal completion path from
entering a timeout request, blk_mq_mark_rq_complete is introduced
in blk_mq_complete_request. Have to use __blk_mq_complete_request
in timeout path to complete a timeout request.

nvme_cancel_request uses blk_mq_complete_request, it won't do
anything on the timeout request, so add __blk_mq_complete_request
before return BLK_EH_DONE.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/nvme/host/pci.c    | 8 ++++++++
 drivers/nvme/host/rdma.c   | 1 +
 drivers/nvme/target/loop.c | 1 +
 3 files changed, 10 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 73a97fc..2a161f6 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		nvme_warn_reset(dev, csts);
 		nvme_dev_disable(dev, false);
 		nvme_reset_ctrl(&dev->ctrl);
+		__blk_mq_complete_request(req);
 		return BLK_EH_DONE;
 	}
 
@@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		dev_warn(dev->ctrl.device,
 			 "I/O %d QID %d timeout, completion polled\n",
 			 req->tag, nvmeq->qid);
+		/*
+		 * nvme_end_request will invoke blk_mq_complete_request,
+		 * it will do nothing for this timed out request.
+		 */
+		__blk_mq_complete_request(req);
 		return BLK_EH_DONE;
 	}
 
@@ -1230,6 +1236,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 			 req->tag, nvmeq->qid);
 		nvme_dev_disable(dev, false);
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
+		__blk_mq_complete_request(req);
 		return BLK_EH_DONE;
 	default:
 		break;
@@ -1248,6 +1255,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 		nvme_reset_ctrl(&dev->ctrl);
 
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
+		__blk_mq_complete_request(req);
 		return BLK_EH_DONE;
 	}
 
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c9424da..287eecd 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1617,6 +1617,7 @@ nvme_rdma_timeout(struct request *rq, bool reserved)
 	/* fail with DNR on cmd timeout */
 	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
 
+	__blk_mq_complete_request(rq);
 	return BLK_EH_DONE;
 }
 
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index d8d91f0..627fbb0 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -148,6 +148,7 @@ nvme_loop_timeout(struct request *rq, bool reserved)
 	/* fail with DNR on admin cmd timeout */
 	nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
 
+	__blk_mq_complete_request(rq);
 	return BLK_EH_DONE;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] nbd: use __blk_mq_complete_request in timeout path
  2018-06-20 13:22 ` [PATCH 2/5] nbd: use __blk_mq_complete_request in timeout path Jianchao Wang
@ 2018-06-20 14:13   ` Josef Bacik
  0 siblings, 0 replies; 24+ messages in thread
From: Josef Bacik @ 2018-06-20 14:13 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: Jens Axboe, Christoph Hellwig, Martin K. Petersen, Keith Busch,
	ulf.hansson, linux-block, linux-scsi, Linux Kernel

You can add

Acked-by: Josef Bacik <josef@toxicpanda.com>

for the NBD patch.  Thanks,

Josef

On Wed, Jun 20, 2018 at 9:22 AM, Jianchao Wang
<jianchao.w.wang@oracle.com> wrote:
> To regain the capability to prevent normal completion path from
> entering a timeout request, blk_mq_mark_rq_complete is introduced
> in blk_mq_complete_request. Have to use __blk_mq_complete_request
> in timeout path to complete a timeout request.
>
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>  drivers/block/nbd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 3b7083b..d7e533a 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -354,7 +354,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>         sock_shutdown(nbd);
>         nbd_config_put(nbd);
>  done:
> -       blk_mq_complete_request(req);
> +       __blk_mq_complete_request(req);
>         return BLK_EH_DONE;
>  }
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path
  2018-06-20 13:22 ` [PATCH 5/5] nvme: " Jianchao Wang
@ 2018-06-20 14:39   ` Christoph Hellwig
  2018-06-21  2:09     ` jianchao.wang
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2018-06-20 14:39 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, hch, martin.petersen, keith.busch, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 73a97fc..2a161f6 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		nvme_warn_reset(dev, csts);
>  		nvme_dev_disable(dev, false);
>  		nvme_reset_ctrl(&dev->ctrl);
> +		__blk_mq_complete_request(req);
>  		return BLK_EH_DONE;
>  	}
>  
> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		dev_warn(dev->ctrl.device,
>  			 "I/O %d QID %d timeout, completion polled\n",
>  			 req->tag, nvmeq->qid);
> +		/*
> +		 * nvme_end_request will invoke blk_mq_complete_request,
> +		 * it will do nothing for this timed out request.
> +		 */
> +		__blk_mq_complete_request(req);

And this clearly is bogus.  We want to iterate over the tagetset
and cancel all requests, not do that manually here.

That was the whole point of the original change.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
                   ` (4 preceding siblings ...)
  2018-06-20 13:22 ` [PATCH 5/5] nvme: " Jianchao Wang
@ 2018-06-20 18:16 ` Keith Busch
  2018-06-21  1:43   ` jianchao.wang
  5 siblings, 1 reply; 24+ messages in thread
From: Keith Busch @ 2018-06-20 18:16 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, hch, martin.petersen, josef, ulf.hansson, linux-block,
	linux-scsi, linux-kernel

On Wed, Jun 20, 2018 at 09:22:39PM +0800, Jianchao Wang wrote:
> Dear all
> 
> scsi timeout and error handler are based on an assumption that normal
> completion mustn't do anything on an timeout request. After 12f5b931
> (blk-mq: Remove generation seqeunce), we lost this. __blk_mq_complete
> request could ensure a request won't be completed twice, but it can
> still complete a timeout request.
> scsi (even other drivers) have been working on this assumption for many
> years, it is dangerous to discard it suddenly. This patch set is to regain this.

I certainly don't want to harm any drivers. Could you possibly explain
what about removing silent execptions from the completion handler and
letting drivers control the destiny of requests they own is "dangerous"?

A initial look at your proposal looks pretty harmful to me. A driver may
return BLK_EH_RESET_TIMER, then call blk_mq_complete_req from another
thread, and your patch will simply lose that request and escalate error
recovery. That seems exactly what you shouldn't want to happen.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-20 18:16 ` [PATCH 0/5]stop normal completion path entering a timeout req Keith Busch
@ 2018-06-21  1:43   ` jianchao.wang
  2018-06-21  8:19     ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: jianchao.wang @ 2018-06-21  1:43 UTC (permalink / raw)
  To: Keith Busch
  Cc: axboe, hch, martin.petersen, josef, ulf.hansson, linux-block,
	linux-scsi, linux-kernel

Hi Keith

Thanks for your kindly response.

On 06/21/2018 02:16 AM, Keith Busch wrote:
> On Wed, Jun 20, 2018 at 09:22:39PM +0800, Jianchao Wang wrote:
>> Dear all
>>
>> scsi timeout and error handler are based on an assumption that normal
>> completion mustn't do anything on an timeout request. After 12f5b931
>> (blk-mq: Remove generation seqeunce), we lost this. __blk_mq_complete
>> request could ensure a request won't be completed twice, but it can
>> still complete a timeout request.
>> scsi (even other drivers) have been working on this assumption for many
>> years, it is dangerous to discard it suddenly. This patch set is to regain this.
> 
> I certainly don't want to harm any drivers. Could you possibly explain
> what about removing silent execptions from the completion handler and
> letting drivers control the destiny of requests they own is "dangerous"?

Letting LLDD control the destiny of requests they own is great idea !
But some of the LLDD (such as scsi) depends on an assumption (or setup)
normal completion mustn't do anything on an timeout request and this is provided
by block layer before 12f5b931 (blk-mq: Remove generation seqeunce) for many years.
timer and IO completion will both attempt to 'grab' the request, we have to make
sure that only one of them succeeds.

We could also refer to the following segment of the Documentation/scsi/scsi_eh.txt
" Note that this does not mean lower layers are quiescent.  If a LLDD
completed a scmd with error status, the LLDD and lower layers are
assumed to forget about the scmd at that point.  However, if a scmd
has timed out, unless hostt->eh_timed_out() made lower layers forget
about the scmd, which currently no LLDD does, the command is still
active as long as lower layers are concerned and completion could
occur at any time.  Of course, all such completions are ignored as the
timer has already expired.
"

So we have to preserve the ability of block layer that it could prevent
IO completion path from entering a timeout request.

With scsi-debug module, I tried to simulate a scenario where timeout and IO
completion path could occur concurrently, the system ran into crash easily.

> 
> A initial look at your proposal looks pretty harmful to me. A driver may
> return BLK_EH_RESET_TIMER, then call blk_mq_complete_req from another
> thread, and your patch will simply lose that request and escalate error
> recovery. That seems exactly what you shouldn't want to happen.
> 
Yes, this is indeed a hole.
The escalated error recovery should could handle this.
And it will be a better scenario than the one caused by trace between io completion
and timeout path.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path
  2018-06-20 14:39   ` Christoph Hellwig
@ 2018-06-21  2:09     ` jianchao.wang
  2018-06-24 18:07       ` Sagi Grimberg
  0 siblings, 1 reply; 24+ messages in thread
From: jianchao.wang @ 2018-06-21  2:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, martin.petersen, keith.busch, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

Hi Christoph

Thanks for your kindly response.

On 06/20/2018 10:39 PM, Christoph Hellwig wrote:
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 73a97fc..2a161f6 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>  		nvme_warn_reset(dev, csts);
>>  		nvme_dev_disable(dev, false);
>>  		nvme_reset_ctrl(&dev->ctrl);
>> +		__blk_mq_complete_request(req);
>>  		return BLK_EH_DONE;
>>  	}
>>  
>> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>  		dev_warn(dev->ctrl.device,
>>  			 "I/O %d QID %d timeout, completion polled\n",
>>  			 req->tag, nvmeq->qid);
>> +		/*
>> +		 * nvme_end_request will invoke blk_mq_complete_request,
>> +		 * it will do nothing for this timed out request.
>> +		 */
>> +		__blk_mq_complete_request(req);
> 
> And this clearly is bogus.  We want to iterate over the tagetset
> and cancel all requests, not do that manually here.
> 
> That was the whole point of the original change.
> 

For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still
running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work.
However, the hole is still there right now w/o my changes, but just narrower. 

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21  1:43   ` jianchao.wang
@ 2018-06-21  8:19     ` Christoph Hellwig
  2018-06-21  8:22       ` jianchao.wang
                         ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Christoph Hellwig @ 2018-06-21  8:19 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Keith Busch, axboe, hch, martin.petersen, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

On Thu, Jun 21, 2018 at 09:43:26AM +0800, jianchao.wang wrote:
> So we have to preserve the ability of block layer that it could prevent
> IO completion path from entering a timeout request.
> 
> With scsi-debug module, I tried to simulate a scenario where timeout and IO
> completion path could occur concurrently, the system ran into crash easily.

Trace, please.  With the latest kernel.  I'm not saying that there
is nothing to fix, but the mode of never completing once timeout
requests as currently done is SCSI is clearly broken.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21  8:19     ` Christoph Hellwig
@ 2018-06-21  8:22       ` jianchao.wang
  2018-06-22 15:10         ` Christoph Hellwig
  2018-06-21 13:13       ` jianchao.wang
  2018-06-21 18:21         ` Bart Van Assche
  2 siblings, 1 reply; 24+ messages in thread
From: jianchao.wang @ 2018-06-21  8:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, axboe, martin.petersen, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

Hi Christoph

Thanks for your kindly response.

On 06/21/2018 04:19 PM, Christoph Hellwig wrote:
> On Thu, Jun 21, 2018 at 09:43:26AM +0800, jianchao.wang wrote:
>> So we have to preserve the ability of block layer that it could prevent
>> IO completion path from entering a timeout request.
>>
>> With scsi-debug module, I tried to simulate a scenario where timeout and IO
>> completion path could occur concurrently, the system ran into crash easily.
> 
> Trace, please.  With the latest kernel.  I'm not saying that there
> is nothing to fix, but the mode of never completing once timeout
> requests as currently done is SCSI is clearly broken.
> 

I didn't find the existing method to simulate this.
So I modified the scsi-debug as following patch as install it as following:
modprobe scsi-debug delay=-1 ndelay=-1
Both 4.17-rc1 and 4.18-rc1 with this patch set could survive from the test.

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 24d7496..f278e6c 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -4323,6 +4323,8 @@ static void setup_inject(struct sdebug_queue *sqp,
        sqcp->inj_host_busy = !!(SDEBUG_OPT_HOST_BUSY & sdebug_opts);
 }
 
+static atomic_t g_abort_counter;
+
 /* Complete the processing of the thread that queued a SCSI command to this
  * driver. It either completes the command by calling cmnd_done() or
  * schedules a hr timer or work queue then returns 0. Returns
@@ -4459,6 +4461,11 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip,
                        sd_dp->issuing_cpu = raw_smp_processor_id();
                sd_dp->defer_t = SDEB_DEFER_WQ;
                schedule_work(&sd_dp->ew.work);
+               atomic_inc(&g_abort_counter);
+               if (atomic_read(&g_abort_counter)%2000 == 0) {
+                       blk_abort_request(cmnd->request);
+                       trace_printk("abort request tag %d\n", cmnd->request->tag);
+               }
        }
        if (unlikely((SDEBUG_OPT_Q_NOISE & sdebug_opts) &&
                     (scsi_result == device_qfull_result)))
@@ -5844,6 +5851,7 @@ static int sdebug_driver_probe(struct device *dev)
        struct Scsi_Host *hpnt;
        int hprot;
 
+       atomic_set(&g_abort_counter, 0);
        sdbg_host = to_sdebug_host(dev);
 
        sdebug_driver_template.can_queue = sdebug_max_queue;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21  8:19     ` Christoph Hellwig
  2018-06-21  8:22       ` jianchao.wang
@ 2018-06-21 13:13       ` jianchao.wang
  2018-06-21 15:01         ` Keith Busch
  2018-06-21 18:21         ` Bart Van Assche
  2 siblings, 1 reply; 24+ messages in thread
From: jianchao.wang @ 2018-06-21 13:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, axboe, martin.petersen, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

Hi Christoph

On 06/21/2018 04:19 PM, Christoph Hellwig wrote:
> On Thu, Jun 21, 2018 at 09:43:26AM +0800, jianchao.wang wrote:
>> So we have to preserve the ability of block layer that it could prevent
>> IO completion path from entering a timeout request.
>>
>> With scsi-debug module, I tried to simulate a scenario where timeout and IO
>> completion path could occur concurrently, the system ran into crash easily.
> 
> Trace, please.  With the latest kernel.  I'm not saying that there
> is nothing to fix, but the mode of never completing once timeout
> requests as currently done is SCSI is clearly broken.
> 

Sorry, I don't quite get your point.
Do you mean we should do the modification in the scsi layer ?
Actually, I used to look at the code to try to stop io completion on timeout request
in scsi-mid layer, but it looks like difficult, especially, scsi eh would try
to borrow the command to issue admin command such as get sense, test unit,etc.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21 13:13       ` jianchao.wang
@ 2018-06-21 15:01         ` Keith Busch
  0 siblings, 0 replies; 24+ messages in thread
From: Keith Busch @ 2018-06-21 15:01 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Christoph Hellwig, axboe, martin.petersen, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

On Thu, Jun 21, 2018 at 09:13:30PM +0800, jianchao.wang wrote:
> Hi Christoph
> 
> On 06/21/2018 04:19 PM, Christoph Hellwig wrote:
> > On Thu, Jun 21, 2018 at 09:43:26AM +0800, jianchao.wang wrote:
> >> So we have to preserve the ability of block layer that it could prevent
> >> IO completion path from entering a timeout request.
> >>
> >> With scsi-debug module, I tried to simulate a scenario where timeout and IO
> >> completion path could occur concurrently, the system ran into crash easily.
> > 
> > Trace, please.  With the latest kernel.  I'm not saying that there
> > is nothing to fix, but the mode of never completing once timeout
> > requests as currently done is SCSI is clearly broken.
> > 
> 
> Sorry, I don't quite get your point.
> Do you mean we should do the modification in the scsi layer ?

I think it would just be helpful to start with the stack trace of your
observed crash + the reproduction steps so the problem (and hopefully
a solution) can be better understood.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21  8:19     ` Christoph Hellwig
@ 2018-06-21 18:21         ` Bart Van Assche
  2018-06-21 13:13       ` jianchao.wang
  2018-06-21 18:21         ` Bart Van Assche
  2 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2018-06-21 18:21 UTC (permalink / raw)
  To: hch, jianchao.w.wang
  Cc: linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	josef, ulf.hansson, keith.busch

T24gVGh1LCAyMDE4LTA2LTIxIGF0IDEwOjE5ICswMjAwLCBDaHJpc3RvcGggSGVsbHdpZyB3cm90
ZToNCj4gT24gVGh1LCBKdW4gMjEsIDIwMTggYXQgMDk6NDM6MjZBTSArMDgwMCwgamlhbmNoYW8u
d2FuZyB3cm90ZToNCj4gPiBTbyB3ZSBoYXZlIHRvIHByZXNlcnZlIHRoZSBhYmlsaXR5IG9mIGJs
b2NrIGxheWVyIHRoYXQgaXQgY291bGQgcHJldmVudA0KPiA+IElPIGNvbXBsZXRpb24gcGF0aCBm
cm9tIGVudGVyaW5nIGEgdGltZW91dCByZXF1ZXN0Lg0KPiA+IA0KPiA+IFdpdGggc2NzaS1kZWJ1
ZyBtb2R1bGUsIEkgdHJpZWQgdG8gc2ltdWxhdGUgYSBzY2VuYXJpbyB3aGVyZSB0aW1lb3V0IGFu
ZCBJTw0KPiA+IGNvbXBsZXRpb24gcGF0aCBjb3VsZCBvY2N1ciBjb25jdXJyZW50bHksIHRoZSBz
eXN0ZW0gcmFuIGludG8gY3Jhc2ggZWFzaWx5Lg0KPiANCj4gVHJhY2UsIHBsZWFzZS4gIFdpdGgg
dGhlIGxhdGVzdCBrZXJuZWwuICBJJ20gbm90IHNheWluZyB0aGF0IHRoZXJlDQo+IGlzIG5vdGhp
bmcgdG8gZml4LCBidXQgdGhlIG1vZGUgb2YgbmV2ZXIgY29tcGxldGluZyBvbmNlIHRpbWVvdXQN
Cj4gcmVxdWVzdHMgYXMgY3VycmVudGx5IGRvbmUgaXMgU0NTSSBpcyBjbGVhcmx5IGJyb2tlbi4N
Cg0KVGhhdCdzIG5vdCBob3cgdGhlIFNDU0kgY29yZSB3b3Jrcy4NCg0KQmFydC4NCg0KDQoNCg==

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
@ 2018-06-21 18:21         ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2018-06-21 18:21 UTC (permalink / raw)
  To: hch, jianchao.w.wang
  Cc: linux-kernel, linux-block, martin.petersen, axboe, linux-scsi,
	josef, ulf.hansson, keith.busch

On Thu, 2018-06-21 at 10:19 +0200, Christoph Hellwig wrote:
> On Thu, Jun 21, 2018 at 09:43:26AM +0800, jianchao.wang wrote:
> > So we have to preserve the ability of block layer that it could prevent
> > IO completion path from entering a timeout request.
> > 
> > With scsi-debug module, I tried to simulate a scenario where timeout and IO
> > completion path could occur concurrently, the system ran into crash easily.
> 
> Trace, please.  With the latest kernel.  I'm not saying that there
> is nothing to fix, but the mode of never completing once timeout
> requests as currently done is SCSI is clearly broken.

That's not how the SCSI core works.

Bart.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21 18:21         ` Bart Van Assche
  (?)
@ 2018-06-21 21:15         ` Keith Busch
  2018-06-21 21:30           ` Bart Van Assche
  -1 siblings, 1 reply; 24+ messages in thread
From: Keith Busch @ 2018-06-21 21:15 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: hch, jianchao.w.wang, linux-kernel, linux-block, martin.petersen,
	axboe, linux-scsi, josef, ulf.hansson

On Thu, Jun 21, 2018 at 06:21:09PM +0000, Bart Van Assche wrote:
> That's not how the SCSI core works.

How does it work? Jianchao is advocating for holes in software
to ignore reacting to hardware events and force unnecessary error
escalation. Putting holes in kernel software can't possibly be our
only recourse.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21 21:15         ` Keith Busch
@ 2018-06-21 21:30           ` Bart Van Assche
  0 siblings, 0 replies; 24+ messages in thread
From: Bart Van Assche @ 2018-06-21 21:30 UTC (permalink / raw)
  To: Keith Busch
  Cc: hch, jianchao.w.wang, linux-kernel, linux-block, martin.petersen,
	axboe, linux-scsi, josef, ulf.hansson

On 06/21/18 14:12, Keith Busch wrote:
> On Thu, Jun 21, 2018 at 06:21:09PM +0000, Bart Van Assche wrote:
>> That's not how the SCSI core works.
> 
> How does it work? Jianchao is advocating for holes in software
> to ignore reacting to hardware events and force unnecessary error
> escalation. Putting holes in kernel software can't possibly be our
> only recourse.

Why were the patch "blk-mq: Remove generation seqeunce" and related 
patches rushed upstream without having given anyone a chance to review 
these patches? The approach of these patches is inferior to at least one 
alternative that was available. The code that is currently upstream
spreads the request state over multiple (three) variables while an 
alternative approach was presented that stores the request state in a 
single variable. I think the number of fix-up patches that is needed for 
the approach that is currently upstream shows that the upstream approach 
is inferior. I'm referring to "[PATCH v14] blk-mq: Rework blk-mq timeout 
handling again" (https://marc.info/?l=linux-block&m=152703042412228).

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-21  8:22       ` jianchao.wang
@ 2018-06-22 15:10         ` Christoph Hellwig
  2018-06-25  1:29           ` jianchao.wang
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2018-06-22 15:10 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Christoph Hellwig, Keith Busch, axboe, martin.petersen, josef,
	ulf.hansson, linux-block, linux-scsi, linux-kernel

On Thu, Jun 21, 2018 at 04:22:22PM +0800, jianchao.wang wrote:
> > Trace, please.  With the latest kernel.  I'm not saying that there
> > is nothing to fix, but the mode of never completing once timeout
> > requests as currently done is SCSI is clearly broken.
> > 
> 
> I didn't find the existing method to simulate this.
> So I modified the scsi-debug as following patch as install it as following:
> modprobe scsi-debug delay=-1 ndelay=-1
> Both 4.17-rc1 and 4.18-rc1 with this patch set could survive from the test.

What tree is this against?  I can't apply it to either current Linus'
tree or 4.17 for that matter.

Also I'm not sure this blk_abort_request call is representative
of the real world.  Drivers do drain their queues before calling
it in general, e.g. take a look at ata_eh_set_pending for the
probably most common user.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path
  2018-06-21  2:09     ` jianchao.wang
@ 2018-06-24 18:07       ` Sagi Grimberg
  2018-06-25  1:40         ` jianchao.wang
  0 siblings, 1 reply; 24+ messages in thread
From: Sagi Grimberg @ 2018-06-24 18:07 UTC (permalink / raw)
  To: jianchao.wang, Christoph Hellwig
  Cc: axboe, martin.petersen, keith.busch, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel


> Hi Christoph
> 
> Thanks for your kindly response.
> 
> On 06/20/2018 10:39 PM, Christoph Hellwig wrote:
>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>> index 73a97fc..2a161f6 100644
>>> --- a/drivers/nvme/host/pci.c
>>> +++ b/drivers/nvme/host/pci.c
>>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>   		nvme_warn_reset(dev, csts);
>>>   		nvme_dev_disable(dev, false);
>>>   		nvme_reset_ctrl(&dev->ctrl);
>>> +		__blk_mq_complete_request(req);
>>>   		return BLK_EH_DONE;
>>>   	}
>>>   
>>> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>   		dev_warn(dev->ctrl.device,
>>>   			 "I/O %d QID %d timeout, completion polled\n",
>>>   			 req->tag, nvmeq->qid);
>>> +		/*
>>> +		 * nvme_end_request will invoke blk_mq_complete_request,
>>> +		 * it will do nothing for this timed out request.
>>> +		 */
>>> +		__blk_mq_complete_request(req);
>>
>> And this clearly is bogus.  We want to iterate over the tagetset
>> and cancel all requests, not do that manually here.
>>
>> That was the whole point of the original change.
>>
> 
> For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still
> running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work.
> However, the hole is still there right now w/o my changes, but just narrower.

Given the amount of fixes (and fixes of fixes) we had in the timeout 
handler, maybe it'd be a good idea to step back and take a another look?

Won't it be better to avoid disabling the device and return
BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5]stop normal completion path entering a timeout req
  2018-06-22 15:10         ` Christoph Hellwig
@ 2018-06-25  1:29           ` jianchao.wang
  0 siblings, 0 replies; 24+ messages in thread
From: jianchao.wang @ 2018-06-25  1:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, axboe, martin.petersen, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel

Hi Christoph

Sorry for delayed response.

On 06/22/2018 11:10 PM, Christoph Hellwig wrote:
> On Thu, Jun 21, 2018 at 04:22:22PM +0800, jianchao.wang wrote:
>>> Trace, please.  With the latest kernel.  I'm not saying that there
>>> is nothing to fix, but the mode of never completing once timeout
>>> requests as currently done is SCSI is clearly broken.
>>>
>>
>> I didn't find the existing method to simulate this.
>> So I modified the scsi-debug as following patch as install it as following:
>> modprobe scsi-debug delay=-1 ndelay=-1
>> Both 4.17-rc1 and 4.18-rc1 with this patch set could survive from the test.
> 
> What tree is this against?  I can't apply it to either current Linus'
> tree or 4.17 for that matter.

I made the patch against 4.18.rc1.

> Also I'm not sure this blk_abort_request call is representative
> of the real world.  Drivers do drain their queues before calling
> it in general, e.g. take a look at ata_eh_set_pending for the
> probably most common user.
> 

This blk_abort_request here is to force request timed out and simulate the
scenario where timeout path and io completion path could occur concurrently.
It is hard for me to trigger this scenario in real world, so I made  this patch
which may looks bad. What I want is to trigger the io completion and timeout path
concurrently.


Thanks
Jianchao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path
  2018-06-24 18:07       ` Sagi Grimberg
@ 2018-06-25  1:40         ` jianchao.wang
  2018-06-25 18:51           ` Sagi Grimberg
  0 siblings, 1 reply; 24+ messages in thread
From: jianchao.wang @ 2018-06-25  1:40 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig
  Cc: axboe, martin.petersen, keith.busch, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel



On 06/25/2018 02:07 AM, Sagi Grimberg wrote:
> 
>> Hi Christoph
>>
>> Thanks for your kindly response.
>>
>> On 06/20/2018 10:39 PM, Christoph Hellwig wrote:
>>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>>> index 73a97fc..2a161f6 100644
>>>> --- a/drivers/nvme/host/pci.c
>>>> +++ b/drivers/nvme/host/pci.c
>>>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>>           nvme_warn_reset(dev, csts);
>>>>           nvme_dev_disable(dev, false);
>>>>           nvme_reset_ctrl(&dev->ctrl);
>>>> +        __blk_mq_complete_request(req);
>>>>           return BLK_EH_DONE;
>>>>       }
>>>>   @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>>           dev_warn(dev->ctrl.device,
>>>>                "I/O %d QID %d timeout, completion polled\n",
>>>>                req->tag, nvmeq->qid);
>>>> +        /*
>>>> +         * nvme_end_request will invoke blk_mq_complete_request,
>>>> +         * it will do nothing for this timed out request.
>>>> +         */
>>>> +        __blk_mq_complete_request(req);
>>>
>>> And this clearly is bogus.  We want to iterate over the tagetset
>>> and cancel all requests, not do that manually here.
>>>
>>> That was the whole point of the original change.
>>>
>>
>> For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still
>> running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work.
>> However, the hole is still there right now w/o my changes, but just narrower.
> 
> Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look?
> 
> Won't it be better to avoid disabling the device and return
> BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?
> 

Yes, that would be an ideal status for nvme-pci.
But we have to depend on the timeout handler to handle the timed out request from nvme_reset_work.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path
  2018-06-25  1:40         ` jianchao.wang
@ 2018-06-25 18:51           ` Sagi Grimberg
  0 siblings, 0 replies; 24+ messages in thread
From: Sagi Grimberg @ 2018-06-25 18:51 UTC (permalink / raw)
  To: jianchao.wang, Christoph Hellwig
  Cc: axboe, martin.petersen, keith.busch, josef, ulf.hansson,
	linux-block, linux-scsi, linux-kernel


>> Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look?
>>
>> Won't it be better to avoid disabling the device and return
>> BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?
>>
> 
> Yes, that would be an ideal status for nvme-pci.
> But we have to depend on the timeout handler to handle the timed out request from nvme_reset_work.

Yes, so if the timeout handler is invoked from the reset, it should
simply set a proper failed status and return BLK_EH_DONE.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-06-25 18:51 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-20 13:22 [PATCH 0/5]stop normal completion path entering a timeout req Jianchao Wang
2018-06-20 13:22 ` [PATCH 1/5] blk-mq: prevent normal completion from entering a timeout request Jianchao Wang
2018-06-20 13:22 ` [PATCH 2/5] nbd: use __blk_mq_complete_request in timeout path Jianchao Wang
2018-06-20 14:13   ` Josef Bacik
2018-06-20 13:22 ` [PATCH 3/5] null_blk: " Jianchao Wang
2018-06-20 13:22 ` [PATCH 4/5] mmc: " Jianchao Wang
2018-06-20 13:22 ` [PATCH 5/5] nvme: " Jianchao Wang
2018-06-20 14:39   ` Christoph Hellwig
2018-06-21  2:09     ` jianchao.wang
2018-06-24 18:07       ` Sagi Grimberg
2018-06-25  1:40         ` jianchao.wang
2018-06-25 18:51           ` Sagi Grimberg
2018-06-20 18:16 ` [PATCH 0/5]stop normal completion path entering a timeout req Keith Busch
2018-06-21  1:43   ` jianchao.wang
2018-06-21  8:19     ` Christoph Hellwig
2018-06-21  8:22       ` jianchao.wang
2018-06-22 15:10         ` Christoph Hellwig
2018-06-25  1:29           ` jianchao.wang
2018-06-21 13:13       ` jianchao.wang
2018-06-21 15:01         ` Keith Busch
2018-06-21 18:21       ` Bart Van Assche
2018-06-21 18:21         ` Bart Van Assche
2018-06-21 21:15         ` Keith Busch
2018-06-21 21:30           ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.