linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] blk-mq: Queue running fixes
@ 2017-12-01  0:08 Bart Van Assche
  2017-12-01  0:08 ` [PATCH 1/7] blk-mq: Fix spelling in a source code comment Bart Van Assche
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Bart Van Assche

Hello Jens,

This patch series fixes the issues I came across while reviewing and testing
the v4.15-rc1 block layer. Please consider these patches for v4.15-rc<n> /
v4.16 as appropriate.

Thanks,

Bart.

Bart Van Assche (7):
  blk-mq: Fix spelling in a source code comment
  block: Document more locking requirements
  blk-mq: Make blk_mq_mark_tag_wait() easier to read
  blk-mq: Avoid that request processing stalls when sharing tags
  blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall
  blk-mq: Rerun hardware queues after having called .put_budget()
  blk-mq: Fix another queue stall

 block/blk-mq-sched.c | 43 ++++++++++++------------
 block/blk-mq-sched.h |  2 +-
 block/blk-mq.c       | 95 +++++++++++++++++++++++++++++++---------------------
 block/blk-mq.h       |  2 ++
 block/blk-timeout.c  |  3 ++
 5 files changed, 85 insertions(+), 60 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/7] blk-mq: Fix spelling in a source code comment
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  0:08 ` [PATCH 2/7] block: Document more locking requirements Bart Van Assche
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

Change "nedeing" into "needing" and "caes" into "cases".

Fixes: commit f906a6a0f426 ("blk-mq: improve tag waiting setup for non-shared tags")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index aa165abf08f0..e1ca7661daa5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1014,8 +1014,8 @@ static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 
 /*
  * Mark us waiting for a tag. For shared tags, this involves hooking us into
- * the tag wakeups. For non-shared tags, we can simply mark us nedeing a
- * restart. For both caes, take care to check the condition again after
+ * the tag wakeups. For non-shared tags, we can simply mark us needing a
+ * restart. For both cases, take care to check the condition again after
  * marking us as waiting.
  */
 static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/7] block: Document more locking requirements
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
  2017-12-01  0:08 ` [PATCH 1/7] blk-mq: Fix spelling in a source code comment Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  3:03   ` Jens Axboe
  2017-12-01  0:08 ` [PATCH 3/7] blk-mq: Make blk_mq_mark_tag_wait() easier to read Bart Van Assche
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Hannes Reinecke,
	Johannes Thumshirn

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c      | 13 +++++++++++--
 block/blk-timeout.c |  3 +++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e1ca7661daa5..7f290a91a612 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -74,6 +74,8 @@ static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx)
 static void blk_mq_hctx_mark_pending(struct blk_mq_hw_ctx *hctx,
 				     struct blk_mq_ctx *ctx)
 {
+	lockdep_assert_held(&ctx->lock);
+
 	if (!sbitmap_test_bit(&hctx->ctx_map, ctx->index_hw))
 		sbitmap_set_bit(&hctx->ctx_map, ctx->index_hw);
 }
@@ -81,6 +83,8 @@ static void blk_mq_hctx_mark_pending(struct blk_mq_hw_ctx *hctx,
 static void blk_mq_hctx_clear_pending(struct blk_mq_hw_ctx *hctx,
 				      struct blk_mq_ctx *ctx)
 {
+	lockdep_assert_held(&ctx->lock);
+
 	sbitmap_clear_bit(&hctx->ctx_map, ctx->index_hw);
 }
 
@@ -1003,9 +1007,14 @@ bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx,
 static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 				int flags, void *key)
 {
-	struct blk_mq_hw_ctx *hctx;
+	struct blk_mq_hw_ctx *hctx =
+		container_of(wait, struct blk_mq_hw_ctx, dispatch_wait);
+
+#ifdef CONFIG_LOCKDEP
+	struct sbq_wait_state *ws = bt_wait_ptr(&hctx->tags->bitmap_tags, hctx);
 
-	hctx = container_of(wait, struct blk_mq_hw_ctx, dispatch_wait);
+	lockdep_assert_held(&ws->wait.lock);
+#endif
 
 	list_del_init(&wait->entry);
 	blk_mq_run_hw_queue(hctx, true);
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 764ecf9aeb30..77bf0c6e7c7e 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -77,6 +77,9 @@ ssize_t part_timeout_store(struct device *dev, struct device_attribute *attr,
  */
 void blk_delete_timer(struct request *req)
 {
+	lockdep_assert_held(req->q->queue_lock);
+	WARN_ON_ONCE(req->q->mq_ops);
+
 	list_del_init(&req->timeout_list);
 }
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/7] blk-mq: Make blk_mq_mark_tag_wait() easier to read
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
  2017-12-01  0:08 ` [PATCH 1/7] blk-mq: Fix spelling in a source code comment Bart Van Assche
  2017-12-01  0:08 ` [PATCH 2/7] block: Document more locking requirements Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  0:08 ` [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags Bart Van Assche
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

Reduce the number of return statements from three to one. Reduce the
number of spin_unlock(&this_hctx->lock) calls from two to one. Fix
a misleading comment: since blk-mq-tag.c always uses wake_up_all()
other waiters are woken up whether or not the current hctx is removed
from the wait list. This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c | 41 +++++++++++++++--------------------------
 1 file changed, 15 insertions(+), 26 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7f290a91a612..26fec4dfa40f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1039,6 +1039,11 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 	if (!shared_tags) {
 		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
 			set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
+		ret = blk_mq_get_driver_tag(rq, hctx, false);
+		/*
+		 * Don't clear RESTART here, someone else could have set it.
+		 * At most this will cost an extra queue run.
+		 */
 	} else {
 		wait = &this_hctx->dispatch_wait;
 		if (!list_empty_careful(&wait->entry))
@@ -1052,37 +1057,21 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 
 		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
 		add_wait_queue(&ws->wait, wait);
-	}
-
-	/*
-	 * It's possible that a tag was freed in the window between the
-	 * allocation failure and adding the hardware queue to the wait
-	 * queue.
-	 */
-	ret = blk_mq_get_driver_tag(rq, hctx, false);
-
-	if (!shared_tags) {
 		/*
-		 * Don't clear RESTART here, someone else could have set it.
-		 * At most this will cost an extra queue run.
+		 * It's possible that a tag was freed in the window between the
+		 * allocation failure and adding the hardware queue to the wait
+		 * queue.
 		 */
-		return ret;
-	} else {
-		if (!ret) {
-			spin_unlock(&this_hctx->lock);
-			return false;
+		ret = blk_mq_get_driver_tag(rq, hctx, false);
+		/* If we got a tag remove ourselves from the wait queue. */
+		if (ret) {
+			spin_lock_irq(&ws->wait.lock);
+			list_del_init(&wait->entry);
+			spin_unlock_irq(&ws->wait.lock);
 		}
-
-		/*
-		 * We got a tag, remove ourselves from the wait queue to ensure
-		 * someone else gets the wakeup.
-		 */
-		spin_lock_irq(&ws->wait.lock);
-		list_del_init(&wait->entry);
-		spin_unlock_irq(&ws->wait.lock);
 		spin_unlock(&this_hctx->lock);
-		return true;
 	}
+	return ret;
 }
 
 bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
                   ` (2 preceding siblings ...)
  2017-12-01  0:08 ` [PATCH 3/7] blk-mq: Make blk_mq_mark_tag_wait() easier to read Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  2:58   ` Ming Lei
  2017-12-01  0:08 ` [PATCH 5/7] blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall Bart Van Assche
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Ming Lei,
	Omar Sandoval, Hannes Reinecke, Johannes Thumshirn

blk_mq_sched_mark_restart_hctx() must be called before
blk_mq_dispatch_rq_list() is called. Make sure that
BLK_MQ_S_SCHED_RESTART is set before any blk_mq_dispatch_rq_list()
call occurs.

Fixes: commit b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq-sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index d8e3533d3218..c4e0cb5f6f1f 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -208,8 +208,8 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 	 * on the dispatch list or we were able to dispatch from the
 	 * dispatch list.
 	 */
+	blk_mq_sched_mark_restart_hctx(hctx);
 	if (!list_empty(&rq_list)) {
-		blk_mq_sched_mark_restart_hctx(hctx);
 		if (blk_mq_dispatch_rq_list(q, &rq_list, false)) {
 			if (has_sched_dispatch)
 				blk_mq_do_dispatch_sched(hctx);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/7] blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
                   ` (3 preceding siblings ...)
  2017-12-01  0:08 ` [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  3:51   ` Ming Lei
  2017-12-01  0:08 ` [PATCH 6/7] blk-mq: Rerun hardware queues after having called .put_budget() Bart Van Assche
  2017-12-01  0:08 ` [PATCH 7/7] blk-mq: Fix another queue stall Bart Van Assche
  6 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Ming Lei,
	Omar Sandoval, Hannes Reinecke, Johannes Thumshirn

The blk_mq_sched_restart() call from inside blk_mq_free_request()
only runs those queues for which BLK_MQ_S_SCHED_RESTART has been
set. Hence set that flag from inside blk_mq_mark_tag_wait() whether
or not a queue is shared.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq-sched.c | 2 +-
 block/blk-mq.c       | 4 ++--
 block/blk-mq.h       | 2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index c4e0cb5f6f1f..398545d94521 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -54,7 +54,7 @@ void blk_mq_sched_assign_ioc(struct request *rq, struct bio *bio)
  * Mark a hardware queue as needing a restart. For shared queues, maintain
  * a count of how many hardware queues are marked for restart.
  */
-static void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
+void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
 {
 	if (test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state))
 		return;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 26fec4dfa40f..3e0ce940377f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1036,9 +1036,9 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 	wait_queue_entry_t *wait;
 	bool ret;
 
+	blk_mq_sched_mark_restart_hctx(this_hctx);
+
 	if (!shared_tags) {
-		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
-			set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
 		ret = blk_mq_get_driver_tag(rq, hctx, false);
 		/*
 		 * Don't clear RESTART here, someone else could have set it.
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 6c7c3ff5bf62..eb3c93aeb8b3 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -139,6 +139,8 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
 void blk_mq_in_flight(struct request_queue *q, struct hd_struct *part,
 			unsigned int inflight[2]);
 
+void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx);
+
 static inline void blk_mq_put_dispatch_budget(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/7] blk-mq: Rerun hardware queues after having called .put_budget()
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
                   ` (4 preceding siblings ...)
  2017-12-01  0:08 ` [PATCH 5/7] blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  3:23   ` Ming Lei
  2017-12-01  0:08 ` [PATCH 7/7] blk-mq: Fix another queue stall Bart Van Assche
  6 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Ming Lei,
	Omar Sandoval, Hannes Reinecke, Johannes Thumshirn, stable

During request dispatch, after a scheduler or per-CPU queue has
been examined, .put_budget() is called if the examined queue is
empty. Since a new request may be queued concurrently with the
.put_budget() call, a request queue needs to be rerun after each
.put_budget() call.

Fixes: commit 1f460b63d4b3 ("blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: <stable@vger.kernel.org>
---
 block/blk-mq-sched.c | 39 ++++++++++++++++++++-------------------
 block/blk-mq-sched.h |  2 +-
 block/blk-mq.c       | 17 ++++++++++++-----
 3 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 398545d94521..3a935081a2d3 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -82,12 +82,8 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx *hctx)
 	return blk_mq_run_hw_queue(hctx, true);
 }
 
-/*
- * Only SCSI implements .get_budget and .put_budget, and SCSI restarts
- * its queue by itself in its completion handler, so we don't need to
- * restart queue if .get_budget() returns BLK_STS_NO_RESOURCE.
- */
-static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
+/* returns true if hctx needs to be run again */
+static bool blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
 	struct elevator_queue *e = q->elevator;
@@ -106,7 +102,7 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 		rq = e->type->ops.mq.dispatch_request(hctx);
 		if (!rq) {
 			blk_mq_put_dispatch_budget(hctx);
-			break;
+			return true;
 		}
 
 		/*
@@ -116,6 +112,8 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 		 */
 		list_add(&rq->queuelist, &rq_list);
 	} while (blk_mq_dispatch_rq_list(q, &rq_list, true));
+
+	return false;
 }
 
 static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx,
@@ -129,16 +127,13 @@ static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx,
 	return hctx->ctxs[idx];
 }
 
-/*
- * Only SCSI implements .get_budget and .put_budget, and SCSI restarts
- * its queue by itself in its completion handler, so we don't need to
- * restart queue if .get_budget() returns BLK_STS_NO_RESOURCE.
- */
-static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
+/* returns true if hctx needs to be run again */
+static bool blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
 	LIST_HEAD(rq_list);
 	struct blk_mq_ctx *ctx = READ_ONCE(hctx->dispatch_from);
+	bool ret = false;
 
 	do {
 		struct request *rq;
@@ -152,6 +147,7 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 		rq = blk_mq_dequeue_from_ctx(hctx, ctx);
 		if (!rq) {
 			blk_mq_put_dispatch_budget(hctx);
+			ret = true;
 			break;
 		}
 
@@ -168,19 +164,22 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 	} while (blk_mq_dispatch_rq_list(q, &rq_list, true));
 
 	WRITE_ONCE(hctx->dispatch_from, ctx);
+
+	return ret;
 }
 
 /* return true if hw queue need to be run again */
-void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
+bool blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
 	struct elevator_queue *e = q->elevator;
 	const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request;
 	LIST_HEAD(rq_list);
+	bool run_queue = false;
 
 	/* RCU or SRCU read lock is needed before checking quiesced flag */
 	if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)))
-		return;
+		return false;
 
 	hctx->run++;
 
@@ -212,12 +211,12 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 	if (!list_empty(&rq_list)) {
 		if (blk_mq_dispatch_rq_list(q, &rq_list, false)) {
 			if (has_sched_dispatch)
-				blk_mq_do_dispatch_sched(hctx);
+				run_queue = blk_mq_do_dispatch_sched(hctx);
 			else
-				blk_mq_do_dispatch_ctx(hctx);
+				run_queue = blk_mq_do_dispatch_ctx(hctx);
 		}
 	} else if (has_sched_dispatch) {
-		blk_mq_do_dispatch_sched(hctx);
+		run_queue = blk_mq_do_dispatch_sched(hctx);
 	} else if (q->mq_ops->get_budget) {
 		/*
 		 * If we need to get budget before queuing request, we
@@ -227,11 +226,13 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 		 * TODO: get more budgets, and dequeue more requests in
 		 * one time.
 		 */
-		blk_mq_do_dispatch_ctx(hctx);
+		run_queue = blk_mq_do_dispatch_ctx(hctx);
 	} else {
 		blk_mq_flush_busy_ctxs(hctx, &rq_list);
 		blk_mq_dispatch_rq_list(q, &rq_list, false);
 	}
+
+	return run_queue;
 }
 
 bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index ba1d1418a96d..1ccfb8027cfc 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -23,7 +23,7 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
 				  struct blk_mq_ctx *ctx,
 				  struct list_head *list, bool run_queue_async);
 
-void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx);
+bool blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx);
 
 int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e);
 void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3e0ce940377f..b4225f606737 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1079,7 +1079,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 {
 	struct blk_mq_hw_ctx *hctx;
 	struct request *rq, *nxt;
-	bool no_tag = false;
+	bool restart = false, no_tag = false;
 	int errors, queued;
 
 	if (list_empty(list))
@@ -1105,8 +1105,10 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 			 * we'll re-run it below.
 			 */
 			if (!blk_mq_mark_tag_wait(&hctx, rq)) {
-				if (got_budget)
+				if (got_budget) {
 					blk_mq_put_dispatch_budget(hctx);
+					restart = true;
+				}
 				/*
 				 * For non-shared tags, the RESTART check
 				 * will suffice.
@@ -1193,7 +1195,8 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 		 *   returning BLK_STS_RESOURCE. Two exceptions are scsi-mq
 		 *   and dm-rq.
 		 */
-		if (!blk_mq_sched_needs_restart(hctx) ||
+		if (restart ||
+		    !blk_mq_sched_needs_restart(hctx) ||
 		    (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
 			blk_mq_run_hw_queue(hctx, true);
 	}
@@ -1204,6 +1207,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 {
 	int srcu_idx;
+	bool run_queue;
 
 	/*
 	 * We should be running this queue from one of the CPUs that
@@ -1220,15 +1224,18 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 
 	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
 		rcu_read_lock();
-		blk_mq_sched_dispatch_requests(hctx);
+		run_queue = blk_mq_sched_dispatch_requests(hctx);
 		rcu_read_unlock();
 	} else {
 		might_sleep();
 
 		srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
-		blk_mq_sched_dispatch_requests(hctx);
+		run_queue = blk_mq_sched_dispatch_requests(hctx);
 		srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
 	}
+
+	if (run_queue)
+		blk_mq_sched_restart(hctx);
 }
 
 /*
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/7] blk-mq: Fix another queue stall
  2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
                   ` (5 preceding siblings ...)
  2017-12-01  0:08 ` [PATCH 6/7] blk-mq: Rerun hardware queues after having called .put_budget() Bart Van Assche
@ 2017-12-01  0:08 ` Bart Van Assche
  2017-12-01  3:00   ` Jens Axboe
  6 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01  0:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Ming Lei,
	Omar Sandoval, Hannes Reinecke, Johannes Thumshirn

The following code at the end of blk_mq_dispatch_rq_list() detects
whether or not wake_up(&hctx->dispatch_wait) has been called
concurrently with pushing back requests onto the dispatch list:

    list_empty_careful(&hctx->dispatch_wait.entry)

Since blk_mq_dispatch_wake() is protected by another lock than the
dispatch list and since blk_mq_run_hw_queue() does not acquire any
lock if it notices that no requests are pending,
blk_mq_dispatch_wake() is not ordered against the code that pushes
back requests onto the dispatch list. Avoid that the dispatch_wait
empty check fails due to load/store reordering by serializing it
against the dispatch_wait queue wakeup. This patch fixes a queue
stall I ran into while testing a SCSI initiator driver with the
maximum target depth set to one.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b4225f606737..a11767a4d95c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1074,6 +1074,20 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 	return ret;
 }
 
+static bool blk_mq_dispatch_list_empty(struct blk_mq_hw_ctx *hctx)
+{
+	struct sbq_wait_state *ws = bt_wait_ptr(&hctx->tags->bitmap_tags, hctx);
+	struct wait_queue_head *wq_head = &ws->wait;
+	unsigned long flags;
+	bool result;
+
+	spin_lock_irqsave(&wq_head->lock, flags);
+	result = list_empty(&hctx->dispatch_wait.entry);
+	spin_unlock_irqrestore(&wq_head->lock, flags);
+
+	return result;
+}
+
 bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 			     bool got_budget)
 {
@@ -1197,7 +1211,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
 		 */
 		if (restart ||
 		    !blk_mq_sched_needs_restart(hctx) ||
-		    (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
+		    (no_tag && blk_mq_dispatch_list_empty(hctx)))
 			blk_mq_run_hw_queue(hctx, true);
 	}
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-01  0:08 ` [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags Bart Van Assche
@ 2017-12-01  2:58   ` Ming Lei
  2017-12-01 19:52     ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2017-12-01  2:58 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

On Thu, Nov 30, 2017 at 04:08:45PM -0800, Bart Van Assche wrote:
> blk_mq_sched_mark_restart_hctx() must be called before

Could you please describe the theory on commit log? Like, why is it
a must? and what is the issue to be fixed? 

> blk_mq_dispatch_rq_list() is called. Make sure that
> BLK_MQ_S_SCHED_RESTART is set before any blk_mq_dispatch_rq_list()
> call occurs.
> 
> Fixes: commit b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")

We always mark RESTART state bit just before dispatching from ->dispatch_list,
this way has been there before b347689ffbca, which doesn't change this
RESTART mechanism, so please explain a bit why it is a fix on commit
b347689ffbca.

> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-mq-sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index d8e3533d3218..c4e0cb5f6f1f 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -208,8 +208,8 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
>  	 * on the dispatch list or we were able to dispatch from the
>  	 * dispatch list.
>  	 */
> +	blk_mq_sched_mark_restart_hctx(hctx);

This looks over-kill.

It means RESTART has to be done from request's completion after each new dispatch.

>  	if (!list_empty(&rq_list)) {
> -		blk_mq_sched_mark_restart_hctx(hctx);
>  		if (blk_mq_dispatch_rq_list(q, &rq_list, false)) {
>  			if (has_sched_dispatch)
>  				blk_mq_do_dispatch_sched(hctx);
> -- 
> 2.15.0
> 

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 7/7] blk-mq: Fix another queue stall
  2017-12-01  0:08 ` [PATCH 7/7] blk-mq: Fix another queue stall Bart Van Assche
@ 2017-12-01  3:00   ` Jens Axboe
  0 siblings, 0 replies; 21+ messages in thread
From: Jens Axboe @ 2017-12-01  3:00 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, Ming Lei, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

On 11/30/2017 05:08 PM, Bart Van Assche wrote:
> The following code at the end of blk_mq_dispatch_rq_list() detects
> whether or not wake_up(&hctx->dispatch_wait) has been called
> concurrently with pushing back requests onto the dispatch list:
> 
>     list_empty_careful(&hctx->dispatch_wait.entry)
> 
> Since blk_mq_dispatch_wake() is protected by another lock than the
> dispatch list and since blk_mq_run_hw_queue() does not acquire any
> lock if it notices that no requests are pending,
> blk_mq_dispatch_wake() is not ordered against the code that pushes
> back requests onto the dispatch list. Avoid that the dispatch_wait
> empty check fails due to load/store reordering by serializing it
> against the dispatch_wait queue wakeup. This patch fixes a queue
> stall I ran into while testing a SCSI initiator driver with the
> maximum target depth set to one.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-mq.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index b4225f606737..a11767a4d95c 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1074,6 +1074,20 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  	return ret;
>  }
>  
> +static bool blk_mq_dispatch_list_empty(struct blk_mq_hw_ctx *hctx)
> +{
> +	struct sbq_wait_state *ws = bt_wait_ptr(&hctx->tags->bitmap_tags, hctx);
> +	struct wait_queue_head *wq_head = &ws->wait;
> +	unsigned long flags;
> +	bool result;
> +
> +	spin_lock_irqsave(&wq_head->lock, flags);
> +	result = list_empty(&hctx->dispatch_wait.entry);
> +	spin_unlock_irqrestore(&wq_head->lock, flags);
> +
> +	return result;
> +}

This can't fix anything, since you're still depending on the state
outside the lock. You probably just changed the window slightly.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/7] block: Document more locking requirements
  2017-12-01  0:08 ` [PATCH 2/7] block: Document more locking requirements Bart Van Assche
@ 2017-12-01  3:03   ` Jens Axboe
  2017-12-01 17:05     ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Jens Axboe @ 2017-12-01  3:03 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Johannes Thumshirn

On 11/30/2017 05:08 PM, Bart Van Assche wrote:
> This patch does not change any functionality.

Unless these actually found real bugs, I think it's pointless.
Add a comment. And this:
 
> @@ -1003,9 +1007,14 @@ bool blk_mq_get_driver_tag(struct request *rq, struct blk_mq_hw_ctx **hctx,
>  static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
>  				int flags, void *key)
>  {
> -	struct blk_mq_hw_ctx *hctx;
> +	struct blk_mq_hw_ctx *hctx =
> +		container_of(wait, struct blk_mq_hw_ctx, dispatch_wait);
> +
> +#ifdef CONFIG_LOCKDEP
> +	struct sbq_wait_state *ws = bt_wait_ptr(&hctx->tags->bitmap_tags, hctx);
>  
> -	hctx = container_of(wait, struct blk_mq_hw_ctx, dispatch_wait);
> +	lockdep_assert_held(&ws->wait.lock);
> +#endif

we're definitely not doing.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/7] blk-mq: Rerun hardware queues after having called .put_budget()
  2017-12-01  0:08 ` [PATCH 6/7] blk-mq: Rerun hardware queues after having called .put_budget() Bart Van Assche
@ 2017-12-01  3:23   ` Ming Lei
  0 siblings, 0 replies; 21+ messages in thread
From: Ming Lei @ 2017-12-01  3:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn, stable

On Thu, Nov 30, 2017 at 04:08:47PM -0800, Bart Van Assche wrote:
> During request dispatch, after a scheduler or per-CPU queue has
> been examined, .put_budget() is called if the examined queue is
> empty. Since a new request may be queued concurrently with the
> .put_budget() call, a request queue needs to be rerun after each
> .put_budget() call.

If a request is queued concurrently from another path, it can be run
from that path, so don't need to rerun in __blk_mq_run_hw_queue().

> 
> Fixes: commit 1f460b63d4b3 ("blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE")
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> Cc: <stable@vger.kernel.org>
> ---
>  block/blk-mq-sched.c | 39 ++++++++++++++++++++-------------------
>  block/blk-mq-sched.h |  2 +-
>  block/blk-mq.c       | 17 ++++++++++++-----
>  3 files changed, 33 insertions(+), 25 deletions(-)
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index 398545d94521..3a935081a2d3 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -82,12 +82,8 @@ static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx *hctx)
>  	return blk_mq_run_hw_queue(hctx, true);
>  }
>  
> -/*
> - * Only SCSI implements .get_budget and .put_budget, and SCSI restarts
> - * its queue by itself in its completion handler, so we don't need to
> - * restart queue if .get_budget() returns BLK_STS_NO_RESOURCE.
> - */
> -static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
> +/* returns true if hctx needs to be run again */
> +static bool blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>  {
>  	struct request_queue *q = hctx->queue;
>  	struct elevator_queue *e = q->elevator;
> @@ -106,7 +102,7 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>  		rq = e->type->ops.mq.dispatch_request(hctx);
>  		if (!rq) {
>  			blk_mq_put_dispatch_budget(hctx);
> -			break;
> +			return true;
>  		}
>  
>  		/*
> @@ -116,6 +112,8 @@ static void blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>  		 */
>  		list_add(&rq->queuelist, &rq_list);
>  	} while (blk_mq_dispatch_rq_list(q, &rq_list, true));
> +
> +	return false;
>  }
>  
>  static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx,
> @@ -129,16 +127,13 @@ static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx,
>  	return hctx->ctxs[idx];
>  }
>  
> -/*
> - * Only SCSI implements .get_budget and .put_budget, and SCSI restarts
> - * its queue by itself in its completion handler, so we don't need to
> - * restart queue if .get_budget() returns BLK_STS_NO_RESOURCE.
> - */
> -static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
> +/* returns true if hctx needs to be run again */
> +static bool blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
>  {
>  	struct request_queue *q = hctx->queue;
>  	LIST_HEAD(rq_list);
>  	struct blk_mq_ctx *ctx = READ_ONCE(hctx->dispatch_from);
> +	bool ret = false;
>  
>  	do {
>  		struct request *rq;
> @@ -152,6 +147,7 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
>  		rq = blk_mq_dequeue_from_ctx(hctx, ctx);
>  		if (!rq) {
>  			blk_mq_put_dispatch_budget(hctx);
> +			ret = true;
>  			break;
>  		}
>  
> @@ -168,19 +164,22 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
>  	} while (blk_mq_dispatch_rq_list(q, &rq_list, true));
>  
>  	WRITE_ONCE(hctx->dispatch_from, ctx);
> +
> +	return ret;
>  }
>  
>  /* return true if hw queue need to be run again */
> -void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
> +bool blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
>  {
>  	struct request_queue *q = hctx->queue;
>  	struct elevator_queue *e = q->elevator;
>  	const bool has_sched_dispatch = e && e->type->ops.mq.dispatch_request;
>  	LIST_HEAD(rq_list);
> +	bool run_queue = false;
>  
>  	/* RCU or SRCU read lock is needed before checking quiesced flag */
>  	if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)))
> -		return;
> +		return false;
>  
>  	hctx->run++;
>  
> @@ -212,12 +211,12 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
>  	if (!list_empty(&rq_list)) {
>  		if (blk_mq_dispatch_rq_list(q, &rq_list, false)) {
>  			if (has_sched_dispatch)
> -				blk_mq_do_dispatch_sched(hctx);
> +				run_queue = blk_mq_do_dispatch_sched(hctx);
>  			else
> -				blk_mq_do_dispatch_ctx(hctx);
> +				run_queue = blk_mq_do_dispatch_ctx(hctx);
>  		}
>  	} else if (has_sched_dispatch) {
> -		blk_mq_do_dispatch_sched(hctx);
> +		run_queue = blk_mq_do_dispatch_sched(hctx);
>  	} else if (q->mq_ops->get_budget) {
>  		/*
>  		 * If we need to get budget before queuing request, we
> @@ -227,11 +226,13 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
>  		 * TODO: get more budgets, and dequeue more requests in
>  		 * one time.
>  		 */
> -		blk_mq_do_dispatch_ctx(hctx);
> +		run_queue = blk_mq_do_dispatch_ctx(hctx);
>  	} else {
>  		blk_mq_flush_busy_ctxs(hctx, &rq_list);
>  		blk_mq_dispatch_rq_list(q, &rq_list, false);
>  	}
> +
> +	return run_queue;
>  }
>  
>  bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
> diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
> index ba1d1418a96d..1ccfb8027cfc 100644
> --- a/block/blk-mq-sched.h
> +++ b/block/blk-mq-sched.h
> @@ -23,7 +23,7 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
>  				  struct blk_mq_ctx *ctx,
>  				  struct list_head *list, bool run_queue_async);
>  
> -void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx);
> +bool blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx);
>  
>  int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e);
>  void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3e0ce940377f..b4225f606737 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1079,7 +1079,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
>  {
>  	struct blk_mq_hw_ctx *hctx;
>  	struct request *rq, *nxt;
> -	bool no_tag = false;
> +	bool restart = false, no_tag = false;
>  	int errors, queued;
>  
>  	if (list_empty(list))
> @@ -1105,8 +1105,10 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
>  			 * we'll re-run it below.
>  			 */
>  			if (!blk_mq_mark_tag_wait(&hctx, rq)) {
> -				if (got_budget)
> +				if (got_budget) {
>  					blk_mq_put_dispatch_budget(hctx);
> +					restart = true;
> +				}
>  				/*
>  				 * For non-shared tags, the RESTART check
>  				 * will suffice.
> @@ -1193,7 +1195,8 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
>  		 *   returning BLK_STS_RESOURCE. Two exceptions are scsi-mq
>  		 *   and dm-rq.
>  		 */
> -		if (!blk_mq_sched_needs_restart(hctx) ||
> +		if (restart ||
> +		    !blk_mq_sched_needs_restart(hctx) ||
>  		    (no_tag && list_empty_careful(&hctx->dispatch_wait.entry)))
>  			blk_mq_run_hw_queue(hctx, true);
>  	}
> @@ -1204,6 +1207,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
>  static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
>  {
>  	int srcu_idx;
> +	bool run_queue;
>  
>  	/*
>  	 * We should be running this queue from one of the CPUs that
> @@ -1220,15 +1224,18 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
>  
>  	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
>  		rcu_read_lock();
> -		blk_mq_sched_dispatch_requests(hctx);
> +		run_queue = blk_mq_sched_dispatch_requests(hctx);
>  		rcu_read_unlock();
>  	} else {
>  		might_sleep();
>  
>  		srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
> -		blk_mq_sched_dispatch_requests(hctx);
> +		run_queue = blk_mq_sched_dispatch_requests(hctx);
>  		srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
>  	}
> +
> +	if (run_queue)
> +		blk_mq_sched_restart(hctx);
>  }
>  
>  /*
> -- 
> 2.15.0
> 

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/7] blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall
  2017-12-01  0:08 ` [PATCH 5/7] blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall Bart Van Assche
@ 2017-12-01  3:51   ` Ming Lei
  0 siblings, 0 replies; 21+ messages in thread
From: Ming Lei @ 2017-12-01  3:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Omar Sandoval,
	Hannes Reinecke, Johannes Thumshirn

On Thu, Nov 30, 2017 at 04:08:46PM -0800, Bart Van Assche wrote:
> The blk_mq_sched_restart() call from inside blk_mq_free_request()
> only runs those queues for which BLK_MQ_S_SCHED_RESTART has been
> set. Hence set that flag from inside blk_mq_mark_tag_wait() whether
> or not a queue is shared.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  block/blk-mq-sched.c | 2 +-
>  block/blk-mq.c       | 4 ++--
>  block/blk-mq.h       | 2 ++
>  3 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index c4e0cb5f6f1f..398545d94521 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -54,7 +54,7 @@ void blk_mq_sched_assign_ioc(struct request *rq, struct bio *bio)
>   * Mark a hardware queue as needing a restart. For shared queues, maintain
>   * a count of how many hardware queues are marked for restart.
>   */
> -static void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
> +void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
>  {
>  	if (test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state))
>  		return;
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 26fec4dfa40f..3e0ce940377f 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1036,9 +1036,9 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
>  	wait_queue_entry_t *wait;
>  	bool ret;
>  
> +	blk_mq_sched_mark_restart_hctx(this_hctx);
> +
>  	if (!shared_tags) {
> -		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
> -			set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);

On the contrary, the above two lines should be removed simply, because
this bit has to be set after the req is added to hctx->dispatch_list,
otherwise the RESTART for this rq may be missed. Seems it is a issue
introduced by f906a6a0f4268(blk-mq: improve tag waiting setup for
non-shared tags).

And the bit can be set in blk_mq_sched_dispatch_requests() if the
following get_driver_tag fails.

>  		ret = blk_mq_get_driver_tag(rq, hctx, false);
>  		/*
>  		 * Don't clear RESTART here, someone else could have set it.
> diff --git a/block/blk-mq.h b/block/blk-mq.h
> index 6c7c3ff5bf62..eb3c93aeb8b3 100644
> --- a/block/blk-mq.h
> +++ b/block/blk-mq.h
> @@ -139,6 +139,8 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
>  void blk_mq_in_flight(struct request_queue *q, struct hd_struct *part,
>  			unsigned int inflight[2]);
>  
> +void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx);
> +
>  static inline void blk_mq_put_dispatch_budget(struct blk_mq_hw_ctx *hctx)
>  {
>  	struct request_queue *q = hctx->queue;
> -- 
> 2.15.0
> 

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/7] block: Document more locking requirements
  2017-12-01  3:03   ` Jens Axboe
@ 2017-12-01 17:05     ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01 17:05 UTC (permalink / raw)
  To: axboe; +Cc: hch, linux-block, hare, jthumshirn

T24gVGh1LCAyMDE3LTExLTMwIGF0IDIwOjAzIC0wNzAwLCBKZW5zIEF4Ym9lIHdyb3RlOg0KPiBP
biAxMS8zMC8yMDE3IDA1OjA4IFBNLCBCYXJ0IFZhbiBBc3NjaGUgd3JvdGU6DQo+ID4gVGhpcyBw
YXRjaCBkb2VzIG5vdCBjaGFuZ2UgYW55IGZ1bmN0aW9uYWxpdHkuDQo+IA0KPiBVbmxlc3MgdGhl
c2UgYWN0dWFsbHkgZm91bmQgcmVhbCBidWdzLCBJIHRoaW5rIGl0J3MgcG9pbnRsZXNzLg0KPiBB
ZGQgYSBjb21tZW50Lg0KIA0KSGVsbG8gSmVucywNCg0KQXMgeW91IGtub3cgbG9ja2RlcF9hc3Nl
cnRfaGVsZCgpIHN0YXRlbWVudHMgYXJlIHZlcmlmaWVkIGF0IHJ1bnRpbWUgd2l0aA0KTE9DS0RF
UCBlbmFibGVkIGJ1dCBjb21tZW50cyBub3QuIEhlbmNlIG15IHByZWZlcmVuY2UgZm9yIGxvY2tk
ZXBfYXNzZXJ0X2hlbGQoKQ0Kb3ZlciBzb3VyY2UgY29kZSBjb21tZW50cy4NCg0KPiBBbmQgdGhp
czoNCj4NCj4gPiBAQCAtMTAwMyw5ICsxMDA3LDE0IEBAIGJvb2wgYmxrX21xX2dldF9kcml2ZXJf
dGFnKHN0cnVjdCByZXF1ZXN0ICpycSwgc3RydWN0IGJsa19tcV9od19jdHggKipoY3R4LA0KPiA+
ICBzdGF0aWMgaW50IGJsa19tcV9kaXNwYXRjaF93YWtlKHdhaXRfcXVldWVfZW50cnlfdCAqd2Fp
dCwgdW5zaWduZWQgbW9kZSwNCj4gPiAgCQkJCWludCBmbGFncywgdm9pZCAqa2V5KQ0KPiA+ICB7
DQo+ID4gLQlzdHJ1Y3QgYmxrX21xX2h3X2N0eCAqaGN0eDsNCj4gPiArCXN0cnVjdCBibGtfbXFf
aHdfY3R4ICpoY3R4ID0NCj4gPiArCQljb250YWluZXJfb2Yod2FpdCwgc3RydWN0IGJsa19tcV9o
d19jdHgsIGRpc3BhdGNoX3dhaXQpOw0KPiA+ICsNCj4gPiArI2lmZGVmIENPTkZJR19MT0NLREVQ
DQo+ID4gKwlzdHJ1Y3Qgc2JxX3dhaXRfc3RhdGUgKndzID0gYnRfd2FpdF9wdHIoJmhjdHgtPnRh
Z3MtPmJpdG1hcF90YWdzLCBoY3R4KTsNCj4gPiAgDQo+ID4gLQloY3R4ID0gY29udGFpbmVyX29m
KHdhaXQsIHN0cnVjdCBibGtfbXFfaHdfY3R4LCBkaXNwYXRjaF93YWl0KTsNCj4gPiArCWxvY2tk
ZXBfYXNzZXJ0X2hlbGQoJndzLT53YWl0LmxvY2spOw0KPiA+ICsjZW5kaWYNCj4gDQo+IHdlJ3Jl
IGRlZmluaXRlbHkgbm90IGRvaW5nLg0KDQpDYW4geW91IGV4cGxhaW4gbWUgd2h5IHlvdSB0aGlu
ayB0aGUgYWJvdmUgaXMgd3Jvbmc/IE15IHVuZGVyc3RhbmRpbmcgaXMNCnRoYXQgdGhlIGNhbGwg
Y2hhaW4gZm9yIHRoZSBhYm92ZSBmdW5jdGlvbiBpcyBhcyBmb2xsb3dzOg0KDQpibGtfbXFfdGFn
X3dha2V1cF9hbGwoKQ0KLT4gc2JpdG1hcF9xdWV1ZV93YWtlX2FsbCgpDQogIC0+IHdha2VfdXAo
KQ0KICAgIC0+IF9fd2FrZV91cCgpDQogICAgICAtPiBfX3dha2VfdXBfY29tbW9uX2xvY2soKQ0K
ICAgICAgICAtPiBzcGluX2xvY2tfaXJxc2F2ZSgmd3FfaGVhZC0+bG9jaywgZmxhZ3MpOw0KICAg
ICAgICAtPiBfX3dha2VfdXBfY29tbW9uKCkNCiAgICAgICAgICAtPiBsaXN0X2Zvcl9lYWNoX2Vu
dHJ5X3NhZmVfZnJvbShjdXJyLCBuZXh0LCAmd3FfaGVhZC0+aGVhZCwgZW50cnkpDQogICAgICAg
ICAgLT4gICByZXQgPSBjdXJyLT5mdW5jKGN1cnIsIG1vZGUsIHdha2VfZmxhZ3MsIGtleSkNCiAg
ICAgICAgLT4gc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmd3FfaGVhZC0+bG9jaywgZmxhZ3MpOw0K
DQpJbiBvdGhlciB3b3JkcywgYmxrX21xX2Rpc3BhdGNoX3dha2UoKSBpcyBjYWxsZWQgd2l0aCB0
aGUgd2FpdCBxdWV1ZSBoZWFkDQpsb2NrIGhlbGQuDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-01  2:58   ` Ming Lei
@ 2017-12-01 19:52     ` Bart Van Assche
  2017-12-02  0:36       ` Ming Lei
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-01 19:52 UTC (permalink / raw)
  To: ming.lei; +Cc: hch, hare, linux-block, osandov, jthumshirn, axboe

T24gRnJpLCAyMDE3LTEyLTAxIGF0IDEwOjU4ICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gT24g
VGh1LCBOb3YgMzAsIDIwMTcgYXQgMDQ6MDg6NDVQTSAtMDgwMCwgQmFydCBWYW4gQXNzY2hlIHdy
b3RlOg0KPiA+IGJsa19tcV9zY2hlZF9tYXJrX3Jlc3RhcnRfaGN0eCgpIG11c3QgYmUgY2FsbGVk
IGJlZm9yZQ0KPiANCj4gQ291bGQgeW91IHBsZWFzZSBkZXNjcmliZSB0aGUgdGhlb3J5IG9uIGNv
bW1pdCBsb2c/IExpa2UsIHdoeSBpcyBpdA0KPiBhIG11c3Q/IGFuZCB3aGF0IGlzIHRoZSBpc3N1
ZSB0byBiZSBmaXhlZD8gDQoNClRoZSBCTEtfTVFfU19TQ0hFRF9SRVNUQVJUIHRlc3QgYXQgdGhl
IGVuZCBvZiBibGtfbXFfZGlzcGF0Y2hfcnFfbGlzdCgpIGNhbg0Kb25seSB3b3JrIGlmIEJMS19N
UV9TX1NDSEVEX1JFU1RBUlQgaXMgc2V0IGJlZm9yZSBibGtfbXFfZGlzcGF0Y2hfcnFfbGlzdCgp
DQppcyBjYWxsZWQuIEJUVywgd2l0aG91dCB0aGlzIHBhdGNoIGV2ZXJ5IGl0ZXJhdGlvbiBvZiBt
eSB0ZXN0IHRyaWdnZXJzIGENCnF1ZXVlIHN0YWxsLiBXaXRoIHRoaXMgcGF0Y2ggYSBxdWV1ZSBz
dGFsbCBvbmx5IG9jY3VycyBzcG9yYWRpY2FsbHkgc28gSQ0KdGhpbmsgd2UgcmVhbGx5IG5lZWQg
c29tZXRoaW5nIGxpa2UgdGhpcyBwYXRjaC4NCg0KPiA+IGJsa19tcV9kaXNwYXRjaF9ycV9saXN0
KCkgaXMgY2FsbGVkLiBNYWtlIHN1cmUgdGhhdA0KPiA+IEJMS19NUV9TX1NDSEVEX1JFU1RBUlQg
aXMgc2V0IGJlZm9yZSBhbnkgYmxrX21xX2Rpc3BhdGNoX3JxX2xpc3QoKQ0KPiA+IGNhbGwgb2Nj
dXJzLg0KPiA+IA0KPiA+IEZpeGVzOiBjb21taXQgYjM0NzY4OWZmYmNhICgiYmxrLW1xLXNjaGVk
OiBpbXByb3ZlIGRpc3BhdGNoaW5nIGZyb20gc3cgcXVldWUiKQ0KPiANCj4gV2UgYWx3YXlzIG1h
cmsgUkVTVEFSVCBzdGF0ZSBiaXQganVzdCBiZWZvcmUgZGlzcGF0Y2hpbmcgZnJvbSAtPmRpc3Bh
dGNoX2xpc3QsDQo+IHRoaXMgd2F5IGhhcyBiZWVuIHRoZXJlIGJlZm9yZSBiMzQ3Njg5ZmZiY2Es
IHdoaWNoIGRvZXNuJ3QgY2hhbmdlIHRoaXMNCj4gUkVTVEFSVCBtZWNoYW5pc20sIHNvIHBsZWFz
ZSBleHBsYWluIGEgYml0IHdoeSBpdCBpcyBhIGZpeCBvbiBjb21taXQNCj4gYjM0NzY4OWZmYmNh
Lg0KDQpJJ20gbm90IGNvbXBsZXRlbHkgc3VyZSB3aGljaCBwYXRjaCBpbnRyb2R1Y2VkIHRoZSBs
b2NrdXAgZml4ZWQgYnkgdGhpcyBwYXRjaA0KYnV0IEkgd2lsbCBoYXZlIGFub3RoZXIgbG9vayB3
aGV0aGVyIHRoaXMgd2FzIHJlYWxseSBpbnRyb2R1Y2VkIGJ5IGNvbW1pdA0KYjM0NzY4OWZmYmNh
Lg0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-01 19:52     ` Bart Van Assche
@ 2017-12-02  0:36       ` Ming Lei
  2017-12-02  0:48         ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2017-12-02  0:36 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, hare, linux-block, osandov, jthumshirn, axboe

On Fri, Dec 01, 2017 at 07:52:14PM +0000, Bart Van Assche wrote:
> On Fri, 2017-12-01 at 10:58 +0800, Ming Lei wrote:
> > On Thu, Nov 30, 2017 at 04:08:45PM -0800, Bart Van Assche wrote:
> > > blk_mq_sched_mark_restart_hctx() must be called before
> > 
> > Could you please describe the theory on commit log? Like, why is it
> > a must? and what is the issue to be fixed? 
> 
> The BLK_MQ_S_SCHED_RESTART test at the end of blk_mq_dispatch_rq_list() can
> only work if BLK_MQ_S_SCHED_RESTART is set before blk_mq_dispatch_rq_list()
> is called.

The theory about using BLK_MQ_S_SCHED_RESTART in current way is that we
mark it after requests are added to hctx->dispatch, then blk_mq_sched_restart() 
can see this request to be revisited.

So in theory, we don't need to set it before each dispatch.

Once .get_budget()/.put_budget() is introduced, things may be a bit
different because we may need to revisit requests in scheduler/SW queue.
But we depend on SCSI's RESTART(scsi_end_request()) to do that. So we
still don't need this patch.

> BTW, without this patch every iteration of my test triggers a
> queue stall. With this patch a queue stall only occurs sporadically so I
> think we really need something like this patch.

We need to root cause your queue stall first, otherwise any change can
be thought as workaround. Could you investigate the issue a bit and get
the exact reason?

> 
> > > blk_mq_dispatch_rq_list() is called. Make sure that
> > > BLK_MQ_S_SCHED_RESTART is set before any blk_mq_dispatch_rq_list()
> > > call occurs.
> > > 
> > > Fixes: commit b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")
> > 
> > We always mark RESTART state bit just before dispatching from ->dispatch_list,
> > this way has been there before b347689ffbca, which doesn't change this
> > RESTART mechanism, so please explain a bit why it is a fix on commit
> > b347689ffbca.
> 
> I'm not completely sure which patch introduced the lockup fixed by this patch
> but I will have another look whether this was really introduced by commit
> b347689ffbca.

Please make sure 'Fixes' tag correct.

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-02  0:36       ` Ming Lei
@ 2017-12-02  0:48         ` Bart Van Assche
  2017-12-02  1:00           ` Ming Lei
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2017-12-02  0:48 UTC (permalink / raw)
  To: ming.lei; +Cc: hch, jthumshirn, hare, linux-block, osandov, axboe

T24gU2F0LCAyMDE3LTEyLTAyIGF0IDA4OjM2ICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gT24g
RnJpLCBEZWMgMDEsIDIwMTcgYXQgMDc6NTI6MTRQTSArMDAwMCwgQmFydCBWYW4gQXNzY2hlIHdy
b3RlOg0KPiA+IE9uIEZyaSwgMjAxNy0xMi0wMSBhdCAxMDo1OCArMDgwMCwgTWluZyBMZWkgd3Jv
dGU6DQo+ID4gPiBPbiBUaHUsIE5vdiAzMCwgMjAxNyBhdCAwNDowODo0NVBNIC0wODAwLCBCYXJ0
IFZhbiBBc3NjaGUgd3JvdGU6DQo+ID4gPiA+IGJsa19tcV9kaXNwYXRjaF9ycV9saXN0KCkgaXMg
Y2FsbGVkLiBNYWtlIHN1cmUgdGhhdA0KPiA+ID4gPiBCTEtfTVFfU19TQ0hFRF9SRVNUQVJUIGlz
IHNldCBiZWZvcmUgYW55IGJsa19tcV9kaXNwYXRjaF9ycV9saXN0KCkNCj4gPiA+ID4gY2FsbCBv
Y2N1cnMuDQo+ID4gPiA+IA0KPiA+ID4gPiBGaXhlczogY29tbWl0IGIzNDc2ODlmZmJjYSAoImJs
ay1tcS1zY2hlZDogaW1wcm92ZSBkaXNwYXRjaGluZyBmcm9tIHN3IHF1ZXVlIikNCj4gPiA+IA0K
PiA+ID4gV2UgYWx3YXlzIG1hcmsgUkVTVEFSVCBzdGF0ZSBiaXQganVzdCBiZWZvcmUgZGlzcGF0
Y2hpbmcgZnJvbSAtPmRpc3BhdGNoX2xpc3QsDQo+ID4gPiB0aGlzIHdheSBoYXMgYmVlbiB0aGVy
ZSBiZWZvcmUgYjM0NzY4OWZmYmNhLCB3aGljaCBkb2Vzbid0IGNoYW5nZSB0aGlzDQo+ID4gPiBS
RVNUQVJUIG1lY2hhbmlzbSwgc28gcGxlYXNlIGV4cGxhaW4gYSBiaXQgd2h5IGl0IGlzIGEgZml4
IG9uIGNvbW1pdA0KPiA+ID4gYjM0NzY4OWZmYmNhLg0KPiA+IA0KPiA+IEknbSBub3QgY29tcGxl
dGVseSBzdXJlIHdoaWNoIHBhdGNoIGludHJvZHVjZWQgdGhlIGxvY2t1cCBmaXhlZCBieSB0aGlz
IHBhdGNoDQo+ID4gYnV0IEkgd2lsbCBoYXZlIGFub3RoZXIgbG9vayB3aGV0aGVyIHRoaXMgd2Fz
IHJlYWxseSBpbnRyb2R1Y2VkIGJ5IGNvbW1pdA0KPiA+IGIzNDc2ODlmZmJjYS4NCj4gDQo+IFBs
ZWFzZSBtYWtlIHN1cmUgJ0ZpeGVzJyB0YWcgY29ycmVjdC4NCg0KRnVydGhlciB0ZXN0cyBoYXZl
IHNob3duIHRoYXQgdGhlIGxvY2t1cCBJIHJlZmVycmVkIHRvIGRvZXMgbm90IG9jY3VyIGJlZm9y
ZSBjb21taXQNCmIzNDc2ODlmZmJjYSBidXQgdGhhdCBpdCBvY2N1cnMgd2l0aCBiMzQ3Njg5ZmZi
Y2EuIEkgdGhpbmsgdGhhdCBzaG93cyBjbGVhcmx5IHRoYXQNCmNvbW1pdCBiMzQ3Njg5ZmZiY2Eg
aW50cm9kdWNlZCB0aGlzIGxvY2t1cC4NCg0KQmFydC4=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-02  0:48         ` Bart Van Assche
@ 2017-12-02  1:00           ` Ming Lei
  2017-12-02  1:05             ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2017-12-02  1:00 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, jthumshirn, hare, linux-block, osandov, axboe

On Sat, Dec 02, 2017 at 12:48:51AM +0000, Bart Van Assche wrote:
> On Sat, 2017-12-02 at 08:36 +0800, Ming Lei wrote:
> > On Fri, Dec 01, 2017 at 07:52:14PM +0000, Bart Van Assche wrote:
> > > On Fri, 2017-12-01 at 10:58 +0800, Ming Lei wrote:
> > > > On Thu, Nov 30, 2017 at 04:08:45PM -0800, Bart Van Assche wrote:
> > > > > blk_mq_dispatch_rq_list() is called. Make sure that
> > > > > BLK_MQ_S_SCHED_RESTART is set before any blk_mq_dispatch_rq_list()
> > > > > call occurs.
> > > > > 
> > > > > Fixes: commit b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")
> > > > 
> > > > We always mark RESTART state bit just before dispatching from ->dispatch_list,
> > > > this way has been there before b347689ffbca, which doesn't change this
> > > > RESTART mechanism, so please explain a bit why it is a fix on commit
> > > > b347689ffbca.
> > > 
> > > I'm not completely sure which patch introduced the lockup fixed by this patch
> > > but I will have another look whether this was really introduced by commit
> > > b347689ffbca.
> > 
> > Please make sure 'Fixes' tag correct.
> 
> Further tests have shown that the lockup I referred to does not occur before commit
> b347689ffbca but that it occurs with b347689ffbca.

Then you need to root cause it, or

Provide debugfs log and reproduction steps, please.

> I think that shows clearly that commit b347689ffbca introduced this lockup.

That may not be so clearly, maybe it is just triggered easily after this
commit.

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-02  1:00           ` Ming Lei
@ 2017-12-02  1:05             ` Bart Van Assche
  2017-12-02  1:14               ` Ming Lei
  2017-12-02  1:20               ` Ming Lei
  0 siblings, 2 replies; 21+ messages in thread
From: Bart Van Assche @ 2017-12-02  1:05 UTC (permalink / raw)
  To: ming.lei; +Cc: hch, osandov, jthumshirn, hare, linux-block, axboe

T24gU2F0LCAyMDE3LTEyLTAyIGF0IDA5OjAwICswODAwLCBNaW5nIExlaSB3cm90ZToNCj4gT24g
U2F0LCBEZWMgMDIsIDIwMTcgYXQgMTI6NDg6NTFBTSArMDAwMCwgQmFydCBWYW4gQXNzY2hlIHdy
b3RlOg0KPiA+IEZ1cnRoZXIgdGVzdHMgaGF2ZSBzaG93biB0aGF0IHRoZSBsb2NrdXAgSSByZWZl
cnJlZCB0byBkb2VzIG5vdCBvY2N1ciBiZWZvcmUgY29tbWl0DQo+ID4gYjM0NzY4OWZmYmNhIGJ1
dCB0aGF0IGl0IG9jY3VycyB3aXRoIGIzNDc2ODlmZmJjYS4NCj4gDQo+IFRoZW4geW91IG5lZWQg
dG8gcm9vdCBjYXVzZSBpdCwgb3INCg0KSXQncyBub3QgbXkgam9iIHRvIHJvb3QgY2F1c2UgYnVn
cyBpbnRyb2R1Y2VkIGJ5IHlvdXIgcGF0Y2hlcy4NCg0KPiBQcm92aWRlIGRlYnVnZnMgbG9nIGFu
ZCByZXByb2R1Y3Rpb24gc3RlcHMsIHBsZWFzZS4NCg0KSG93IEkgcmVwcm9kdWNlIHRoaXMgYnVn
IGlzIHNvbWV0aGluZyBJIGhhdmUgYWxyZWFkeSBtZW50aW9uZWQgbWFueSB0aW1lcyBpbg0KcHJl
dmlvdXMgZS1tYWlscy4NCg0KQmFydC4NCg==

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-02  1:05             ` Bart Van Assche
@ 2017-12-02  1:14               ` Ming Lei
  2017-12-02  1:20               ` Ming Lei
  1 sibling, 0 replies; 21+ messages in thread
From: Ming Lei @ 2017-12-02  1:14 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, osandov, jthumshirn, hare, linux-block, axboe

On Sat, Dec 02, 2017 at 01:05:05AM +0000, Bart Van Assche wrote:
> On Sat, 2017-12-02 at 09:00 +0800, Ming Lei wrote:
> > On Sat, Dec 02, 2017 at 12:48:51AM +0000, Bart Van Assche wrote:
> > > Further tests have shown that the lockup I referred to does not occur before commit
> > > b347689ffbca but that it occurs with b347689ffbca.
> > 
> > Then you need to root cause it, or
> 
> It's not my job to root cause bugs introduced by your patches.
> 
> > Provide debugfs log and reproduction steps, please.
> 
> How I reproduce this bug is something I have already mentioned many times in
> previous e-mails.

Please provide it in this commit log or this thread explicitly, otherwise who
knows what is that.

If that is something only you have the environment for the reproduction,
please provide the debugfs log, otherwise don't expect community can help
you much.

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags
  2017-12-02  1:05             ` Bart Van Assche
  2017-12-02  1:14               ` Ming Lei
@ 2017-12-02  1:20               ` Ming Lei
  1 sibling, 0 replies; 21+ messages in thread
From: Ming Lei @ 2017-12-02  1:20 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, osandov, jthumshirn, hare, linux-block, axboe

On Sat, Dec 02, 2017 at 01:05:05AM +0000, Bart Van Assche wrote:
> On Sat, 2017-12-02 at 09:00 +0800, Ming Lei wrote:
> > On Sat, Dec 02, 2017 at 12:48:51AM +0000, Bart Van Assche wrote:
> > > Further tests have shown that the lockup I referred to does not occur before commit
> > > b347689ffbca but that it occurs with b347689ffbca.
> > 
> > Then you need to root cause it, or
> 
> It's not my job to root cause bugs introduced by your patches.

Then show us the debugfs and dmesg log, otherwise how can we make
progress since you are the only reproducer?

Also it is you who posts the patch, you have to provide exact root
cause.

-- 
Ming

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-12-02  1:20 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-01  0:08 [PATCH 0/7] blk-mq: Queue running fixes Bart Van Assche
2017-12-01  0:08 ` [PATCH 1/7] blk-mq: Fix spelling in a source code comment Bart Van Assche
2017-12-01  0:08 ` [PATCH 2/7] block: Document more locking requirements Bart Van Assche
2017-12-01  3:03   ` Jens Axboe
2017-12-01 17:05     ` Bart Van Assche
2017-12-01  0:08 ` [PATCH 3/7] blk-mq: Make blk_mq_mark_tag_wait() easier to read Bart Van Assche
2017-12-01  0:08 ` [PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags Bart Van Assche
2017-12-01  2:58   ` Ming Lei
2017-12-01 19:52     ` Bart Van Assche
2017-12-02  0:36       ` Ming Lei
2017-12-02  0:48         ` Bart Van Assche
2017-12-02  1:00           ` Ming Lei
2017-12-02  1:05             ` Bart Van Assche
2017-12-02  1:14               ` Ming Lei
2017-12-02  1:20               ` Ming Lei
2017-12-01  0:08 ` [PATCH 5/7] blk-mq: Avoid that blk_mq_mark_tag_wait() triggers a queue stall Bart Van Assche
2017-12-01  3:51   ` Ming Lei
2017-12-01  0:08 ` [PATCH 6/7] blk-mq: Rerun hardware queues after having called .put_budget() Bart Van Assche
2017-12-01  3:23   ` Ming Lei
2017-12-01  0:08 ` [PATCH 7/7] blk-mq: Fix another queue stall Bart Van Assche
2017-12-01  3:00   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).