All of lore.kernel.org
 help / color / mirror / Atom feed
* [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
@ 2018-01-17 16:25 Mike Snitzer
  2018-01-17 16:25 ` [for-4.16 PATCH v6 1/3] blk-mq: factor out a few helpers from __blk_mq_try_issue_directly Mike Snitzer
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Mike Snitzer @ 2018-01-17 16:25 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, dm-devel, linux-block

Hi Jens,

Think this finally takes care of it! ;)

Thanks,
Mike

Mike Snitzer (2):
  blk-mq: factor out a few helpers from __blk_mq_try_issue_directly
  blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request

Ming Lei (1):
  blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback

 block/blk-core.c     |   3 +-
 block/blk-exec.c     |   2 +-
 block/blk-mq-sched.c |   2 +-
 block/blk-mq-sched.h |   2 +-
 block/blk-mq.c       | 106 ++++++++++++++++++++++++++++++++++++---------------
 block/blk-mq.h       |   3 ++
 drivers/md/dm-rq.c   |  19 +++++++--
 7 files changed, 98 insertions(+), 39 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [for-4.16 PATCH v6 1/3] blk-mq: factor out a few helpers from __blk_mq_try_issue_directly
  2018-01-17 16:25 [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
@ 2018-01-17 16:25 ` Mike Snitzer
  2018-01-17 16:25 ` [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2018-01-17 16:25 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, dm-devel, linux-block

No functional change.  Just makes code flow more logically.

In following commit, __blk_mq_try_issue_directly() will be used to
return the dispatch result (blk_status_t) to DM.  DM needs this
information to improve IO merging.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-mq.c | 79 ++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 52 insertions(+), 27 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c8f62e6be6b6..c117c2baf2c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1694,9 +1694,9 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq)
 	return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true);
 }
 
-static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-					struct request *rq,
-					blk_qc_t *cookie)
+static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
+					    struct request *rq,
+					    blk_qc_t *cookie)
 {
 	struct request_queue *q = rq->q;
 	struct blk_mq_queue_data bd = {
@@ -1705,6 +1705,43 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	};
 	blk_qc_t new_cookie;
 	blk_status_t ret;
+
+	new_cookie = request_to_qc_t(hctx, rq);
+
+	/*
+	 * For OK queue, we are done. For error, caller may kill it.
+	 * Any other error (busy), just add it to our list as we
+	 * previously would have done.
+	 */
+	ret = q->mq_ops->queue_rq(hctx, &bd);
+	switch (ret) {
+	case BLK_STS_OK:
+		*cookie = new_cookie;
+		break;
+	case BLK_STS_RESOURCE:
+		__blk_mq_requeue_request(rq);
+		break;
+	default:
+		*cookie = BLK_QC_T_NONE;
+		break;
+	}
+
+	return ret;
+}
+
+static void __blk_mq_fallback_to_insert(struct blk_mq_hw_ctx *hctx,
+					struct request *rq,
+					bool run_queue)
+{
+	blk_mq_sched_insert_request(rq, false, run_queue, false,
+					hctx->flags & BLK_MQ_F_BLOCKING);
+}
+
+static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+						struct request *rq,
+						blk_qc_t *cookie)
+{
+	struct request_queue *q = rq->q;
 	bool run_queue = true;
 
 	/* RCU or SRCU read lock is needed before checking quiesced flag */
@@ -1724,41 +1761,29 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		goto insert;
 	}
 
-	new_cookie = request_to_qc_t(hctx, rq);
-
-	/*
-	 * For OK queue, we are done. For error, kill it. Any other
-	 * error (busy), just add it to our list as we previously
-	 * would have done
-	 */
-	ret = q->mq_ops->queue_rq(hctx, &bd);
-	switch (ret) {
-	case BLK_STS_OK:
-		*cookie = new_cookie;
-		return;
-	case BLK_STS_RESOURCE:
-		__blk_mq_requeue_request(rq);
-		goto insert;
-	default:
-		*cookie = BLK_QC_T_NONE;
-		blk_mq_end_request(rq, ret);
-		return;
-	}
-
+	return __blk_mq_issue_directly(hctx, rq, cookie);
 insert:
-	blk_mq_sched_insert_request(rq, false, run_queue, false,
-					hctx->flags & BLK_MQ_F_BLOCKING);
+	__blk_mq_fallback_to_insert(hctx, rq, run_queue);
+
+	return BLK_STS_OK;
 }
 
 static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, blk_qc_t *cookie)
 {
+	blk_status_t ret;
 	int srcu_idx;
 
 	might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
 
 	hctx_lock(hctx, &srcu_idx);
-	__blk_mq_try_issue_directly(hctx, rq, cookie);
+
+	ret = __blk_mq_try_issue_directly(hctx, rq, cookie);
+	if (ret == BLK_STS_RESOURCE)
+		__blk_mq_fallback_to_insert(hctx, rq, true);
+	else if (ret != BLK_STS_OK)
+		blk_mq_end_request(rq, ret);
+
 	hctx_unlock(hctx, srcu_idx);
 }
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 16:25 [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
  2018-01-17 16:25 ` [for-4.16 PATCH v6 1/3] blk-mq: factor out a few helpers from __blk_mq_try_issue_directly Mike Snitzer
@ 2018-01-17 16:25 ` Mike Snitzer
  2018-01-18  3:25   ` Ming Lei
  2018-01-17 16:25 ` [for-4.16 PATCH v6 3/3] blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request Mike Snitzer
  2018-01-17 16:50 ` [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Jens Axboe
  3 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2018-01-17 16:25 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, dm-devel, linux-block

From: Ming Lei <ming.lei@redhat.com>

blk_insert_cloned_request() is called in the fast path of a dm-rq driver
(e.g. blk-mq request-based DM mpath).  blk_insert_cloned_request() uses
blk_mq_request_bypass_insert() to directly append the request to the
blk-mq hctx->dispatch_list of the underlying queue.

1) This way isn't efficient enough because the hctx spinlock is always
used.

2) With blk_insert_cloned_request(), we completely bypass underlying
queue's elevator and depend on the upper-level dm-rq driver's elevator
to schedule IO.  But dm-rq currently can't get the underlying queue's
dispatch feedback at all.  Without knowing whether a request was issued
or not (e.g. due to underlying queue being busy) the dm-rq elevator will
not be able to provide effective IO merging (as a side-effect of dm-rq
currently blindly destaging a request from its elevator only to requeue
it after a delay, which kills any opportunity for merging).  This
obviously causes very bad sequential IO performance.

Fix this by updating blk_insert_cloned_request() to use
blk_mq_request_direct_issue().  blk_mq_request_direct_issue() allows a
request to be issued directly to the underlying queue and returns the
dispatch feedback (blk_status_t).  If blk_mq_request_direct_issue()
returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE
to _not_ destage the request.  Whereby preserving the opportunity to
merge IO.

With this, request-based DM's blk-mq sequential IO performance is vastly
improved (as much as 3X in mpath/virtio-scsi testing).

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[blk-mq.c changes heavily influenced by Ming Lei's initial solution, but
they were refactored to make them less fragile and easier to read/review]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-core.c   |  3 +--
 block/blk-mq.c     | 37 +++++++++++++++++++++++++++++--------
 block/blk-mq.h     |  3 +++
 drivers/md/dm-rq.c | 19 ++++++++++++++++---
 4 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7ba607527487..55f338020254 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2500,8 +2500,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 		 * bypass a potential scheduler on the bottom device for
 		 * insert.
 		 */
-		blk_mq_request_bypass_insert(rq, true);
-		return BLK_STS_OK;
+		return blk_mq_request_direct_issue(rq);
 	}
 
 	spin_lock_irqsave(q->queue_lock, flags);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index c117c2baf2c9..f5f0d8456713 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1731,15 +1731,19 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 static void __blk_mq_fallback_to_insert(struct blk_mq_hw_ctx *hctx,
 					struct request *rq,
-					bool run_queue)
+					bool run_queue, bool bypass_insert)
 {
-	blk_mq_sched_insert_request(rq, false, run_queue, false,
-					hctx->flags & BLK_MQ_F_BLOCKING);
+	if (!bypass_insert)
+		blk_mq_sched_insert_request(rq, false, run_queue, false,
+					    hctx->flags & BLK_MQ_F_BLOCKING);
+	else
+		blk_mq_request_bypass_insert(rq, run_queue);
 }
 
 static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 						struct request *rq,
-						blk_qc_t *cookie)
+						blk_qc_t *cookie,
+						bool bypass_insert)
 {
 	struct request_queue *q = rq->q;
 	bool run_queue = true;
@@ -1750,7 +1754,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		goto insert;
 	}
 
-	if (q->elevator)
+	if (q->elevator && !bypass_insert)
 		goto insert;
 
 	if (!blk_mq_get_driver_tag(rq, NULL, false))
@@ -1763,7 +1767,9 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 	return __blk_mq_issue_directly(hctx, rq, cookie);
 insert:
-	__blk_mq_fallback_to_insert(hctx, rq, run_queue);
+	__blk_mq_fallback_to_insert(hctx, rq, run_queue, bypass_insert);
+	if (bypass_insert)
+		return BLK_STS_RESOURCE;
 
 	return BLK_STS_OK;
 }
@@ -1778,15 +1784,30 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 	hctx_lock(hctx, &srcu_idx);
 
-	ret = __blk_mq_try_issue_directly(hctx, rq, cookie);
+	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
 	if (ret == BLK_STS_RESOURCE)
-		__blk_mq_fallback_to_insert(hctx, rq, true);
+		__blk_mq_fallback_to_insert(hctx, rq, true, false);
 	else if (ret != BLK_STS_OK)
 		blk_mq_end_request(rq, ret);
 
 	hctx_unlock(hctx, srcu_idx);
 }
 
+blk_status_t blk_mq_request_direct_issue(struct request *rq)
+{
+	blk_status_t ret;
+	int srcu_idx;
+	blk_qc_t unused_cookie;
+	struct blk_mq_ctx *ctx = rq->mq_ctx;
+	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu);
+
+	hctx_lock(hctx, &srcu_idx);
+	ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
+	hctx_unlock(hctx, srcu_idx);
+
+	return ret;
+}
+
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 {
 	const int is_sync = op_is_sync(bio->bi_opf);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 8591a54d989b..e3ebc93646ca 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -74,6 +74,9 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
 				struct list_head *list);
 
+/* Used by blk_insert_cloned_request() to issue request directly */
+blk_status_t blk_mq_request_direct_issue(struct request *rq);
+
 /*
  * CPU -> queue mappings
  */
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index c28357f5cb0e..b7d175e94a02 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -395,7 +395,7 @@ static void end_clone_request(struct request *clone, blk_status_t error)
 	dm_complete_request(tio->orig, error);
 }
 
-static void dm_dispatch_clone_request(struct request *clone, struct request *rq)
+static blk_status_t dm_dispatch_clone_request(struct request *clone, struct request *rq)
 {
 	blk_status_t r;
 
@@ -404,9 +404,10 @@ static void dm_dispatch_clone_request(struct request *clone, struct request *rq)
 
 	clone->start_time = jiffies;
 	r = blk_insert_cloned_request(clone->q, clone);
-	if (r)
+	if (r != BLK_STS_OK && r != BLK_STS_RESOURCE)
 		/* must complete clone in terms of original request */
 		dm_complete_request(rq, r);
+	return r;
 }
 
 static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig,
@@ -476,8 +477,10 @@ static int map_request(struct dm_rq_target_io *tio)
 	struct mapped_device *md = tio->md;
 	struct request *rq = tio->orig;
 	struct request *clone = NULL;
+	blk_status_t ret;
 
 	r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone);
+check_again:
 	switch (r) {
 	case DM_MAPIO_SUBMITTED:
 		/* The target has taken the I/O to submit by itself later */
@@ -492,7 +495,17 @@ static int map_request(struct dm_rq_target_io *tio)
 		/* The target has remapped the I/O so dispatch it */
 		trace_block_rq_remap(clone->q, clone, disk_devt(dm_disk(md)),
 				     blk_rq_pos(rq));
-		dm_dispatch_clone_request(clone, rq);
+		ret = dm_dispatch_clone_request(clone, rq);
+		if (ret == BLK_STS_RESOURCE) {
+			blk_rq_unprep_clone(clone);
+			tio->ti->type->release_clone_rq(clone);
+			tio->clone = NULL;
+			if (!rq->q->mq_ops)
+				r = DM_MAPIO_DELAY_REQUEUE;
+			else
+				r = DM_MAPIO_REQUEUE;
+			goto check_again;
+		}
 		break;
 	case DM_MAPIO_REQUEUE:
 		/* The target wants to requeue the I/O */
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [for-4.16 PATCH v6 3/3] blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request
  2018-01-17 16:25 [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
  2018-01-17 16:25 ` [for-4.16 PATCH v6 1/3] blk-mq: factor out a few helpers from __blk_mq_try_issue_directly Mike Snitzer
  2018-01-17 16:25 ` [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
@ 2018-01-17 16:25 ` Mike Snitzer
  2018-01-17 16:50 ` [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Jens Axboe
  3 siblings, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2018-01-17 16:25 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, dm-devel, linux-block

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-exec.c     |  2 +-
 block/blk-mq-sched.c |  2 +-
 block/blk-mq-sched.h |  2 +-
 block/blk-mq.c       | 16 +++++++---------
 4 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/block/blk-exec.c b/block/blk-exec.c
index 5c0f3dc446dc..f7b292f12449 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -61,7 +61,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	 * be reused after dying flag is set
 	 */
 	if (q->mq_ops) {
-		blk_mq_sched_insert_request(rq, at_head, true, false, false);
+		blk_mq_sched_insert_request(rq, at_head, true, false);
 		return;
 	}
 
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 2ff7cf0cbf73..55c0a745b427 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -427,7 +427,7 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *const hctx)
 }
 
 void blk_mq_sched_insert_request(struct request *rq, bool at_head,
-				 bool run_queue, bool async, bool can_block)
+				 bool run_queue, bool async)
 {
 	struct request_queue *q = rq->q;
 	struct elevator_queue *e = q->elevator;
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index ba1d1418a96d..1e9c9018ace1 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -18,7 +18,7 @@ bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
 void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx);
 
 void blk_mq_sched_insert_request(struct request *rq, bool at_head,
-				 bool run_queue, bool async, bool can_block);
+				 bool run_queue, bool async);
 void blk_mq_sched_insert_requests(struct request_queue *q,
 				  struct blk_mq_ctx *ctx,
 				  struct list_head *list, bool run_queue_async);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f5f0d8456713..02ae1d7b94af 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -745,13 +745,13 @@ static void blk_mq_requeue_work(struct work_struct *work)
 
 		rq->rq_flags &= ~RQF_SOFTBARRIER;
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, true, false, false, true);
+		blk_mq_sched_insert_request(rq, true, false, false);
 	}
 
 	while (!list_empty(&rq_list)) {
 		rq = list_entry(rq_list.next, struct request, queuelist);
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, false, false, false, true);
+		blk_mq_sched_insert_request(rq, false, false, false);
 	}
 
 	blk_mq_run_hw_queues(q, false);
@@ -1729,13 +1729,11 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
-static void __blk_mq_fallback_to_insert(struct blk_mq_hw_ctx *hctx,
-					struct request *rq,
+static void __blk_mq_fallback_to_insert(struct request *rq,
 					bool run_queue, bool bypass_insert)
 {
 	if (!bypass_insert)
-		blk_mq_sched_insert_request(rq, false, run_queue, false,
-					    hctx->flags & BLK_MQ_F_BLOCKING);
+		blk_mq_sched_insert_request(rq, false, run_queue, false);
 	else
 		blk_mq_request_bypass_insert(rq, run_queue);
 }
@@ -1767,7 +1765,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 	return __blk_mq_issue_directly(hctx, rq, cookie);
 insert:
-	__blk_mq_fallback_to_insert(hctx, rq, run_queue, bypass_insert);
+	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
 	if (bypass_insert)
 		return BLK_STS_RESOURCE;
 
@@ -1786,7 +1784,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
 	if (ret == BLK_STS_RESOURCE)
-		__blk_mq_fallback_to_insert(hctx, rq, true, false);
+		__blk_mq_fallback_to_insert(rq, true, false);
 	else if (ret != BLK_STS_OK)
 		blk_mq_end_request(rq, ret);
 
@@ -1916,7 +1914,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	} else if (q->elevator) {
 		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
-		blk_mq_sched_insert_request(rq, false, true, true, true);
+		blk_mq_sched_insert_request(rq, false, true, true);
 	} else {
 		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 16:25 [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
                   ` (2 preceding siblings ...)
  2018-01-17 16:25 ` [for-4.16 PATCH v6 3/3] blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request Mike Snitzer
@ 2018-01-17 16:50 ` Jens Axboe
  2018-01-17 16:58   ` Mike Snitzer
  3 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2018-01-17 16:50 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Ming Lei, hch, dm-devel, linux-block

On 1/17/18 9:25 AM, Mike Snitzer wrote:
> Hi Jens,
> 
> Think this finally takes care of it! ;)
> 
> Thanks,
> Mike
> 
> Mike Snitzer (2):
>   blk-mq: factor out a few helpers from __blk_mq_try_issue_directly
>   blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request
> 
> Ming Lei (1):
>   blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback

Applied - added actual commit message to patch 3.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 16:50 ` [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Jens Axboe
@ 2018-01-17 16:58   ` Mike Snitzer
  2018-01-17 23:31       ` Bart Van Assche
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2018-01-17 16:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Ming Lei, hch, dm-devel, linux-block

On Wed, Jan 17 2018 at 11:50am -0500,
Jens Axboe <axboe@kernel.dk> wrote:

> On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > Hi Jens,
> > 
> > Think this finally takes care of it! ;)
> > 
> > Thanks,
> > Mike
> > 
> > Mike Snitzer (2):
> >   blk-mq: factor out a few helpers from __blk_mq_try_issue_directly
> >   blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request
> > 
> > Ming Lei (1):
> >   blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
> 
> Applied - added actual commit message to patch 3.

Great, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 16:58   ` Mike Snitzer
@ 2018-01-17 23:31       ` Bart Van Assche
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2018-01-17 23:31 UTC (permalink / raw)
  To: snitzer, axboe; +Cc: hch, dm-devel, linux-block, tom.leiming, loberman

T24gV2VkLCAyMDE4LTAxLTE3IGF0IDExOjU4IC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IE9uIFdlZCwgSmFuIDE3IDIwMTggYXQgMTE6NTBhbSAtMDUwMCwNCj4gSmVucyBBeGJvZSA8YXhi
b2VAa2VybmVsLmRrPiB3cm90ZToNCj4gDQo+ID4gT24gMS8xNy8xOCA5OjI1IEFNLCBNaWtlIFNu
aXR6ZXIgd3JvdGU6DQo+ID4gPiBIaSBKZW5zLA0KPiA+ID4gDQo+ID4gPiBUaGluayB0aGlzIGZp
bmFsbHkgdGFrZXMgY2FyZSBvZiBpdCEgOykNCj4gPiA+IA0KPiA+ID4gVGhhbmtzLA0KPiA+ID4g
TWlrZQ0KPiA+ID4gDQo+ID4gPiBNaWtlIFNuaXR6ZXIgKDIpOg0KPiA+ID4gICBibGstbXE6IGZh
Y3RvciBvdXQgYSBmZXcgaGVscGVycyBmcm9tIF9fYmxrX21xX3RyeV9pc3N1ZV9kaXJlY3RseQ0K
PiA+ID4gICBibGstbXEtc2NoZWQ6IHJlbW92ZSB1bnVzZWQgJ2Nhbl9ibG9jaycgYXJnIGZyb20g
YmxrX21xX3NjaGVkX2luc2VydF9yZXF1ZXN0DQo+ID4gPiANCj4gPiA+IE1pbmcgTGVpICgxKToN
Cj4gPiA+ICAgYmxrLW1xOiBpbXByb3ZlIERNJ3MgYmxrLW1xIElPIG1lcmdpbmcgdmlhIGJsa19p
bnNlcnRfY2xvbmVkX3JlcXVlc3QgZmVlZGJhY2sNCj4gPiANCj4gPiBBcHBsaWVkIC0gYWRkZWQg
YWN0dWFsIGNvbW1pdCBtZXNzYWdlIHRvIHBhdGNoIDMuDQo+IA0KPiBHcmVhdCwgdGhhbmtzLg0K
DQpIZWxsbyBNaWtlLA0KDQpMYXVyZW5jZSBoaXQgdGhlIGZvbGxvd2luZyB3aGlsZSByZXRlc3Rp
bmcgdGhlIFNSUCBpbml0aWF0b3IgY29kZToNCg0KWyAyMjIzLjc5NzEyOV0gbGlzdF9hZGQgY29y
cnVwdGlvbi4gcHJldi0+bmV4dCBzaG91bGQgYmUgbmV4dCAoMDAwMDAwMDBlMGRkZDVkZCksIGJ1
dCB3YXMgMDAwMDAwMDAzZGVmZTVjZC4gKHByZXY9MDAwMDAwMDAzZGVmZTVjZCkuDQpbIDIyMjMu
ODYyMTY4XSBXQVJOSU5HOiBDUFU6IDE0IFBJRDogNTc3IGF0IGxpYi9saXN0X2RlYnVnLmM6Mjgg
X19saXN0X2FkZF92YWxpZCsweDZhLzB4NzANClsgMjIyNC40ODExNTFdIENQVTogMTQgUElEOiA1
NzcgQ29tbToga3dvcmtlci8xNDoxSCBUYWludGVkOiBHICAgICAgICAgIEkgICAgICA0LjE1LjAt
cmM4LmJhcnQzKyAjMQ0KWyAyMjI0LjUzMTE5M10gSGFyZHdhcmUgbmFtZTogSFAgUHJvTGlhbnQg
REwzODAgRzcsIEJJT1MgUDY3IDA4LzE2LzIwMTUNClsgMjIyNC41NjcxNTBdIFdvcmtxdWV1ZTog
a2Jsb2NrZCBibGtfbXFfcnVuX3dvcmtfZm4NClsgMjIyNC41OTMxODJdIFJJUDogMDAxMDpfX2xp
c3RfYWRkX3ZhbGlkKzB4NmEvMHg3MA0KWyAyMjI0Ljk2NzAwMl0gQ2FsbCBUcmFjZToNClsgMjIy
NC45ODA5NDFdICBibGtfbXFfcmVxdWVzdF9ieXBhc3NfaW5zZXJ0KzB4NTcvMHhhMA0KWyAyMjI1
LjAwOTA0NF0gIF9fYmxrX21xX3RyeV9pc3N1ZV9kaXJlY3RseSsweDU2LzB4MWUwDQpbIDIyMjUu
MDM3MDA3XSAgYmxrX21xX3JlcXVlc3RfZGlyZWN0X2lzc3VlKzB4NWQvMHhjMA0KWyAyMjI1LjA5
MDYwOF0gIG1hcF9yZXF1ZXN0KzB4MTQyLzB4MjYwIFtkbV9tb2RdDQpbIDIyMjUuMTE0NzU2XSAg
ZG1fbXFfcXVldWVfcnErMHhhNC8weDEyMCBbZG1fbW9kXQ0KWyAyMjI1LjE0MDgxMl0gIGJsa19t
cV9kaXNwYXRjaF9ycV9saXN0KzB4OTAvMHg1YjANClsgMjIyNS4yMTE3NjldICBibGtfbXFfc2No
ZWRfZGlzcGF0Y2hfcmVxdWVzdHMrMHgxMDcvMHgxYTANClsgMjIyNS4yNDA4MjVdICBfX2Jsa19t
cV9ydW5faHdfcXVldWUrMHg1Zi8weGYwDQpbIDIyMjUuMjY0ODUyXSAgcHJvY2Vzc19vbmVfd29y
aysweDE0MS8weDM0MA0KWyAyMjI1LjI4Nzg3Ml0gIHdvcmtlcl90aHJlYWQrMHg0Ny8weDNlMA0K
WyAyMjI1LjMwODM1NF0gIGt0aHJlYWQrMHhmNS8weDEzMA0KWyAyMjI1LjM5NjQwNV0gIHJldF9m
cm9tX2ZvcmsrMHgzMi8weDQwDQoNClRoYXQgY2FsbCB0cmFjZSBkaWQgbm90IHNob3cgdXAgYmVm
b3JlIHRoaXMgcGF0Y2ggc2VyaWVzIHdhcyBhZGRlZCB0byBKZW5zJw0KdHJlZS4gVGhpcyBpcyBh
IHJlZ3Jlc3Npb24uIENvdWxkIHRoaXMgaGF2ZSBiZWVuIGludHJvZHVjZWQgYnkgdGhpcyBwYXRj
aA0Kc2VyaWVzPw0KDQpUaGFua3MsDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
@ 2018-01-17 23:31       ` Bart Van Assche
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2018-01-17 23:31 UTC (permalink / raw)
  To: snitzer, axboe; +Cc: hch, dm-devel, linux-block, tom.leiming, loberman

On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> On Wed, Jan 17 2018 at 11:50am -0500,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
> > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > Hi Jens,
> > > 
> > > Think this finally takes care of it! ;)
> > > 
> > > Thanks,
> > > Mike
> > > 
> > > Mike Snitzer (2):
> > >   blk-mq: factor out a few helpers from __blk_mq_try_issue_directly
> > >   blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request
> > > 
> > > Ming Lei (1):
> > >   blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
> > 
> > Applied - added actual commit message to patch 3.
> 
> Great, thanks.

Hello Mike,

Laurence hit the following while retesting the SRP initiator code:

[ 2223.797129] list_add corruption. prev->next should be next (00000000e0ddd5dd), but was 000000003defe5cd. (prev=000000003defe5cd).
[ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28 __list_add_valid+0x6a/0x70
[ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted: G          I      4.15.0-rc8.bart3+ #1
[ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
[ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
[ 2224.967002] Call Trace:
[ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
[ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
[ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
[ 2225.090608]  map_request+0x142/0x260 [dm_mod]
[ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
[ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
[ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
[ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
[ 2225.264852]  process_one_work+0x141/0x340
[ 2225.287872]  worker_thread+0x47/0x3e0
[ 2225.308354]  kthread+0xf5/0x130
[ 2225.396405]  ret_from_fork+0x32/0x40

That call trace did not show up before this patch series was added to Jens'
tree. This is a regression. Could this have been introduced by this patch
series?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 23:31       ` Bart Van Assche
  (?)
@ 2018-01-17 23:43       ` Laurence Oberman
  2018-01-17 23:53           ` Bart Van Assche
  -1 siblings, 1 reply; 23+ messages in thread
From: Laurence Oberman @ 2018-01-17 23:43 UTC (permalink / raw)
  To: Bart Van Assche, snitzer, axboe; +Cc: hch, dm-devel, linux-block, tom.leiming

On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > On Wed, Jan 17 2018 at 11:50am -0500,
> > Jens Axboe <axboe@kernel.dk> wrote:
> > 
> > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > Hi Jens,
> > > > 
> > > > Think this finally takes care of it! ;)
> > > > 
> > > > Thanks,
> > > > Mike
> > > > 
> > > > Mike Snitzer (2):
> > > >   blk-mq: factor out a few helpers from
> > > > __blk_mq_try_issue_directly
> > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > blk_mq_sched_insert_request
> > > > 
> > > > Ming Lei (1):
> > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > blk_insert_cloned_request feedback
> > > 
> > > Applied - added actual commit message to patch 3.
> > 
> > Great, thanks.
> 
> Hello Mike,
> 
> Laurence hit the following while retesting the SRP initiator code:
> 
> [ 2223.797129] list_add corruption. prev->next should be next
> (00000000e0ddd5dd), but was 000000003defe5cd.
> (prev=000000003defe5cd).
> [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> __list_add_valid+0x6a/0x70
> [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> G          I      4.15.0-rc8.bart3+ #1
> [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> [ 2224.967002] Call Trace:
> [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> [ 2225.264852]  process_one_work+0x141/0x340
> [ 2225.287872]  worker_thread+0x47/0x3e0
> [ 2225.308354]  kthread+0xf5/0x130
> [ 2225.396405]  ret_from_fork+0x32/0x40
> 
> That call trace did not show up before this patch series was added to
> Jens'
> tree. This is a regression. Could this have been introduced by this
> patch
> series?
> 
> Thanks,
> 
> Bart.

Hi Bart
One thing to note.

I tested Mike's combined tree on the weekend fully dm4.16-block4.16 and
did not see this.
This was with Mike combined tree and SRPT running 4.13-rc2.

I also tested your tree Monday with the revert of the scatter/gather
patches with both SRP and SRPT running your tree and it was fine.

So its a combination of what you provided me before and that has been
added to your tree.

Mike combined tree seemed to be fine, I can revisit that if needed. I
still have that kernel in place.

I was not running latest SRPT when I tested Mike's tree


Regards
Laurence

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 23:43       ` Laurence Oberman
@ 2018-01-17 23:53           ` Bart Van Assche
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2018-01-17 23:53 UTC (permalink / raw)
  To: axboe, loberman, snitzer; +Cc: hch, dm-devel, linux-block, tom.leiming

T24gV2VkLCAyMDE4LTAxLTE3IGF0IDE4OjQzIC0wNTAwLCBMYXVyZW5jZSBPYmVybWFuIHdyb3Rl
Og0KPiBPbiBXZWQsIDIwMTgtMDEtMTcgYXQgMjM6MzEgKzAwMDAsIEJhcnQgVmFuIEFzc2NoZSB3
cm90ZToNCj4gPiBPbiBXZWQsIDIwMTgtMDEtMTcgYXQgMTE6NTggLTA1MDAsIE1pa2UgU25pdHpl
ciB3cm90ZToNCj4gPiA+IE9uIFdlZCwgSmFuIDE3IDIwMTggYXQgMTE6NTBhbSAtMDUwMCwNCj4g
PiA+IEplbnMgQXhib2UgPGF4Ym9lQGtlcm5lbC5kaz4gd3JvdGU6DQo+ID4gPiANCj4gPiA+ID4g
T24gMS8xNy8xOCA5OjI1IEFNLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+ID4gPiA+ID4gSGkgSmVu
cywNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBUaGluayB0aGlzIGZpbmFsbHkgdGFrZXMgY2FyZSBv
ZiBpdCEgOykNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBUaGFua3MsDQo+ID4gPiA+ID4gTWlrZQ0K
PiA+ID4gPiA+IA0KPiA+ID4gPiA+IE1pa2UgU25pdHplciAoMik6DQo+ID4gPiA+ID4gICBibGst
bXE6IGZhY3RvciBvdXQgYSBmZXcgaGVscGVycyBmcm9tDQo+ID4gPiA+ID4gX19ibGtfbXFfdHJ5
X2lzc3VlX2RpcmVjdGx5DQo+ID4gPiA+ID4gICBibGstbXEtc2NoZWQ6IHJlbW92ZSB1bnVzZWQg
J2Nhbl9ibG9jaycgYXJnIGZyb20NCj4gPiA+ID4gPiBibGtfbXFfc2NoZWRfaW5zZXJ0X3JlcXVl
c3QNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBNaW5nIExlaSAoMSk6DQo+ID4gPiA+ID4gICBibGst
bXE6IGltcHJvdmUgRE0ncyBibGstbXEgSU8gbWVyZ2luZyB2aWENCj4gPiA+ID4gPiBibGtfaW5z
ZXJ0X2Nsb25lZF9yZXF1ZXN0IGZlZWRiYWNrDQo+ID4gPiA+IA0KPiA+ID4gPiBBcHBsaWVkIC0g
YWRkZWQgYWN0dWFsIGNvbW1pdCBtZXNzYWdlIHRvIHBhdGNoIDMuDQo+ID4gPiANCj4gPiA+IEdy
ZWF0LCB0aGFua3MuDQo+ID4gDQo+ID4gSGVsbG8gTWlrZSwNCj4gPiANCj4gPiBMYXVyZW5jZSBo
aXQgdGhlIGZvbGxvd2luZyB3aGlsZSByZXRlc3RpbmcgdGhlIFNSUCBpbml0aWF0b3IgY29kZToN
Cj4gPiANCj4gPiBbIDIyMjMuNzk3MTI5XSBsaXN0X2FkZCBjb3JydXB0aW9uLiBwcmV2LT5uZXh0
IHNob3VsZCBiZSBuZXh0DQo+ID4gKDAwMDAwMDAwZTBkZGQ1ZGQpLCBidXQgd2FzIDAwMDAwMDAw
M2RlZmU1Y2QuDQo+ID4gKHByZXY9MDAwMDAwMDAzZGVmZTVjZCkuDQo+ID4gWyAyMjIzLjg2MjE2
OF0gV0FSTklORzogQ1BVOiAxNCBQSUQ6IDU3NyBhdCBsaWIvbGlzdF9kZWJ1Zy5jOjI4DQo+ID4g
X19saXN0X2FkZF92YWxpZCsweDZhLzB4NzANCj4gPiBbIDIyMjQuNDgxMTUxXSBDUFU6IDE0IFBJ
RDogNTc3IENvbW06IGt3b3JrZXIvMTQ6MUggVGFpbnRlZDoNCj4gPiBHICAgICAgICAgIEkgICAg
ICA0LjE1LjAtcmM4LmJhcnQzKyAjMQ0KPiA+IFsgMjIyNC41MzExOTNdIEhhcmR3YXJlIG5hbWU6
IEhQIFByb0xpYW50IERMMzgwIEc3LCBCSU9TIFA2Nw0KPiA+IDA4LzE2LzIwMTUNCj4gPiBbIDIy
MjQuNTY3MTUwXSBXb3JrcXVldWU6IGtibG9ja2QgYmxrX21xX3J1bl93b3JrX2ZuDQo+ID4gWyAy
MjI0LjU5MzE4Ml0gUklQOiAwMDEwOl9fbGlzdF9hZGRfdmFsaWQrMHg2YS8weDcwDQo+ID4gWyAy
MjI0Ljk2NzAwMl0gQ2FsbCBUcmFjZToNCj4gPiBbIDIyMjQuOTgwOTQxXSAgYmxrX21xX3JlcXVl
c3RfYnlwYXNzX2luc2VydCsweDU3LzB4YTANCj4gPiBbIDIyMjUuMDA5MDQ0XSAgX19ibGtfbXFf
dHJ5X2lzc3VlX2RpcmVjdGx5KzB4NTYvMHgxZTANCj4gPiBbIDIyMjUuMDM3MDA3XSAgYmxrX21x
X3JlcXVlc3RfZGlyZWN0X2lzc3VlKzB4NWQvMHhjMA0KPiA+IFsgMjIyNS4wOTA2MDhdICBtYXBf
cmVxdWVzdCsweDE0Mi8weDI2MCBbZG1fbW9kXQ0KPiA+IFsgMjIyNS4xMTQ3NTZdICBkbV9tcV9x
dWV1ZV9ycSsweGE0LzB4MTIwIFtkbV9tb2RdDQo+ID4gWyAyMjI1LjE0MDgxMl0gIGJsa19tcV9k
aXNwYXRjaF9ycV9saXN0KzB4OTAvMHg1YjANCj4gPiBbIDIyMjUuMjExNzY5XSAgYmxrX21xX3Nj
aGVkX2Rpc3BhdGNoX3JlcXVlc3RzKzB4MTA3LzB4MWEwDQo+ID4gWyAyMjI1LjI0MDgyNV0gIF9f
YmxrX21xX3J1bl9od19xdWV1ZSsweDVmLzB4ZjANCj4gPiBbIDIyMjUuMjY0ODUyXSAgcHJvY2Vz
c19vbmVfd29yaysweDE0MS8weDM0MA0KPiA+IFsgMjIyNS4yODc4NzJdICB3b3JrZXJfdGhyZWFk
KzB4NDcvMHgzZTANCj4gPiBbIDIyMjUuMzA4MzU0XSAga3RocmVhZCsweGY1LzB4MTMwDQo+ID4g
WyAyMjI1LjM5NjQwNV0gIHJldF9mcm9tX2ZvcmsrMHgzMi8weDQwDQo+ID4gDQo+ID4gVGhhdCBj
YWxsIHRyYWNlIGRpZCBub3Qgc2hvdyB1cCBiZWZvcmUgdGhpcyBwYXRjaCBzZXJpZXMgd2FzIGFk
ZGVkIHRvDQo+ID4gSmVucycNCj4gPiB0cmVlLiBUaGlzIGlzIGEgcmVncmVzc2lvbi4gQ291bGQg
dGhpcyBoYXZlIGJlZW4gaW50cm9kdWNlZCBieSB0aGlzDQo+ID4gcGF0Y2gNCj4gPiBzZXJpZXM/
DQo+ID4gDQo+ID4gVGhhbmtzLA0KPiA+IA0KPiA+IEJhcnQuDQo+IA0KPiBIaSBCYXJ0DQo+IE9u
ZSB0aGluZyB0byBub3RlLg0KPiANCj4gSSB0ZXN0ZWQgTWlrZSdzIGNvbWJpbmVkIHRyZWUgb24g
dGhlIHdlZWtlbmQgZnVsbHkgZG00LjE2LWJsb2NrNC4xNiBhbmQNCj4gZGlkIG5vdCBzZWUgdGhp
cy4NCj4gVGhpcyB3YXMgd2l0aCBNaWtlIGNvbWJpbmVkIHRyZWUgYW5kIFNSUFQgcnVubmluZyA0
LjEzLXJjMi4NCj4gDQo+IEkgYWxzbyB0ZXN0ZWQgeW91ciB0cmVlIE1vbmRheSB3aXRoIHRoZSBy
ZXZlcnQgb2YgdGhlIHNjYXR0ZXIvZ2F0aGVyDQo+IHBhdGNoZXMgd2l0aCBib3RoIFNSUCBhbmQg
U1JQVCBydW5uaW5nIHlvdXIgdHJlZSBhbmQgaXQgd2FzIGZpbmUuDQo+IA0KPiBTbyBpdHMgYSBj
b21iaW5hdGlvbiBvZiB3aGF0IHlvdSBwcm92aWRlZCBtZSBiZWZvcmUgYW5kIHRoYXQgaGFzIGJl
ZW4NCj4gYWRkZWQgdG8geW91ciB0cmVlLg0KPiANCj4gTWlrZSBjb21iaW5lZCB0cmVlIHNlZW1l
ZCB0byBiZSBmaW5lLCBJIGNhbiByZXZpc2l0IHRoYXQgaWYgbmVlZGVkLiBJDQo+IHN0aWxsIGhh
dmUgdGhhdCBrZXJuZWwgaW4gcGxhY2UuDQo+IA0KPiBJIHdhcyBub3QgcnVubmluZyBsYXRlc3Qg
U1JQVCB3aGVuIEkgdGVzdGVkIE1pa2UncyB0cmVlDQoNCkhlbGxvIExhdXJlbmNlLA0KDQpUaGUg
dHJlZSBJIHNlbnQgeW91IHRoaXMgbW9ybmluZyBkaWQgbm90IG9ubHkgaW5jbHVkZSBNaWtlJ3Mg
bGF0ZXN0IGRtIGNvZGUNCmJ1dCBhbHNvIEplbnMnIGxhdGVzdCBmb3ItbmV4dCBicmFuY2guIFNv
IHdoYXQgeW91IHdyb3RlIGFib3ZlIGRvZXMgbm90DQpjb250cmFkaWN0IHdoYXQgSSB3cm90ZSBp
biBteSBlLW1haWwsIG5hbWVseSB0aGF0IEkgc3VzcGVjdCB0aGF0IGEgcmVncmVzc2lvbg0Kd2Fz
IGludHJvZHVjZWQgYnkgdGhlIHBhdGNoZXMgaW4gdGhlIHNlcmllcyAiYmxrLW1xOiBpbXByb3Zl
IERNJ3MgYmxrLW1xIElPDQptZXJnaW5nIHZpYSBibGtfaW5zZXJ0X2Nsb25lZF9yZXF1ZXN0IGZl
ZWRiYWNrIi4gVGhlc2UgY2hhbmdlcyBuYW1lbHkgd2VudCBpbg0KdGhyb3VnaCB0aGUgYmxvY2sg
dHJlZSBhbmQgbm90IHRocm91Z2ggdGhlIGRtIHRyZWUuIEFkZGl0aW9uYWxseSwgdGhlc2UNCmNo
YW5nZXMgd2VyZSBub3QgcHJlc2VudCBpbiB0aGUgYmxvY2stc2NzaS1mb3ItbmV4dCBicmFuY2gg
SSBzZW50IHlvdSBvbg0KTW9uZGF5Lg0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
@ 2018-01-17 23:53           ` Bart Van Assche
  0 siblings, 0 replies; 23+ messages in thread
From: Bart Van Assche @ 2018-01-17 23:53 UTC (permalink / raw)
  To: axboe, loberman, snitzer; +Cc: hch, dm-devel, linux-block, tom.leiming

On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote:
> On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > > On Wed, Jan 17 2018 at 11:50am -0500,
> > > Jens Axboe <axboe@kernel.dk> wrote:
> > > 
> > > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > > Hi Jens,
> > > > > 
> > > > > Think this finally takes care of it! ;)
> > > > > 
> > > > > Thanks,
> > > > > Mike
> > > > > 
> > > > > Mike Snitzer (2):
> > > > >   blk-mq: factor out a few helpers from
> > > > > __blk_mq_try_issue_directly
> > > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > > blk_mq_sched_insert_request
> > > > > 
> > > > > Ming Lei (1):
> > > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > > blk_insert_cloned_request feedback
> > > > 
> > > > Applied - added actual commit message to patch 3.
> > > 
> > > Great, thanks.
> > 
> > Hello Mike,
> > 
> > Laurence hit the following while retesting the SRP initiator code:
> > 
> > [ 2223.797129] list_add corruption. prev->next should be next
> > (00000000e0ddd5dd), but was 000000003defe5cd.
> > (prev=000000003defe5cd).
> > [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> > __list_add_valid+0x6a/0x70
> > [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> > G          I      4.15.0-rc8.bart3+ #1
> > [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> > [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> > [ 2224.967002] Call Trace:
> > [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> > [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> > [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> > [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> > [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> > [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> > [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> > [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> > [ 2225.264852]  process_one_work+0x141/0x340
> > [ 2225.287872]  worker_thread+0x47/0x3e0
> > [ 2225.308354]  kthread+0xf5/0x130
> > [ 2225.396405]  ret_from_fork+0x32/0x40
> > 
> > That call trace did not show up before this patch series was added to
> > Jens'
> > tree. This is a regression. Could this have been introduced by this
> > patch
> > series?
> > 
> > Thanks,
> > 
> > Bart.
> 
> Hi Bart
> One thing to note.
> 
> I tested Mike's combined tree on the weekend fully dm4.16-block4.16 and
> did not see this.
> This was with Mike combined tree and SRPT running 4.13-rc2.
> 
> I also tested your tree Monday with the revert of the scatter/gather
> patches with both SRP and SRPT running your tree and it was fine.
> 
> So its a combination of what you provided me before and that has been
> added to your tree.
> 
> Mike combined tree seemed to be fine, I can revisit that if needed. I
> still have that kernel in place.
> 
> I was not running latest SRPT when I tested Mike's tree

Hello Laurence,

The tree I sent you this morning did not only include Mike's latest dm code
but also Jens' latest for-next branch. So what you wrote above does not
contradict what I wrote in my e-mail, namely that I suspect that a regression
was introduced by the patches in the series "blk-mq: improve DM's blk-mq IO
merging via blk_insert_cloned_request feedback". These changes namely went in
through the block tree and not through the dm tree. Additionally, these
changes were not present in the block-scsi-for-next branch I sent you on
Monday.

Bart.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 23:53           ` Bart Van Assche
  (?)
@ 2018-01-17 23:57           ` Laurence Oberman
  2018-01-18  0:12               ` Laurence Oberman
  -1 siblings, 1 reply; 23+ messages in thread
From: Laurence Oberman @ 2018-01-17 23:57 UTC (permalink / raw)
  To: Bart Van Assche, axboe, snitzer; +Cc: hch, dm-devel, linux-block, tom.leiming

On Wed, 2018-01-17 at 23:53 +0000, Bart Van Assche wrote:
> On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote:
> > On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> > > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > > > On Wed, Jan 17 2018 at 11:50am -0500,
> > > > Jens Axboe <axboe@kernel.dk> wrote:
> > > > 
> > > > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > > > Hi Jens,
> > > > > > 
> > > > > > Think this finally takes care of it! ;)
> > > > > > 
> > > > > > Thanks,
> > > > > > Mike
> > > > > > 
> > > > > > Mike Snitzer (2):
> > > > > >   blk-mq: factor out a few helpers from
> > > > > > __blk_mq_try_issue_directly
> > > > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > > > blk_mq_sched_insert_request
> > > > > > 
> > > > > > Ming Lei (1):
> > > > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > > > blk_insert_cloned_request feedback
> > > > > 
> > > > > Applied - added actual commit message to patch 3.
> > > > 
> > > > Great, thanks.
> > > 
> > > Hello Mike,
> > > 
> > > Laurence hit the following while retesting the SRP initiator
> > > code:
> > > 
> > > [ 2223.797129] list_add corruption. prev->next should be next
> > > (00000000e0ddd5dd), but was 000000003defe5cd.
> > > (prev=000000003defe5cd).
> > > [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> > > __list_add_valid+0x6a/0x70
> > > [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> > > G          I      4.15.0-rc8.bart3+ #1
> > > [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > > 08/16/2015
> > > [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> > > [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> > > [ 2224.967002] Call Trace:
> > > [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> > > [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> > > [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> > > [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> > > [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> > > [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> > > [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> > > [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > [ 2225.264852]  process_one_work+0x141/0x340
> > > [ 2225.287872]  worker_thread+0x47/0x3e0
> > > [ 2225.308354]  kthread+0xf5/0x130
> > > [ 2225.396405]  ret_from_fork+0x32/0x40
> > > 
> > > That call trace did not show up before this patch series was
> > > added to
> > > Jens'
> > > tree. This is a regression. Could this have been introduced by
> > > this
> > > patch
> > > series?
> > > 
> > > Thanks,
> > > 
> > > Bart.
> > 
> > Hi Bart
> > One thing to note.
> > 
> > I tested Mike's combined tree on the weekend fully dm4.16-block4.16 
> > and
> > did not see this.
> > This was with Mike combined tree and SRPT running 4.13-rc2.
> > 
> > I also tested your tree Monday with the revert of the
> > scatter/gather
> > patches with both SRP and SRPT running your tree and it was fine.
> > 
> > So its a combination of what you provided me before and that has
> > been
> > added to your tree.
> > 
> > Mike combined tree seemed to be fine, I can revisit that if needed.
> > I
> > still have that kernel in place.
> > 
> > I was not running latest SRPT when I tested Mike's tree
> 
> Hello Laurence,
> 
> The tree I sent you this morning did not only include Mike's latest
> dm code
> but also Jens' latest for-next branch. So what you wrote above does
> not
> contradict what I wrote in my e-mail, namely that I suspect that a
> regression
> was introduced by the patches in the series "blk-mq: improve DM's
> blk-mq IO
> merging via blk_insert_cloned_request feedback". These changes namely
> went in
> through the block tree and not through the dm tree. Additionally,
> these
> changes were not present in the block-scsi-for-next branch I sent you
> on
> Monday.
> 
> Bart.

Hi Bart
Thank you

I probably don't have latest stuff Mike added Monday in his tree.
I have not gone back to test that, been busy with yours all week.

I will get with Mike and pull his latest and test it with SRPT on 4.13-
rc2 like before.

That should be the best way to isolate this.

Regards
Laurence

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 23:57           ` Laurence Oberman
@ 2018-01-18  0:12               ` Laurence Oberman
  0 siblings, 0 replies; 23+ messages in thread
From: Laurence Oberman @ 2018-01-18  0:12 UTC (permalink / raw)
  To: Bart Van Assche, axboe, snitzer; +Cc: hch, dm-devel, linux-block, tom.leiming

[-- Attachment #1: Type: text/plain, Size: 4570 bytes --]

Grabbing Mikes latest combined tree and testing it now

Thanks
Laurence

On Wed, Jan 17, 2018 at 6:57 PM, Laurence Oberman <loberman@redhat.com>
wrote:

> On Wed, 2018-01-17 at 23:53 +0000, Bart Van Assche wrote:
> > On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote:
> > > On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> > > > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > > > > On Wed, Jan 17 2018 at 11:50am -0500,
> > > > > Jens Axboe <axboe@kernel.dk> wrote:
> > > > >
> > > > > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > > > > Hi Jens,
> > > > > > >
> > > > > > > Think this finally takes care of it! ;)
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Mike
> > > > > > >
> > > > > > > Mike Snitzer (2):
> > > > > > >   blk-mq: factor out a few helpers from
> > > > > > > __blk_mq_try_issue_directly
> > > > > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > > > > blk_mq_sched_insert_request
> > > > > > >
> > > > > > > Ming Lei (1):
> > > > > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > > > > blk_insert_cloned_request feedback
> > > > > >
> > > > > > Applied - added actual commit message to patch 3.
> > > > >
> > > > > Great, thanks.
> > > >
> > > > Hello Mike,
> > > >
> > > > Laurence hit the following while retesting the SRP initiator
> > > > code:
> > > >
> > > > [ 2223.797129] list_add corruption. prev->next should be next
> > > > (00000000e0ddd5dd), but was 000000003defe5cd.
> > > > (prev=000000003defe5cd).
> > > > [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> > > > __list_add_valid+0x6a/0x70
> > > > [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> > > > G          I      4.15.0-rc8.bart3+ #1
> > > > [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > > > 08/16/2015
> > > > [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> > > > [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> > > > [ 2224.967002] Call Trace:
> > > > [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> > > > [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> > > > [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> > > > [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> > > > [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> > > > [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> > > > [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> > > > [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > [ 2225.264852]  process_one_work+0x141/0x340
> > > > [ 2225.287872]  worker_thread+0x47/0x3e0
> > > > [ 2225.308354]  kthread+0xf5/0x130
> > > > [ 2225.396405]  ret_from_fork+0x32/0x40
> > > >
> > > > That call trace did not show up before this patch series was
> > > > added to
> > > > Jens'
> > > > tree. This is a regression. Could this have been introduced by
> > > > this
> > > > patch
> > > > series?
> > > >
> > > > Thanks,
> > > >
> > > > Bart.
> > >
> > > Hi Bart
> > > One thing to note.
> > >
> > > I tested Mike's combined tree on the weekend fully dm4.16-block4.16
> > > and
> > > did not see this.
> > > This was with Mike combined tree and SRPT running 4.13-rc2.
> > >
> > > I also tested your tree Monday with the revert of the
> > > scatter/gather
> > > patches with both SRP and SRPT running your tree and it was fine.
> > >
> > > So its a combination of what you provided me before and that has
> > > been
> > > added to your tree.
> > >
> > > Mike combined tree seemed to be fine, I can revisit that if needed.
> > > I
> > > still have that kernel in place.
> > >
> > > I was not running latest SRPT when I tested Mike's tree
> >
> > Hello Laurence,
> >
> > The tree I sent you this morning did not only include Mike's latest
> > dm code
> > but also Jens' latest for-next branch. So what you wrote above does
> > not
> > contradict what I wrote in my e-mail, namely that I suspect that a
> > regression
> > was introduced by the patches in the series "blk-mq: improve DM's
> > blk-mq IO
> > merging via blk_insert_cloned_request feedback". These changes namely
> > went in
> > through the block tree and not through the dm tree. Additionally,
> > these
> > changes were not present in the block-scsi-for-next branch I sent you
> > on
> > Monday.
> >
> > Bart.
>
> Hi Bart
> Thank you
>
> I probably don't have latest stuff Mike added Monday in his tree.
> I have not gone back to test that, been busy with yours all week.
>
> I will get with Mike and pull his latest and test it with SRPT on 4.13-
> rc2 like before.
>
> That should be the best way to isolate this.
>
> Regards
> Laurence
>

[-- Attachment #2: Type: text/html, Size: 6574 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
@ 2018-01-18  0:12               ` Laurence Oberman
  0 siblings, 0 replies; 23+ messages in thread
From: Laurence Oberman @ 2018-01-18  0:12 UTC (permalink / raw)
  To: Bart Van Assche, axboe, snitzer; +Cc: linux-block, tom.leiming, dm-devel, hch


[-- Attachment #1.1: Type: text/plain, Size: 4570 bytes --]

Grabbing Mikes latest combined tree and testing it now

Thanks
Laurence

On Wed, Jan 17, 2018 at 6:57 PM, Laurence Oberman <loberman@redhat.com>
wrote:

> On Wed, 2018-01-17 at 23:53 +0000, Bart Van Assche wrote:
> > On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote:
> > > On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> > > > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > > > > On Wed, Jan 17 2018 at 11:50am -0500,
> > > > > Jens Axboe <axboe@kernel.dk> wrote:
> > > > >
> > > > > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > > > > Hi Jens,
> > > > > > >
> > > > > > > Think this finally takes care of it! ;)
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Mike
> > > > > > >
> > > > > > > Mike Snitzer (2):
> > > > > > >   blk-mq: factor out a few helpers from
> > > > > > > __blk_mq_try_issue_directly
> > > > > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > > > > blk_mq_sched_insert_request
> > > > > > >
> > > > > > > Ming Lei (1):
> > > > > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > > > > blk_insert_cloned_request feedback
> > > > > >
> > > > > > Applied - added actual commit message to patch 3.
> > > > >
> > > > > Great, thanks.
> > > >
> > > > Hello Mike,
> > > >
> > > > Laurence hit the following while retesting the SRP initiator
> > > > code:
> > > >
> > > > [ 2223.797129] list_add corruption. prev->next should be next
> > > > (00000000e0ddd5dd), but was 000000003defe5cd.
> > > > (prev=000000003defe5cd).
> > > > [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> > > > __list_add_valid+0x6a/0x70
> > > > [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> > > > G          I      4.15.0-rc8.bart3+ #1
> > > > [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > > > 08/16/2015
> > > > [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> > > > [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> > > > [ 2224.967002] Call Trace:
> > > > [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> > > > [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> > > > [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> > > > [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> > > > [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> > > > [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> > > > [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> > > > [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > [ 2225.264852]  process_one_work+0x141/0x340
> > > > [ 2225.287872]  worker_thread+0x47/0x3e0
> > > > [ 2225.308354]  kthread+0xf5/0x130
> > > > [ 2225.396405]  ret_from_fork+0x32/0x40
> > > >
> > > > That call trace did not show up before this patch series was
> > > > added to
> > > > Jens'
> > > > tree. This is a regression. Could this have been introduced by
> > > > this
> > > > patch
> > > > series?
> > > >
> > > > Thanks,
> > > >
> > > > Bart.
> > >
> > > Hi Bart
> > > One thing to note.
> > >
> > > I tested Mike's combined tree on the weekend fully dm4.16-block4.16
> > > and
> > > did not see this.
> > > This was with Mike combined tree and SRPT running 4.13-rc2.
> > >
> > > I also tested your tree Monday with the revert of the
> > > scatter/gather
> > > patches with both SRP and SRPT running your tree and it was fine.
> > >
> > > So its a combination of what you provided me before and that has
> > > been
> > > added to your tree.
> > >
> > > Mike combined tree seemed to be fine, I can revisit that if needed.
> > > I
> > > still have that kernel in place.
> > >
> > > I was not running latest SRPT when I tested Mike's tree
> >
> > Hello Laurence,
> >
> > The tree I sent you this morning did not only include Mike's latest
> > dm code
> > but also Jens' latest for-next branch. So what you wrote above does
> > not
> > contradict what I wrote in my e-mail, namely that I suspect that a
> > regression
> > was introduced by the patches in the series "blk-mq: improve DM's
> > blk-mq IO
> > merging via blk_insert_cloned_request feedback". These changes namely
> > went in
> > through the block tree and not through the dm tree. Additionally,
> > these
> > changes were not present in the block-scsi-for-next branch I sent you
> > on
> > Monday.
> >
> > Bart.
>
> Hi Bart
> Thank you
>
> I probably don't have latest stuff Mike added Monday in his tree.
> I have not gone back to test that, been busy with yours all week.
>
> I will get with Mike and pull his latest and test it with SRPT on 4.13-
> rc2 like before.
>
> That should be the best way to isolate this.
>
> Regards
> Laurence
>

[-- Attachment #1.2: Type: text/html, Size: 6574 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 23:53           ` Bart Van Assche
  (?)
  (?)
@ 2018-01-18  0:54           ` Mike Snitzer
  2018-01-18  1:17             ` Laurence Oberman
  2018-01-18  3:33             ` [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE Mike Snitzer
  -1 siblings, 2 replies; 23+ messages in thread
From: Mike Snitzer @ 2018-01-18  0:54 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: axboe, loberman, hch, dm-devel, linux-block, tom.leiming

On Wed, Jan 17 2018 at  6:53pm -0500,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote:
> > On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> > > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > > > On Wed, Jan 17 2018 at 11:50am -0500,
> > > > Jens Axboe <axboe@kernel.dk> wrote:
> > > > 
> > > > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > > > Hi Jens,
> > > > > > 
> > > > > > Think this finally takes care of it! ;)
> > > > > > 
> > > > > > Thanks,
> > > > > > Mike
> > > > > > 
> > > > > > Mike Snitzer (2):
> > > > > >   blk-mq: factor out a few helpers from
> > > > > > __blk_mq_try_issue_directly
> > > > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > > > blk_mq_sched_insert_request
> > > > > > 
> > > > > > Ming Lei (1):
> > > > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > > > blk_insert_cloned_request feedback
> > > > > 
> > > > > Applied - added actual commit message to patch 3.
> > > > 
> > > > Great, thanks.
> > > 
> > > Hello Mike,
> > > 
> > > Laurence hit the following while retesting the SRP initiator code:
> > > 
> > > [ 2223.797129] list_add corruption. prev->next should be next
> > > (00000000e0ddd5dd), but was 000000003defe5cd.
> > > (prev=000000003defe5cd).
> > > [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> > > __list_add_valid+0x6a/0x70
> > > [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> > > G          I      4.15.0-rc8.bart3+ #1
> > > [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > > 08/16/2015
> > > [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> > > [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> > > [ 2224.967002] Call Trace:
> > > [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> > > [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> > > [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> > > [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> > > [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> > > [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> > > [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> > > [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > [ 2225.264852]  process_one_work+0x141/0x340
> > > [ 2225.287872]  worker_thread+0x47/0x3e0
> > > [ 2225.308354]  kthread+0xf5/0x130
> > > [ 2225.396405]  ret_from_fork+0x32/0x40
> > > 
> > > That call trace did not show up before this patch series was added to
> > > Jens'
> > > tree. This is a regression. Could this have been introduced by this
> > > patch
> > > series?
> > > 
> > > Thanks,
> > > 
> > > Bart.
> > 
> > Hi Bart
> > One thing to note.
> > 
> > I tested Mike's combined tree on the weekend fully dm4.16-block4.16 and
> > did not see this.
> > This was with Mike combined tree and SRPT running 4.13-rc2.
> > 
> > I also tested your tree Monday with the revert of the scatter/gather
> > patches with both SRP and SRPT running your tree and it was fine.
> > 
> > So its a combination of what you provided me before and that has been
> > added to your tree.
> > 
> > Mike combined tree seemed to be fine, I can revisit that if needed. I
> > still have that kernel in place.
> > 
> > I was not running latest SRPT when I tested Mike's tree
> 
> Hello Laurence,
> 
> The tree I sent you this morning did not only include Mike's latest dm code
> but also Jens' latest for-next branch. So what you wrote above does not
> contradict what I wrote in my e-mail, namely that I suspect that a regression
> was introduced by the patches in the series "blk-mq: improve DM's blk-mq IO
> merging via blk_insert_cloned_request feedback". These changes namely went in
> through the block tree and not through the dm tree. Additionally, these
> changes were not present in the block-scsi-for-next branch I sent you on
> Monday.

Functionality shouldn't be any different than the variant (Ming's v4)
that Laurence tested on Sunday/Monday (once we got past the genirq issue
on HPSA).

But sure, I suppose there is something I missed when refactoring Ming's
change to get it acceptable for upstream.  I went over the mechanical
nature of what I did many times (comaping Ming's v4 to my v5).

Anyway, we'll see how Laurence fairs with this tree (but with the revert
of 84676c1 added, so his HPSA server will boot):
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=block-4.16_dm-4.16
(which is the same as linux-dm.git's 'for-next' at the moment)

The call to blk_mq_request_bypass_insert will only occur via
__blk_mq_fallback_to_insert.  Which as the name implies this is not the
fast path.  This will occur if the underlying blk-mq device cannot get
resources it needs in order to issue the request.  Specifically: if/when
in __blk_mq_try_issue_directly() the hctx is stopped, or queue is
quiesced, or it cannot get the driver tag or dispatch_budget (in the
case of scsi-mq).

The same fallback, via call to blk_mq_request_bypass_insert, occured
with Ming's v4 though.

Anyway, we'll see what Laurence finds when testing my above kernel.

Mike

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-18  0:54           ` Mike Snitzer
@ 2018-01-18  1:17             ` Laurence Oberman
  2018-01-18  3:33             ` [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE Mike Snitzer
  1 sibling, 0 replies; 23+ messages in thread
From: Laurence Oberman @ 2018-01-18  1:17 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Bart Van Assche, axboe, dm-devel, hch, tom.leiming


[-- Attachment #1.1: Type: text/plain, Size: 23804 bytes --]

MIke and Bart

Replying via Browser because Evolution is having issues so cannot copy the
block list.
The gmail browser wont do clean text, only evolution for me.

Server is running 4.13-rc1 for the RPT so stayed up, Client see sthe list
corruption like I saw on Barts tree.
Someone will have to copy the block list if needed until I get evolution
back

[  370.765953] list_add corruption. prev->next should be next
(0000000062acf0b0), but was 000000007a44ce4f. (prev=000000007a44ce4f).
[  370.831948] WARNING: CPU: 15 PID: 13175 at lib/list_debug.c:28
__list_add_valid+0x6a/0x70
[  370.877893] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter rpcrdma ib_isert iscsi_target_mod target_core_mod ib_iser
libiscsi scsi_transport_iscsi ib_srp scsi_transport_srp ib_ipoib rdma_ucm
ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ipmi_ssif ghash_clmulni_intel pcbc aesni_intel crypto_simd
ipmi_si glue_helper cryptd joydev ipmi_devintf ipmi_msghandler iTCO_wdt
iTCO_vendor_support dm_service_time pcspkr acpi_power_meter gpio_ich hpilo
hpwdt sg pcc_cpufreq i7core_edac
[  371.273754]  shpchp lpc_ich nfsd auth_rpcgss nfs_acl lockd grace
dm_multipath sunrpc ip_tables xfs libcrc32c sd_mod radeon mlx5_core
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
ttm mlxfw drm ptp i2c_core crc32c_intel hpsa pps_core serio_raw bnx2
devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  371.444895] CPU: 15 PID: 13175 Comm: kworker/u66:14 Tainted: G
I      4.15.0-rc4.dm_and_block+ #1
[  371.499292] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  371.535070] Workqueue: writeback wb_workfn (flush-253:13)
[  371.565729] RIP: 0010:__list_add_valid+0x6a/0x70
[  371.592197] RSP: 0018:ffffae5b8c293690 EFLAGS: 00010286
[  371.621114] RAX: 0000000000000000 RBX: ffff929ba920b800 RCX:
0000000000000000
[  371.661131] RDX: 0000000000000001 RSI: ffff929c333cdf78 RDI:
ffff929c333cdf78
[  371.702111] RBP: ffff929ba8b34c00 R08: 0000000000000000 R09:
00000000000006d7
[  371.743248] R10: 0000000000000000 R11: ffffae5b8c2933f8 R12:
ffff929ba8b34c40
[  371.783772] R13: ffff929ba8b34c40 R14: ffff929ba920b808 R15:
0000000000000001
[  371.824291] FS:  0000000000000000(0000) GS:ffff929c333c0000(0000)
knlGS:0000000000000000
[  371.870706] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  371.903803] CR2: 00007fb07e09b000 CR3: 0000000f56609006 CR4:
00000000000206e0
[  371.943875] Call Trace:
[  371.957310]  blk_mq_request_bypass_insert+0x57/0xa0
[  371.984333]  __blk_mq_try_issue_directly+0x56/0x1e0
[  372.011390]  blk_mq_request_direct_issue+0x5d/0xc0
[  372.038404]  ? blk_insert_cloned_request+0x96/0x1c0
[  372.065923]  map_request+0x142/0x260 [dm_mod]
[  372.090043]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
[  372.115507]  blk_mq_dispatch_rq_list+0x8e/0x530
[  372.141076]  ? deadline_remove_request+0x79/0xc0
[  372.167035]  blk_mq_do_dispatch_sched+0x8b/0x110
[  372.192938]  blk_mq_sched_dispatch_requests+0x118/0x1a0
[  372.222326]  __blk_mq_run_hw_queue+0x5f/0xf0
[  372.246083]  __blk_mq_delay_run_hw_queue+0x9c/0xa0
[  372.272943]  blk_mq_run_hw_queue+0x54/0xf0
[  372.296181]  blk_mq_flush_plug_list+0x17f/0x260
[  372.322152]  blk_flush_plug_list+0xe4/0x260
[  372.345712]  blk_mq_make_request+0x483/0x560
[  372.370156]  generic_make_request+0x110/0x2e0
[  372.394759]  submit_bio+0x6e/0x140
[  372.414258]  xfs_submit_ioend+0x9c/0x110 [xfs]
[  372.439759]  xfs_vm_writepages+0xc6/0xd0 [xfs]
[  372.464677]  do_writepages+0x17/0x70
[  372.484686]  __writeback_single_inode+0x3d/0x330
[  372.510219]  writeback_sb_inodes+0x24f/0x4b0
[  372.534789]  __writeback_inodes_wb+0x87/0xb0
[  372.558826]  wb_writeback+0x276/0x310
[  372.579434]  wb_workfn+0x1b0/0x460
[  372.598886]  process_one_work+0x141/0x340
[  372.621522]  worker_thread+0x47/0x3e0
[  372.641939]  kthread+0xf5/0x130
[  372.659890]  ? rescuer_thread+0x380/0x380
[  372.682932]  ? kthread_associate_blkcg+0x90/0x90
[  372.708925]  ret_from_fork+0x1f/0x30
[  372.729339] Code: fe 31 c0 48 c7 c7 58 a8 c9 ba e8 02 4c cf ff 0f ff 31
c0 c3 48 89 d1 48 c7 c7 08 a8 c9 ba 48 89 f2 48 89 c6 31 c0 e8 e6 4b cf ff
<0f> ff 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b
[  372.835562] ---[ end trace d6ddd485d92c6ddd ]---

[  372.861081] scsi host1: ib_srp: Failed to map data (-5)    Note

[  372.863116] WARNING: CPU: 15 PID: 13175 at block/blk-mq.c:667
blk_mq_start_request+0x161/0x170
[  372.863117] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter rpcrdma ib_isert iscsi_target_mod target_core_mod ib_iser
libiscsi scsi_transport_iscsi ib_srp scsi_transport_srp ib_ipoib rdma_ucm
ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ipmi_ssif ghash_clmulni_intel pcbc aesni_intel crypto_simd
ipmi_si glue_helper cryptd joydev ipmi_devintf ipmi_msghandler iTCO_wdt
iTCO_vendor_support dm_service_time pcspkr acpi_power_meter gpio_ich hpilo
hpwdt sg pcc_cpufreq i7core_edac
[  372.863147]  shpchp lpc_ich nfsd auth_rpcgss nfs_acl lockd grace
dm_multipath sunrpc ip_tables xfs libcrc32c sd_mod radeon mlx5_core
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
ttm mlxfw drm ptp i2c_core crc32c_intel hpsa pps_core serio_raw bnx2
devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  372.863165] CPU: 15 PID: 13175 Comm: kworker/u66:14 Tainted: G        W
I      4.15.0-rc4.dm_and_block+ #1
[  372.863166] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  372.863170] Workqueue: writeback wb_workfn (flush-253:13)
[  372.863173] RIP: 0010:blk_mq_start_request+0x161/0x170
[  372.863174] RSP: 0018:ffffae5b8c293480 EFLAGS: 00010202
[  372.863177] RAX: 0000000000000009 RBX: ffff929ba8b34c00 RCX:
0001ffffffffffff
[  372.863179] RDX: 00000056af18c440 RSI: 00310d51549749a3 RDI:
ffffffffbae29760
[  372.863180] RBP: 0000000000000000 R08: 0000000000000000 R09:
ffff929ba8b34d98
[  372.863182] R10: 0000000000001000 R11: 0000000000000000 R12:
ffff929ba8cdcba8
[  372.863183] R13: 0000000000004000 R14: ffff929ba920b800 R15:
ffff929ba8b34d60
[  372.863185] FS:  0000000000000000(0000) GS:ffff929c333c0000(0000)
knlGS:0000000000000000
[  372.863187] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.863188] CR2: 00007fb07e09b000 CR3: 0000000f56609006 CR4:
00000000000206e0
[  372.863189] Call Trace:
[  372.863200]  scsi_queue_rq+0x2f4/0x560
[  372.863202]  ? scsi_mq_get_budget+0x31/0x110
[  372.863205]  __blk_mq_try_issue_directly+0x195/0x1e0
[  372.863206]  blk_mq_request_direct_issue+0x5d/0xc0
[  372.863209]  ? blk_insert_cloned_request+0x96/0x1c0
[  372.863221]  map_request+0x142/0x260 [dm_mod]
[  372.863226]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
[  372.863229]  blk_mq_dispatch_rq_list+0x8e/0x530
[  372.863233]  ? deadline_remove_request+0x79/0xc0
[  372.863236]  blk_mq_do_dispatch_sched+0x8b/0x110
[  372.863239]  blk_mq_sched_dispatch_requests+0x118/0x1a0
[  372.863241]  __blk_mq_run_hw_queue+0x5f/0xf0
[  372.863243]  __blk_mq_delay_run_hw_queue+0x9c/0xa0
[  372.863245]  blk_mq_run_hw_queue+0x54/0xf0
[  372.863248]  blk_mq_flush_plug_list+0x17f/0x260
[  372.863250]  blk_flush_plug_list+0xe4/0x260
[  372.863252]  blk_mq_make_request+0x483/0x560
[  372.863254]  generic_make_request+0x110/0x2e0
[  372.863257]  submit_bio+0x6e/0x140
[  372.863333]  xfs_add_to_ioend+0x13f/0x240 [xfs]
[  372.863364]  ? xfs_map_buffer.isra.14+0x33/0x60 [xfs]
[  372.863397]  xfs_do_writepage+0x214/0x600 [xfs]
[  372.863403]  ? find_get_pages_range_tag+0x15f/0x290
[  372.863407]  ? invalid_page_referenced_vma+0x90/0x90
[  372.863410]  write_cache_pages+0x222/0x470
[  372.863439]  ? xfs_aops_discard_page+0x130/0x130 [xfs]
[  372.863469]  xfs_vm_writepages+0xb2/0xd0 [xfs]
[  372.863473]  do_writepages+0x17/0x70
[  372.863475]  __writeback_single_inode+0x3d/0x330
[  372.863477]  writeback_sb_inodes+0x24f/0x4b0
[  372.863479]  __writeback_inodes_wb+0x87/0xb0
[  372.863481]  wb_writeback+0x276/0x310
[  372.863483]  wb_workfn+0x1b0/0x460
[  372.863489]  process_one_work+0x141/0x340
[  372.863491]  worker_thread+0x47/0x3e0
[  372.863494]  kthread+0xf5/0x130
[  372.863497]  ? rescuer_thread+0x380/0x380
[  372.863499]  ? kthread_associate_blkcg+0x90/0x90
[  372.863504]  ret_from_fork+0x1f/0x30
[  372.863506] Code: ed 09 48 21 c8 44 89 ea 81 e2 ff 0f 00 00 48 c1 e2 31
48 09 ea 48 09 c2 48 89 93 b0 00 00 00 e9 ea fe ff ff 0f ff e9 14 ff ff ff
<0f> ff e9 eb fe ff ff 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48
[  372.863523] ---[ end trace d6ddd485d92c6dde ]---
[  372.863536] WARNING: CPU: 15 PID: 13175 at block/blk-mq.h:128
blk_mq_start_request+0x15a/0x170
[  372.863537] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter rpcrdma ib_isert iscsi_target_mod target_core_mod ib_iser
libiscsi scsi_transport_iscsi ib_srp scsi_transport_srp ib_ipoib rdma_ucm
ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ipmi_ssif ghash_clmulni_intel pcbc aesni_intel crypto_simd
ipmi_si glue_helper cryptd joydev ipmi_devintf ipmi_msghandler iTCO_wdt
iTCO_vendor_support dm_service_time pcspkr acpi_power_meter gpio_ich hpilo
hpwdt sg pcc_cpufreq i7core_edac
[  372.863557]  shpchp lpc_ich nfsd auth_rpcgss nfs_acl lockd grace
dm_multipath sunrpc ip_tables xfs libcrc32c sd_mod radeon mlx5_core
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
ttm mlxfw drm ptp i2c_core crc32c_intel hpsa pps_core serio_raw bnx2
devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  372.863568] CPU: 15 PID: 13175 Comm: kworker/u66:14 Tainted: G        W
I      4.15.0-rc4.dm_and_block+ #1
[  372.863569] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  372.863571] Workqueue: writeback wb_workfn (flush-253:13)
[  372.863573] RIP: 0010:blk_mq_start_request+0x15a/0x170
[  372.863574] RSP: 0018:ffffae5b8c293480 EFLAGS: 00010202
[  372.863576] RAX: 0000000000000009 RBX: ffff929ba8b34c00 RCX:
0001ffffffffffff
[  372.863577] RDX: 0000000000000001 RSI: 00310d51549749a3 RDI:
ffffffffbae29760
[  372.863578] RBP: 0000000000000000 R08: 0000000000000000 R09:
ffff929ba8b34d98
[  372.863579] R10: 0000000000001000 R11: 0000000000000000 R12:
ffff929ba8cdcba8
[  372.863580] R13: 0000000000004000 R14: ffff929ba920b800 R15:
ffff929ba8b34d60
[  372.863582] FS:  0000000000000000(0000) GS:ffff929c333c0000(0000)
knlGS:0000000000000000
[  372.863584] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.863585] CR2: 00007fb07e09b000 CR3: 0000000f56609006 CR4:
00000000000206e0
[  372.863587] Call Trace:
[  372.863589]  scsi_queue_rq+0x2f4/0x560
[  372.863591]  ? scsi_mq_get_budget+0x31/0x110
[  372.863593]  __blk_mq_try_issue_directly+0x195/0x1e0
[  372.863595]  blk_mq_request_direct_issue+0x5d/0xc0
[  372.863598]  ? blk_insert_cloned_request+0x96/0x1c0
[  372.863604]  map_request+0x142/0x260 [dm_mod]
[  372.863609]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
[  372.863611]  blk_mq_dispatch_rq_list+0x8e/0x530
[  372.863614]  ? deadline_remove_request+0x79/0xc0
[  372.863617]  blk_mq_do_dispatch_sched+0x8b/0x110
[  372.863619]  blk_mq_sched_dispatch_requests+0x118/0x1a0
[  372.863622]  __blk_mq_run_hw_queue+0x5f/0xf0
[  372.863625]  __blk_mq_delay_run_hw_queue+0x9c/0xa0
[  372.863627]  blk_mq_run_hw_queue+0x54/0xf0
[  372.863628]  blk_mq_flush_plug_list+0x17f/0x260
[  372.863630]  blk_flush_plug_list+0xe4/0x260
[  372.863632]  blk_mq_make_request+0x483/0x560
[  372.863634]  generic_make_request+0x110/0x2e0
[  372.863635]  submit_bio+0x6e/0x140
[  372.863664]  xfs_add_to_ioend+0x13f/0x240 [xfs]
[  372.863694]  ? xfs_map_buffer.isra.14+0x33/0x60 [xfs]
[  372.863724]  xfs_do_writepage+0x214/0x600 [xfs]
[  372.863727]  ? find_get_pages_range_tag+0x15f/0x290
[  372.863730]  ? invalid_page_referenced_vma+0x90/0x90
[  372.863732]  write_cache_pages+0x222/0x470
[  372.863762]  ? xfs_aops_discard_page+0x130/0x130 [xfs]
[  372.863791]  xfs_vm_writepages+0xb2/0xd0 [xfs]
[  372.863794]  do_writepages+0x17/0x70
[  372.863796]  __writeback_single_inode+0x3d/0x330
[  372.863799]  writeback_sb_inodes+0x24f/0x4b0
[  372.863801]  __writeback_inodes_wb+0x87/0xb0
[  372.863804]  wb_writeback+0x276/0x310
[  372.863806]  wb_workfn+0x1b0/0x460
[  372.863809]  process_one_work+0x141/0x340
[  372.863812]  worker_thread+0x47/0x3e0
[  372.863814]  kthread+0xf5/0x130
[  372.863817]  ? rescuer_thread+0x380/0x380
[  372.863820]  ? kthread_associate_blkcg+0x90/0x90
[  372.863822]  ret_from_fork+0x1f/0x30
[  372.863823] Code: 18 00 00 02 00 41 c1 ed 09 48 21 c8 44 89 ea 81 e2 ff
0f 00 00 48 c1 e2 31 48 09 ea 48 09 c2 48 89 93 b0 00 00 00 e9 ea fe ff ff
<0f> ff e9 14 ff ff ff 0f ff e9 eb fe ff ff 0f 1f 84 00 00 00 00
[  372.863841] ---[ end trace d6ddd485d92c6ddf ]---
[  372.915387] ------------[ cut here ]------------
[  372.915391] list_add corruption. prev->next should be next
(00000000bebab7ca), but was 00000000b1199db9. (prev=00000000b1199db9).
[  372.915416] WARNING: CPU: 15 PID: 13175 at lib/list_debug.c:28
__list_add_valid+0x6a/0x70
[  372.915417] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter rpcrdma ib_isert iscsi_target_mod target_core_mod ib_iser
libiscsi scsi_transport_iscsi ib_srp scsi_transport_srp ib_ipoib rdma_ucm
ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
crc32_pclmul ipmi_ssif ghash_clmulni_intel pcbc aesni_intel crypto_simd
ipmi_si glue_helper cryptd joydev ipmi_devintf ipmi_msghandler iTCO_wdt
iTCO_vendor_support dm_service_time pcspkr acpi_power_meter gpio_ich hpilo
hpwdt sg pcc_cpufreq i7core_edac
[  372.915454]  shpchp lpc_ich nfsd auth_rpcgss nfs_acl lockd grace
dm_multipath sunrpc ip_tables xfs libcrc32c sd_mod radeon mlx5_core
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
ttm mlxfw drm ptp i2c_core crc32c_intel hpsa pps_core serio_raw bnx2
devlink scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  372.915473] CPU: 15 PID: 13175 Comm: kworker/u66:14 Tainted: G        W
I      4.15.0-rc4.dm_and_block+ #1
[  372.915473] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  372.915478] Workqueue: writeback wb_workfn (flush-253:13)
[  372.915481] RIP: 0010:__list_add_valid+0x6a/0x70
[  372.915482] RSP: 0018:ffffae5b8c2934b0 EFLAGS: 00010282
[  372.915484] RAX: 0000000000000000 RBX: ffff929c2bb83000 RCX:
0000000000000000
[  372.915485] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
0000000000000282
[  372.915487] RBP: ffff929ba87ed100 R08: 0000000000000075 R09:
ffff929c6402cfdd
[  372.915487] R10: 0000000000000780 R11: 0000000000000000 R12:
ffff929ba87ed140
[  372.915489] R13: ffff929ba87ed140 R14: ffff929c2bb83008 R15:
0000000000000001
[  372.915491] FS:  0000000000000000(0000) GS:ffff929c333c0000(0000)
knlGS:0000000000000000
[  372.915492] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  372.915493] CR2: 00007fb07e09b000 CR3: 0000000f56609006 CR4:
00000000000206e0
[  372.915494] Call Trace:
[  372.915500]  blk_mq_request_bypass_insert+0x57/0xa0
[  372.915503]  __blk_mq_try_issue_directly+0x56/0x1e0
[  372.915504]  blk_mq_request_direct_issue+0x5d/0xc0
[  372.915507]  ? blk_insert_cloned_request+0x96/0x1c0
[  372.915517]  map_request+0x142/0x260 [dm_mod]
[  372.915522]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
[  372.915525]  blk_mq_dispatch_rq_list+0x8e/0x530
[  372.915528]  ? deadline_remove_request+0x79/0xc0
[  372.915531]  blk_mq_do_dispatch_sched+0x8b/0x110
[  372.915534]  blk_mq_sched_dispatch_requests+0x118/0x1a0
[  372.915538]  __blk_mq_run_hw_queue+0x5f/0xf0
[  372.915540]  __blk_mq_delay_run_hw_queue+0x9c/0xa0
[  372.915542]  blk_mq_run_hw_queue+0x54/0xf0
[  372.915544]  blk_mq_flush_plug_list+0x17f/0x260
[  372.915547]  blk_flush_plug_list+0xe4/0x260
[  372.915550]  blk_mq_make_request+0x483/0x560
[  372.915553]  generic_make_request+0x110/0x2e0
[  372.915555]  submit_bio+0x6e/0x140
[  372.915608]  xfs_add_to_ioend+0x13f/0x240 [xfs]
[  372.915639]  ? xfs_map_buffer.isra.14+0x33/0x60 [xfs]
[  372.915667]  xfs_do_writepage+0x214/0x600 [xfs]
[  372.915673]  ? find_get_pages_range_tag+0x15f/0x290
[  372.915677]  ? invalid_page_referenced_vma+0x90/0x90
[  372.915680]  write_cache_pages+0x222/0x470
[  372.915709]  ? xfs_aops_discard_page+0x130/0x130 [xfs]
[  372.915739]  xfs_vm_writepages+0xb2/0xd0 [xfs]
[  372.915743]  do_writepages+0x17/0x70
[  372.915745]  __writeback_single_inode+0x3d/0x330
[  372.915746]  writeback_sb_inodes+0x24f/0x4b0
[  372.915749]  __writeback_inodes_wb+0x87/0xb0
[  372.915751]  wb_writeback+0x276/0x310
[  372.915753]  wb_workfn+0x1b0/0x460
[  372.915758]  process_one_work+0x141/0x340
[  372.915761]  worker_thread+0x47/0x3e0
[  372.915764]  kthread+0xf5/0x130
[  372.915766]  ? rescuer_thread+0x380/0x380
[  372.915769]  ? kthread_associate_blkcg+0x90/0x90
[  372.915774]  ret_from_fork+0x1f/0x30
[  372.915775] Code: fe 31 c0 48 c7 c7 58 a8 c9 ba e8 02 4c cf ff 0f ff 31
c0 c3 48 89 d1 48 c7 c7 08 a8 c9 ba 48 89 f2 48 89 c6 31 c0 e8 e6 4b cf ff
<0f> ff 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b
[  372.915792] ---[ end trace d6ddd485d92c6de0 ]---
[  372.916492] ------------[ cut here ]------------
[  372.916493] list_add corruption. prev->next should be next
(00000000bebab7ca), but was 00000000b1199db9. (prev=00000000b1199db9).
[  372.916501] WARNING: CPU: 15 PID: 13175 at lib/list_debug.c:28
__list_add_valid+0x6a/0x70






On Wed, Jan 17, 2018 at 7:54 PM, Mike Snitzer <snitzer@redhat.com> wrote:

> On Wed, Jan 17 2018 at  6:53pm -0500,
> Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
>
> > On Wed, 2018-01-17 at 18:43 -0500, Laurence Oberman wrote:
> > > On Wed, 2018-01-17 at 23:31 +0000, Bart Van Assche wrote:
> > > > On Wed, 2018-01-17 at 11:58 -0500, Mike Snitzer wrote:
> > > > > On Wed, Jan 17 2018 at 11:50am -0500,
> > > > > Jens Axboe <axboe@kernel.dk> wrote:
> > > > >
> > > > > > On 1/17/18 9:25 AM, Mike Snitzer wrote:
> > > > > > > Hi Jens,
> > > > > > >
> > > > > > > Think this finally takes care of it! ;)
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Mike
> > > > > > >
> > > > > > > Mike Snitzer (2):
> > > > > > >   blk-mq: factor out a few helpers from
> > > > > > > __blk_mq_try_issue_directly
> > > > > > >   blk-mq-sched: remove unused 'can_block' arg from
> > > > > > > blk_mq_sched_insert_request
> > > > > > >
> > > > > > > Ming Lei (1):
> > > > > > >   blk-mq: improve DM's blk-mq IO merging via
> > > > > > > blk_insert_cloned_request feedback
> > > > > >
> > > > > > Applied - added actual commit message to patch 3.
> > > > >
> > > > > Great, thanks.
> > > >
> > > > Hello Mike,
> > > >
> > > > Laurence hit the following while retesting the SRP initiator code:
> > > >
> > > > [ 2223.797129] list_add corruption. prev->next should be next
> > > > (00000000e0ddd5dd), but was 000000003defe5cd.
> > > > (prev=000000003defe5cd).
> > > > [ 2223.862168] WARNING: CPU: 14 PID: 577 at lib/list_debug.c:28
> > > > __list_add_valid+0x6a/0x70
> > > > [ 2224.481151] CPU: 14 PID: 577 Comm: kworker/14:1H Tainted:
> > > > G          I      4.15.0-rc8.bart3+ #1
> > > > [ 2224.531193] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > > > 08/16/2015
> > > > [ 2224.567150] Workqueue: kblockd blk_mq_run_work_fn
> > > > [ 2224.593182] RIP: 0010:__list_add_valid+0x6a/0x70
> > > > [ 2224.967002] Call Trace:
> > > > [ 2224.980941]  blk_mq_request_bypass_insert+0x57/0xa0
> > > > [ 2225.009044]  __blk_mq_try_issue_directly+0x56/0x1e0
> > > > [ 2225.037007]  blk_mq_request_direct_issue+0x5d/0xc0
> > > > [ 2225.090608]  map_request+0x142/0x260 [dm_mod]
> > > > [ 2225.114756]  dm_mq_queue_rq+0xa4/0x120 [dm_mod]
> > > > [ 2225.140812]  blk_mq_dispatch_rq_list+0x90/0x5b0
> > > > [ 2225.211769]  blk_mq_sched_dispatch_requests+0x107/0x1a0
> > > > [ 2225.240825]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > [ 2225.264852]  process_one_work+0x141/0x340
> > > > [ 2225.287872]  worker_thread+0x47/0x3e0
> > > > [ 2225.308354]  kthread+0xf5/0x130
> > > > [ 2225.396405]  ret_from_fork+0x32/0x40
> > > >
> > > > That call trace did not show up before this patch series was added to
> > > > Jens'
> > > > tree. This is a regression. Could this have been introduced by this
> > > > patch
> > > > series?
> > > >
> > > > Thanks,
> > > >
> > > > Bart.
> > >
> > > Hi Bart
> > > One thing to note.
> > >
> > > I tested Mike's combined tree on the weekend fully dm4.16-block4.16 and
> > > did not see this.
> > > This was with Mike combined tree and SRPT running 4.13-rc2.
> > >
> > > I also tested your tree Monday with the revert of the scatter/gather
> > > patches with both SRP and SRPT running your tree and it was fine.
> > >
> > > So its a combination of what you provided me before and that has been
> > > added to your tree.
> > >
> > > Mike combined tree seemed to be fine, I can revisit that if needed. I
> > > still have that kernel in place.
> > >
> > > I was not running latest SRPT when I tested Mike's tree
> >
> > Hello Laurence,
> >
> > The tree I sent you this morning did not only include Mike's latest dm
> code
> > but also Jens' latest for-next branch. So what you wrote above does not
> > contradict what I wrote in my e-mail, namely that I suspect that a
> regression
> > was introduced by the patches in the series "blk-mq: improve DM's blk-mq
> IO
> > merging via blk_insert_cloned_request feedback". These changes namely
> went in
> > through the block tree and not through the dm tree. Additionally, these
> > changes were not present in the block-scsi-for-next branch I sent you on
> > Monday.
>
> Functionality shouldn't be any different than the variant (Ming's v4)
> that Laurence tested on Sunday/Monday (once we got past the genirq issue
> on HPSA).
>
> But sure, I suppose there is something I missed when refactoring Ming's
> change to get it acceptable for upstream.  I went over the mechanical
> nature of what I did many times (comaping Ming's v4 to my v5).
>
> Anyway, we'll see how Laurence fairs with this tree (but with the revert
> of 84676c1 added, so his HPSA server will boot):
> https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/
> linux.git/log/?h=block-4.16_dm-4.16
> (which is the same as linux-dm.git's 'for-next' at the moment)
>
> The call to blk_mq_request_bypass_insert will only occur via
> __blk_mq_fallback_to_insert.  Which as the name implies this is not the
> fast path.  This will occur if the underlying blk-mq device cannot get
> resources it needs in order to issue the request.  Specifically: if/when
> in __blk_mq_try_issue_directly() the hctx is stopped, or queue is
> quiesced, or it cannot get the driver tag or dispatch_budget (in the
> case of scsi-mq).
>
> The same fallback, via call to blk_mq_request_bypass_insert, occured
> with Ming's v4 though.
>
> Anyway, we'll see what Laurence finds when testing my above kernel.
>
> Mike
>

[-- Attachment #1.2: Type: text/html, Size: 28857 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-17 16:25 ` [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
@ 2018-01-18  3:25   ` Ming Lei
  2018-01-18  3:37     ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Ming Lei @ 2018-01-18  3:25 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hch, dm-devel, linux-block

Hi Mike,

On Wed, Jan 17, 2018 at 11:25:57AM -0500, Mike Snitzer wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> blk_insert_cloned_request() is called in the fast path of a dm-rq driver
> (e.g. blk-mq request-based DM mpath).  blk_insert_cloned_request() uses
> blk_mq_request_bypass_insert() to directly append the request to the
> blk-mq hctx->dispatch_list of the underlying queue.
> 
> 1) This way isn't efficient enough because the hctx spinlock is always
> used.
> 
> 2) With blk_insert_cloned_request(), we completely bypass underlying
> queue's elevator and depend on the upper-level dm-rq driver's elevator
> to schedule IO.  But dm-rq currently can't get the underlying queue's
> dispatch feedback at all.  Without knowing whether a request was issued
> or not (e.g. due to underlying queue being busy) the dm-rq elevator will
> not be able to provide effective IO merging (as a side-effect of dm-rq
> currently blindly destaging a request from its elevator only to requeue
> it after a delay, which kills any opportunity for merging).  This
> obviously causes very bad sequential IO performance.
> 
> Fix this by updating blk_insert_cloned_request() to use
> blk_mq_request_direct_issue().  blk_mq_request_direct_issue() allows a
> request to be issued directly to the underlying queue and returns the
> dispatch feedback (blk_status_t).  If blk_mq_request_direct_issue()
> returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE
> to _not_ destage the request.  Whereby preserving the opportunity to
> merge IO.
> 
> With this, request-based DM's blk-mq sequential IO performance is vastly
> improved (as much as 3X in mpath/virtio-scsi testing).
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> [blk-mq.c changes heavily influenced by Ming Lei's initial solution, but
> they were refactored to make them less fragile and easier to read/review]
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-core.c   |  3 +--
>  block/blk-mq.c     | 37 +++++++++++++++++++++++++++++--------
>  block/blk-mq.h     |  3 +++
>  drivers/md/dm-rq.c | 19 ++++++++++++++++---
>  4 files changed, 49 insertions(+), 13 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 7ba607527487..55f338020254 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2500,8 +2500,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
>  		 * bypass a potential scheduler on the bottom device for
>  		 * insert.
>  		 */
> -		blk_mq_request_bypass_insert(rq, true);
> -		return BLK_STS_OK;
> +		return blk_mq_request_direct_issue(rq);
>  	}
>  
>  	spin_lock_irqsave(q->queue_lock, flags);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c117c2baf2c9..f5f0d8456713 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1731,15 +1731,19 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
>  
>  static void __blk_mq_fallback_to_insert(struct blk_mq_hw_ctx *hctx,
>  					struct request *rq,
> -					bool run_queue)
> +					bool run_queue, bool bypass_insert)
>  {
> -	blk_mq_sched_insert_request(rq, false, run_queue, false,
> -					hctx->flags & BLK_MQ_F_BLOCKING);
> +	if (!bypass_insert)
> +		blk_mq_sched_insert_request(rq, false, run_queue, false,
> +					    hctx->flags & BLK_MQ_F_BLOCKING);
> +	else
> +		blk_mq_request_bypass_insert(rq, run_queue);
>  }

If 'bypass_insert' is true, we don't need to insert the request into
hctx->dispatch_list for dm-rq, then it causes the issue(use after free)
reported by Bart and Laurence.

Also this way is the exact opposite of the idea of the improvement,
we do not want to dispatch request if underlying queue is busy.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
  2018-01-18  0:54           ` Mike Snitzer
  2018-01-18  1:17             ` Laurence Oberman
@ 2018-01-18  3:33             ` Mike Snitzer
  2018-01-18  3:39               ` Ming Lei
  1 sibling, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2018-01-18  3:33 UTC (permalink / raw)
  To: Bart Van Assche, Laurence Oberman
  Cc: axboe, hch, dm-devel, linux-block, tom.leiming

On Wed, Jan 17 2018 at  7:54P -0500,
Mike Snitzer <snitzer@redhat.com> wrote:
 
> But sure, I suppose there is something I missed when refactoring Ming's
> change to get it acceptable for upstream.  I went over the mechanical
> nature of what I did many times (comparing Ming's v4 to my v5).

And yes there is one subtlety that I missed.

> The call to blk_mq_request_bypass_insert will only occur via
> __blk_mq_fallback_to_insert.  Which as the name implies this is not the
> fast path.  This will occur if the underlying blk-mq device cannot get
> resources it needs in order to issue the request.  Specifically: if/when
> in __blk_mq_try_issue_directly() the hctx is stopped, or queue is
> quiesced, or it cannot get the driver tag or dispatch_budget (in the
> case of scsi-mq).
> 
> The same fallback, via call to blk_mq_request_bypass_insert, occured
> with Ming's v4 though.

Turns out Ming's v4 doesn't fallback to insert for the "or it cannot get
the driver tag or dispatch_budget" case.

This patch should fix it (Laurence, please report back on if this fixes
your list_add corruption, pretty sure it will):

From: Mike Snitzer <snitzer@redhat.com>
Date: Wed, 17 Jan 2018 22:02:07 -0500
Subject: [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE

It isn't ever valid to call blk_mq_request_bypass_insert() and return
BLK_STS_RESOURCE.

Unfortunately after commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO
merging via blk_insert_cloned_request feedback") we do just that if
blk_mq_request_direct_issue() cannot get the resources (driver_tag or
dispatch_budget) needed to directly issue a request.  This will lead to
"list_add corruption" because blk-mq submits the IO but then reports
that it didn't (BLK_STS_RESOURCE in this case).

Fix this by simply returning BLK_STS_RESOURCE for this case.

Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
Reported-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-mq.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c418858a60ef..8bee37239255 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1799,20 +1799,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	if (q->elevator && !bypass_insert)
 		goto insert;
 
-	if (!blk_mq_get_driver_tag(rq, NULL, false))
-		goto insert;
-
-	if (!blk_mq_get_dispatch_budget(hctx)) {
+	if (!blk_mq_get_driver_tag(rq, NULL, false) ||
+	    !blk_mq_get_dispatch_budget(hctx)) {
+		/* blk_mq_put_driver_tag() is idempotent */
 		blk_mq_put_driver_tag(rq);
+		if (bypass_insert)
+			return BLK_STS_RESOURCE;
 		goto insert;
 	}
 
 	return __blk_mq_issue_directly(hctx, rq, cookie);
 insert:
 	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
-	if (bypass_insert)
-		return BLK_STS_RESOURCE;
-
 	return BLK_STS_OK;
 }
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-18  3:25   ` Ming Lei
@ 2018-01-18  3:37     ` Mike Snitzer
  2018-01-18  3:44       ` Ming Lei
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2018-01-18  3:37 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, Ming Lei, hch, dm-devel, linux-block

On Wed, Jan 17 2018 at 10:25pm -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> Hi Mike,
> 
> On Wed, Jan 17, 2018 at 11:25:57AM -0500, Mike Snitzer wrote:
> > From: Ming Lei <ming.lei@redhat.com>
> > 
> > blk_insert_cloned_request() is called in the fast path of a dm-rq driver
> > (e.g. blk-mq request-based DM mpath).  blk_insert_cloned_request() uses
> > blk_mq_request_bypass_insert() to directly append the request to the
> > blk-mq hctx->dispatch_list of the underlying queue.
> > 
> > 1) This way isn't efficient enough because the hctx spinlock is always
> > used.
> > 
> > 2) With blk_insert_cloned_request(), we completely bypass underlying
> > queue's elevator and depend on the upper-level dm-rq driver's elevator
> > to schedule IO.  But dm-rq currently can't get the underlying queue's
> > dispatch feedback at all.  Without knowing whether a request was issued
> > or not (e.g. due to underlying queue being busy) the dm-rq elevator will
> > not be able to provide effective IO merging (as a side-effect of dm-rq
> > currently blindly destaging a request from its elevator only to requeue
> > it after a delay, which kills any opportunity for merging).  This
> > obviously causes very bad sequential IO performance.
> > 
> > Fix this by updating blk_insert_cloned_request() to use
> > blk_mq_request_direct_issue().  blk_mq_request_direct_issue() allows a
> > request to be issued directly to the underlying queue and returns the
> > dispatch feedback (blk_status_t).  If blk_mq_request_direct_issue()
> > returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE
> > to _not_ destage the request.  Whereby preserving the opportunity to
> > merge IO.
> > 
> > With this, request-based DM's blk-mq sequential IO performance is vastly
> > improved (as much as 3X in mpath/virtio-scsi testing).
> > 
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > [blk-mq.c changes heavily influenced by Ming Lei's initial solution, but
> > they were refactored to make them less fragile and easier to read/review]
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
...
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index c117c2baf2c9..f5f0d8456713 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1731,15 +1731,19 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
> >  
> >  static void __blk_mq_fallback_to_insert(struct blk_mq_hw_ctx *hctx,
> >  					struct request *rq,
> > -					bool run_queue)
> > +					bool run_queue, bool bypass_insert)
> >  {
> > -	blk_mq_sched_insert_request(rq, false, run_queue, false,
> > -					hctx->flags & BLK_MQ_F_BLOCKING);
> > +	if (!bypass_insert)
> > +		blk_mq_sched_insert_request(rq, false, run_queue, false,
> > +					    hctx->flags & BLK_MQ_F_BLOCKING);
> > +	else
> > +		blk_mq_request_bypass_insert(rq, run_queue);
> >  }
> 
> If 'bypass_insert' is true, we don't need to insert the request into
> hctx->dispatch_list for dm-rq, then it causes the issue(use after free)
> reported by Bart and Laurence.
> 
> Also this way is the exact opposite of the idea of the improvement,
> we do not want to dispatch request if underlying queue is busy.

Yeap, please see the patch I just posted to fix it.

But your v4 does fallback to using blk_mq_request_bypass_insert() as
well, just in a much narrower case -- specifically:
       if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))

Thanks,
Mike

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
  2018-01-18  3:33             ` [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE Mike Snitzer
@ 2018-01-18  3:39               ` Ming Lei
  2018-01-18  3:49                 ` Mike Snitzer
  0 siblings, 1 reply; 23+ messages in thread
From: Ming Lei @ 2018-01-18  3:39 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, Laurence Oberman, axboe, hch, dm-devel,
	linux-block, tom.leiming

On Wed, Jan 17, 2018 at 10:33:35PM -0500, Mike Snitzer wrote:
> On Wed, Jan 17 2018 at  7:54P -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
>  
> > But sure, I suppose there is something I missed when refactoring Ming's
> > change to get it acceptable for upstream.  I went over the mechanical
> > nature of what I did many times (comparing Ming's v4 to my v5).
> 
> And yes there is one subtlety that I missed.
> 
> > The call to blk_mq_request_bypass_insert will only occur via
> > __blk_mq_fallback_to_insert.  Which as the name implies this is not the
> > fast path.  This will occur if the underlying blk-mq device cannot get
> > resources it needs in order to issue the request.  Specifically: if/when
> > in __blk_mq_try_issue_directly() the hctx is stopped, or queue is
> > quiesced, or it cannot get the driver tag or dispatch_budget (in the
> > case of scsi-mq).
> > 
> > The same fallback, via call to blk_mq_request_bypass_insert, occured
> > with Ming's v4 though.
> 
> Turns out Ming's v4 doesn't fallback to insert for the "or it cannot get
> the driver tag or dispatch_budget" case.
> 
> This patch should fix it (Laurence, please report back on if this fixes
> your list_add corruption, pretty sure it will):
> 
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Wed, 17 Jan 2018 22:02:07 -0500
> Subject: [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
> 
> It isn't ever valid to call blk_mq_request_bypass_insert() and return
> BLK_STS_RESOURCE.
> 
> Unfortunately after commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO
> merging via blk_insert_cloned_request feedback") we do just that if
> blk_mq_request_direct_issue() cannot get the resources (driver_tag or
> dispatch_budget) needed to directly issue a request.  This will lead to
> "list_add corruption" because blk-mq submits the IO but then reports
> that it didn't (BLK_STS_RESOURCE in this case).
> 
> Fix this by simply returning BLK_STS_RESOURCE for this case.
> 
> Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
> Reported-by: Laurence Oberman <loberman@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-mq.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c418858a60ef..8bee37239255 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1799,20 +1799,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  	if (q->elevator && !bypass_insert)
>  		goto insert;
>  
> -	if (!blk_mq_get_driver_tag(rq, NULL, false))
> -		goto insert;
> -
> -	if (!blk_mq_get_dispatch_budget(hctx)) {
> +	if (!blk_mq_get_driver_tag(rq, NULL, false) ||
> +	    !blk_mq_get_dispatch_budget(hctx)) {
> +		/* blk_mq_put_driver_tag() is idempotent */
>  		blk_mq_put_driver_tag(rq);
> +		if (bypass_insert)
> +			return BLK_STS_RESOURCE;
>  		goto insert;
>  	}
>  
>  	return __blk_mq_issue_directly(hctx, rq, cookie);
>  insert:
>  	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
> -	if (bypass_insert)
> -		return BLK_STS_RESOURCE;
> -
>  	return BLK_STS_OK;
>  }

Hi Mike,

I'd suggest to use the following one, which is simple and clean:


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4d4af8d712da..816ff5d6bc88 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1856,15 +1856,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
-static void __blk_mq_fallback_to_insert(struct request *rq,
-					bool run_queue, bool bypass_insert)
-{
-	if (!bypass_insert)
-		blk_mq_sched_insert_request(rq, false, run_queue, false);
-	else
-		blk_mq_request_bypass_insert(rq, run_queue);
-}
-
 static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 						struct request *rq,
 						blk_qc_t *cookie,
@@ -1892,10 +1883,10 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 	return __blk_mq_issue_directly(hctx, rq, cookie);
 insert:
-	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
 	if (bypass_insert)
 		return BLK_STS_RESOURCE;
 
+	blk_mq_sched_insert_request(rq, false, run_queue, false);
 	return BLK_STS_OK;
 }
 
@@ -1911,7 +1902,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 
 	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
 	if (ret == BLK_STS_RESOURCE)
-		__blk_mq_fallback_to_insert(rq, true, false);
+		blk_mq_sched_insert_request(rq, false, true, false);
 	else if (ret != BLK_STS_OK)
 		blk_mq_end_request(rq, ret);
 


-- 
Ming

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
  2018-01-18  3:37     ` Mike Snitzer
@ 2018-01-18  3:44       ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-18  3:44 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hch, dm-devel, linux-block

On Wed, Jan 17, 2018 at 10:37:23PM -0500, Mike Snitzer wrote:
> On Wed, Jan 17 2018 at 10:25pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > Hi Mike,
> > 
> > On Wed, Jan 17, 2018 at 11:25:57AM -0500, Mike Snitzer wrote:
> > > From: Ming Lei <ming.lei@redhat.com>
> > > 
> > > blk_insert_cloned_request() is called in the fast path of a dm-rq driver
> > > (e.g. blk-mq request-based DM mpath).  blk_insert_cloned_request() uses
> > > blk_mq_request_bypass_insert() to directly append the request to the
> > > blk-mq hctx->dispatch_list of the underlying queue.
> > > 
> > > 1) This way isn't efficient enough because the hctx spinlock is always
> > > used.
> > > 
> > > 2) With blk_insert_cloned_request(), we completely bypass underlying
> > > queue's elevator and depend on the upper-level dm-rq driver's elevator
> > > to schedule IO.  But dm-rq currently can't get the underlying queue's
> > > dispatch feedback at all.  Without knowing whether a request was issued
> > > or not (e.g. due to underlying queue being busy) the dm-rq elevator will
> > > not be able to provide effective IO merging (as a side-effect of dm-rq
> > > currently blindly destaging a request from its elevator only to requeue
> > > it after a delay, which kills any opportunity for merging).  This
> > > obviously causes very bad sequential IO performance.
> > > 
> > > Fix this by updating blk_insert_cloned_request() to use
> > > blk_mq_request_direct_issue().  blk_mq_request_direct_issue() allows a
> > > request to be issued directly to the underlying queue and returns the
> > > dispatch feedback (blk_status_t).  If blk_mq_request_direct_issue()
> > > returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE
> > > to _not_ destage the request.  Whereby preserving the opportunity to
> > > merge IO.
> > > 
> > > With this, request-based DM's blk-mq sequential IO performance is vastly
> > > improved (as much as 3X in mpath/virtio-scsi testing).
> > > 
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > [blk-mq.c changes heavily influenced by Ming Lei's initial solution, but
> > > they were refactored to make them less fragile and easier to read/review]
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ...
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index c117c2baf2c9..f5f0d8456713 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -1731,15 +1731,19 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
> > >  
> > >  static void __blk_mq_fallback_to_insert(struct blk_mq_hw_ctx *hctx,
> > >  					struct request *rq,
> > > -					bool run_queue)
> > > +					bool run_queue, bool bypass_insert)
> > >  {
> > > -	blk_mq_sched_insert_request(rq, false, run_queue, false,
> > > -					hctx->flags & BLK_MQ_F_BLOCKING);
> > > +	if (!bypass_insert)
> > > +		blk_mq_sched_insert_request(rq, false, run_queue, false,
> > > +					    hctx->flags & BLK_MQ_F_BLOCKING);
> > > +	else
> > > +		blk_mq_request_bypass_insert(rq, run_queue);
> > >  }
> > 
> > If 'bypass_insert' is true, we don't need to insert the request into
> > hctx->dispatch_list for dm-rq, then it causes the issue(use after free)
> > reported by Bart and Laurence.
> > 
> > Also this way is the exact opposite of the idea of the improvement,
> > we do not want to dispatch request if underlying queue is busy.
> 
> Yeap, please see the patch I just posted to fix it.

Looks your patch is a bit complicated, then __blk_mq_fallback_to_insert()
can be removed.

> 
> But your v4 does fallback to using blk_mq_request_bypass_insert() as
> well, just in a much narrower case -- specifically:
>        if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))

Yeah, I just found it, and you can add 'bypass_insert = false' under
condition.

-- 
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
  2018-01-18  3:39               ` Ming Lei
@ 2018-01-18  3:49                 ` Mike Snitzer
  2018-01-18  3:52                   ` Ming Lei
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Snitzer @ 2018-01-18  3:49 UTC (permalink / raw)
  To: Ming Lei
  Cc: Bart Van Assche, Laurence Oberman, axboe, hch, dm-devel,
	linux-block, tom.leiming

On Wed, Jan 17 2018 at 10:39pm -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Wed, Jan 17, 2018 at 10:33:35PM -0500, Mike Snitzer wrote:
> > On Wed, Jan 17 2018 at  7:54P -0500,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> >  
> > > But sure, I suppose there is something I missed when refactoring Ming's
> > > change to get it acceptable for upstream.  I went over the mechanical
> > > nature of what I did many times (comparing Ming's v4 to my v5).
> > 
> > And yes there is one subtlety that I missed.
> > 
> > > The call to blk_mq_request_bypass_insert will only occur via
> > > __blk_mq_fallback_to_insert.  Which as the name implies this is not the
> > > fast path.  This will occur if the underlying blk-mq device cannot get
> > > resources it needs in order to issue the request.  Specifically: if/when
> > > in __blk_mq_try_issue_directly() the hctx is stopped, or queue is
> > > quiesced, or it cannot get the driver tag or dispatch_budget (in the
> > > case of scsi-mq).
> > > 
> > > The same fallback, via call to blk_mq_request_bypass_insert, occured
> > > with Ming's v4 though.
> > 
> > Turns out Ming's v4 doesn't fallback to insert for the "or it cannot get
> > the driver tag or dispatch_budget" case.
> > 
> > This patch should fix it (Laurence, please report back on if this fixes
> > your list_add corruption, pretty sure it will):
> > 
> > From: Mike Snitzer <snitzer@redhat.com>
> > Date: Wed, 17 Jan 2018 22:02:07 -0500
> > Subject: [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
> > 
> > It isn't ever valid to call blk_mq_request_bypass_insert() and return
> > BLK_STS_RESOURCE.
> > 
> > Unfortunately after commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO
> > merging via blk_insert_cloned_request feedback") we do just that if
> > blk_mq_request_direct_issue() cannot get the resources (driver_tag or
> > dispatch_budget) needed to directly issue a request.  This will lead to
> > "list_add corruption" because blk-mq submits the IO but then reports
> > that it didn't (BLK_STS_RESOURCE in this case).
> > 
> > Fix this by simply returning BLK_STS_RESOURCE for this case.
> > 
> > Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
> > Reported-by: Laurence Oberman <loberman@redhat.com>
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  block/blk-mq.c | 12 +++++-------
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index c418858a60ef..8bee37239255 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1799,20 +1799,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> >  	if (q->elevator && !bypass_insert)
> >  		goto insert;
> >  
> > -	if (!blk_mq_get_driver_tag(rq, NULL, false))
> > -		goto insert;
> > -
> > -	if (!blk_mq_get_dispatch_budget(hctx)) {
> > +	if (!blk_mq_get_driver_tag(rq, NULL, false) ||
> > +	    !blk_mq_get_dispatch_budget(hctx)) {
> > +		/* blk_mq_put_driver_tag() is idempotent */
> >  		blk_mq_put_driver_tag(rq);
> > +		if (bypass_insert)
> > +			return BLK_STS_RESOURCE;
> >  		goto insert;
> >  	}
> >  
> >  	return __blk_mq_issue_directly(hctx, rq, cookie);
> >  insert:
> >  	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
> > -	if (bypass_insert)
> > -		return BLK_STS_RESOURCE;
> > -
> >  	return BLK_STS_OK;
> >  }
> 
> Hi Mike,
> 
> I'd suggest to use the following one, which is simple and clean:
> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4d4af8d712da..816ff5d6bc88 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1856,15 +1856,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
>  	return ret;
>  }
>  
> -static void __blk_mq_fallback_to_insert(struct request *rq,
> -					bool run_queue, bool bypass_insert)
> -{
> -	if (!bypass_insert)
> -		blk_mq_sched_insert_request(rq, false, run_queue, false);
> -	else
> -		blk_mq_request_bypass_insert(rq, run_queue);
> -}
> -
>  static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  						struct request *rq,
>  						blk_qc_t *cookie,
> @@ -1892,10 +1883,10 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  
>  	return __blk_mq_issue_directly(hctx, rq, cookie);
>  insert:
> -	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
>  	if (bypass_insert)
>  		return BLK_STS_RESOURCE;
>  
> +	blk_mq_sched_insert_request(rq, false, run_queue, false);
>  	return BLK_STS_OK;
>  }
>  
> @@ -1911,7 +1902,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>  
>  	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
>  	if (ret == BLK_STS_RESOURCE)
> -		__blk_mq_fallback_to_insert(rq, true, false);
> +		blk_mq_sched_insert_request(rq, false, true, false);
>  	else if (ret != BLK_STS_OK)
>  		blk_mq_end_request(rq, ret);
>  

That'd be another way to skin the cat.. BUT it is different than your
original approach (due to the case I detailed in earlier mail).

But I like the simplicity of always returning BLK_STS_RESOURCE.

Reviewed-by: Mike Snitzer <snitzer@redhat.com>

Laurance, please test Ming's patch instead.  Ming, if Laurance finds
this fix works (which is should) please put a formal header on the patch
and submit for Jens to pick up.  Sorry about screwing this up.

(feel free to re-use portions of my above patch header to help explain
the problem in your header)

Thanks,
Mike

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
  2018-01-18  3:49                 ` Mike Snitzer
@ 2018-01-18  3:52                   ` Ming Lei
  0 siblings, 0 replies; 23+ messages in thread
From: Ming Lei @ 2018-01-18  3:52 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, Laurence Oberman, axboe, hch, dm-devel,
	linux-block, tom.leiming

On Wed, Jan 17, 2018 at 10:49:58PM -0500, Mike Snitzer wrote:
> On Wed, Jan 17 2018 at 10:39pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Wed, Jan 17, 2018 at 10:33:35PM -0500, Mike Snitzer wrote:
> > > On Wed, Jan 17 2018 at  7:54P -0500,
> > > Mike Snitzer <snitzer@redhat.com> wrote:
> > >  
> > > > But sure, I suppose there is something I missed when refactoring Ming's
> > > > change to get it acceptable for upstream.  I went over the mechanical
> > > > nature of what I did many times (comparing Ming's v4 to my v5).
> > > 
> > > And yes there is one subtlety that I missed.
> > > 
> > > > The call to blk_mq_request_bypass_insert will only occur via
> > > > __blk_mq_fallback_to_insert.  Which as the name implies this is not the
> > > > fast path.  This will occur if the underlying blk-mq device cannot get
> > > > resources it needs in order to issue the request.  Specifically: if/when
> > > > in __blk_mq_try_issue_directly() the hctx is stopped, or queue is
> > > > quiesced, or it cannot get the driver tag or dispatch_budget (in the
> > > > case of scsi-mq).
> > > > 
> > > > The same fallback, via call to blk_mq_request_bypass_insert, occured
> > > > with Ming's v4 though.
> > > 
> > > Turns out Ming's v4 doesn't fallback to insert for the "or it cannot get
> > > the driver tag or dispatch_budget" case.
> > > 
> > > This patch should fix it (Laurence, please report back on if this fixes
> > > your list_add corruption, pretty sure it will):
> > > 
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Wed, 17 Jan 2018 22:02:07 -0500
> > > Subject: [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE
> > > 
> > > It isn't ever valid to call blk_mq_request_bypass_insert() and return
> > > BLK_STS_RESOURCE.
> > > 
> > > Unfortunately after commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO
> > > merging via blk_insert_cloned_request feedback") we do just that if
> > > blk_mq_request_direct_issue() cannot get the resources (driver_tag or
> > > dispatch_budget) needed to directly issue a request.  This will lead to
> > > "list_add corruption" because blk-mq submits the IO but then reports
> > > that it didn't (BLK_STS_RESOURCE in this case).
> > > 
> > > Fix this by simply returning BLK_STS_RESOURCE for this case.
> > > 
> > > Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
> > > Reported-by: Laurence Oberman <loberman@redhat.com>
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  block/blk-mq.c | 12 +++++-------
> > >  1 file changed, 5 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index c418858a60ef..8bee37239255 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -1799,20 +1799,18 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> > >  	if (q->elevator && !bypass_insert)
> > >  		goto insert;
> > >  
> > > -	if (!blk_mq_get_driver_tag(rq, NULL, false))
> > > -		goto insert;
> > > -
> > > -	if (!blk_mq_get_dispatch_budget(hctx)) {
> > > +	if (!blk_mq_get_driver_tag(rq, NULL, false) ||
> > > +	    !blk_mq_get_dispatch_budget(hctx)) {
> > > +		/* blk_mq_put_driver_tag() is idempotent */
> > >  		blk_mq_put_driver_tag(rq);
> > > +		if (bypass_insert)
> > > +			return BLK_STS_RESOURCE;
> > >  		goto insert;
> > >  	}
> > >  
> > >  	return __blk_mq_issue_directly(hctx, rq, cookie);
> > >  insert:
> > >  	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
> > > -	if (bypass_insert)
> > > -		return BLK_STS_RESOURCE;
> > > -
> > >  	return BLK_STS_OK;
> > >  }
> > 
> > Hi Mike,
> > 
> > I'd suggest to use the following one, which is simple and clean:
> > 
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 4d4af8d712da..816ff5d6bc88 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1856,15 +1856,6 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
> >  	return ret;
> >  }
> >  
> > -static void __blk_mq_fallback_to_insert(struct request *rq,
> > -					bool run_queue, bool bypass_insert)
> > -{
> > -	if (!bypass_insert)
> > -		blk_mq_sched_insert_request(rq, false, run_queue, false);
> > -	else
> > -		blk_mq_request_bypass_insert(rq, run_queue);
> > -}
> > -
> >  static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> >  						struct request *rq,
> >  						blk_qc_t *cookie,
> > @@ -1892,10 +1883,10 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> >  
> >  	return __blk_mq_issue_directly(hctx, rq, cookie);
> >  insert:
> > -	__blk_mq_fallback_to_insert(rq, run_queue, bypass_insert);
> >  	if (bypass_insert)
> >  		return BLK_STS_RESOURCE;
> >  
> > +	blk_mq_sched_insert_request(rq, false, run_queue, false);
> >  	return BLK_STS_OK;
> >  }
> >  
> > @@ -1911,7 +1902,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> >  
> >  	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
> >  	if (ret == BLK_STS_RESOURCE)
> > -		__blk_mq_fallback_to_insert(rq, true, false);
> > +		blk_mq_sched_insert_request(rq, false, true, false);
> >  	else if (ret != BLK_STS_OK)
> >  		blk_mq_end_request(rq, ret);
> >  
> 
> That'd be another way to skin the cat.. BUT it is different than your
> original approach (due to the case I detailed in earlier mail).

Yeah, as I mentioned, it is a bug in this patch.

> 
> But I like the simplicity of always returning BLK_STS_RESOURCE.
> 
> Reviewed-by: Mike Snitzer <snitzer@redhat.com>
> 
> Laurance, please test Ming's patch instead.  Ming, if Laurance finds
> this fix works (which is should) please put a formal header on the patch
> and submit for Jens to pick up.  Sorry about screwing this up.
> 
> (feel free to re-use portions of my above patch header to help explain
> the problem in your header)

Hi Mike & Laurance,

Please hold a moment, this one has one issue, and I will post a formal
one.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-01-18  3:52 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-17 16:25 [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
2018-01-17 16:25 ` [for-4.16 PATCH v6 1/3] blk-mq: factor out a few helpers from __blk_mq_try_issue_directly Mike Snitzer
2018-01-17 16:25 ` [for-4.16 PATCH v6 2/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Mike Snitzer
2018-01-18  3:25   ` Ming Lei
2018-01-18  3:37     ` Mike Snitzer
2018-01-18  3:44       ` Ming Lei
2018-01-17 16:25 ` [for-4.16 PATCH v6 3/3] blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request Mike Snitzer
2018-01-17 16:50 ` [for-4.16 PATCH v6 0/3] blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback Jens Axboe
2018-01-17 16:58   ` Mike Snitzer
2018-01-17 23:31     ` Bart Van Assche
2018-01-17 23:31       ` Bart Van Assche
2018-01-17 23:43       ` Laurence Oberman
2018-01-17 23:53         ` Bart Van Assche
2018-01-17 23:53           ` Bart Van Assche
2018-01-17 23:57           ` Laurence Oberman
2018-01-18  0:12             ` Laurence Oberman
2018-01-18  0:12               ` Laurence Oberman
2018-01-18  0:54           ` Mike Snitzer
2018-01-18  1:17             ` Laurence Oberman
2018-01-18  3:33             ` [PATCH] blk mq: don't blk_mq_request_bypass_insert _and_ return BLK_STS_RESOURCE Mike Snitzer
2018-01-18  3:39               ` Ming Lei
2018-01-18  3:49                 ` Mike Snitzer
2018-01-18  3:52                   ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.