All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap
@ 2020-11-16  9:07 Ming Lei
  2020-11-16  9:07 ` [PATCH V4 01/12] sbitmap: remove sbitmap_clear_bit_unlock Ming Lei
                   ` (11 more replies)
  0 siblings, 12 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

Hi,

scsi uses one global atomic variable to track queue depth for each
LUN/request queue. This way can't scale well when there is lots of CPU
cores and the disk is very fast. Broadcom guys has complained that their
high end HBA can't reach top performance because .device_busy is
operated in IO path.

Replace the atomic variable sdev->device_busy with sbitmap for
tracking scsi device queue depth.

Test on scsi_debug shows this way improve IOPS > 20%. Meantime
the IOPS difference is just ~1% compared with bypassing .device_busy
on scsi_debug via patches[1]

The 1st 6 patches moves percpu allocation hint into sbitmap, since
the improvement by doing percpu allocation hint on sbitmap is observable.
Meantime export helpers for SCSI.

Patch 7 and 8 prepares for the conversion by returning budget token
from .get_budget callback, meantime passes the budget token to driver
via 'struct blk_mq_queue_data' in .queue_rq().

The last four patches changes SCSI for switching to track device queue
depth via sbitmap.

The patchset have been tested by Broadcom, and obvious performance boost
can be observed.

https://github.com/ming1/linux/tree/v5.10-block-replace-sdev-device_busy-with-sbitmap

V4:
	- limit max sdev->queue_depth as max(1024, shost->can_queue)
	- simplify code for moving per-cpu allocation hint into sbitmap

V3:
	- rebase on both for-5.10/block and 5.10/scsi-queue.

V2:
	- fix one build failure


Ming Lei (12):
  sbitmap: remove sbitmap_clear_bit_unlock
  sbitmap: maintain allocation round_robin in sbitmap
  sbitmap: add helpers for updating allocation hint
  sbitmap: move allocation hint into sbitmap
  sbitmap: export sbitmap_weight
  sbitmap: add helper of sbitmap_calculate_shift
  blk-mq: add callbacks for storing & retrieving budget token
  blk-mq: return budget token from .get_budget callback
  scsi: put hot fields of scsi_host_template into one cacheline
  scsi: add scsi_device_busy() to read sdev->device_busy
  scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
  scsi: replace sdev->device_busy with sbitmap

 block/blk-mq-sched.c                 |  17 ++-
 block/blk-mq.c                       |  38 +++--
 block/blk-mq.h                       |  25 +++-
 block/kyber-iosched.c                |   3 +-
 drivers/message/fusion/mptsas.c      |   2 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   2 +-
 drivers/scsi/scsi.c                  |  13 ++
 drivers/scsi/scsi_lib.c              |  69 ++++++---
 drivers/scsi/scsi_priv.h             |   3 +
 drivers/scsi/scsi_scan.c             |  23 ++-
 drivers/scsi/scsi_sysfs.c            |   4 +-
 drivers/scsi/sg.c                    |   2 +-
 include/linux/blk-mq.h               |  13 +-
 include/linux/sbitmap.h              |  84 +++++++----
 include/scsi/scsi_cmnd.h             |   2 +
 include/scsi/scsi_device.h           |   8 +-
 include/scsi/scsi_host.h             |  72 ++++-----
 lib/sbitmap.c                        | 210 +++++++++++++++------------
 18 files changed, 385 insertions(+), 205 deletions(-)

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
-- 
2.25.4


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH V4 01/12] sbitmap: remove sbitmap_clear_bit_unlock
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 02/12] sbitmap: maintain allocation round_robin in sbitmap Ming Lei
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke, Johannes Thumshirn

No one uses this helper any more, so kill it.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/sbitmap.h | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index e40d019c3d9d..51edc05489cb 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -320,12 +320,6 @@ static inline void sbitmap_deferred_clear_bit(struct sbitmap *sb, unsigned int b
 	set_bit(SB_NR_TO_BIT(sb, bitnr), addr);
 }
 
-static inline void sbitmap_clear_bit_unlock(struct sbitmap *sb,
-					    unsigned int bitnr)
-{
-	clear_bit_unlock(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr));
-}
-
 static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
 {
 	return test_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr));
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 02/12] sbitmap: maintain allocation round_robin in sbitmap
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
  2020-11-16  9:07 ` [PATCH V4 01/12] sbitmap: remove sbitmap_clear_bit_unlock Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 03/12] sbitmap: add helpers for updating allocation hint Ming Lei
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

Now allocation round_robin info is maintained by sbitmap_queue.

Actually, bit allocation belongs to sbitmap. Also the following
patch will move alloc_hint to sbitmap for users with high depth.

So move round_robin to sbitmap.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c          |  2 +-
 block/kyber-iosched.c   |  3 ++-
 include/linux/sbitmap.h | 20 ++++++++++----------
 lib/sbitmap.c           | 28 ++++++++++++++--------------
 4 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 55bcee5dc032..33b94ca9d0e9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2688,7 +2688,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set,
 		goto free_cpumask;
 
 	if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8),
-				gfp, node))
+				gfp, node, false))
 		goto free_ctxs;
 	hctx->nr_ctx = 0;
 
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index dc89199bc8c6..cc8bcfe1d587 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -479,7 +479,8 @@ static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
 
 	for (i = 0; i < KYBER_NUM_DOMAINS; i++) {
 		if (sbitmap_init_node(&khd->kcq_map[i], hctx->nr_ctx,
-				      ilog2(8), GFP_KERNEL, hctx->numa_node)) {
+				      ilog2(8), GFP_KERNEL, hctx->numa_node,
+				      false)) {
 			while (--i >= 0)
 				sbitmap_free(&khd->kcq_map[i]);
 			goto err_kcqs;
diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 51edc05489cb..68097b052ec3 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -61,6 +61,11 @@ struct sbitmap {
 	 */
 	unsigned int map_nr;
 
+	/**
+	 * @round_robin: Allocate bits in strict round-robin order.
+	 */
+	bool round_robin;
+
 	/**
 	 * @map: Allocated bitmap.
 	 */
@@ -129,11 +134,6 @@ struct sbitmap_queue {
 	 */
 	atomic_t ws_active;
 
-	/**
-	 * @round_robin: Allocate bits in strict round-robin order.
-	 */
-	bool round_robin;
-
 	/**
 	 * @min_shallow_depth: The minimum shallow depth which may be passed to
 	 * sbitmap_queue_get_shallow() or __sbitmap_queue_get_shallow().
@@ -149,11 +149,14 @@ struct sbitmap_queue {
  *         given, a good default is chosen.
  * @flags: Allocation flags.
  * @node: Memory node to allocate on.
+ * @round_robin: If true, be stricter about allocation order; always allocate
+ *               starting from the last allocated bit. This is less efficient
+ *               than the default behavior (false).
  *
  * Return: Zero on success or negative errno on failure.
  */
 int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
-		      gfp_t flags, int node);
+		      gfp_t flags, int node, bool round_robin);
 
 /**
  * sbitmap_free() - Free memory used by a &struct sbitmap.
@@ -179,15 +182,12 @@ void sbitmap_resize(struct sbitmap *sb, unsigned int depth);
  * sbitmap_get() - Try to allocate a free bit from a &struct sbitmap.
  * @sb: Bitmap to allocate from.
  * @alloc_hint: Hint for where to start searching for a free bit.
- * @round_robin: If true, be stricter about allocation order; always allocate
- *               starting from the last allocated bit. This is less efficient
- *               than the default behavior (false).
  *
  * This operation provides acquire barrier semantics if it succeeds.
  *
  * Return: Non-negative allocated bit number if successful, -1 otherwise.
  */
-int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint, bool round_robin);
+int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint);
 
 /**
  * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap,
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 267aa7709416..8d920d66d42a 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -42,7 +42,7 @@ static inline bool sbitmap_deferred_clear(struct sbitmap *sb, int index)
 }
 
 int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
-		      gfp_t flags, int node)
+		      gfp_t flags, int node, bool round_robin)
 {
 	unsigned int bits_per_word;
 	unsigned int i;
@@ -67,6 +67,7 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
 	sb->shift = shift;
 	sb->depth = depth;
 	sb->map_nr = DIV_ROUND_UP(sb->depth, bits_per_word);
+	sb->round_robin = round_robin;
 
 	if (depth == 0) {
 		sb->map = NULL;
@@ -137,14 +138,14 @@ static int __sbitmap_get_word(unsigned long *word, unsigned long depth,
 }
 
 static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index,
-				     unsigned int alloc_hint, bool round_robin)
+				     unsigned int alloc_hint)
 {
 	int nr;
 
 	do {
 		nr = __sbitmap_get_word(&sb->map[index].word,
 					sb->map[index].depth, alloc_hint,
-					!round_robin);
+					!sb->round_robin);
 		if (nr != -1)
 			break;
 		if (!sbitmap_deferred_clear(sb, index))
@@ -154,7 +155,7 @@ static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index,
 	return nr;
 }
 
-int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint, bool round_robin)
+int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint)
 {
 	unsigned int i, index;
 	int nr = -1;
@@ -166,14 +167,13 @@ int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint, bool round_robin)
 	 * alloc_hint to find the right word index. No point in looping
 	 * twice in find_next_zero_bit() for that case.
 	 */
-	if (round_robin)
+	if (sb->round_robin)
 		alloc_hint = SB_NR_TO_BIT(sb, alloc_hint);
 	else
 		alloc_hint = 0;
 
 	for (i = 0; i < sb->map_nr; i++) {
-		nr = sbitmap_find_bit_in_index(sb, index, alloc_hint,
-						round_robin);
+		nr = sbitmap_find_bit_in_index(sb, index, alloc_hint);
 		if (nr != -1) {
 			nr += index << sb->shift;
 			break;
@@ -358,7 +358,8 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
 	int ret;
 	int i;
 
-	ret = sbitmap_init_node(&sbq->sb, depth, shift, flags, node);
+	ret = sbitmap_init_node(&sbq->sb, depth, shift, flags, node,
+				round_robin);
 	if (ret)
 		return ret;
 
@@ -390,7 +391,6 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
 		atomic_set(&sbq->ws[i].wait_cnt, sbq->wake_batch);
 	}
 
-	sbq->round_robin = round_robin;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(sbitmap_queue_init_node);
@@ -432,12 +432,12 @@ int __sbitmap_queue_get(struct sbitmap_queue *sbq)
 		hint = depth ? prandom_u32() % depth : 0;
 		this_cpu_write(*sbq->alloc_hint, hint);
 	}
-	nr = sbitmap_get(&sbq->sb, hint, sbq->round_robin);
+	nr = sbitmap_get(&sbq->sb, hint);
 
 	if (nr == -1) {
 		/* If the map is full, a hint won't do us much good. */
 		this_cpu_write(*sbq->alloc_hint, 0);
-	} else if (nr == hint || unlikely(sbq->round_robin)) {
+	} else if (nr == hint || unlikely(sbq->sb.round_robin)) {
 		/* Only update the hint if we used it. */
 		hint = nr + 1;
 		if (hint >= depth - 1)
@@ -468,7 +468,7 @@ int __sbitmap_queue_get_shallow(struct sbitmap_queue *sbq,
 	if (nr == -1) {
 		/* If the map is full, a hint won't do us much good. */
 		this_cpu_write(*sbq->alloc_hint, 0);
-	} else if (nr == hint || unlikely(sbq->round_robin)) {
+	} else if (nr == hint || unlikely(sbq->sb.round_robin)) {
 		/* Only update the hint if we used it. */
 		hint = nr + 1;
 		if (hint >= depth - 1)
@@ -584,7 +584,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr,
 	smp_mb__after_atomic();
 	sbitmap_queue_wake_up(sbq);
 
-	if (likely(!sbq->round_robin && nr < sbq->sb.depth))
+	if (likely(!sbq->sb.round_robin && nr < sbq->sb.depth))
 		*per_cpu_ptr(sbq->alloc_hint, cpu) = nr;
 }
 EXPORT_SYMBOL_GPL(sbitmap_queue_clear);
@@ -641,7 +641,7 @@ void sbitmap_queue_show(struct sbitmap_queue *sbq, struct seq_file *m)
 	}
 	seq_puts(m, "}\n");
 
-	seq_printf(m, "round_robin=%d\n", sbq->round_robin);
+	seq_printf(m, "round_robin=%d\n", sbq->sb.round_robin);
 	seq_printf(m, "min_shallow_depth=%u\n", sbq->min_shallow_depth);
 }
 EXPORT_SYMBOL_GPL(sbitmap_queue_show);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 03/12] sbitmap: add helpers for updating allocation hint
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
  2020-11-16  9:07 ` [PATCH V4 01/12] sbitmap: remove sbitmap_clear_bit_unlock Ming Lei
  2020-11-16  9:07 ` [PATCH V4 02/12] sbitmap: maintain allocation round_robin in sbitmap Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 04/12] sbitmap: move allocation hint into sbitmap Ming Lei
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

Add helpers for updating allocation hint, so that we can
avoid to duplicate code.

Prepare for moving allocation hint into sbitmap.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 lib/sbitmap.c | 93 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 54 insertions(+), 39 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 8d920d66d42a..4e4423414f4d 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -9,6 +9,55 @@
 #include <linux/sbitmap.h>
 #include <linux/seq_file.h>
 
+static int init_alloc_hint(struct sbitmap_queue *sbq, gfp_t flags)
+{
+	unsigned depth = sbq->sb.depth;
+
+	sbq->alloc_hint = alloc_percpu_gfp(unsigned int, flags);
+	if (!sbq->alloc_hint)
+		return -ENOMEM;
+
+	if (depth && !sbq->sb.round_robin) {
+		int i;
+
+		for_each_possible_cpu(i)
+			*per_cpu_ptr(sbq->alloc_hint, i) = prandom_u32() % depth;
+	}
+
+	return 0;
+}
+
+static inline unsigned update_alloc_hint_before_get(struct sbitmap_queue *sbq,
+						    unsigned int depth)
+{
+	unsigned hint;
+
+	hint = this_cpu_read(*sbq->alloc_hint);
+	if (unlikely(hint >= depth)) {
+		hint = depth ? prandom_u32() % depth : 0;
+		this_cpu_write(*sbq->alloc_hint, hint);
+	}
+
+	return hint;
+}
+
+static inline void update_alloc_hint_after_get(struct sbitmap_queue *sbq,
+					       unsigned int depth,
+					       unsigned int hint,
+					       unsigned int nr)
+{
+	if (nr == -1) {
+		/* If the map is full, a hint won't do us much good. */
+		this_cpu_write(*sbq->alloc_hint, 0);
+	} else if (nr == hint || unlikely(sbq->sb.round_robin)) {
+		/* Only update the hint if we used it. */
+		hint = nr + 1;
+		if (hint >= depth - 1)
+			hint = 0;
+		this_cpu_write(*sbq->alloc_hint, hint);
+	}
+}
+
 /*
  * See if we have deferred clears that we can batch move
  */
@@ -363,17 +412,11 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
 	if (ret)
 		return ret;
 
-	sbq->alloc_hint = alloc_percpu_gfp(unsigned int, flags);
-	if (!sbq->alloc_hint) {
+	if (init_alloc_hint(sbq, flags) != 0) {
 		sbitmap_free(&sbq->sb);
 		return -ENOMEM;
 	}
 
-	if (depth && !round_robin) {
-		for_each_possible_cpu(i)
-			*per_cpu_ptr(sbq->alloc_hint, i) = prandom_u32() % depth;
-	}
-
 	sbq->min_shallow_depth = UINT_MAX;
 	sbq->wake_batch = sbq_calc_wake_batch(sbq, depth);
 	atomic_set(&sbq->wake_index, 0);
@@ -426,24 +469,10 @@ int __sbitmap_queue_get(struct sbitmap_queue *sbq)
 	unsigned int hint, depth;
 	int nr;
 
-	hint = this_cpu_read(*sbq->alloc_hint);
 	depth = READ_ONCE(sbq->sb.depth);
-	if (unlikely(hint >= depth)) {
-		hint = depth ? prandom_u32() % depth : 0;
-		this_cpu_write(*sbq->alloc_hint, hint);
-	}
+	hint = update_alloc_hint_before_get(sbq, depth);
 	nr = sbitmap_get(&sbq->sb, hint);
-
-	if (nr == -1) {
-		/* If the map is full, a hint won't do us much good. */
-		this_cpu_write(*sbq->alloc_hint, 0);
-	} else if (nr == hint || unlikely(sbq->sb.round_robin)) {
-		/* Only update the hint if we used it. */
-		hint = nr + 1;
-		if (hint >= depth - 1)
-			hint = 0;
-		this_cpu_write(*sbq->alloc_hint, hint);
-	}
+	update_alloc_hint_after_get(sbq, depth, hint, nr);
 
 	return nr;
 }
@@ -457,24 +486,10 @@ int __sbitmap_queue_get_shallow(struct sbitmap_queue *sbq,
 
 	WARN_ON_ONCE(shallow_depth < sbq->min_shallow_depth);
 
-	hint = this_cpu_read(*sbq->alloc_hint);
 	depth = READ_ONCE(sbq->sb.depth);
-	if (unlikely(hint >= depth)) {
-		hint = depth ? prandom_u32() % depth : 0;
-		this_cpu_write(*sbq->alloc_hint, hint);
-	}
+	hint = update_alloc_hint_before_get(sbq, depth);
 	nr = sbitmap_get_shallow(&sbq->sb, hint, shallow_depth);
-
-	if (nr == -1) {
-		/* If the map is full, a hint won't do us much good. */
-		this_cpu_write(*sbq->alloc_hint, 0);
-	} else if (nr == hint || unlikely(sbq->sb.round_robin)) {
-		/* Only update the hint if we used it. */
-		hint = nr + 1;
-		if (hint >= depth - 1)
-			hint = 0;
-		this_cpu_write(*sbq->alloc_hint, hint);
-	}
+	update_alloc_hint_after_get(sbq, depth, hint, nr);
 
 	return nr;
 }
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 04/12] sbitmap: move allocation hint into sbitmap
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (2 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 03/12] sbitmap: add helpers for updating allocation hint Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:33   ` Hannes Reinecke
  2020-11-16  9:07 ` [PATCH V4 05/12] sbitmap: export sbitmap_weight Ming Lei
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

Allocation hint should have belonged to sbitmap, also when sbitmap's
depth is high and no need to use mulitple wakeup queues, user can
benefit from percpu allocation hint too.

So move allocation hint into sbitmap, then scsi device queue can benefit
from allocation hint when converting to plain sbitmap in the following
patches.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c          |   2 +-
 block/kyber-iosched.c   |   2 +-
 include/linux/sbitmap.h |  41 +++++++++------
 lib/sbitmap.c           | 112 +++++++++++++++++++++++-----------------
 4 files changed, 94 insertions(+), 63 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 33b94ca9d0e9..ae023cf90c30 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2688,7 +2688,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set,
 		goto free_cpumask;
 
 	if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8),
-				gfp, node, false))
+				gfp, node, false, false))
 		goto free_ctxs;
 	hctx->nr_ctx = 0;
 
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index cc8bcfe1d587..3949d68ac4c1 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -480,7 +480,7 @@ static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
 	for (i = 0; i < KYBER_NUM_DOMAINS; i++) {
 		if (sbitmap_init_node(&khd->kcq_map[i], hctx->nr_ctx,
 				      ilog2(8), GFP_KERNEL, hctx->numa_node,
-				      false)) {
+				      false, false)) {
 			while (--i >= 0)
 				sbitmap_free(&khd->kcq_map[i]);
 			goto err_kcqs;
diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 68097b052ec3..103b41c03311 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -70,6 +70,14 @@ struct sbitmap {
 	 * @map: Allocated bitmap.
 	 */
 	struct sbitmap_word *map;
+
+	/*
+	 * @alloc_hint: Cache of last successfully allocated or freed bit.
+	 *
+	 * This is per-cpu, which allows multiple users to stick to different
+	 * cachelines until the map is exhausted.
+	 */
+	unsigned int __percpu *alloc_hint;
 };
 
 #define SBQ_WAIT_QUEUES 8
@@ -105,14 +113,6 @@ struct sbitmap_queue {
 	 */
 	struct sbitmap sb;
 
-	/*
-	 * @alloc_hint: Cache of last successfully allocated or freed bit.
-	 *
-	 * This is per-cpu, which allows multiple users to stick to different
-	 * cachelines until the map is exhausted.
-	 */
-	unsigned int __percpu *alloc_hint;
-
 	/**
 	 * @wake_batch: Number of bits which must be freed before we wake up any
 	 * waiters.
@@ -152,11 +152,13 @@ struct sbitmap_queue {
  * @round_robin: If true, be stricter about allocation order; always allocate
  *               starting from the last allocated bit. This is less efficient
  *               than the default behavior (false).
+ * @alloc_hint: If true, apply percpu hint for where to start searching for
+ * 		a free bit.
  *
  * Return: Zero on success or negative errno on failure.
  */
 int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
-		      gfp_t flags, int node, bool round_robin);
+		      gfp_t flags, int node, bool round_robin, bool alloc_hint);
 
 /**
  * sbitmap_free() - Free memory used by a &struct sbitmap.
@@ -164,6 +166,7 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
  */
 static inline void sbitmap_free(struct sbitmap *sb)
 {
+	free_percpu(sb->alloc_hint);
 	kfree(sb->map);
 	sb->map = NULL;
 }
@@ -181,19 +184,17 @@ void sbitmap_resize(struct sbitmap *sb, unsigned int depth);
 /**
  * sbitmap_get() - Try to allocate a free bit from a &struct sbitmap.
  * @sb: Bitmap to allocate from.
- * @alloc_hint: Hint for where to start searching for a free bit.
  *
  * This operation provides acquire barrier semantics if it succeeds.
  *
  * Return: Non-negative allocated bit number if successful, -1 otherwise.
  */
-int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint);
+int sbitmap_get(struct sbitmap *sb);
 
 /**
  * sbitmap_get_shallow() - Try to allocate a free bit from a &struct sbitmap,
  * limiting the depth used from each word.
  * @sb: Bitmap to allocate from.
- * @alloc_hint: Hint for where to start searching for a free bit.
  * @shallow_depth: The maximum number of bits to allocate from a single word.
  *
  * This rather specific operation allows for having multiple users with
@@ -205,8 +206,7 @@ int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint);
  *
  * Return: Non-negative allocated bit number if successful, -1 otherwise.
  */
-int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint,
-			unsigned long shallow_depth);
+int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth);
 
 /**
  * sbitmap_any_bit_set() - Check for a set bit in a &struct sbitmap.
@@ -320,6 +320,18 @@ static inline void sbitmap_deferred_clear_bit(struct sbitmap *sb, unsigned int b
 	set_bit(SB_NR_TO_BIT(sb, bitnr), addr);
 }
 
+/*
+ * Pair of sbitmap_get, and this one applies both cleared bit and
+ * allocation hint.
+ */
+static inline void sbitmap_put(struct sbitmap *sb, unsigned int bitnr)
+{
+	sbitmap_deferred_clear_bit(sb, bitnr);
+
+	if (likely(sb->alloc_hint && !sb->round_robin && bitnr < sb->depth))
+                *this_cpu_ptr(sb->alloc_hint) = bitnr;
+}
+
 static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
 {
 	return test_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr));
@@ -368,7 +380,6 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
 static inline void sbitmap_queue_free(struct sbitmap_queue *sbq)
 {
 	kfree(sbq->ws);
-	free_percpu(sbq->alloc_hint);
 	sbitmap_free(&sbq->sb);
 }
 
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 4e4423414f4d..dcd6a89b4d2f 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -9,52 +9,51 @@
 #include <linux/sbitmap.h>
 #include <linux/seq_file.h>
 
-static int init_alloc_hint(struct sbitmap_queue *sbq, gfp_t flags)
+static int init_alloc_hint(struct sbitmap *sb, gfp_t flags)
 {
-	unsigned depth = sbq->sb.depth;
+	unsigned depth = sb->depth;
 
-	sbq->alloc_hint = alloc_percpu_gfp(unsigned int, flags);
-	if (!sbq->alloc_hint)
+	sb->alloc_hint = alloc_percpu_gfp(unsigned int, flags);
+	if (!sb->alloc_hint)
 		return -ENOMEM;
 
-	if (depth && !sbq->sb.round_robin) {
+	if (depth && !sb->round_robin) {
 		int i;
 
 		for_each_possible_cpu(i)
-			*per_cpu_ptr(sbq->alloc_hint, i) = prandom_u32() % depth;
+			*per_cpu_ptr(sb->alloc_hint, i) = prandom_u32() % depth;
 	}
-
 	return 0;
 }
 
-static inline unsigned update_alloc_hint_before_get(struct sbitmap_queue *sbq,
+static inline unsigned update_alloc_hint_before_get(struct sbitmap *sb,
 						    unsigned int depth)
 {
 	unsigned hint;
 
-	hint = this_cpu_read(*sbq->alloc_hint);
+	hint = this_cpu_read(*sb->alloc_hint);
 	if (unlikely(hint >= depth)) {
 		hint = depth ? prandom_u32() % depth : 0;
-		this_cpu_write(*sbq->alloc_hint, hint);
+		this_cpu_write(*sb->alloc_hint, hint);
 	}
 
 	return hint;
 }
 
-static inline void update_alloc_hint_after_get(struct sbitmap_queue *sbq,
+static inline void update_alloc_hint_after_get(struct sbitmap *sb,
 					       unsigned int depth,
 					       unsigned int hint,
 					       unsigned int nr)
 {
 	if (nr == -1) {
 		/* If the map is full, a hint won't do us much good. */
-		this_cpu_write(*sbq->alloc_hint, 0);
-	} else if (nr == hint || unlikely(sbq->sb.round_robin)) {
+		this_cpu_write(*sb->alloc_hint, 0);
+	} else if (nr == hint || unlikely(sb->round_robin)) {
 		/* Only update the hint if we used it. */
 		hint = nr + 1;
 		if (hint >= depth - 1)
 			hint = 0;
-		this_cpu_write(*sbq->alloc_hint, hint);
+		this_cpu_write(*sb->alloc_hint, hint);
 	}
 }
 
@@ -91,7 +90,8 @@ static inline bool sbitmap_deferred_clear(struct sbitmap *sb, int index)
 }
 
 int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
-		      gfp_t flags, int node, bool round_robin)
+		      gfp_t flags, int node, bool round_robin,
+		      bool alloc_hint)
 {
 	unsigned int bits_per_word;
 	unsigned int i;
@@ -123,9 +123,18 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
 		return 0;
 	}
 
+	if (alloc_hint) {
+		if (init_alloc_hint(sb, flags))
+			return -ENOMEM;
+	} else {
+		sb->alloc_hint = NULL;
+	}
+
 	sb->map = kcalloc_node(sb->map_nr, sizeof(*sb->map), flags, node);
-	if (!sb->map)
+	if (!sb->map) {
+		free_percpu(sb->alloc_hint);
 		return -ENOMEM;
+	}
 
 	for (i = 0; i < sb->map_nr; i++) {
 		sb->map[i].depth = min(depth, bits_per_word);
@@ -204,7 +213,7 @@ static int sbitmap_find_bit_in_index(struct sbitmap *sb, int index,
 	return nr;
 }
 
-int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint)
+static int __sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint)
 {
 	unsigned int i, index;
 	int nr = -1;
@@ -236,10 +245,27 @@ int sbitmap_get(struct sbitmap *sb, unsigned int alloc_hint)
 
 	return nr;
 }
+
+int sbitmap_get(struct sbitmap *sb)
+{
+	int nr;
+	unsigned int hint, depth;
+
+	if (WARN_ON_ONCE(unlikely(!sb->alloc_hint)))
+		return -1;
+
+	depth = READ_ONCE(sb->depth);
+	hint = update_alloc_hint_before_get(sb, depth);
+	nr = __sbitmap_get(sb, hint);
+	update_alloc_hint_after_get(sb, depth, hint, nr);
+
+	return nr;
+}
 EXPORT_SYMBOL_GPL(sbitmap_get);
 
-int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint,
-			unsigned long shallow_depth)
+static int __sbitmap_get_shallow(struct sbitmap *sb,
+				 unsigned int alloc_hint,
+				 unsigned long shallow_depth)
 {
 	unsigned int i, index;
 	int nr = -1;
@@ -271,6 +297,22 @@ int sbitmap_get_shallow(struct sbitmap *sb, unsigned int alloc_hint,
 
 	return nr;
 }
+
+int sbitmap_get_shallow(struct sbitmap *sb, unsigned long shallow_depth)
+{
+	int nr;
+	unsigned int hint, depth;
+
+	if (WARN_ON_ONCE(unlikely(!sb->alloc_hint)))
+		return -1;
+
+	depth = READ_ONCE(sb->depth);
+	hint = update_alloc_hint_before_get(sb, depth);
+	nr = __sbitmap_get_shallow(sb, hint, shallow_depth);
+	update_alloc_hint_after_get(sb, depth, hint, nr);
+
+	return nr;
+}
 EXPORT_SYMBOL_GPL(sbitmap_get_shallow);
 
 bool sbitmap_any_bit_set(const struct sbitmap *sb)
@@ -408,15 +450,10 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
 	int i;
 
 	ret = sbitmap_init_node(&sbq->sb, depth, shift, flags, node,
-				round_robin);
+				round_robin, true);
 	if (ret)
 		return ret;
 
-	if (init_alloc_hint(sbq, flags) != 0) {
-		sbitmap_free(&sbq->sb);
-		return -ENOMEM;
-	}
-
 	sbq->min_shallow_depth = UINT_MAX;
 	sbq->wake_batch = sbq_calc_wake_batch(sbq, depth);
 	atomic_set(&sbq->wake_index, 0);
@@ -424,7 +461,6 @@ int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth,
 
 	sbq->ws = kzalloc_node(SBQ_WAIT_QUEUES * sizeof(*sbq->ws), flags, node);
 	if (!sbq->ws) {
-		free_percpu(sbq->alloc_hint);
 		sbitmap_free(&sbq->sb);
 		return -ENOMEM;
 	}
@@ -466,32 +502,16 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_resize);
 
 int __sbitmap_queue_get(struct sbitmap_queue *sbq)
 {
-	unsigned int hint, depth;
-	int nr;
-
-	depth = READ_ONCE(sbq->sb.depth);
-	hint = update_alloc_hint_before_get(sbq, depth);
-	nr = sbitmap_get(&sbq->sb, hint);
-	update_alloc_hint_after_get(sbq, depth, hint, nr);
-
-	return nr;
+	return sbitmap_get(&sbq->sb);
 }
 EXPORT_SYMBOL_GPL(__sbitmap_queue_get);
 
 int __sbitmap_queue_get_shallow(struct sbitmap_queue *sbq,
 				unsigned int shallow_depth)
 {
-	unsigned int hint, depth;
-	int nr;
-
 	WARN_ON_ONCE(shallow_depth < sbq->min_shallow_depth);
 
-	depth = READ_ONCE(sbq->sb.depth);
-	hint = update_alloc_hint_before_get(sbq, depth);
-	nr = sbitmap_get_shallow(&sbq->sb, hint, shallow_depth);
-	update_alloc_hint_after_get(sbq, depth, hint, nr);
-
-	return nr;
+	return sbitmap_get_shallow(&sbq->sb, shallow_depth);
 }
 EXPORT_SYMBOL_GPL(__sbitmap_queue_get_shallow);
 
@@ -600,7 +620,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr,
 	sbitmap_queue_wake_up(sbq);
 
 	if (likely(!sbq->sb.round_robin && nr < sbq->sb.depth))
-		*per_cpu_ptr(sbq->alloc_hint, cpu) = nr;
+		*per_cpu_ptr(sbq->sb.alloc_hint, cpu) = nr;
 }
 EXPORT_SYMBOL_GPL(sbitmap_queue_clear);
 
@@ -638,7 +658,7 @@ void sbitmap_queue_show(struct sbitmap_queue *sbq, struct seq_file *m)
 		if (!first)
 			seq_puts(m, ", ");
 		first = false;
-		seq_printf(m, "%u", *per_cpu_ptr(sbq->alloc_hint, i));
+		seq_printf(m, "%u", *per_cpu_ptr(sbq->sb.alloc_hint, i));
 	}
 	seq_puts(m, "}\n");
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 05/12] sbitmap: export sbitmap_weight
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (3 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 04/12] sbitmap: move allocation hint into sbitmap Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:38   ` Hannes Reinecke
  2020-11-17  9:18   ` John Garry
  2020-11-16  9:07 ` [PATCH V4 06/12] sbitmap: add helper of sbitmap_calculate_shift Ming Lei
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

SCSI's .device_busy will be converted to sbitmap, and sbitmap_weight
is needed, so export the helper.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/sbitmap.h |  9 +++++++++
 lib/sbitmap.c           | 11 ++++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 103b41c03311..34343ce3ef6c 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -346,6 +346,15 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
  */
 void sbitmap_show(struct sbitmap *sb, struct seq_file *m);
 
+
+/**
+ * sbitmap_weight() - Return how many real bits set in a &struct sbitmap.
+ * @sb: Bitmap to check.
+ *
+ * Return: How many real bits set
+ */
+unsigned int sbitmap_weight(const struct sbitmap *sb);
+
 /**
  * sbitmap_bitmap_show() - Write a hex dump of a &struct sbitmap to a &struct
  * seq_file.
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index dcd6a89b4d2f..fb1d3c2f70a2 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -342,20 +342,21 @@ static unsigned int __sbitmap_weight(const struct sbitmap *sb, bool set)
 	return weight;
 }
 
-static unsigned int sbitmap_weight(const struct sbitmap *sb)
+static unsigned int sbitmap_cleared(const struct sbitmap *sb)
 {
-	return __sbitmap_weight(sb, true);
+	return __sbitmap_weight(sb, false);
 }
 
-static unsigned int sbitmap_cleared(const struct sbitmap *sb)
+unsigned int sbitmap_weight(const struct sbitmap *sb)
 {
-	return __sbitmap_weight(sb, false);
+	return __sbitmap_weight(sb, true) - sbitmap_cleared(sb);
 }
+EXPORT_SYMBOL_GPL(sbitmap_weight);
 
 void sbitmap_show(struct sbitmap *sb, struct seq_file *m)
 {
 	seq_printf(m, "depth=%u\n", sb->depth);
-	seq_printf(m, "busy=%u\n", sbitmap_weight(sb) - sbitmap_cleared(sb));
+	seq_printf(m, "busy=%u\n", sbitmap_weight(sb));
 	seq_printf(m, "cleared=%u\n", sbitmap_cleared(sb));
 	seq_printf(m, "bits_per_word=%u\n", 1U << sb->shift);
 	seq_printf(m, "map_nr=%u\n", sb->map_nr);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 06/12] sbitmap: add helper of sbitmap_calculate_shift
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (4 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 05/12] sbitmap: export sbitmap_weight Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 07/12] blk-mq: add callbacks for storing & retrieving budget token Ming Lei
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

Move code for calculating default shift into one public helper,
which can be used for SCSI to calculate shift.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/sbitmap.h | 18 ++++++++++++++++++
 lib/sbitmap.c           | 16 +++-------------
 2 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
index 34343ce3ef6c..2a40364d6d00 100644
--- a/include/linux/sbitmap.h
+++ b/include/linux/sbitmap.h
@@ -337,6 +337,24 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
 	return test_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr));
 }
 
+static inline int sbitmap_calculate_shift(unsigned int depth)
+{
+	int	shift = ilog2(BITS_PER_LONG);
+
+	/*
+	 * If the bitmap is small, shrink the number of bits per word so
+	 * we spread over a few cachelines, at least. If less than 4
+	 * bits, just forget about it, it's not going to work optimally
+	 * anyway.
+	 */
+	if (depth >= 4) {
+		while ((4U << shift) > depth)
+			shift--;
+	}
+
+	return shift;
+}
+
 /**
  * sbitmap_show() - Dump &struct sbitmap information to a &struct seq_file.
  * @sb: Bitmap to show.
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index fb1d3c2f70a2..c5a58cf7731b 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -96,19 +96,9 @@ int sbitmap_init_node(struct sbitmap *sb, unsigned int depth, int shift,
 	unsigned int bits_per_word;
 	unsigned int i;
 
-	if (shift < 0) {
-		shift = ilog2(BITS_PER_LONG);
-		/*
-		 * If the bitmap is small, shrink the number of bits per word so
-		 * we spread over a few cachelines, at least. If less than 4
-		 * bits, just forget about it, it's not going to work optimally
-		 * anyway.
-		 */
-		if (depth >= 4) {
-			while ((4U << shift) > depth)
-				shift--;
-		}
-	}
+	if (shift < 0)
+		shift = sbitmap_calculate_shift(depth);
+
 	bits_per_word = 1U << shift;
 	if (bits_per_word > BITS_PER_LONG)
 		return -EINVAL;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 07/12] blk-mq: add callbacks for storing & retrieving budget token
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (5 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 06/12] sbitmap: add helper of sbitmap_calculate_shift Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 08/12] blk-mq: return budget token from .get_budget callback Ming Lei
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

SCSI is the only driver which requires dispatch budget, and it isn't
fair to add one field into 'struct request' for storing budget token
which will be used in the following patches for improving scsi's device
busy scalability.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c  | 18 ++++++++++++++++++
 include/linux/blk-mq.h   |  9 +++++++++
 include/scsi/scsi_cmnd.h |  2 ++
 3 files changed, 29 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 60c7a7d74852..022ed2991463 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1635,6 +1635,20 @@ static bool scsi_mq_get_budget(struct request_queue *q)
 	return false;
 }
 
+static void scsi_mq_set_rq_budget_token(struct request *req, int token)
+{
+	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
+
+	cmd->budget_token = token;
+}
+
+static int scsi_mq_get_rq_budget_token(struct request *req)
+{
+	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
+
+	return cmd->budget_token;
+}
+
 static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 			 const struct blk_mq_queue_data *bd)
 {
@@ -1845,6 +1859,8 @@ static const struct blk_mq_ops scsi_mq_ops_no_commit = {
 	.cleanup_rq	= scsi_cleanup_rq,
 	.busy		= scsi_mq_lld_busy,
 	.map_queues	= scsi_map_queues,
+	.set_rq_budget_token = scsi_mq_set_rq_budget_token,
+	.get_rq_budget_token = scsi_mq_get_rq_budget_token,
 };
 
 
@@ -1873,6 +1889,8 @@ static const struct blk_mq_ops scsi_mq_ops = {
 	.cleanup_rq	= scsi_cleanup_rq,
 	.busy		= scsi_mq_lld_busy,
 	.map_queues	= scsi_map_queues,
+	.set_rq_budget_token = scsi_mq_set_rq_budget_token,
+	.get_rq_budget_token = scsi_mq_get_rq_budget_token,
 };
 
 struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev)
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 794b2a33a2c3..d2c66b453d07 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -316,6 +316,15 @@ struct blk_mq_ops {
 	 */
 	void (*put_budget)(struct request_queue *);
 
+	/*
+	 * @set_rq_budget_toekn: store rq's budget token
+	 */
+	void (*set_rq_budget_token)(struct request *, int);
+	/*
+	 * @get_rq_budget_toekn: retrieve rq's budget token
+	 */
+	int (*get_rq_budget_token)(struct request *);
+
 	/**
 	 * @timeout: Called on request timeout.
 	 */
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 69ade4fb71aa..4884f300c896 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -75,6 +75,8 @@ struct scsi_cmnd {
 
 	int eh_eflags;		/* Used by error handlr */
 
+	int budget_token;
+
 	/*
 	 * This is set to jiffies as it was when the command was first
 	 * allocated.  It is used to time how long the command has
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 08/12] blk-mq: return budget token from .get_budget callback
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (6 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 07/12] blk-mq: add callbacks for storing & retrieving budget token Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:45   ` Hannes Reinecke
  2020-11-16  9:07 ` [PATCH V4 09/12] scsi: put hot fields of scsi_host_template into one cacheline Ming Lei
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

SCSI uses one global atomic variable to track queue depth for each
LUN/request queue.

This way doesn't scale well when there is lots of CPU cores and the
disk is very fast. It has been observed that IOPS is affected a lot
by tracking queue depth via sdev->device_busy in IO path.

Return budget token from .get_budget callback, and the budget token
can be passed to driver, so that we can replace the atomic variable
with sbitmap_queue, then the scale issue can be fixed.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-sched.c    | 17 +++++++++++++----
 block/blk-mq.c          | 36 +++++++++++++++++++++++++-----------
 block/blk-mq.h          | 25 +++++++++++++++++++++----
 drivers/scsi/scsi_lib.c | 16 +++++++++++-----
 include/linux/blk-mq.h  |  4 ++--
 5 files changed, 72 insertions(+), 26 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index d1eafe2c045c..bf046530ed4c 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -131,6 +131,7 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 
 	do {
 		struct request *rq;
+		int budget_token;
 
 		if (e->type->ops.has_work && !e->type->ops.has_work(hctx))
 			break;
@@ -140,12 +141,13 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 			break;
 		}
 
-		if (!blk_mq_get_dispatch_budget(q))
+		budget_token = blk_mq_get_dispatch_budget(q);
+		if (budget_token < 0)
 			break;
 
 		rq = e->type->ops.dispatch_request(hctx);
 		if (!rq) {
-			blk_mq_put_dispatch_budget(q);
+			blk_mq_put_dispatch_budget(q, budget_token);
 			/*
 			 * We're releasing without dispatching. Holding the
 			 * budget could have blocked any "hctx"s with the
@@ -157,6 +159,8 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 			break;
 		}
 
+		blk_mq_set_rq_budget_token(rq, budget_token);
+
 		/*
 		 * Now this rq owns the budget which has to be released
 		 * if this rq won't be queued to driver via .queue_rq()
@@ -230,6 +234,8 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 	struct request *rq;
 
 	do {
+		int budget_token;
+
 		if (!list_empty_careful(&hctx->dispatch)) {
 			ret = -EAGAIN;
 			break;
@@ -238,12 +244,13 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 		if (!sbitmap_any_bit_set(&hctx->ctx_map))
 			break;
 
-		if (!blk_mq_get_dispatch_budget(q))
+		budget_token = blk_mq_get_dispatch_budget(q);
+		if (budget_token < 0)
 			break;
 
 		rq = blk_mq_dequeue_from_ctx(hctx, ctx);
 		if (!rq) {
-			blk_mq_put_dispatch_budget(q);
+			blk_mq_put_dispatch_budget(q, budget_token);
 			/*
 			 * We're releasing without dispatching. Holding the
 			 * budget could have blocked any "hctx"s with the
@@ -255,6 +262,8 @@ static int blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 			break;
 		}
 
+		blk_mq_set_rq_budget_token(rq, budget_token);
+
 		/*
 		 * Now this rq owns the budget which has to be released
 		 * if this rq won't be queued to driver via .queue_rq()
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ae023cf90c30..9e075c333498 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1296,10 +1296,15 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq,
 						  bool need_budget)
 {
 	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
+	int budget_token = -1;
 
-	if (need_budget && !blk_mq_get_dispatch_budget(rq->q)) {
-		blk_mq_put_driver_tag(rq);
-		return PREP_DISPATCH_NO_BUDGET;
+	if (need_budget) {
+		budget_token = blk_mq_get_dispatch_budget(rq->q);
+		if (budget_token < 0) {
+			blk_mq_put_driver_tag(rq);
+			return PREP_DISPATCH_NO_BUDGET;
+		}
+		blk_mq_set_rq_budget_token(rq, budget_token);
 	}
 
 	if (!blk_mq_get_driver_tag(rq)) {
@@ -1316,7 +1321,7 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq,
 			 * together during handling partial dispatch
 			 */
 			if (need_budget)
-				blk_mq_put_dispatch_budget(rq->q);
+				blk_mq_put_dispatch_budget(rq->q, budget_token);
 			return PREP_DISPATCH_NO_TAG;
 		}
 	}
@@ -1326,12 +1331,16 @@ static enum prep_dispatch blk_mq_prep_dispatch_rq(struct request *rq,
 
 /* release all allocated budgets before calling to blk_mq_dispatch_rq_list */
 static void blk_mq_release_budgets(struct request_queue *q,
-		unsigned int nr_budgets)
+		struct list_head *list)
 {
-	int i;
+	struct request *rq;
 
-	for (i = 0; i < nr_budgets; i++)
-		blk_mq_put_dispatch_budget(q);
+	list_for_each_entry(rq, list, queuelist) {
+		int budget_token = blk_mq_get_rq_budget_token(rq);
+
+		if (budget_token >= 0)
+			blk_mq_put_dispatch_budget(q, budget_token);
+	}
 }
 
 /*
@@ -1429,7 +1438,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 			(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED);
 		bool no_budget_avail = prep == PREP_DISPATCH_NO_BUDGET;
 
-		blk_mq_release_budgets(q, nr_budgets);
+		if (nr_budgets)
+			blk_mq_release_budgets(q, list);
 
 		spin_lock(&hctx->lock);
 		list_splice_tail_init(list, &hctx->dispatch);
@@ -1999,6 +2009,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 {
 	struct request_queue *q = rq->q;
 	bool run_queue = true;
+	int budget_token;
 
 	/*
 	 * RCU or SRCU read lock is needed before checking quiesced flag.
@@ -2016,11 +2027,14 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	if (q->elevator && !bypass_insert)
 		goto insert;
 
-	if (!blk_mq_get_dispatch_budget(q))
+	budget_token = blk_mq_get_dispatch_budget(q);
+	if (budget_token < 0)
 		goto insert;
 
+	blk_mq_set_rq_budget_token(rq, budget_token);
+
 	if (!blk_mq_get_driver_tag(rq)) {
-		blk_mq_put_dispatch_budget(q);
+		blk_mq_put_dispatch_budget(q, budget_token);
 		goto insert;
 	}
 
diff --git a/block/blk-mq.h b/block/blk-mq.h
index a52703c98b77..bb59a3e54c3b 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -186,17 +186,34 @@ unsigned int blk_mq_in_flight(struct request_queue *q, struct hd_struct *part);
 void blk_mq_in_flight_rw(struct request_queue *q, struct hd_struct *part,
 			 unsigned int inflight[2]);
 
-static inline void blk_mq_put_dispatch_budget(struct request_queue *q)
+static inline void blk_mq_put_dispatch_budget(struct request_queue *q,
+					      int budget_token)
 {
 	if (q->mq_ops->put_budget)
-		q->mq_ops->put_budget(q);
+		q->mq_ops->put_budget(q, budget_token);
 }
 
-static inline bool blk_mq_get_dispatch_budget(struct request_queue *q)
+static inline int blk_mq_get_dispatch_budget(struct request_queue *q)
 {
 	if (q->mq_ops->get_budget)
 		return q->mq_ops->get_budget(q);
-	return true;
+	return 0;
+}
+
+static inline void blk_mq_set_rq_budget_token(struct request *rq, int token)
+{
+	if (token < 0)
+		return;
+
+	if (rq->q->mq_ops->set_rq_budget_token)
+		rq->q->mq_ops->set_rq_budget_token(rq, token);
+}
+
+static inline int blk_mq_get_rq_budget_token(struct request *rq)
+{
+	if (rq->q->mq_ops->get_rq_budget_token)
+		return rq->q->mq_ops->get_rq_budget_token(rq);
+	return -1;
 }
 
 static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 022ed2991463..dafcfa3ae77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -328,6 +328,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)
 		atomic_dec(&starget->target_busy);
 
 	atomic_dec(&sdev->device_busy);
+	cmd->budget_token = -1;
 }
 
 static void scsi_kick_queue(struct request_queue *q)
@@ -1138,6 +1139,7 @@ void scsi_init_command(struct scsi_device *dev, struct scsi_cmnd *cmd)
 	unsigned long jiffies_at_alloc;
 	int retries, to_clear;
 	bool in_flight;
+	int budget_token = cmd->budget_token;
 
 	if (!blk_rq_is_scsi(rq) && !(flags & SCMD_INITIALIZED)) {
 		flags |= SCMD_INITIALIZED;
@@ -1166,6 +1168,7 @@ void scsi_init_command(struct scsi_device *dev, struct scsi_cmnd *cmd)
 	cmd->retries = retries;
 	if (in_flight)
 		__set_bit(SCMD_STATE_INFLIGHT, &cmd->state);
+	cmd->budget_token = budget_token;
 
 }
 
@@ -1598,19 +1601,19 @@ static void scsi_mq_done(struct scsi_cmnd *cmd)
 	blk_mq_complete_request(cmd->request);
 }
 
-static void scsi_mq_put_budget(struct request_queue *q)
+static void scsi_mq_put_budget(struct request_queue *q, int budget_token)
 {
 	struct scsi_device *sdev = q->queuedata;
 
 	atomic_dec(&sdev->device_busy);
 }
 
-static bool scsi_mq_get_budget(struct request_queue *q)
+static int scsi_mq_get_budget(struct request_queue *q)
 {
 	struct scsi_device *sdev = q->queuedata;
 
 	if (scsi_dev_queue_ready(q, sdev))
-		return true;
+		return 0;
 
 	atomic_inc(&sdev->restarts);
 
@@ -1632,7 +1635,7 @@ static bool scsi_mq_get_budget(struct request_queue *q)
 	if (unlikely(atomic_read(&sdev->device_busy) == 0 &&
 				!scsi_device_blocked(sdev)))
 		blk_mq_delay_run_hw_queues(sdev->request_queue, SCSI_QUEUE_DELAY);
-	return false;
+	return -1;
 }
 
 static void scsi_mq_set_rq_budget_token(struct request *req, int token)
@@ -1660,6 +1663,8 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 	blk_status_t ret;
 	int reason;
 
+	WARN_ON_ONCE(cmd->budget_token < 0);
+
 	/*
 	 * If the device is not in running state we will reject some or all
 	 * commands.
@@ -1711,7 +1716,8 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (scsi_target(sdev)->can_queue > 0)
 		atomic_dec(&scsi_target(sdev)->target_busy);
 out_put_budget:
-	scsi_mq_put_budget(q);
+	scsi_mq_put_budget(q, cmd->budget_token);
+	cmd->budget_token = -1;
 	switch (ret) {
 	case BLK_STS_OK:
 		break;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index d2c66b453d07..b2e1b8daf12f 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -309,12 +309,12 @@ struct blk_mq_ops {
 	 * reserved budget. Also we have to handle failure case
 	 * of .get_budget for avoiding I/O deadlock.
 	 */
-	bool (*get_budget)(struct request_queue *);
+	int (*get_budget)(struct request_queue *);
 
 	/**
 	 * @put_budget: Release the reserved budget.
 	 */
-	void (*put_budget)(struct request_queue *);
+	void (*put_budget)(struct request_queue *, int);
 
 	/*
 	 * @set_rq_budget_toekn: store rq's budget token
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 09/12] scsi: put hot fields of scsi_host_template into one cacheline
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (7 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 08/12] blk-mq: return budget token from .get_budget callback Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 10/12] scsi: add scsi_device_busy() to read sdev->device_busy Ming Lei
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke, Christoph Hellwig

The following three fields of scsi_host_template are referenced in
scsi IO submission path, so put them together into one cacheline:

- cmd_size
- queuecommand
- commit_rqs

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/scsi/scsi_host.h | 72 ++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 33 deletions(-)

diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 701f178b20ae..2d6e3a1f5f0b 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -30,40 +30,15 @@ struct scsi_transport_template;
 #define MODE_TARGET 0x02
 
 struct scsi_host_template {
-	struct module *module;
-	const char *name;
-
 	/*
-	 * The info function will return whatever useful information the
-	 * developer sees fit.  If not provided, then the name field will
-	 * be used instead.
-	 *
-	 * Status: OPTIONAL
+	 * Put fields referenced in IO submission path together in
+	 * same cacheline
 	 */
-	const char *(* info)(struct Scsi_Host *);
 
 	/*
-	 * Ioctl interface
-	 *
-	 * Status: OPTIONAL
-	 */
-	int (*ioctl)(struct scsi_device *dev, unsigned int cmd,
-		     void __user *arg);
-
-
-#ifdef CONFIG_COMPAT
-	/* 
-	 * Compat handler. Handle 32bit ABI.
-	 * When unknown ioctl is passed return -ENOIOCTLCMD.
-	 *
-	 * Status: OPTIONAL
+	 * Additional per-command data allocated for the driver.
 	 */
-	int (*compat_ioctl)(struct scsi_device *dev, unsigned int cmd,
-			    void __user *arg);
-#endif
-
-	int (*init_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd);
-	int (*exit_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd);
+	unsigned int cmd_size;
 
 	/*
 	 * The queuecommand function is used to queue up a scsi
@@ -111,6 +86,41 @@ struct scsi_host_template {
 	 */
 	void (*commit_rqs)(struct Scsi_Host *, u16);
 
+	struct module *module;
+	const char *name;
+
+	/*
+	 * The info function will return whatever useful information the
+	 * developer sees fit.  If not provided, then the name field will
+	 * be used instead.
+	 *
+	 * Status: OPTIONAL
+	 */
+	const char *(*info)(struct Scsi_Host *);
+
+	/*
+	 * Ioctl interface
+	 *
+	 * Status: OPTIONAL
+	 */
+	int (*ioctl)(struct scsi_device *dev, unsigned int cmd,
+		     void __user *arg);
+
+
+#ifdef CONFIG_COMPAT
+	/*
+	 * Compat handler. Handle 32bit ABI.
+	 * When unknown ioctl is passed return -ENOIOCTLCMD.
+	 *
+	 * Status: OPTIONAL
+	 */
+	int (*compat_ioctl)(struct scsi_device *dev, unsigned int cmd,
+			    void __user *arg);
+#endif
+
+	int (*init_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd);
+	int (*exit_cmd_priv)(struct Scsi_Host *shost, struct scsi_cmnd *cmd);
+
 	/*
 	 * This is an error handling strategy routine.  You don't need to
 	 * define one of these if you don't want to - there is a default
@@ -478,10 +488,6 @@ struct scsi_host_template {
 	 */
 	u64 vendor_id;
 
-	/*
-	 * Additional per-command data allocated for the driver.
-	 */
-	unsigned int cmd_size;
 	struct scsi_host_cmd_pool *cmd_pool;
 
 	/* Delay for runtime autosuspend */
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 10/12] scsi: add scsi_device_busy() to read sdev->device_busy
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (8 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 09/12] scsi: put hot fields of scsi_host_template into one cacheline Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:07 ` [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024) Ming Lei
  2020-11-16  9:07 ` [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap Ming Lei
  11 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

Add scsi_device_busy() for drivers, so that we can prepare for tracking
device queue depth via sbitmap_queue.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/message/fusion/mptsas.c      | 2 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +-
 drivers/scsi/scsi_lib.c              | 4 ++--
 drivers/scsi/scsi_sysfs.c            | 2 +-
 drivers/scsi/sg.c                    | 2 +-
 include/scsi/scsi_device.h           | 5 +++++
 6 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 18b91ea1a353..25dd9b2e12ab 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -3756,7 +3756,7 @@ mptsas_send_link_status_event(struct fw_event_work *fw_event)
 						printk(MYIOC_s_DEBUG_FMT
 						"SDEV OUTSTANDING CMDS"
 						"%d\n", ioc->name,
-						atomic_read(&sdev->device_busy)));
+						scsi_device_busy(sdev)));
 				}
 
 			}
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5f845d7094fc..ea1bed3e7a4a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3255,7 +3255,7 @@ scsih_dev_reset(struct scsi_cmnd *scmd)
 		MPI2_SCSITASKMGMT_TASKTYPE_LOGICAL_UNIT_RESET, 0, 0,
 		tr_timeout, tr_method);
 	/* Check for busy commands after reset */
-	if (r == SUCCESS && atomic_read(&scmd->device->device_busy))
+	if (r == SUCCESS && scsi_device_busy(scmd->device))
 		r = FAILED;
  out:
 	sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(0x%p)\n",
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index dafcfa3ae77c..4709418c47e0 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -384,7 +384,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 
 static inline bool scsi_device_is_busy(struct scsi_device *sdev)
 {
-	if (atomic_read(&sdev->device_busy) >= sdev->queue_depth)
+	if (scsi_device_busy(sdev) >= sdev->queue_depth)
 		return true;
 	if (atomic_read(&sdev->device_blocked) > 0)
 		return true;
@@ -1632,7 +1632,7 @@ static int scsi_mq_get_budget(struct request_queue *q)
 	 * the .restarts flag, and the request queue will be run for handling
 	 * this request, see scsi_end_request().
 	 */
-	if (unlikely(atomic_read(&sdev->device_busy) == 0 &&
+	if (unlikely(scsi_device_busy(sdev) == 0 &&
 				!scsi_device_blocked(sdev)))
 		blk_mq_delay_run_hw_queues(sdev->request_queue, SCSI_QUEUE_DELAY);
 	return -1;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index d6e344fa33ad..ef5c79037bde 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -670,7 +670,7 @@ sdev_show_device_busy(struct device *dev, struct device_attribute *attr,
 		char *buf)
 {
 	struct scsi_device *sdev = to_scsi_device(dev);
-	return snprintf(buf, 20, "%d\n", atomic_read(&sdev->device_busy));
+	return snprintf(buf, 20, "%d\n", scsi_device_busy(sdev));
 }
 static DEVICE_ATTR(device_busy, S_IRUGO, sdev_show_device_busy, NULL);
 
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index bfa8d77322d7..4fa2fbb2c833 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -2504,7 +2504,7 @@ static int sg_proc_seq_show_dev(struct seq_file *s, void *v)
 			      scsidp->id, scsidp->lun, (int) scsidp->type,
 			      1,
 			      (int) scsidp->queue_depth,
-			      (int) atomic_read(&scsidp->device_busy),
+			      (int) scsi_device_busy(scsidp),
 			      (int) scsi_device_online(scsidp));
 	}
 	read_unlock_irqrestore(&sg_index_lock, iflags);
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 1a5c9a3df6d6..dd0b9f690a26 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -590,6 +590,11 @@ static inline int scsi_device_supports_vpd(struct scsi_device *sdev)
 	return 0;
 }
 
+static inline int scsi_device_busy(struct scsi_device *sdev)
+{
+	return atomic_read(&sdev->device_busy);
+}
+
 #define MODULE_ALIAS_SCSI_DEVICE(type) \
 	MODULE_ALIAS("scsi:t-" __stringify(type) "*")
 #define SCSI_DEVICE_MODALIAS_FMT "scsi:t-0x%02x"
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (9 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 10/12] scsi: add scsi_device_busy() to read sdev->device_busy Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16  9:44   ` Hannes Reinecke
  2020-11-16  9:07 ` [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap Ming Lei
  11 siblings, 1 reply; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

Limit scsi device's queue depth is less than max(host->can_queue, 1024)
in scsi_change_queue_depth(), and 1024 is big enough for saturating
current fast SCSI LUN(SSD, or raid volume on multiple SSDs).

We need this patch for replacing sdev->device_busy with sbitmap which
has to be pre-allocated with reasonable max depth.

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 24619c3bebd5..a28d48c850cf 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -214,6 +214,15 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 	scsi_io_completion(cmd, good_bytes);
 }
 
+
+/*
+ * 1024 is big enough for saturating the fast scsi LUN now
+ */
+static int scsi_device_max_queue_depth(struct scsi_device *sdev)
+{
+	return max_t(int, sdev->host->can_queue, 1024);
+}
+
 /**
  * scsi_change_queue_depth - change a device's queue depth
  * @sdev: SCSI Device in question
@@ -223,6 +232,8 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
  */
 int scsi_change_queue_depth(struct scsi_device *sdev, int depth)
 {
+	depth = min_t(int, depth, scsi_device_max_queue_depth(sdev));
+
 	if (depth > 0) {
 		sdev->queue_depth = depth;
 		wmb();
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
                   ` (10 preceding siblings ...)
  2020-11-16  9:07 ` [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024) Ming Lei
@ 2020-11-16  9:07 ` Ming Lei
  2020-11-16 11:49     ` kernel test robot
  11 siblings, 1 reply; 43+ messages in thread
From: Ming Lei @ 2020-11-16  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Ming Lei, Omar Sandoval, Kashyap Desai, Sumanesh Samanta,
	Ewan D . Milne, Hannes Reinecke

scsi requires one global atomic variable to track queue depth for each LUN/
request queue, meantime blk-mq tracks queue depth for each hctx. This SCSI's
requirement can't be implemented in blk-mq easily, cause it is a bigger &
harder problem to spread the device or request queue's depth among all hw
queues.

The current approach by using atomic variable can't scale well when there
is lots of CPU cores and the disk is very fast and IO are submitted to this
device concurrently. It has been observed that IOPS is affected a lot by
tracking queue depth via sdev->device_busy in IO path.

So replace the atomic variable sdev->device_busy with sbitmap for
tracking scsi device queue depth.

It is observed that IOPS is improved ~30% by this patchset in the
following test:

1) test machine(32 logical CPU cores)
	Thread(s) per core:  2
	Core(s) per socket:  8
	Socket(s):           2
	NUMA node(s):        2
	Model name:          Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz

2) setup scsi_debug:
modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256

3) fio script:
fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \
	--numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \
	--loops=10000 --name=job1 --filename=/dev/sdN

[1] https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi.c        |  4 +++-
 drivers/scsi/scsi_lib.c    | 33 +++++++++++++++------------------
 drivers/scsi/scsi_priv.h   |  3 +++
 drivers/scsi/scsi_scan.c   | 23 +++++++++++++++++++++--
 drivers/scsi/scsi_sysfs.c  |  2 ++
 include/scsi/scsi_device.h |  5 +++--
 6 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index a28d48c850cf..e9e2f0e15ac8 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -218,7 +218,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 /*
  * 1024 is big enough for saturating the fast scsi LUN now
  */
-static int scsi_device_max_queue_depth(struct scsi_device *sdev)
+int scsi_device_max_queue_depth(struct scsi_device *sdev)
 {
 	return max_t(int, sdev->host->can_queue, 1024);
 }
@@ -242,6 +242,8 @@ int scsi_change_queue_depth(struct scsi_device *sdev, int depth)
 	if (sdev->request_queue)
 		blk_set_queue_depth(sdev->request_queue, depth);
 
+	sbitmap_resize(&sdev->budget_map, sdev->queue_depth);
+
 	return sdev->queue_depth;
 }
 EXPORT_SYMBOL(scsi_change_queue_depth);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 4709418c47e0..f5b675d02929 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -327,7 +327,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)
 	if (starget->can_queue > 0)
 		atomic_dec(&starget->target_busy);
 
-	atomic_dec(&sdev->device_busy);
+	sbitmap_put(&sdev->budget_map, cmd->budget_token);
 	cmd->budget_token = -1;
 }
 
@@ -1250,19 +1250,17 @@ scsi_device_state_check(struct scsi_device *sdev, struct request *req)
 }
 
 /*
- * scsi_dev_queue_ready: if we can send requests to sdev, return 1 else
- * return 0.
- *
- * Called with the queue_lock held.
+ * scsi_dev_queue_ready: if we can send requests to sdev, assign one token
+ * and return the token else return -1.
  */
 static inline int scsi_dev_queue_ready(struct request_queue *q,
 				  struct scsi_device *sdev)
 {
-	unsigned int busy;
+	int token;
 
-	busy = atomic_inc_return(&sdev->device_busy) - 1;
+	token = sbitmap_get(&sdev->budget_map);
 	if (atomic_read(&sdev->device_blocked)) {
-		if (busy)
+		if (token >= 0)
 			goto out_dec;
 
 		/*
@@ -1274,13 +1272,11 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
 				   "unblocking device at zero depth\n"));
 	}
 
-	if (busy >= sdev->queue_depth)
-		goto out_dec;
-
-	return 1;
+	return token;
 out_dec:
-	atomic_dec(&sdev->device_busy);
-	return 0;
+	if (token >= 0)
+		sbitmap_put(&sdev->budget_map, token);
+	return -1;
 }
 
 /*
@@ -1605,15 +1601,16 @@ static void scsi_mq_put_budget(struct request_queue *q, int budget_token)
 {
 	struct scsi_device *sdev = q->queuedata;
 
-	atomic_dec(&sdev->device_busy);
+	sbitmap_put(&sdev->budget_map, budget_token);
 }
 
 static int scsi_mq_get_budget(struct request_queue *q)
 {
 	struct scsi_device *sdev = q->queuedata;
+	int token = scsi_dev_queue_ready(q, sdev);
 
-	if (scsi_dev_queue_ready(q, sdev))
-		return 0;
+	if (token >= 0)
+		return token;
 
 	atomic_inc(&sdev->restarts);
 
@@ -1723,7 +1720,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 		break;
 	case BLK_STS_RESOURCE:
 	case BLK_STS_ZONE_RESOURCE:
-		if (atomic_read(&sdev->device_busy) ||
+		if (sbitmap_any_bit_set(&sdev->budget_map) ||
 		    scsi_device_blocked(sdev))
 			ret = BLK_STS_DEV_RESOURCE;
 		break;
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 180636d54982..30b35002d2f8 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -5,6 +5,7 @@
 #include <linux/device.h>
 #include <linux/async.h>
 #include <scsi/scsi_device.h>
+#include <linux/sbitmap.h>
 
 struct request_queue;
 struct request;
@@ -182,6 +183,8 @@ static inline void scsi_dh_add_device(struct scsi_device *sdev) { }
 static inline void scsi_dh_release_device(struct scsi_device *sdev) { }
 #endif
 
+extern int scsi_device_max_queue_depth(struct scsi_device *sdev);
+
 /* 
  * internal scsi timeout functions: for use by mid-layer and transport
  * classes.
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 9af50e6f94c4..9f1b7f3c650a 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -215,6 +215,7 @@ static void scsi_unlock_floptical(struct scsi_device *sdev,
 static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 					   u64 lun, void *hostdata)
 {
+	unsigned int depth;
 	struct scsi_device *sdev;
 	int display_failure_msg = 1, ret;
 	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
@@ -276,8 +277,25 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	WARN_ON_ONCE(!blk_get_queue(sdev->request_queue));
 	sdev->request_queue->queuedata = sdev;
 
-	scsi_change_queue_depth(sdev, sdev->host->cmd_per_lun ?
-					sdev->host->cmd_per_lun : 1);
+	depth = sdev->host->cmd_per_lun ?: 1;
+
+	/*
+	 * Use .can_queue as budget map's depth because we have to
+	 * support adjusting queue depth from sysfs. Meantime use
+	 * default device queue depth to figure out sbitmap shift
+	 * since we use this queue depth most of times.
+	 */
+	if (sbitmap_init_node(&sdev->budget_map,
+				scsi_device_max_queue_depth(sdev),
+				sbitmap_calculate_shift(depth),
+				GFP_KERNEL, sdev->request_queue->node,
+				false, true)) {
+		put_device(&starget->dev);
+		kfree(sdev);
+		goto out;
+	}
+
+	scsi_change_queue_depth(sdev, depth);
 
 	scsi_sysfs_device_initialize(sdev);
 
@@ -979,6 +997,7 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
 		scsi_attach_vpd(sdev);
 
 	sdev->max_queue_depth = sdev->queue_depth;
+	WARN_ON_ONCE(sdev->max_queue_depth > sdev->budget_map.depth);
 	sdev->sdev_bflags = *bflags;
 
 	/*
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ef5c79037bde..79c36cfdb84b 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -477,6 +477,8 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	/* NULL queue means the device can't be used */
 	sdev->request_queue = NULL;
 
+	sbitmap_free(&sdev->budget_map);
+
 	mutex_lock(&sdev->inquiry_mutex);
 	vpd_pg0 = rcu_replace_pointer(sdev->vpd_pg0, vpd_pg0,
 				       lockdep_is_held(&sdev->inquiry_mutex));
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index dd0b9f690a26..05c7c320ef32 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -8,6 +8,7 @@
 #include <linux/blkdev.h>
 #include <scsi/scsi.h>
 #include <linux/atomic.h>
+#include <linux/sbitmap.h>
 
 struct device;
 struct request_queue;
@@ -106,7 +107,7 @@ struct scsi_device {
 	struct list_head    siblings;   /* list of all devices on this host */
 	struct list_head    same_target_siblings; /* just the devices sharing same target id */
 
-	atomic_t device_busy;		/* commands actually active on LLDD */
+	struct sbitmap budget_map;
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
 
 	atomic_t restarts;
@@ -592,7 +593,7 @@ static inline int scsi_device_supports_vpd(struct scsi_device *sdev)
 
 static inline int scsi_device_busy(struct scsi_device *sdev)
 {
-	return atomic_read(&sdev->device_busy);
+	return sbitmap_weight(&sdev->budget_map);
 }
 
 #define MODULE_ALIAS_SCSI_DEVICE(type) \
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 04/12] sbitmap: move allocation hint into sbitmap
  2020-11-16  9:07 ` [PATCH V4 04/12] sbitmap: move allocation hint into sbitmap Ming Lei
@ 2020-11-16  9:33   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-16  9:33 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On 11/16/20 10:07 AM, Ming Lei wrote:
> Allocation hint should have belonged to sbitmap, also when sbitmap's
> depth is high and no need to use mulitple wakeup queues, user can
> benefit from percpu allocation hint too.
> 
> So move allocation hint into sbitmap, then scsi device queue can benefit
> from allocation hint when converting to plain sbitmap in the following
> patches.
> 
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Cc: Ewan D. Milne <emilne@redhat.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   block/blk-mq.c          |   2 +-
>   block/kyber-iosched.c   |   2 +-
>   include/linux/sbitmap.h |  41 +++++++++------
>   lib/sbitmap.c           | 112 +++++++++++++++++++++++-----------------
>   4 files changed, 94 insertions(+), 63 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 05/12] sbitmap: export sbitmap_weight
  2020-11-16  9:07 ` [PATCH V4 05/12] sbitmap: export sbitmap_weight Ming Lei
@ 2020-11-16  9:38   ` Hannes Reinecke
  2020-11-17  2:10     ` Ming Lei
  2020-11-17  9:18   ` John Garry
  1 sibling, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-16  9:38 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On 11/16/20 10:07 AM, Ming Lei wrote:
> SCSI's .device_busy will be converted to sbitmap, and sbitmap_weight
> is needed, so export the helper.
> 
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Cc: Ewan D. Milne <emilne@redhat.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   include/linux/sbitmap.h |  9 +++++++++
>   lib/sbitmap.c           | 11 ++++++-----
>   2 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
> index 103b41c03311..34343ce3ef6c 100644
> --- a/include/linux/sbitmap.h
> +++ b/include/linux/sbitmap.h
> @@ -346,6 +346,15 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
>    */
>   void sbitmap_show(struct sbitmap *sb, struct seq_file *m);
>   
> +
> +/**
> + * sbitmap_weight() - Return how many real bits set in a &struct sbitmap.
> + * @sb: Bitmap to check.
> + *
> + * Return: How many real bits set
> + */
> +unsigned int sbitmap_weight(const struct sbitmap *sb);
> +
>   /**
>    * sbitmap_bitmap_show() - Write a hex dump of a &struct sbitmap to a &struct
>    * seq_file.
> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> index dcd6a89b4d2f..fb1d3c2f70a2 100644
> --- a/lib/sbitmap.c
> +++ b/lib/sbitmap.c
> @@ -342,20 +342,21 @@ static unsigned int __sbitmap_weight(const struct sbitmap *sb, bool set)
>   	return weight;
>   }
>   
> -static unsigned int sbitmap_weight(const struct sbitmap *sb)
> +static unsigned int sbitmap_cleared(const struct sbitmap *sb)
>   {
> -	return __sbitmap_weight(sb, true);
> +	return __sbitmap_weight(sb, false);
>   }
>   
> -static unsigned int sbitmap_cleared(const struct sbitmap *sb)
> +unsigned int sbitmap_weight(const struct sbitmap *sb)
>   {
> -	return __sbitmap_weight(sb, false);
> +	return __sbitmap_weight(sb, true) - sbitmap_cleared(sb);
>   }
> +EXPORT_SYMBOL_GPL(sbitmap_weight);
>   
That is extremely confusing. Why do you change the meaning of 
'sbitmap_weight' from __sbitmap_weight(sb, true) to
__sbitmap_weight(sb, true) - __sbitmap_weight(sb, false)?
Does this mean that the original definition was wrong?
Or does this mean that this patch implies a different meaning of 
'sbitmap_weight'?
In either case, it's not 'just' an export, so it does warrant an 
explanation.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
  2020-11-16  9:07 ` [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024) Ming Lei
@ 2020-11-16  9:44   ` Hannes Reinecke
  2020-11-17  2:18     ` Ming Lei
  0 siblings, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-16  9:44 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On 11/16/20 10:07 AM, Ming Lei wrote:
> Limit scsi device's queue depth is less than max(host->can_queue, 1024)
> in scsi_change_queue_depth(), and 1024 is big enough for saturating
> current fast SCSI LUN(SSD, or raid volume on multiple SSDs).
> 
> We need this patch for replacing sdev->device_busy with sbitmap which
> has to be pre-allocated with reasonable max depth.
> 
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Cc: Ewan D. Milne <emilne@redhat.com>
> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   drivers/scsi/scsi.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 24619c3bebd5..a28d48c850cf 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -214,6 +214,15 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
>   	scsi_io_completion(cmd, good_bytes);
>   }
>   
> +
> +/*
> + * 1024 is big enough for saturating the fast scsi LUN now
> + */
> +static int scsi_device_max_queue_depth(struct scsi_device *sdev)
> +{
> +	return max_t(int, sdev->host->can_queue, 1024);
> +}
> +

Shouldn't this rather be initialized with scsi_host->can_queue?
These 'should be enough' settings inevitable turn out to be not enough 
in the long run ...

>   /**
>    * scsi_change_queue_depth - change a device's queue depth
>    * @sdev: SCSI Device in question
> @@ -223,6 +232,8 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
>    */
>   int scsi_change_queue_depth(struct scsi_device *sdev, int depth)
>   {
> +	depth = min_t(int, depth, scsi_device_max_queue_depth(sdev));
> +
>   	if (depth > 0) {
>   		sdev->queue_depth = depth;
>   		wmb();
> 

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 08/12] blk-mq: return budget token from .get_budget callback
  2020-11-16  9:07 ` [PATCH V4 08/12] blk-mq: return budget token from .get_budget callback Ming Lei
@ 2020-11-16  9:45   ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-16  9:45 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On 11/16/20 10:07 AM, Ming Lei wrote:
> SCSI uses one global atomic variable to track queue depth for each
> LUN/request queue.
> 
> This way doesn't scale well when there is lots of CPU cores and the
> disk is very fast. It has been observed that IOPS is affected a lot
> by tracking queue depth via sdev->device_busy in IO path.
> 
> Return budget token from .get_budget callback, and the budget token
> can be passed to driver, so that we can replace the atomic variable
> with sbitmap_queue, then the scale issue can be fixed.
> 
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Cc: Ewan D. Milne <emilne@redhat.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   block/blk-mq-sched.c    | 17 +++++++++++++----
>   block/blk-mq.c          | 36 +++++++++++++++++++++++++-----------
>   block/blk-mq.h          | 25 +++++++++++++++++++++----
>   drivers/scsi/scsi_lib.c | 16 +++++++++++-----
>   include/linux/blk-mq.h  |  4 ++--
>   5 files changed, 72 insertions(+), 26 deletions(-)
> Reviewed-by: Hannes Reinecke <hare@suse.de

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-16  9:07 ` [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap Ming Lei
@ 2020-11-16 11:49     ` kernel test robot
  0 siblings, 0 replies; 43+ messages in thread
From: kernel test robot @ 2020-11-16 11:49 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: kbuild-all, clang-built-linux, Ming Lei, Omar Sandoval,
	Kashyap Desai, Sumanesh Samanta, Ewan D . Milne, Hannes Reinecke

[-- Attachment #1: Type: text/plain, Size: 4845 bytes --]

Hi Ming,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: powerpc64-randconfig-r026-20201116 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc64 cross compiling tool for clang build
        # apt-get install binutils-powerpc64-linux-gnu
        # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
        git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
           sdev_busy = atomic_read(&scmd->device->device_busy);
                                    ~~~~~~~~~~~~  ^
   1 error generated.

vim +365 drivers/scsi/megaraid/megaraid_sas_fusion.c

4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  353  
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  354  static inline void
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  355  megasas_get_msix_index(struct megasas_instance *instance,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  356  		       struct scsi_cmnd *scmd,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  357  		       struct megasas_cmd_fusion *cmd,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  358  		       u8 data_arms)
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  359  {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  360  	int sdev_busy;
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  361  
103fbf8e4020845 Kashyap Desai 2020-08-19  362  	/* TBD - if sml remove device_busy in future, driver
103fbf8e4020845 Kashyap Desai 2020-08-19  363  	 * should track counter in internal structure.
103fbf8e4020845 Kashyap Desai 2020-08-19  364  	 */
103fbf8e4020845 Kashyap Desai 2020-08-19 @365  	sdev_busy = atomic_read(&scmd->device->device_busy);
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  366  
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  367  	if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
103fbf8e4020845 Kashyap Desai 2020-08-19  368  	    sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  369  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  370  			mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) /
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  371  					MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start);
103fbf8e4020845 Kashyap Desai 2020-08-19  372  	} else if (instance->msix_load_balance) {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  373  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  374  			(mega_mod64(atomic64_add_return(1, &instance->total_io_count),
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  375  				instance->msix_vectors));
103fbf8e4020845 Kashyap Desai 2020-08-19  376  	} else if (instance->host->nr_hw_queues > 1) {
103fbf8e4020845 Kashyap Desai 2020-08-19  377  		u32 tag = blk_mq_unique_tag(scmd->request);
103fbf8e4020845 Kashyap Desai 2020-08-19  378  
103fbf8e4020845 Kashyap Desai 2020-08-19  379  		cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) +
103fbf8e4020845 Kashyap Desai 2020-08-19  380  			instance->low_latency_index_start;
103fbf8e4020845 Kashyap Desai 2020-08-19  381  	} else {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  382  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  383  			instance->reply_map[raw_smp_processor_id()];
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  384  	}
103fbf8e4020845 Kashyap Desai 2020-08-19  385  }
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  386  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33994 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-16 11:49     ` kernel test robot
  0 siblings, 0 replies; 43+ messages in thread
From: kernel test robot @ 2020-11-16 11:49 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4923 bytes --]

Hi Ming,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: powerpc64-randconfig-r026-20201116 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc64 cross compiling tool for clang build
        # apt-get install binutils-powerpc64-linux-gnu
        # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
        git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
           sdev_busy = atomic_read(&scmd->device->device_busy);
                                    ~~~~~~~~~~~~  ^
   1 error generated.

vim +365 drivers/scsi/megaraid/megaraid_sas_fusion.c

4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  353  
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  354  static inline void
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  355  megasas_get_msix_index(struct megasas_instance *instance,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  356  		       struct scsi_cmnd *scmd,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  357  		       struct megasas_cmd_fusion *cmd,
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  358  		       u8 data_arms)
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  359  {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  360  	int sdev_busy;
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  361  
103fbf8e4020845 Kashyap Desai 2020-08-19  362  	/* TBD - if sml remove device_busy in future, driver
103fbf8e4020845 Kashyap Desai 2020-08-19  363  	 * should track counter in internal structure.
103fbf8e4020845 Kashyap Desai 2020-08-19  364  	 */
103fbf8e4020845 Kashyap Desai 2020-08-19 @365  	sdev_busy = atomic_read(&scmd->device->device_busy);
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  366  
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  367  	if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
103fbf8e4020845 Kashyap Desai 2020-08-19  368  	    sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  369  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  370  			mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) /
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  371  					MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start);
103fbf8e4020845 Kashyap Desai 2020-08-19  372  	} else if (instance->msix_load_balance) {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  373  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  374  			(mega_mod64(atomic64_add_return(1, &instance->total_io_count),
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  375  				instance->msix_vectors));
103fbf8e4020845 Kashyap Desai 2020-08-19  376  	} else if (instance->host->nr_hw_queues > 1) {
103fbf8e4020845 Kashyap Desai 2020-08-19  377  		u32 tag = blk_mq_unique_tag(scmd->request);
103fbf8e4020845 Kashyap Desai 2020-08-19  378  
103fbf8e4020845 Kashyap Desai 2020-08-19  379  		cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) +
103fbf8e4020845 Kashyap Desai 2020-08-19  380  			instance->low_latency_index_start;
103fbf8e4020845 Kashyap Desai 2020-08-19  381  	} else {
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  382  		cmd->request_desc->SCSIIO.MSIxIndex =
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  383  			instance->reply_map[raw_smp_processor_id()];
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  384  	}
103fbf8e4020845 Kashyap Desai 2020-08-19  385  }
4d1634b8d12ecb8 Anand Lodnoor 2020-01-14  386  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 33994 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 05/12] sbitmap: export sbitmap_weight
  2020-11-16  9:38   ` Hannes Reinecke
@ 2020-11-17  2:10     ` Ming Lei
  2020-11-17  6:57       ` Hannes Reinecke
  0 siblings, 1 reply; 43+ messages in thread
From: Ming Lei @ 2020-11-17  2:10 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On Mon, Nov 16, 2020 at 10:38:58AM +0100, Hannes Reinecke wrote:
> On 11/16/20 10:07 AM, Ming Lei wrote:
> > SCSI's .device_busy will be converted to sbitmap, and sbitmap_weight
> > is needed, so export the helper.
> > 
> > Cc: Omar Sandoval <osandov@fb.com>
> > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Cc: Ewan D. Milne <emilne@redhat.com>
> > Reviewed-by: Hannes Reinecke <hare@suse.de>
> > Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   include/linux/sbitmap.h |  9 +++++++++
> >   lib/sbitmap.c           | 11 ++++++-----
> >   2 files changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
> > index 103b41c03311..34343ce3ef6c 100644
> > --- a/include/linux/sbitmap.h
> > +++ b/include/linux/sbitmap.h
> > @@ -346,6 +346,15 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
> >    */
> >   void sbitmap_show(struct sbitmap *sb, struct seq_file *m);
> > +
> > +/**
> > + * sbitmap_weight() - Return how many real bits set in a &struct sbitmap.
> > + * @sb: Bitmap to check.
> > + *
> > + * Return: How many real bits set
> > + */
> > +unsigned int sbitmap_weight(const struct sbitmap *sb);
> > +
> >   /**
> >    * sbitmap_bitmap_show() - Write a hex dump of a &struct sbitmap to a &struct
> >    * seq_file.
> > diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> > index dcd6a89b4d2f..fb1d3c2f70a2 100644
> > --- a/lib/sbitmap.c
> > +++ b/lib/sbitmap.c
> > @@ -342,20 +342,21 @@ static unsigned int __sbitmap_weight(const struct sbitmap *sb, bool set)
> >   	return weight;
> >   }
> > -static unsigned int sbitmap_weight(const struct sbitmap *sb)
> > +static unsigned int sbitmap_cleared(const struct sbitmap *sb)
> >   {
> > -	return __sbitmap_weight(sb, true);
> > +	return __sbitmap_weight(sb, false);
> >   }
> > -static unsigned int sbitmap_cleared(const struct sbitmap *sb)
> > +unsigned int sbitmap_weight(const struct sbitmap *sb)
> >   {
> > -	return __sbitmap_weight(sb, false);
> > +	return __sbitmap_weight(sb, true) - sbitmap_cleared(sb);
> >   }
> > +EXPORT_SYMBOL_GPL(sbitmap_weight);
> That is extremely confusing. Why do you change the meaning of
> 'sbitmap_weight' from __sbitmap_weight(sb, true) to
> __sbitmap_weight(sb, true) - __sbitmap_weight(sb, false)?

Because the only user of sbitmap_weight() just uses the following way:

	sbitmap_weight(sb) - sbitmap_cleared(sb)

Frankly, I think sbitmap_weight(sb) should return real busy bits.

> Does this mean that the original definition was wrong?
> Or does this mean that this patch implies a different meaning of
> 'sbitmap_weight'?

Yeah, this patch changes meaning of sbitmap_weight(), now it is
exported, and we should make it more accurate/readable from user view.


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
  2020-11-16  9:44   ` Hannes Reinecke
@ 2020-11-17  2:18     ` Ming Lei
  2020-11-17 16:42       ` Hannes Reinecke
  0 siblings, 1 reply; 43+ messages in thread
From: Ming Lei @ 2020-11-17  2:18 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On Mon, Nov 16, 2020 at 10:44:54AM +0100, Hannes Reinecke wrote:
> On 11/16/20 10:07 AM, Ming Lei wrote:
> > Limit scsi device's queue depth is less than max(host->can_queue, 1024)
> > in scsi_change_queue_depth(), and 1024 is big enough for saturating
> > current fast SCSI LUN(SSD, or raid volume on multiple SSDs).
> > 
> > We need this patch for replacing sdev->device_busy with sbitmap which
> > has to be pre-allocated with reasonable max depth.
> > 
> > Cc: Omar Sandoval <osandov@fb.com>
> > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Cc: Ewan D. Milne <emilne@redhat.com>
> > Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   drivers/scsi/scsi.c | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> > 
> > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> > index 24619c3bebd5..a28d48c850cf 100644
> > --- a/drivers/scsi/scsi.c
> > +++ b/drivers/scsi/scsi.c
> > @@ -214,6 +214,15 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
> >   	scsi_io_completion(cmd, good_bytes);
> >   }
> > +
> > +/*
> > + * 1024 is big enough for saturating the fast scsi LUN now
> > + */
> > +static int scsi_device_max_queue_depth(struct scsi_device *sdev)
> > +{
> > +	return max_t(int, sdev->host->can_queue, 1024);
> > +}
> > +
> 
> Shouldn't this rather be initialized with scsi_host->can_queue?

Multiple queues may be used for one single LUN, so in theory we should
return max queue depth as host->can_queue * host->nr_hw_queues, but
this number can be too big for the sbitmap's pre-allocation.

That is why this patch introduces one reasonable limit on this value
of max(sdev->host->can_queue, 1024). Suppose single SSD can be saturated
by ~128 requests, we still can saturate one LUN with 8 SSDs behind if
the hw queue depth is set as too low.

> These 'should be enough' settings inevitable turn out to be not enough in
> the long run ...

I have provided the theory behind this idea, not just simple 'should be
enough'.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 05/12] sbitmap: export sbitmap_weight
  2020-11-17  2:10     ` Ming Lei
@ 2020-11-17  6:57       ` Hannes Reinecke
  2020-11-17  7:53         ` Ming Lei
  0 siblings, 1 reply; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-17  6:57 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On 11/17/20 3:10 AM, Ming Lei wrote:
> On Mon, Nov 16, 2020 at 10:38:58AM +0100, Hannes Reinecke wrote:
>> On 11/16/20 10:07 AM, Ming Lei wrote:
>>> SCSI's .device_busy will be converted to sbitmap, and sbitmap_weight
>>> is needed, so export the helper.
>>>
>>> Cc: Omar Sandoval <osandov@fb.com>
>>> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
>>> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>>> Cc: Ewan D. Milne <emilne@redhat.com>
>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>    include/linux/sbitmap.h |  9 +++++++++
>>>    lib/sbitmap.c           | 11 ++++++-----
>>>    2 files changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
>>> index 103b41c03311..34343ce3ef6c 100644
>>> --- a/include/linux/sbitmap.h
>>> +++ b/include/linux/sbitmap.h
>>> @@ -346,6 +346,15 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
>>>     */
>>>    void sbitmap_show(struct sbitmap *sb, struct seq_file *m);
>>> +
>>> +/**
>>> + * sbitmap_weight() - Return how many real bits set in a &struct sbitmap.
>>> + * @sb: Bitmap to check.
>>> + *
>>> + * Return: How many real bits set
>>> + */
>>> +unsigned int sbitmap_weight(const struct sbitmap *sb);
>>> +
>>>    /**
>>>     * sbitmap_bitmap_show() - Write a hex dump of a &struct sbitmap to a &struct
>>>     * seq_file.
>>> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
>>> index dcd6a89b4d2f..fb1d3c2f70a2 100644
>>> --- a/lib/sbitmap.c
>>> +++ b/lib/sbitmap.c
>>> @@ -342,20 +342,21 @@ static unsigned int __sbitmap_weight(const struct sbitmap *sb, bool set)
>>>    	return weight;
>>>    }
>>> -static unsigned int sbitmap_weight(const struct sbitmap *sb)
>>> +static unsigned int sbitmap_cleared(const struct sbitmap *sb)
>>>    {
>>> -	return __sbitmap_weight(sb, true);
>>> +	return __sbitmap_weight(sb, false);
>>>    }
>>> -static unsigned int sbitmap_cleared(const struct sbitmap *sb)
>>> +unsigned int sbitmap_weight(const struct sbitmap *sb)
>>>    {
>>> -	return __sbitmap_weight(sb, false);
>>> +	return __sbitmap_weight(sb, true) - sbitmap_cleared(sb);
>>>    }
>>> +EXPORT_SYMBOL_GPL(sbitmap_weight);
>> That is extremely confusing. Why do you change the meaning of
>> 'sbitmap_weight' from __sbitmap_weight(sb, true) to
>> __sbitmap_weight(sb, true) - __sbitmap_weight(sb, false)?
> 
> Because the only user of sbitmap_weight() just uses the following way:
> 
> 	sbitmap_weight(sb) - sbitmap_cleared(sb)
> 
> Frankly, I think sbitmap_weight(sb) should return real busy bits.
> 
No argument about that. Just wanted to be clear that this is by intention.

>> Does this mean that the original definition was wrong?
>> Or does this mean that this patch implies a different meaning of
>> 'sbitmap_weight'?
> 
> Yeah, this patch changes meaning of sbitmap_weight(), now it is
> exported, and we should make it more accurate/readable from user view.
> 
So can you please state this in the patch description?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 05/12] sbitmap: export sbitmap_weight
  2020-11-17  6:57       ` Hannes Reinecke
@ 2020-11-17  7:53         ` Ming Lei
  0 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-17  7:53 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On Tue, Nov 17, 2020 at 07:57:40AM +0100, Hannes Reinecke wrote:
> On 11/17/20 3:10 AM, Ming Lei wrote:
> > On Mon, Nov 16, 2020 at 10:38:58AM +0100, Hannes Reinecke wrote:
> > > On 11/16/20 10:07 AM, Ming Lei wrote:
> > > > SCSI's .device_busy will be converted to sbitmap, and sbitmap_weight
> > > > is needed, so export the helper.
> > > > 
> > > > Cc: Omar Sandoval <osandov@fb.com>
> > > > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > > > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > > > Cc: Ewan D. Milne <emilne@redhat.com>
> > > > Reviewed-by: Hannes Reinecke <hare@suse.de>
> > > > Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >    include/linux/sbitmap.h |  9 +++++++++
> > > >    lib/sbitmap.c           | 11 ++++++-----
> > > >    2 files changed, 15 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h
> > > > index 103b41c03311..34343ce3ef6c 100644
> > > > --- a/include/linux/sbitmap.h
> > > > +++ b/include/linux/sbitmap.h
> > > > @@ -346,6 +346,15 @@ static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr)
> > > >     */
> > > >    void sbitmap_show(struct sbitmap *sb, struct seq_file *m);
> > > > +
> > > > +/**
> > > > + * sbitmap_weight() - Return how many real bits set in a &struct sbitmap.
> > > > + * @sb: Bitmap to check.
> > > > + *
> > > > + * Return: How many real bits set
> > > > + */
> > > > +unsigned int sbitmap_weight(const struct sbitmap *sb);
> > > > +
> > > >    /**
> > > >     * sbitmap_bitmap_show() - Write a hex dump of a &struct sbitmap to a &struct
> > > >     * seq_file.
> > > > diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> > > > index dcd6a89b4d2f..fb1d3c2f70a2 100644
> > > > --- a/lib/sbitmap.c
> > > > +++ b/lib/sbitmap.c
> > > > @@ -342,20 +342,21 @@ static unsigned int __sbitmap_weight(const struct sbitmap *sb, bool set)
> > > >    	return weight;
> > > >    }
> > > > -static unsigned int sbitmap_weight(const struct sbitmap *sb)
> > > > +static unsigned int sbitmap_cleared(const struct sbitmap *sb)
> > > >    {
> > > > -	return __sbitmap_weight(sb, true);
> > > > +	return __sbitmap_weight(sb, false);
> > > >    }
> > > > -static unsigned int sbitmap_cleared(const struct sbitmap *sb)
> > > > +unsigned int sbitmap_weight(const struct sbitmap *sb)
> > > >    {
> > > > -	return __sbitmap_weight(sb, false);
> > > > +	return __sbitmap_weight(sb, true) - sbitmap_cleared(sb);
> > > >    }
> > > > +EXPORT_SYMBOL_GPL(sbitmap_weight);
> > > That is extremely confusing. Why do you change the meaning of
> > > 'sbitmap_weight' from __sbitmap_weight(sb, true) to
> > > __sbitmap_weight(sb, true) - __sbitmap_weight(sb, false)?
> > 
> > Because the only user of sbitmap_weight() just uses the following way:
> > 
> > 	sbitmap_weight(sb) - sbitmap_cleared(sb)
> > 
> > Frankly, I think sbitmap_weight(sb) should return real busy bits.
> > 
> No argument about that. Just wanted to be clear that this is by intention.
> 
> > > Does this mean that the original definition was wrong?
> > > Or does this mean that this patch implies a different meaning of
> > > 'sbitmap_weight'?
> > 
> > Yeah, this patch changes meaning of sbitmap_weight(), now it is
> > exported, and we should make it more accurate/readable from user view.
> > 
> So can you please state this in the patch description?

Sure.

Thanks, 
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 05/12] sbitmap: export sbitmap_weight
  2020-11-16  9:07 ` [PATCH V4 05/12] sbitmap: export sbitmap_weight Ming Lei
  2020-11-16  9:38   ` Hannes Reinecke
@ 2020-11-17  9:18   ` John Garry
  1 sibling, 0 replies; 43+ messages in thread
From: John Garry @ 2020-11-17  9:18 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block, Martin K . Petersen, linux-scsi
  Cc: Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne,
	Hannes Reinecke

On 16/11/2020 09:07, Ming Lei wrote:
> +
> +/**
> + * sbitmap_weight() - Return how many real bits set in a &struct sbitmap.
> + * @sb: Bitmap to check.
> + *
> + * Return: How many real bits set
> + */
> +unsigned int sbitmap_weight(const struct sbitmap *sb);
> +

Hi Ming,

Just a nit on the comment - I think that "real bits set" is vague. How 
about "set and not cleared bits", "busy bits", "net set bits" or 
"aggregate bits set"?

Thanks,
John


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
  2020-11-17  2:18     ` Ming Lei
@ 2020-11-17 16:42       ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-17 16:42 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	Omar Sandoval, Kashyap Desai, Sumanesh Samanta, Ewan D . Milne

On 11/17/20 3:18 AM, Ming Lei wrote:
> On Mon, Nov 16, 2020 at 10:44:54AM +0100, Hannes Reinecke wrote:
>> On 11/16/20 10:07 AM, Ming Lei wrote:
>>> Limit scsi device's queue depth is less than max(host->can_queue, 1024)
>>> in scsi_change_queue_depth(), and 1024 is big enough for saturating
>>> current fast SCSI LUN(SSD, or raid volume on multiple SSDs).
>>>
>>> We need this patch for replacing sdev->device_busy with sbitmap which
>>> has to be pre-allocated with reasonable max depth.
>>>
>>> Cc: Omar Sandoval <osandov@fb.com>
>>> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
>>> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>>> Cc: Ewan D. Milne <emilne@redhat.com>
>>> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>    drivers/scsi/scsi.c | 11 +++++++++++
>>>    1 file changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
>>> index 24619c3bebd5..a28d48c850cf 100644
>>> --- a/drivers/scsi/scsi.c
>>> +++ b/drivers/scsi/scsi.c
>>> @@ -214,6 +214,15 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
>>>    	scsi_io_completion(cmd, good_bytes);
>>>    }
>>> +
>>> +/*
>>> + * 1024 is big enough for saturating the fast scsi LUN now
>>> + */
>>> +static int scsi_device_max_queue_depth(struct scsi_device *sdev)
>>> +{
>>> +	return max_t(int, sdev->host->can_queue, 1024);
>>> +}
>>> +
>>
>> Shouldn't this rather be initialized with scsi_host->can_queue?
> 
> Multiple queues may be used for one single LUN, so in theory we should
> return max queue depth as host->can_queue * host->nr_hw_queues, but
> this number can be too big for the sbitmap's pre-allocation.
> 
Ah, so that's the problem here.

> That is why this patch introduces one reasonable limit on this value
> of max(sdev->host->can_queue, 1024). Suppose single SSD can be saturated
> by ~128 requests, we still can saturate one LUN with 8 SSDs behind if
> the hw queue depth is set as too low.
> 
>> These 'should be enough' settings inevitable turn out to be not enough in
>> the long run ...
> 
> I have provided the theory behind this idea, not just simple 'should be
> enough'.
> 
No, it's okay now. I wasn't aware that we had a limitation on the 
sbitmap pre-allocation.

You can add my

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-16 11:49     ` kernel test robot
@ 2020-11-18  2:35       ` Ming Lei
  -1 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-18  2:35 UTC (permalink / raw)
  To: kernel test robot, Kashyap Desai, Sumanesh Samanta
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	kbuild-all, clang-built-linux, Omar Sandoval, Ewan D . Milne,
	Hannes Reinecke

Hello Kashyap & Sumanesh,

On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> Hi Ming,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on block/for-next]
> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> config: powerpc64-randconfig-r026-20201116 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install powerpc64 cross compiling tool for clang build
>         # apt-get install binutils-powerpc64-linux-gnu
>         # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>         git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All errors (new ones prefixed by >>):
> 
> >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>            sdev_busy = atomic_read(&scmd->device->device_busy);

This new reference to sdev->device_busy is added by recent shared host
tag patch, and according to the comment, you may have planed to convert into
one megaraid internal counter.

        /* TBD - if sml remove device_busy in future, driver
         * should track counter in internal structure.
         */

So can you post one patch? And I am happy to fold it into this series.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-18  2:35       ` Ming Lei
  0 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-18  2:35 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2469 bytes --]

Hello Kashyap & Sumanesh,

On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> Hi Ming,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on block/for-next]
> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> config: powerpc64-randconfig-r026-20201116 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install powerpc64 cross compiling tool for clang build
>         # apt-get install binutils-powerpc64-linux-gnu
>         # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>         git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All errors (new ones prefixed by >>):
> 
> >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>            sdev_busy = atomic_read(&scmd->device->device_busy);

This new reference to sdev->device_busy is added by recent shared host
tag patch, and according to the comment, you may have planed to convert into
one megaraid internal counter.

        /* TBD - if sml remove device_busy in future, driver
         * should track counter in internal structure.
         */

So can you post one patch? And I am happy to fold it into this series.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-18  2:35       ` Ming Lei
@ 2020-11-18  7:15         ` Hannes Reinecke
  -1 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-18  7:15 UTC (permalink / raw)
  To: Ming Lei, kernel test robot, Kashyap Desai, Sumanesh Samanta
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	kbuild-all, clang-built-linux, Omar Sandoval, Ewan D . Milne

[-- Attachment #1: Type: text/plain, Size: 2951 bytes --]

Hey Ming,

On 11/18/20 3:35 AM, Ming Lei wrote:
> Hello Kashyap & Sumanesh,
> 
> On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
>> Hi Ming,
>>
>> Thank you for the patch! Yet something to improve:
>>
>> [auto build test ERROR on block/for-next]
>> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
>> config: powerpc64-randconfig-r026-20201116 (attached as .config)
>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
>> reproduce (this is a W=1 build):
>>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>          chmod +x ~/bin/make.cross
>>          # install powerpc64 cross compiling tool for clang build
>>          # apt-get install binutils-powerpc64-linux-gnu
>>          # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>          git remote add linux-review https://github.com/0day-ci/linux
>>          git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>          git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>          # save the attached .config to linux build tree
>>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> All errors (new ones prefixed by >>):
>>
>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>>             sdev_busy = atomic_read(&scmd->device->device_busy);
> 
> This new reference to sdev->device_busy is added by recent shared host
> tag patch, and according to the comment, you may have planed to convert into
> one megaraid internal counter.
> 
>          /* TBD - if sml remove device_busy in future, driver
>           * should track counter in internal structure.
>           */
> 
> So can you post one patch? And I am happy to fold it into this series.
> 
Seeing that we already have the accessor 'scsi_device_busy()' it's 
probably easier to just use that and not fiddle with driver internals.
See the attached patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

[-- Attachment #2: 0001-megaraid_sas-use-scsi_device_busy-instead-of-direct-.patch --]
[-- Type: text/x-patch, Size: 1341 bytes --]

From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Wed, 18 Nov 2020 08:08:41 +0100
Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
 to atomic counter

It's always a bad style to access structure internals, especially if
there is an accessor for it. So convert to use scsi_device_busy()
intead of accessing the atomic counter directly.

Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..272ff123bc6b 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
 	/* TBD - if sml remove device_busy in future, driver
 	 * should track counter in internal structure.
 	 */
-	sdev_busy = atomic_read(&scmd->device->device_busy);
+	sdev_busy = scsi_device_busy(scmd->device);
 
 	if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
 	    sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-18  7:15         ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-18  7:15 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3015 bytes --]

Hey Ming,

On 11/18/20 3:35 AM, Ming Lei wrote:
> Hello Kashyap & Sumanesh,
> 
> On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
>> Hi Ming,
>>
>> Thank you for the patch! Yet something to improve:
>>
>> [auto build test ERROR on block/for-next]
>> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
>> config: powerpc64-randconfig-r026-20201116 (attached as .config)
>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
>> reproduce (this is a W=1 build):
>>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>          chmod +x ~/bin/make.cross
>>          # install powerpc64 cross compiling tool for clang build
>>          # apt-get install binutils-powerpc64-linux-gnu
>>          # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>          git remote add linux-review https://github.com/0day-ci/linux
>>          git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>          git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>          # save the attached .config to linux build tree
>>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> All errors (new ones prefixed by >>):
>>
>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>>             sdev_busy = atomic_read(&scmd->device->device_busy);
> 
> This new reference to sdev->device_busy is added by recent shared host
> tag patch, and according to the comment, you may have planed to convert into
> one megaraid internal counter.
> 
>          /* TBD - if sml remove device_busy in future, driver
>           * should track counter in internal structure.
>           */
> 
> So can you post one patch? And I am happy to fold it into this series.
> 
Seeing that we already have the accessor 'scsi_device_busy()' it's 
probably easier to just use that and not fiddle with driver internals.
See the attached patch.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare(a)suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-megaraid_sas-use-scsi_device_busy-instead-of-direct-.patch --]
[-- Type: text/x-patch, Size: 1341 bytes --]

From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Wed, 18 Nov 2020 08:08:41 +0100
Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
 to atomic counter

It's always a bad style to access structure internals, especially if
there is an accessor for it. So convert to use scsi_device_busy()
intead of accessing the atomic counter directly.

Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..272ff123bc6b 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
 	/* TBD - if sml remove device_busy in future, driver
 	 * should track counter in internal structure.
 	 */
-	sdev_busy = atomic_read(&scmd->device->device_busy);
+	sdev_busy = scsi_device_busy(scmd->device);
 
 	if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
 	    sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-18  7:15         ` Hannes Reinecke
@ 2020-11-18  7:44           ` Ming Lei
  -1 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-18  7:44 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: kernel test robot, Kashyap Desai, Sumanesh Samanta, Jens Axboe,
	linux-block, Martin K . Petersen, linux-scsi, kbuild-all,
	clang-built-linux, Omar Sandoval, Ewan D . Milne

On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
> Hey Ming,
> 
> On 11/18/20 3:35 AM, Ming Lei wrote:
> > Hello Kashyap & Sumanesh,
> > 
> > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> > > Hi Ming,
> > > 
> > > Thank you for the patch! Yet something to improve:
> > > 
> > > [auto build test ERROR on block/for-next]
> > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > And when submitting patch, we suggest to use '--base' as documented in
> > > https://git-scm.com/docs/git-format-patch]
> > > 
> > > url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> > > config: powerpc64-randconfig-r026-20201116 (attached as .config)
> > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> > > reproduce (this is a W=1 build):
> > >          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > >          chmod +x ~/bin/make.cross
> > >          # install powerpc64 cross compiling tool for clang build
> > >          # apt-get install binutils-powerpc64-linux-gnu
> > >          # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > >          git remote add linux-review https://github.com/0day-ci/linux
> > >          git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > >          git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > >          # save the attached .config to linux build tree
> > >          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
> > > 
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > 
> > > All errors (new ones prefixed by >>):
> > > 
> > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
> > >             sdev_busy = atomic_read(&scmd->device->device_busy);
> > 
> > This new reference to sdev->device_busy is added by recent shared host
> > tag patch, and according to the comment, you may have planed to convert into
> > one megaraid internal counter.
> > 
> >          /* TBD - if sml remove device_busy in future, driver
> >           * should track counter in internal structure.
> >           */
> > 
> > So can you post one patch? And I am happy to fold it into this series.
> > 
> Seeing that we already have the accessor 'scsi_device_busy()' it's probably
> easier to just use that and not fiddle with driver internals.
> See the attached patch.
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke                Kernel Storage Architect
> hare@suse.de                              +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

> From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
> From: Hannes Reinecke <hare@suse.de>
> Date: Wed, 18 Nov 2020 08:08:41 +0100
> Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
>  to atomic counter
> 
> It's always a bad style to access structure internals, especially if
> there is an accessor for it. So convert to use scsi_device_busy()
> intead of accessing the atomic counter directly.
> 
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index fd607287608e..272ff123bc6b 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
>  	/* TBD - if sml remove device_busy in future, driver
>  	 * should track counter in internal structure.
>  	 */
> -	sdev_busy = atomic_read(&scmd->device->device_busy);
> +	sdev_busy = scsi_device_busy(scmd->device);

megasas_get_msix_index() is called in .queuecommand() path,
scsi_device_busy() might take more cycles since it has to iterate over
each sbitmap words, especially when the sbitmap depth is high.

I'd suggest Kashyap/Sumanesh to check if there is better way to
deal with it. If not, scsi_device_busy() should be fine.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-18  7:44           ` Ming Lei
  0 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-18  7:44 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4883 bytes --]

On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
> Hey Ming,
> 
> On 11/18/20 3:35 AM, Ming Lei wrote:
> > Hello Kashyap & Sumanesh,
> > 
> > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> > > Hi Ming,
> > > 
> > > Thank you for the patch! Yet something to improve:
> > > 
> > > [auto build test ERROR on block/for-next]
> > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > And when submitting patch, we suggest to use '--base' as documented in
> > > https://git-scm.com/docs/git-format-patch]
> > > 
> > > url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> > > config: powerpc64-randconfig-r026-20201116 (attached as .config)
> > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> > > reproduce (this is a W=1 build):
> > >          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > >          chmod +x ~/bin/make.cross
> > >          # install powerpc64 cross compiling tool for clang build
> > >          # apt-get install binutils-powerpc64-linux-gnu
> > >          # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > >          git remote add linux-review https://github.com/0day-ci/linux
> > >          git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > >          git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > >          # save the attached .config to linux build tree
> > >          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
> > > 
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > 
> > > All errors (new ones prefixed by >>):
> > > 
> > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
> > >             sdev_busy = atomic_read(&scmd->device->device_busy);
> > 
> > This new reference to sdev->device_busy is added by recent shared host
> > tag patch, and according to the comment, you may have planed to convert into
> > one megaraid internal counter.
> > 
> >          /* TBD - if sml remove device_busy in future, driver
> >           * should track counter in internal structure.
> >           */
> > 
> > So can you post one patch? And I am happy to fold it into this series.
> > 
> Seeing that we already have the accessor 'scsi_device_busy()' it's probably
> easier to just use that and not fiddle with driver internals.
> See the attached patch.
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke                Kernel Storage Architect
> hare(a)suse.de                              +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

> From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
> From: Hannes Reinecke <hare@suse.de>
> Date: Wed, 18 Nov 2020 08:08:41 +0100
> Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
>  to atomic counter
> 
> It's always a bad style to access structure internals, especially if
> there is an accessor for it. So convert to use scsi_device_busy()
> intead of accessing the atomic counter directly.
> 
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index fd607287608e..272ff123bc6b 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
>  	/* TBD - if sml remove device_busy in future, driver
>  	 * should track counter in internal structure.
>  	 */
> -	sdev_busy = atomic_read(&scmd->device->device_busy);
> +	sdev_busy = scsi_device_busy(scmd->device);

megasas_get_msix_index() is called in .queuecommand() path,
scsi_device_busy() might take more cycles since it has to iterate over
each sbitmap words, especially when the sbitmap depth is high.

I'd suggest Kashyap/Sumanesh to check if there is better way to
deal with it. If not, scsi_device_busy() should be fine.


Thanks,
Ming

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-18  7:44           ` Ming Lei
@ 2020-11-18  9:15             ` Hannes Reinecke
  -1 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-18  9:15 UTC (permalink / raw)
  To: Ming Lei
  Cc: kernel test robot, Kashyap Desai, Sumanesh Samanta, Jens Axboe,
	linux-block, Martin K . Petersen, linux-scsi, kbuild-all,
	clang-built-linux, Omar Sandoval, Ewan D . Milne

On 11/18/20 8:44 AM, Ming Lei wrote:
> On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
>> Hey Ming,
>>
>> On 11/18/20 3:35 AM, Ming Lei wrote:
>>> Hello Kashyap & Sumanesh,
>>>
>>> On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
>>>> Hi Ming,
>>>>
>>>> Thank you for the patch! Yet something to improve:
>>>>
>>>> [auto build test ERROR on block/for-next]
>>>> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>> https://git-scm.com/docs/git-format-patch]
>>>>
>>>> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
>>>> config: powerpc64-randconfig-r026-20201116 (attached as .config)
>>>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
>>>> reproduce (this is a W=1 build):
>>>>           wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>>>           chmod +x ~/bin/make.cross
>>>>           # install powerpc64 cross compiling tool for clang build
>>>>           # apt-get install binutils-powerpc64-linux-gnu
>>>>           # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>>>           git remote add linux-review https://github.com/0day-ci/linux
>>>>           git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>>>           git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>>>           # save the attached .config to linux build tree
>>>>           COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>>>
>>>> If you fix the issue, kindly add following tag as appropriate
>>>> Reported-by: kernel test robot <lkp@intel.com>
>>>>
>>>> All errors (new ones prefixed by >>):
>>>>
>>>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>>>>              sdev_busy = atomic_read(&scmd->device->device_busy);
>>>
>>> This new reference to sdev->device_busy is added by recent shared host
>>> tag patch, and according to the comment, you may have planed to convert into
>>> one megaraid internal counter.
>>>
>>>           /* TBD - if sml remove device_busy in future, driver
>>>            * should track counter in internal structure.
>>>            */
>>>
>>> So can you post one patch? And I am happy to fold it into this series.
>>>
>> Seeing that we already have the accessor 'scsi_device_busy()' it's probably
>> easier to just use that and not fiddle with driver internals.
>> See the attached patch.
>>
>> Cheers,
>>
>> Hannes
>> -- 
>> Dr. Hannes Reinecke                Kernel Storage Architect
>> hare@suse.de                              +49 911 74053 688
>> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
>> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
> 
>>  From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
>> From: Hannes Reinecke <hare@suse.de>
>> Date: Wed, 18 Nov 2020 08:08:41 +0100
>> Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
>>   to atomic counter
>>
>> It's always a bad style to access structure internals, especially if
>> there is an accessor for it. So convert to use scsi_device_busy()
>> intead of accessing the atomic counter directly.
>>
>> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
>> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index fd607287608e..272ff123bc6b 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
>>   	/* TBD - if sml remove device_busy in future, driver
>>   	 * should track counter in internal structure.
>>   	 */
>> -	sdev_busy = atomic_read(&scmd->device->device_busy);
>> +	sdev_busy = scsi_device_busy(scmd->device);
> 
> megasas_get_msix_index() is called in .queuecommand() path,
> scsi_device_busy() might take more cycles since it has to iterate over
> each sbitmap words, especially when the sbitmap depth is high.
> 
> I'd suggest Kashyap/Sumanesh to check if there is better way to
> deal with it. If not, scsi_device_busy() should be fine.
> 
I guess this whole codepath will become obsolete if and when support for 
polling queues / io_uring will be implemented for megaraid_sas.
This whole section deals with spreading the load over several hardware 
queues once the dedicated one is at risk of being congested.
But this is only required if someone want to reach high IOPS; so if we 
have poll/io_uring support there won't be a need for this anymore.
Or that's the theory, at least :-)

But the patch should be good enough for now to get your patchset in.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-18  9:15             ` Hannes Reinecke
  0 siblings, 0 replies; 43+ messages in thread
From: Hannes Reinecke @ 2020-11-18  9:15 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5734 bytes --]

On 11/18/20 8:44 AM, Ming Lei wrote:
> On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
>> Hey Ming,
>>
>> On 11/18/20 3:35 AM, Ming Lei wrote:
>>> Hello Kashyap & Sumanesh,
>>>
>>> On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
>>>> Hi Ming,
>>>>
>>>> Thank you for the patch! Yet something to improve:
>>>>
>>>> [auto build test ERROR on block/for-next]
>>>> [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>> https://git-scm.com/docs/git-format-patch]
>>>>
>>>> url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
>>>> config: powerpc64-randconfig-r026-20201116 (attached as .config)
>>>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
>>>> reproduce (this is a W=1 build):
>>>>           wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>>>           chmod +x ~/bin/make.cross
>>>>           # install powerpc64 cross compiling tool for clang build
>>>>           # apt-get install binutils-powerpc64-linux-gnu
>>>>           # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>>>           git remote add linux-review https://github.com/0day-ci/linux
>>>>           git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
>>>>           git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
>>>>           # save the attached .config to linux build tree
>>>>           COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>>>>
>>>> If you fix the issue, kindly add following tag as appropriate
>>>> Reported-by: kernel test robot <lkp@intel.com>
>>>>
>>>> All errors (new ones prefixed by >>):
>>>>
>>>>>> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
>>>>              sdev_busy = atomic_read(&scmd->device->device_busy);
>>>
>>> This new reference to sdev->device_busy is added by recent shared host
>>> tag patch, and according to the comment, you may have planed to convert into
>>> one megaraid internal counter.
>>>
>>>           /* TBD - if sml remove device_busy in future, driver
>>>            * should track counter in internal structure.
>>>            */
>>>
>>> So can you post one patch? And I am happy to fold it into this series.
>>>
>> Seeing that we already have the accessor 'scsi_device_busy()' it's probably
>> easier to just use that and not fiddle with driver internals.
>> See the attached patch.
>>
>> Cheers,
>>
>> Hannes
>> -- 
>> Dr. Hannes Reinecke                Kernel Storage Architect
>> hare(a)suse.de                              +49 911 74053 688
>> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
>> HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
> 
>>  From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
>> From: Hannes Reinecke <hare@suse.de>
>> Date: Wed, 18 Nov 2020 08:08:41 +0100
>> Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
>>   to atomic counter
>>
>> It's always a bad style to access structure internals, especially if
>> there is an accessor for it. So convert to use scsi_device_busy()
>> intead of accessing the atomic counter directly.
>>
>> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
>> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index fd607287608e..272ff123bc6b 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
>>   	/* TBD - if sml remove device_busy in future, driver
>>   	 * should track counter in internal structure.
>>   	 */
>> -	sdev_busy = atomic_read(&scmd->device->device_busy);
>> +	sdev_busy = scsi_device_busy(scmd->device);
> 
> megasas_get_msix_index() is called in .queuecommand() path,
> scsi_device_busy() might take more cycles since it has to iterate over
> each sbitmap words, especially when the sbitmap depth is high.
> 
> I'd suggest Kashyap/Sumanesh to check if there is better way to
> deal with it. If not, scsi_device_busy() should be fine.
> 
I guess this whole codepath will become obsolete if and when support for 
polling queues / io_uring will be implemented for megaraid_sas.
This whole section deals with spreading the load over several hardware 
queues once the dedicated one is at risk of being congested.
But this is only required if someone want to reach high IOPS; so if we 
have poll/io_uring support there won't be a need for this anymore.
Or that's the theory, at least :-)

But the patch should be good enough for now to get your patchset in.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare(a)suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-18  2:35       ` Ming Lei
@ 2020-11-19  6:20         ` Kashyap Desai
  -1 siblings, 0 replies; 43+ messages in thread
From: Kashyap Desai @ 2020-11-19  6:20 UTC (permalink / raw)
  To: Ming Lei, kernel test robot, Sumanesh Samanta
  Cc: Jens Axboe, linux-block, Martin K . Petersen, linux-scsi,
	kbuild-all, clang-built-linux, Omar Sandoval, Ewan D . Milne,
	Hannes Reinecke

[-- Attachment #1: Type: text/plain, Size: 4799 bytes --]

> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no
member
> named 'device_busy' in 'struct scsi_device'
> >            sdev_busy = atomic_read(&scmd->device->device_busy);
>
> This new reference to sdev->device_busy is added by recent shared host
tag
> patch, and according to the comment, you may have planed to convert into
> one megaraid internal counter.
>
>         /* TBD - if sml remove device_busy in future, driver
>          * should track counter in internal structure.
>          */
>
> So can you post one patch? And I am happy to fold it into this series.

Ming - Please find the patch for megaraid_sas driver -
I have used helper inline function just for inter-operability with older
kernel to support in our out of box driver.
This way it will be easy for us to replace helper function as per kernel
version check.

Subject: [PATCH] megaraid_sas: replace sdev_busy with local counter

---
 drivers/scsi/megaraid/megaraid_sas.h        |  2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 34 ++++++++++++++++++---
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h
b/drivers/scsi/megaraid/megaraid_sas.h
index 0f808d63580e..0c6a56b24c6e 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2019,10 +2019,12 @@ union megasas_frame {
  * struct MR_PRIV_DEVICE - sdev private hostdata
  * @is_tm_capable: firmware managed tm_capable flag
  * @tm_busy: TM request is in progress
+ * @sdev_priv_busy: pending command per sdev
  */
 struct MR_PRIV_DEVICE {
        bool is_tm_capable;
        bool tm_busy;
+       atomic_t sdev_priv_busy;
        atomic_t r1_ldio_hint;
        u8 interface_type;
        u8 task_abort_tmo;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..e813ea0ad8b7 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -220,6 +220,32 @@ megasas_clear_intr_fusion(struct megasas_instance
*instance)
        return 1;
 }

+static inline void
+megasas_sdev_busy_inc(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline void
+megasas_sdev_busy_dec(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline int
+megasas_sdev_busy_read(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
+}
+
+
 /**
  * megasas_get_cmd_fusion -    Get a command from the free pool
  * @instance:          Adapter soft state
@@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
*instance,
 {
        int sdev_busy;

-       /* TBD - if sml remove device_busy in future, driver
-        * should track counter in internal structure.
-        */
-       sdev_busy = atomic_read(&scmd->device->device_busy);
+       sdev_busy = megasas_sdev_busy_read(scmd);

        if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
            sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
@@ -3390,6 +3413,7 @@ megasas_build_and_issue_cmd_fusion(struct
megasas_instance *instance,
         * Issue the command to the FW
         */

+       megasas_sdev_busy_inc(scmd);
        megasas_fire_cmd_fusion(instance, req_desc);

        if (r1_cmd)
@@ -3450,6 +3474,7 @@ megasas_complete_r1_command(struct megasas_instance
*instance,
                scmd_local->SCp.ptr = NULL;
                megasas_return_cmd_fusion(instance, cmd);
                scsi_dma_unmap(scmd_local);
+               megasas_sdev_busy_dec(scmd_local);
                scmd_local->scsi_done(scmd_local);
        }
 }
@@ -3550,6 +3575,7 @@ complete_cmd_fusion(struct megasas_instance
*instance, u32 MSIxIndex,
                                scmd_local->SCp.ptr = NULL;
                                megasas_return_cmd_fusion(instance,
cmd_fusion);
                                scsi_dma_unmap(scmd_local);
+                               megasas_sdev_busy_dec(scmd_local);
                                scmd_local->scsi_done(scmd_local);
                        } else  /* Optimal VD - R1 FP command completion.
*/
                                megasas_complete_r1_command(instance,
cmd_fusion);
--
2.18.1

>
> Thanks,
> Ming

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4169 bytes --]

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-19  6:20         ` Kashyap Desai
  0 siblings, 0 replies; 43+ messages in thread
From: Kashyap Desai @ 2020-11-19  6:20 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4940 bytes --]

> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> >
> > All errors (new ones prefixed by >>):
> >
> > >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no
member
> named 'device_busy' in 'struct scsi_device'
> >            sdev_busy = atomic_read(&scmd->device->device_busy);
>
> This new reference to sdev->device_busy is added by recent shared host
tag
> patch, and according to the comment, you may have planed to convert into
> one megaraid internal counter.
>
>         /* TBD - if sml remove device_busy in future, driver
>          * should track counter in internal structure.
>          */
>
> So can you post one patch? And I am happy to fold it into this series.

Ming - Please find the patch for megaraid_sas driver -
I have used helper inline function just for inter-operability with older
kernel to support in our out of box driver.
This way it will be easy for us to replace helper function as per kernel
version check.

Subject: [PATCH] megaraid_sas: replace sdev_busy with local counter

---
 drivers/scsi/megaraid/megaraid_sas.h        |  2 ++
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 34 ++++++++++++++++++---
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h
b/drivers/scsi/megaraid/megaraid_sas.h
index 0f808d63580e..0c6a56b24c6e 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2019,10 +2019,12 @@ union megasas_frame {
  * struct MR_PRIV_DEVICE - sdev private hostdata
  * @is_tm_capable: firmware managed tm_capable flag
  * @tm_busy: TM request is in progress
+ * @sdev_priv_busy: pending command per sdev
  */
 struct MR_PRIV_DEVICE {
        bool is_tm_capable;
        bool tm_busy;
+       atomic_t sdev_priv_busy;
        atomic_t r1_ldio_hint;
        u8 interface_type;
        u8 task_abort_tmo;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..e813ea0ad8b7 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -220,6 +220,32 @@ megasas_clear_intr_fusion(struct megasas_instance
*instance)
        return 1;
 }

+static inline void
+megasas_sdev_busy_inc(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline void
+megasas_sdev_busy_dec(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline int
+megasas_sdev_busy_read(struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
+}
+
+
 /**
  * megasas_get_cmd_fusion -    Get a command from the free pool
  * @instance:          Adapter soft state
@@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
*instance,
 {
        int sdev_busy;

-       /* TBD - if sml remove device_busy in future, driver
-        * should track counter in internal structure.
-        */
-       sdev_busy = atomic_read(&scmd->device->device_busy);
+       sdev_busy = megasas_sdev_busy_read(scmd);

        if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
            sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
@@ -3390,6 +3413,7 @@ megasas_build_and_issue_cmd_fusion(struct
megasas_instance *instance,
         * Issue the command to the FW
         */

+       megasas_sdev_busy_inc(scmd);
        megasas_fire_cmd_fusion(instance, req_desc);

        if (r1_cmd)
@@ -3450,6 +3474,7 @@ megasas_complete_r1_command(struct megasas_instance
*instance,
                scmd_local->SCp.ptr = NULL;
                megasas_return_cmd_fusion(instance, cmd);
                scsi_dma_unmap(scmd_local);
+               megasas_sdev_busy_dec(scmd_local);
                scmd_local->scsi_done(scmd_local);
        }
 }
@@ -3550,6 +3575,7 @@ complete_cmd_fusion(struct megasas_instance
*instance, u32 MSIxIndex,
                                scmd_local->SCp.ptr = NULL;
                                megasas_return_cmd_fusion(instance,
cmd_fusion);
                                scsi_dma_unmap(scmd_local);
+                               megasas_sdev_busy_dec(scmd_local);
                                scmd_local->scsi_done(scmd_local);
                        } else  /* Optimal VD - R1 FP command completion.
*/
                                megasas_complete_r1_command(instance,
cmd_fusion);
--
2.18.1

>
> Thanks,
> Ming

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4169 bytes --]

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* RE: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-18  7:44           ` Ming Lei
@ 2020-11-19  6:29             ` Kashyap Desai
  -1 siblings, 0 replies; 43+ messages in thread
From: Kashyap Desai @ 2020-11-19  6:29 UTC (permalink / raw)
  To: Ming Lei, Hannes Reinecke
  Cc: kernel test robot, Sumanesh Samanta, Jens Axboe, linux-block,
	Martin K . Petersen, linux-scsi, kbuild-all, clang-built-linux,
	Omar Sandoval, Ewan D . Milne

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

> > From: Hannes Reinecke <hare@suse.de>
> > Date: Wed, 18 Nov 2020 08:08:41 +0100
> > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of
> > direct access  to atomic counter
> >
> > It's always a bad style to access structure internals, especially if
> > there is an accessor for it. So convert to use scsi_device_busy()
> > intead of accessing the atomic counter directly.
> >
> > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Signed-off-by: Hannes Reinecke <hare@suse.de>
> > ---
> >  drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > index fd607287608e..272ff123bc6b 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance
> *instance,
> >  	/* TBD - if sml remove device_busy in future, driver
> >  	 * should track counter in internal structure.
> >  	 */
> > -	sdev_busy = atomic_read(&scmd->device->device_busy);
> > +	sdev_busy = scsi_device_busy(scmd->device);
>
> megasas_get_msix_index() is called in .queuecommand() path,
> scsi_device_busy() might take more cycles since it has to iterate over
each
> sbitmap words, especially when the sbitmap depth is high.
>
> I'd suggest Kashyap/Sumanesh to check if there is better way to deal
with it. If
> not, scsi_device_busy() should be fine.

Scsi_device_busy() add significant amount of overhead which will be
visible in terms of reduced IOPS and high CPU cycle. I tested it earlier
and noticed regression in performance.
I posted megaraid_sas driver patch which will use internal per sdev
outstanding similar to legacy sdev_busy counter.

Kashyap
>
>
> Thanks,
> Ming

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4169 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-19  6:29             ` Kashyap Desai
  0 siblings, 0 replies; 43+ messages in thread
From: Kashyap Desai @ 2020-11-19  6:29 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]

> > From: Hannes Reinecke <hare@suse.de>
> > Date: Wed, 18 Nov 2020 08:08:41 +0100
> > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of
> > direct access  to atomic counter
> >
> > It's always a bad style to access structure internals, especially if
> > there is an accessor for it. So convert to use scsi_device_busy()
> > intead of accessing the atomic counter directly.
> >
> > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > Signed-off-by: Hannes Reinecke <hare@suse.de>
> > ---
> >  drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > index fd607287608e..272ff123bc6b 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance
> *instance,
> >  	/* TBD - if sml remove device_busy in future, driver
> >  	 * should track counter in internal structure.
> >  	 */
> > -	sdev_busy = atomic_read(&scmd->device->device_busy);
> > +	sdev_busy = scsi_device_busy(scmd->device);
>
> megasas_get_msix_index() is called in .queuecommand() path,
> scsi_device_busy() might take more cycles since it has to iterate over
each
> sbitmap words, especially when the sbitmap depth is high.
>
> I'd suggest Kashyap/Sumanesh to check if there is better way to deal
with it. If
> not, scsi_device_busy() should be fine.

Scsi_device_busy() add significant amount of overhead which will be
visible in terms of reduced IOPS and high CPU cycle. I tested it earlier
and noticed regression in performance.
I posted megaraid_sas driver patch which will use internal per sdev
outstanding similar to legacy sdev_busy counter.

Kashyap
>
>
> Thanks,
> Ming

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4169 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-18  9:15             ` Hannes Reinecke
@ 2020-11-19  6:30               ` Ming Lei
  -1 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-19  6:30 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: kernel test robot, Kashyap Desai, Sumanesh Samanta, Jens Axboe,
	linux-block, Martin K . Petersen, linux-scsi, kbuild-all,
	clang-built-linux, Omar Sandoval, Ewan D . Milne

On Wed, Nov 18, 2020 at 10:15:19AM +0100, Hannes Reinecke wrote:
> On 11/18/20 8:44 AM, Ming Lei wrote:
> > On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
> > > Hey Ming,
> > > 
> > > On 11/18/20 3:35 AM, Ming Lei wrote:
> > > > Hello Kashyap & Sumanesh,
> > > > 
> > > > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> > > > > Hi Ming,
> > > > > 
> > > > > Thank you for the patch! Yet something to improve:
> > > > > 
> > > > > [auto build test ERROR on block/for-next]
> > > > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> > > > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > > > And when submitting patch, we suggest to use '--base' as documented in
> > > > > https://git-scm.com/docs/git-format-patch]
> > > > > 
> > > > > url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> > > > > config: powerpc64-randconfig-r026-20201116 (attached as .config)
> > > > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> > > > > reproduce (this is a W=1 build):
> > > > >           wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > > > >           chmod +x ~/bin/make.cross
> > > > >           # install powerpc64 cross compiling tool for clang build
> > > > >           # apt-get install binutils-powerpc64-linux-gnu
> > > > >           # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > > > >           git remote add linux-review https://github.com/0day-ci/linux
> > > > >           git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > > >           git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > > > >           # save the attached .config to linux build tree
> > > > >           COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
> > > > > 
> > > > > If you fix the issue, kindly add following tag as appropriate
> > > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > > 
> > > > > All errors (new ones prefixed by >>):
> > > > > 
> > > > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
> > > > >              sdev_busy = atomic_read(&scmd->device->device_busy);
> > > > 
> > > > This new reference to sdev->device_busy is added by recent shared host
> > > > tag patch, and according to the comment, you may have planed to convert into
> > > > one megaraid internal counter.
> > > > 
> > > >           /* TBD - if sml remove device_busy in future, driver
> > > >            * should track counter in internal structure.
> > > >            */
> > > > 
> > > > So can you post one patch? And I am happy to fold it into this series.
> > > > 
> > > Seeing that we already have the accessor 'scsi_device_busy()' it's probably
> > > easier to just use that and not fiddle with driver internals.
> > > See the attached patch.
> > > 
> > > Cheers,
> > > 
> > > Hannes
> > > -- 
> > > Dr. Hannes Reinecke                Kernel Storage Architect
> > > hare@suse.de                              +49 911 74053 688
> > > SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> > > HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
> > 
> > >  From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
> > > From: Hannes Reinecke <hare@suse.de>
> > > Date: Wed, 18 Nov 2020 08:08:41 +0100
> > > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
> > >   to atomic counter
> > > 
> > > It's always a bad style to access structure internals, especially if
> > > there is an accessor for it. So convert to use scsi_device_busy()
> > > intead of accessing the atomic counter directly.
> > > 
> > > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > > Signed-off-by: Hannes Reinecke <hare@suse.de>
> > > ---
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > index fd607287608e..272ff123bc6b 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
> > >   	/* TBD - if sml remove device_busy in future, driver
> > >   	 * should track counter in internal structure.
> > >   	 */
> > > -	sdev_busy = atomic_read(&scmd->device->device_busy);
> > > +	sdev_busy = scsi_device_busy(scmd->device);
> > 
> > megasas_get_msix_index() is called in .queuecommand() path,
> > scsi_device_busy() might take more cycles since it has to iterate over
> > each sbitmap words, especially when the sbitmap depth is high.
> > 
> > I'd suggest Kashyap/Sumanesh to check if there is better way to
> > deal with it. If not, scsi_device_busy() should be fine.
> > 
> I guess this whole codepath will become obsolete if and when support for
> polling queues / io_uring will be implemented for megaraid_sas.

Not sure if it is related with iopoll which requires host tags. I understand
the code path for selecting msi index should be replaced with the following
simply if host tags is enabled:

	if (instance->host->nr_hw_queues > 1) {
                u32 tag = blk_mq_unique_tag(scmd->request);

                cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) +
                        instance->low_latency_index_start;
	} else {
		if (instance->perf_mode == MR_BALANCED_PERF_MODE)
			...
		else if (instance->msix_load_balance)
			...
		else
			cmd->request_desc->SCSIIO.MSIxIndex =
				instance->reply_map[raw_smp_processor_id()];
	}

Otherwise there might be risk to trigger soft lockup in case of host tags.

sdev->device_busy is only required for MR_BALANCED_PERF_MODE, so your
patch can be adjusted to read the counter only for MR_BALANCED_PERF_MODE.

> This whole section deals with spreading the load over several hardware
> queues once the dedicated one is at risk of being congested.
> But this is only required if someone want to reach high IOPS; so if we have
> poll/io_uring support there won't be a need for this anymore.

I understand poll is for low latency usage with extra cost of CPU utilization, and
iopoll can't replace irq based IO. But I may misunderstood your point.




Thanks, 
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-19  6:30               ` Ming Lei
  0 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-19  6:30 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6893 bytes --]

On Wed, Nov 18, 2020 at 10:15:19AM +0100, Hannes Reinecke wrote:
> On 11/18/20 8:44 AM, Ming Lei wrote:
> > On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote:
> > > Hey Ming,
> > > 
> > > On 11/18/20 3:35 AM, Ming Lei wrote:
> > > > Hello Kashyap & Sumanesh,
> > > > 
> > > > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote:
> > > > > Hi Ming,
> > > > > 
> > > > > Thank you for the patch! Yet something to improve:
> > > > > 
> > > > > [auto build test ERROR on block/for-next]
> > > > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116]
> > > > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > > > And when submitting patch, we suggest to use '--base' as documented in
> > > > > https://git-scm.com/docs/git-format-patch]
> > > > > 
> > > > > url:    https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > > > base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
> > > > > config: powerpc64-randconfig-r026-20201116 (attached as .config)
> > > > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c)
> > > > > reproduce (this is a W=1 build):
> > > > >           wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > > > >           chmod +x ~/bin/make.cross
> > > > >           # install powerpc64 cross compiling tool for clang build
> > > > >           # apt-get install binutils-powerpc64-linux-gnu
> > > > >           # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > > > >           git remote add linux-review https://github.com/0day-ci/linux
> > > > >           git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449
> > > > >           git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff
> > > > >           # save the attached .config to linux build tree
> > > > >           COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
> > > > > 
> > > > > If you fix the issue, kindly add following tag as appropriate
> > > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > > 
> > > > > All errors (new ones prefixed by >>):
> > > > > 
> > > > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device'
> > > > >              sdev_busy = atomic_read(&scmd->device->device_busy);
> > > > 
> > > > This new reference to sdev->device_busy is added by recent shared host
> > > > tag patch, and according to the comment, you may have planed to convert into
> > > > one megaraid internal counter.
> > > > 
> > > >           /* TBD - if sml remove device_busy in future, driver
> > > >            * should track counter in internal structure.
> > > >            */
> > > > 
> > > > So can you post one patch? And I am happy to fold it into this series.
> > > > 
> > > Seeing that we already have the accessor 'scsi_device_busy()' it's probably
> > > easier to just use that and not fiddle with driver internals.
> > > See the attached patch.
> > > 
> > > Cheers,
> > > 
> > > Hannes
> > > -- 
> > > Dr. Hannes Reinecke                Kernel Storage Architect
> > > hare(a)suse.de                              +49 911 74053 688
> > > SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> > > HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
> > 
> > >  From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001
> > > From: Hannes Reinecke <hare@suse.de>
> > > Date: Wed, 18 Nov 2020 08:08:41 +0100
> > > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access
> > >   to atomic counter
> > > 
> > > It's always a bad style to access structure internals, especially if
> > > there is an accessor for it. So convert to use scsi_device_busy()
> > > intead of accessing the atomic counter directly.
> > > 
> > > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > > Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
> > > Signed-off-by: Hannes Reinecke <hare@suse.de>
> > > ---
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > index fd607287608e..272ff123bc6b 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance,
> > >   	/* TBD - if sml remove device_busy in future, driver
> > >   	 * should track counter in internal structure.
> > >   	 */
> > > -	sdev_busy = atomic_read(&scmd->device->device_busy);
> > > +	sdev_busy = scsi_device_busy(scmd->device);
> > 
> > megasas_get_msix_index() is called in .queuecommand() path,
> > scsi_device_busy() might take more cycles since it has to iterate over
> > each sbitmap words, especially when the sbitmap depth is high.
> > 
> > I'd suggest Kashyap/Sumanesh to check if there is better way to
> > deal with it. If not, scsi_device_busy() should be fine.
> > 
> I guess this whole codepath will become obsolete if and when support for
> polling queues / io_uring will be implemented for megaraid_sas.

Not sure if it is related with iopoll which requires host tags. I understand
the code path for selecting msi index should be replaced with the following
simply if host tags is enabled:

	if (instance->host->nr_hw_queues > 1) {
                u32 tag = blk_mq_unique_tag(scmd->request);

                cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) +
                        instance->low_latency_index_start;
	} else {
		if (instance->perf_mode == MR_BALANCED_PERF_MODE)
			...
		else if (instance->msix_load_balance)
			...
		else
			cmd->request_desc->SCSIIO.MSIxIndex =
				instance->reply_map[raw_smp_processor_id()];
	}

Otherwise there might be risk to trigger soft lockup in case of host tags.

sdev->device_busy is only required for MR_BALANCED_PERF_MODE, so your
patch can be adjusted to read the counter only for MR_BALANCED_PERF_MODE.

> This whole section deals with spreading the load over several hardware
> queues once the dedicated one is at risk of being congested.
> But this is only required if someone want to reach high IOPS; so if we have
> poll/io_uring support there won't be a need for this anymore.

I understand poll is for low latency usage with extra cost of CPU utilization, and
iopoll can't replace irq based IO. But I may misunderstood your point.




Thanks, 
Ming

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-19  6:20         ` Kashyap Desai
@ 2020-11-19  6:34           ` Ming Lei
  -1 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-19  6:34 UTC (permalink / raw)
  To: Kashyap Desai
  Cc: kernel test robot, Sumanesh Samanta, Jens Axboe, linux-block,
	Martin K . Petersen, linux-scsi, kbuild-all, clang-built-linux,
	Omar Sandoval, Ewan D . Milne, Hannes Reinecke

On Thu, Nov 19, 2020 at 11:50:39AM +0530, Kashyap Desai wrote:
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no
> member
> > named 'device_busy' in 'struct scsi_device'
> > >            sdev_busy = atomic_read(&scmd->device->device_busy);
> >
> > This new reference to sdev->device_busy is added by recent shared host
> tag
> > patch, and according to the comment, you may have planed to convert into
> > one megaraid internal counter.
> >
> >         /* TBD - if sml remove device_busy in future, driver
> >          * should track counter in internal structure.
> >          */
> >
> > So can you post one patch? And I am happy to fold it into this series.
> 
> Ming - Please find the patch for megaraid_sas driver -
> I have used helper inline function just for inter-operability with older
> kernel to support in our out of box driver.
> This way it will be easy for us to replace helper function as per kernel
> version check.
> 
> Subject: [PATCH] megaraid_sas: replace sdev_busy with local counter
> 
> ---
>  drivers/scsi/megaraid/megaraid_sas.h        |  2 ++
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 34 ++++++++++++++++++---
>  2 files changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> b/drivers/scsi/megaraid/megaraid_sas.h
> index 0f808d63580e..0c6a56b24c6e 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -2019,10 +2019,12 @@ union megasas_frame {
>   * struct MR_PRIV_DEVICE - sdev private hostdata
>   * @is_tm_capable: firmware managed tm_capable flag
>   * @tm_busy: TM request is in progress
> + * @sdev_priv_busy: pending command per sdev
>   */
>  struct MR_PRIV_DEVICE {
>         bool is_tm_capable;
>         bool tm_busy;
> +       atomic_t sdev_priv_busy;
>         atomic_t r1_ldio_hint;
>         u8 interface_type;
>         u8 task_abort_tmo;
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index fd607287608e..e813ea0ad8b7 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -220,6 +220,32 @@ megasas_clear_intr_fusion(struct megasas_instance
> *instance)
>         return 1;
>  }
> 
> +static inline void
> +megasas_sdev_busy_inc(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
> +}
> +static inline void
> +megasas_sdev_busy_dec(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
> +}
> +static inline int
> +megasas_sdev_busy_read(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
> +}
> +
> +
>  /**
>   * megasas_get_cmd_fusion -    Get a command from the free pool
>   * @instance:          Adapter soft state
> @@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
> *instance,
>  {
>         int sdev_busy;
> 
> -       /* TBD - if sml remove device_busy in future, driver
> -        * should track counter in internal structure.
> -        */
> -       sdev_busy = atomic_read(&scmd->device->device_busy);
> +       sdev_busy = megasas_sdev_busy_read(scmd);

The above is only used for MR_BALANCED_PERF_MODE, so maybe you can
skip inc/dec/read the counter for other perf mode.



Thanks, 
Ming


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-19  6:34           ` Ming Lei
  0 siblings, 0 replies; 43+ messages in thread
From: Ming Lei @ 2020-11-19  6:34 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3967 bytes --]

On Thu, Nov 19, 2020 at 11:50:39AM +0530, Kashyap Desai wrote:
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot <lkp@intel.com>
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > >> drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no
> member
> > named 'device_busy' in 'struct scsi_device'
> > >            sdev_busy = atomic_read(&scmd->device->device_busy);
> >
> > This new reference to sdev->device_busy is added by recent shared host
> tag
> > patch, and according to the comment, you may have planed to convert into
> > one megaraid internal counter.
> >
> >         /* TBD - if sml remove device_busy in future, driver
> >          * should track counter in internal structure.
> >          */
> >
> > So can you post one patch? And I am happy to fold it into this series.
> 
> Ming - Please find the patch for megaraid_sas driver -
> I have used helper inline function just for inter-operability with older
> kernel to support in our out of box driver.
> This way it will be easy for us to replace helper function as per kernel
> version check.
> 
> Subject: [PATCH] megaraid_sas: replace sdev_busy with local counter
> 
> ---
>  drivers/scsi/megaraid/megaraid_sas.h        |  2 ++
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 34 ++++++++++++++++++---
>  2 files changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> b/drivers/scsi/megaraid/megaraid_sas.h
> index 0f808d63580e..0c6a56b24c6e 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -2019,10 +2019,12 @@ union megasas_frame {
>   * struct MR_PRIV_DEVICE - sdev private hostdata
>   * @is_tm_capable: firmware managed tm_capable flag
>   * @tm_busy: TM request is in progress
> + * @sdev_priv_busy: pending command per sdev
>   */
>  struct MR_PRIV_DEVICE {
>         bool is_tm_capable;
>         bool tm_busy;
> +       atomic_t sdev_priv_busy;
>         atomic_t r1_ldio_hint;
>         u8 interface_type;
>         u8 task_abort_tmo;
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index fd607287608e..e813ea0ad8b7 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -220,6 +220,32 @@ megasas_clear_intr_fusion(struct megasas_instance
> *instance)
>         return 1;
>  }
> 
> +static inline void
> +megasas_sdev_busy_inc(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
> +}
> +static inline void
> +megasas_sdev_busy_dec(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
> +}
> +static inline int
> +megasas_sdev_busy_read(struct scsi_cmnd *scmd)
> +{
> +       struct MR_PRIV_DEVICE *mr_device_priv_data;
> +
> +       mr_device_priv_data = scmd->device->hostdata;
> +       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
> +}
> +
> +
>  /**
>   * megasas_get_cmd_fusion -    Get a command from the free pool
>   * @instance:          Adapter soft state
> @@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
> *instance,
>  {
>         int sdev_busy;
> 
> -       /* TBD - if sml remove device_busy in future, driver
> -        * should track counter in internal structure.
> -        */
> -       sdev_busy = atomic_read(&scmd->device->device_busy);
> +       sdev_busy = megasas_sdev_busy_read(scmd);

The above is only used for MR_BALANCED_PERF_MODE, so maybe you can
skip inc/dec/read the counter for other perf mode.



Thanks, 
Ming

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
  2020-11-19  6:34           ` Ming Lei
@ 2020-11-19  7:09             ` Kashyap Desai
  -1 siblings, 0 replies; 43+ messages in thread
From: Kashyap Desai @ 2020-11-19  7:09 UTC (permalink / raw)
  To: Ming Lei
  Cc: kernel test robot, Sumanesh Samanta, Jens Axboe, linux-block,
	Martin K . Petersen, linux-scsi, kbuild-all, clang-built-linux,
	Omar Sandoval, Ewan D . Milne, Hannes Reinecke

[-- Attachment #1: Type: text/plain, Size: 5510 bytes --]

> >  /**
> >   * megasas_get_cmd_fusion -    Get a command from the free pool
> >   * @instance:          Adapter soft state
> > @@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
> > *instance,  {
> >         int sdev_busy;
> >
> > -       /* TBD - if sml remove device_busy in future, driver
> > -        * should track counter in internal structure.
> > -        */
> > -       sdev_busy = atomic_read(&scmd->device->device_busy);
> > +       sdev_busy = megasas_sdev_busy_read(scmd);
>
> The above is only used for MR_BALANCED_PERF_MODE, so maybe you can
> skip inc/dec/read the counter for other perf mode.

Agree. I have created V2.

Subject: [PATCH] megaraid_sas: v2 replace sdev_busy with local counter

use local tracking of per sdev outstanding command since sdev_busy in
SML is improved for peformance reason using sbitmap (earlier it was
atomic variable).

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h        |  2 +
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 51 +++++++++++++++++----
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h
b/drivers/scsi/megaraid/megaraid_sas.h
index 0f808d63580e..0c6a56b24c6e 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2019,10 +2019,12 @@ union megasas_frame {
  * struct MR_PRIV_DEVICE - sdev private hostdata
  * @is_tm_capable: firmware managed tm_capable flag
  * @tm_busy: TM request is in progress
+ * @sdev_priv_busy: pending command per sdev
  */
 struct MR_PRIV_DEVICE {
        bool is_tm_capable;
        bool tm_busy;
+       atomic_t sdev_priv_busy;
        atomic_t r1_ldio_hint;
        u8 interface_type;
        u8 task_abort_tmo;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..c630404cbb2d 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -220,6 +220,44 @@ megasas_clear_intr_fusion(struct megasas_instance
*instance)
        return 1;
 }

+static inline void
+megasas_sdev_busy_inc(struct megasas_instance *instance,
+                     struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline void
+megasas_sdev_busy_dec(struct megasas_instance *instance,
+                     struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline int
+megasas_sdev_busy_read(struct megasas_instance *instance,
+                      struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return 0;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
+}
+
+
 /**
  * megasas_get_cmd_fusion -    Get a command from the free pool
  * @instance:          Adapter soft state
@@ -357,15 +395,9 @@ megasas_get_msix_index(struct megasas_instance
*instance,
                       struct megasas_cmd_fusion *cmd,
                       u8 data_arms)
 {
-       int sdev_busy;
-
-       /* TBD - if sml remove device_busy in future, driver
-        * should track counter in internal structure.
-        */
-       sdev_busy = atomic_read(&scmd->device->device_busy);
-
        if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
-           sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
+           (megasas_sdev_busy_read(instance, scmd) >
+           (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))) {
                cmd->request_desc->SCSIIO.MSIxIndex =
                        mega_mod64((atomic64_add_return(1,
&instance->high_iops_outstanding) /
                                        MR_HIGH_IOPS_BATCH_COUNT),
instance->low_latency_index_start);
@@ -3390,6 +3422,7 @@ megasas_build_and_issue_cmd_fusion(struct
megasas_instance *instance,
         * Issue the command to the FW
         */

+       megasas_sdev_busy_inc(instance, scmd);
        megasas_fire_cmd_fusion(instance, req_desc);

        if (r1_cmd)
@@ -3450,6 +3483,7 @@ megasas_complete_r1_command(struct megasas_instance
*instance,
                scmd_local->SCp.ptr = NULL;
                megasas_return_cmd_fusion(instance, cmd);
                scsi_dma_unmap(scmd_local);
+               megasas_sdev_busy_dec(instance, scmd_local);
                scmd_local->scsi_done(scmd_local);
        }
 }
@@ -3550,6 +3584,7 @@ complete_cmd_fusion(struct megasas_instance
*instance, u32 MSIxIndex,
                                scmd_local->SCp.ptr = NULL;
                                megasas_return_cmd_fusion(instance,
cmd_fusion);
                                scsi_dma_unmap(scmd_local);
+                               megasas_sdev_busy_dec(instance,
scmd_local);
                                scmd_local->scsi_done(scmd_local);
                        } else  /* Optimal VD - R1 FP command completion.
*/
                                megasas_complete_r1_command(instance,
cmd_fusion);
--
>
>
>
> Thanks,
> Ming

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4169 bytes --]

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap
@ 2020-11-19  7:09             ` Kashyap Desai
  0 siblings, 0 replies; 43+ messages in thread
From: Kashyap Desai @ 2020-11-19  7:09 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5667 bytes --]

> >  /**
> >   * megasas_get_cmd_fusion -    Get a command from the free pool
> >   * @instance:          Adapter soft state
> > @@ -359,10 +385,7 @@ megasas_get_msix_index(struct megasas_instance
> > *instance,  {
> >         int sdev_busy;
> >
> > -       /* TBD - if sml remove device_busy in future, driver
> > -        * should track counter in internal structure.
> > -        */
> > -       sdev_busy = atomic_read(&scmd->device->device_busy);
> > +       sdev_busy = megasas_sdev_busy_read(scmd);
>
> The above is only used for MR_BALANCED_PERF_MODE, so maybe you can
> skip inc/dec/read the counter for other perf mode.

Agree. I have created V2.

Subject: [PATCH] megaraid_sas: v2 replace sdev_busy with local counter

use local tracking of per sdev outstanding command since sdev_busy in
SML is improved for peformance reason using sbitmap (earlier it was
atomic variable).

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h        |  2 +
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 51 +++++++++++++++++----
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h
b/drivers/scsi/megaraid/megaraid_sas.h
index 0f808d63580e..0c6a56b24c6e 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2019,10 +2019,12 @@ union megasas_frame {
  * struct MR_PRIV_DEVICE - sdev private hostdata
  * @is_tm_capable: firmware managed tm_capable flag
  * @tm_busy: TM request is in progress
+ * @sdev_priv_busy: pending command per sdev
  */
 struct MR_PRIV_DEVICE {
        bool is_tm_capable;
        bool tm_busy;
+       atomic_t sdev_priv_busy;
        atomic_t r1_ldio_hint;
        u8 interface_type;
        u8 task_abort_tmo;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index fd607287608e..c630404cbb2d 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -220,6 +220,44 @@ megasas_clear_intr_fusion(struct megasas_instance
*instance)
        return 1;
 }

+static inline void
+megasas_sdev_busy_inc(struct megasas_instance *instance,
+                     struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_inc(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline void
+megasas_sdev_busy_dec(struct megasas_instance *instance,
+                     struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       atomic_dec(&mr_device_priv_data->sdev_priv_busy);
+}
+static inline int
+megasas_sdev_busy_read(struct megasas_instance *instance,
+                      struct scsi_cmnd *scmd)
+{
+       struct MR_PRIV_DEVICE *mr_device_priv_data;
+
+       if (instance->perf_mode != MR_BALANCED_PERF_MODE)
+               return 0;
+
+       mr_device_priv_data = scmd->device->hostdata;
+       return atomic_read(&mr_device_priv_data->sdev_priv_busy);
+}
+
+
 /**
  * megasas_get_cmd_fusion -    Get a command from the free pool
  * @instance:          Adapter soft state
@@ -357,15 +395,9 @@ megasas_get_msix_index(struct megasas_instance
*instance,
                       struct megasas_cmd_fusion *cmd,
                       u8 data_arms)
 {
-       int sdev_busy;
-
-       /* TBD - if sml remove device_busy in future, driver
-        * should track counter in internal structure.
-        */
-       sdev_busy = atomic_read(&scmd->device->device_busy);
-
        if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
-           sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) {
+           (megasas_sdev_busy_read(instance, scmd) >
+           (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))) {
                cmd->request_desc->SCSIIO.MSIxIndex =
                        mega_mod64((atomic64_add_return(1,
&instance->high_iops_outstanding) /
                                        MR_HIGH_IOPS_BATCH_COUNT),
instance->low_latency_index_start);
@@ -3390,6 +3422,7 @@ megasas_build_and_issue_cmd_fusion(struct
megasas_instance *instance,
         * Issue the command to the FW
         */

+       megasas_sdev_busy_inc(instance, scmd);
        megasas_fire_cmd_fusion(instance, req_desc);

        if (r1_cmd)
@@ -3450,6 +3483,7 @@ megasas_complete_r1_command(struct megasas_instance
*instance,
                scmd_local->SCp.ptr = NULL;
                megasas_return_cmd_fusion(instance, cmd);
                scsi_dma_unmap(scmd_local);
+               megasas_sdev_busy_dec(instance, scmd_local);
                scmd_local->scsi_done(scmd_local);
        }
 }
@@ -3550,6 +3584,7 @@ complete_cmd_fusion(struct megasas_instance
*instance, u32 MSIxIndex,
                                scmd_local->SCp.ptr = NULL;
                                megasas_return_cmd_fusion(instance,
cmd_fusion);
                                scsi_dma_unmap(scmd_local);
+                               megasas_sdev_busy_dec(instance,
scmd_local);
                                scmd_local->scsi_done(scmd_local);
                        } else  /* Optimal VD - R1 FP command completion.
*/
                                megasas_complete_r1_command(instance,
cmd_fusion);
--
>
>
>
> Thanks,
> Ming

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4169 bytes --]

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2020-11-19  7:09 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16  9:07 [PATCH V4 00/12] blk-mq/scsi: tracking device queue depth via sbitmap Ming Lei
2020-11-16  9:07 ` [PATCH V4 01/12] sbitmap: remove sbitmap_clear_bit_unlock Ming Lei
2020-11-16  9:07 ` [PATCH V4 02/12] sbitmap: maintain allocation round_robin in sbitmap Ming Lei
2020-11-16  9:07 ` [PATCH V4 03/12] sbitmap: add helpers for updating allocation hint Ming Lei
2020-11-16  9:07 ` [PATCH V4 04/12] sbitmap: move allocation hint into sbitmap Ming Lei
2020-11-16  9:33   ` Hannes Reinecke
2020-11-16  9:07 ` [PATCH V4 05/12] sbitmap: export sbitmap_weight Ming Lei
2020-11-16  9:38   ` Hannes Reinecke
2020-11-17  2:10     ` Ming Lei
2020-11-17  6:57       ` Hannes Reinecke
2020-11-17  7:53         ` Ming Lei
2020-11-17  9:18   ` John Garry
2020-11-16  9:07 ` [PATCH V4 06/12] sbitmap: add helper of sbitmap_calculate_shift Ming Lei
2020-11-16  9:07 ` [PATCH V4 07/12] blk-mq: add callbacks for storing & retrieving budget token Ming Lei
2020-11-16  9:07 ` [PATCH V4 08/12] blk-mq: return budget token from .get_budget callback Ming Lei
2020-11-16  9:45   ` Hannes Reinecke
2020-11-16  9:07 ` [PATCH V4 09/12] scsi: put hot fields of scsi_host_template into one cacheline Ming Lei
2020-11-16  9:07 ` [PATCH V4 10/12] scsi: add scsi_device_busy() to read sdev->device_busy Ming Lei
2020-11-16  9:07 ` [PATCH V4 11/12] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024) Ming Lei
2020-11-16  9:44   ` Hannes Reinecke
2020-11-17  2:18     ` Ming Lei
2020-11-17 16:42       ` Hannes Reinecke
2020-11-16  9:07 ` [PATCH V4 12/12] scsi: replace sdev->device_busy with sbitmap Ming Lei
2020-11-16 11:49   ` kernel test robot
2020-11-16 11:49     ` kernel test robot
2020-11-18  2:35     ` Ming Lei
2020-11-18  2:35       ` Ming Lei
2020-11-18  7:15       ` Hannes Reinecke
2020-11-18  7:15         ` Hannes Reinecke
2020-11-18  7:44         ` Ming Lei
2020-11-18  7:44           ` Ming Lei
2020-11-18  9:15           ` Hannes Reinecke
2020-11-18  9:15             ` Hannes Reinecke
2020-11-19  6:30             ` Ming Lei
2020-11-19  6:30               ` Ming Lei
2020-11-19  6:29           ` Kashyap Desai
2020-11-19  6:29             ` Kashyap Desai
2020-11-19  6:20       ` Kashyap Desai
2020-11-19  6:20         ` Kashyap Desai
2020-11-19  6:34         ` Ming Lei
2020-11-19  6:34           ` Ming Lei
2020-11-19  7:09           ` Kashyap Desai
2020-11-19  7:09             ` Kashyap Desai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.