Linux-SCSI Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 0/7] Elevator cleanups and improvements
@ 2019-09-04  8:42 Damien Le Moal
  2019-09-04  8:42 ` [PATCH v3 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

This patch series implements some cleanup of the elevator initialization
code and introduces elevator features identification and device matching
to enhance checks for elevator/device compatibility and fitness.

The first 2 patches of the series are simple cleanups which simplify 
elevator initialization for newly allocated device queues.

Patch 3 introduce elevator features, allowing a clean and extensible
definition of devices and features that an elevator supports and match
these against features required by a block device. With this, the sysfs
elevator list for a device always shows only elevators matching the
features that a particular device requires, with the exception of the
none elevator which has no features but is always available for use
with any device.

The first feature defined is for zoned block device sequential write
constraint support through zone write locking which prevents the use of
any elevator that does not support this feature with zoned devices.

The last 4 patches of this series rework the default elevator selection
and initialization to allow for the elevator/device features matching
to work, doing so addressing cases not currently well supported, namely,
multi-queue zoned block devices.

Changes from v2:
* Fixed patch 4
* Call elevator_init_mq() earlier in device_add_disk() as suggested by
  Christoph (patch 5)
* Fixed title of patch 7

Changes from v1:
* Addressed Johannes comments
* Rebased on newest for-next branch to include Ming's sysfs lock changes

Damien Le Moal (7):
  block: Cleanup elevator_init_mq() use
  block: Change elevator_init_mq() to always succeed
  block: Introduce elevator features
  block: Improve default elevator selection
  block: Delay default elevator initialization
  block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks
  sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks

 block/blk-mq.c                |  10 ---
 block/blk-settings.c          |  16 ++++
 block/blk.h                   |   2 +-
 block/elevator.c              | 137 ++++++++++++++++++++++++++--------
 block/genhd.c                 |   8 ++
 block/mq-deadline.c           |   1 +
 drivers/block/null_blk_main.c |   2 +
 drivers/scsi/sd_zbc.c         |   2 +
 include/linux/blkdev.h        |   4 +
 include/linux/elevator.h      |   8 ++
 10 files changed, 148 insertions(+), 42 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/7] block: Cleanup elevator_init_mq() use
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  9:02   ` Ming Lei
  2019-09-04  8:42 ` [PATCH v3 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Instead of checking a queue tag_set BLK_MQ_F_NO_SCHED flag before
calling elevator_init_mq() to make sure that the queue supports IO
scheduling, use the elevator.c function elv_support_iosched() in
elevator_init_mq(). This does not introduce any functional change but
ensure that elevator_init_mq() does the right thing based on the queue
settings.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c   |  8 +++-----
 block/elevator.c | 23 +++++++++++++----------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b622029b19ea..13923630e00a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2904,11 +2904,9 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	blk_mq_add_queue_tag_set(set, q);
 	blk_mq_map_swqueue(q);
 
-	if (!(set->flags & BLK_MQ_F_NO_SCHED)) {
-		ret = elevator_init_mq(q);
-		if (ret)
-			goto err_tag_set;
-	}
+	ret = elevator_init_mq(q);
+	if (ret)
+		goto err_tag_set;
 
 	return q;
 
diff --git a/block/elevator.c b/block/elevator.c
index 86100de88883..4721834815bb 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -619,16 +619,26 @@ int elevator_switch_mq(struct request_queue *q,
 	return ret;
 }
 
+static inline bool elv_support_iosched(struct request_queue *q)
+{
+	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
+		return false;
+	return true;
+}
+
 /*
- * For blk-mq devices, we default to using mq-deadline, if available, for single
- * queue devices.  If deadline isn't available OR we have multiple queues,
- * default to "none".
+ * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
+ * if available, for single queue devices. If deadline isn't available OR we
+ * have multiple queues, default to "none".
  */
 int elevator_init_mq(struct request_queue *q)
 {
 	struct elevator_type *e;
 	int err = 0;
 
+	if (!elv_support_iosched(q))
+		return 0;
+
 	if (q->nr_hw_queues != 1)
 		return 0;
 
@@ -706,13 +716,6 @@ static int __elevator_change(struct request_queue *q, const char *name)
 	return elevator_switch(q, e);
 }
 
-static inline bool elv_support_iosched(struct request_queue *q)
-{
-	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
-		return false;
-	return true;
-}
-
 ssize_t elv_iosched_store(struct request_queue *q, const char *name,
 			  size_t count)
 {
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 2/7] block: Change elevator_init_mq() to always succeed
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
  2019-09-04  8:42 ` [PATCH v3 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  9:05   ` Ming Lei
  2019-09-04  8:42 ` [PATCH v3 3/7] block: Introduce elevator features Damien Le Moal
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

If the default elevator chosen is mq-deadline, elevator_init_mq() may
return an error if mq-deadline initialization fails, leading to
blk_mq_init_allocated_queue() returning an error, which in turn will
cause the block device initialization to fail and the device not being
exposed.

Instead of taking such extreme measure, handle mq-deadline
initialization failures in the same manner as when mq-deadline is not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c   |  8 +-------
 block/blk.h      |  2 +-
 block/elevator.c | 23 ++++++++++++-----------
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 13923630e00a..ee4caf0c0807 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2842,8 +2842,6 @@ static unsigned int nr_hw_queues(struct blk_mq_tag_set *set)
 struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 						  struct request_queue *q)
 {
-	int ret = -ENOMEM;
-
 	/* mark the queue as mq asap */
 	q->mq_ops = set->ops;
 
@@ -2904,14 +2902,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	blk_mq_add_queue_tag_set(set, q);
 	blk_mq_map_swqueue(q);
 
-	ret = elevator_init_mq(q);
-	if (ret)
-		goto err_tag_set;
+	elevator_init_mq(q);
 
 	return q;
 
-err_tag_set:
-	blk_mq_del_queue_tag_set(q);
 err_hctxs:
 	kfree(q->queue_hw_ctx);
 	q->nr_hw_queues = 0;
diff --git a/block/blk.h b/block/blk.h
index e4619fc5c99a..ed347f7a97b1 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -184,7 +184,7 @@ void blk_account_io_done(struct request *req, u64 now);
 
 void blk_insert_flush(struct request *rq);
 
-int elevator_init_mq(struct request_queue *q);
+void elevator_init_mq(struct request_queue *q);
 int elevator_switch_mq(struct request_queue *q,
 			      struct elevator_type *new_e);
 void __elevator_exit(struct request_queue *, struct elevator_queue *);
diff --git a/block/elevator.c b/block/elevator.c
index 4721834815bb..2944c129760c 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -628,34 +628,35 @@ static inline bool elv_support_iosched(struct request_queue *q)
 
 /*
  * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
- * if available, for single queue devices. If deadline isn't available OR we
- * have multiple queues, default to "none".
+ * if available, for single queue devices. If deadline isn't available OR
+ * deadline initialization fails OR we have multiple queues, default to "none".
  */
-int elevator_init_mq(struct request_queue *q)
+void elevator_init_mq(struct request_queue *q)
 {
 	struct elevator_type *e;
-	int err = 0;
+	int err;
 
 	if (!elv_support_iosched(q))
-		return 0;
+		return;
 
 	if (q->nr_hw_queues != 1)
-		return 0;
+		return;
 
 	WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags));
 
 	if (unlikely(q->elevator))
-		goto out;
+		return;
 
 	e = elevator_get(q, "mq-deadline", false);
 	if (!e)
-		goto out;
+		return;
 
 	err = blk_mq_init_sched(q, e);
-	if (err)
+	if (err) {
+		pr_warn("\"%s\" elevator initialization failed, "
+			"falling back to \"none\"\n", e->elevator_name);
 		elevator_put(e);
-out:
-	return err;
+	}
 }
 
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 3/7] block: Introduce elevator features
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
  2019-09-04  8:42 ` [PATCH v3 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
  2019-09-04  8:42 ` [PATCH v3 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  8:42 ` [PATCH v3 4/7] block: Improve default elevator selection Damien Le Moal
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Introduce the definition of elevator features through the
elevator_features flags in the elevator_type structure. Each flag can
represent a feature supported by an elevator. The first feature defined
by this patch is support for zoned block device sequential write
constraint with the flag ELEVATOR_F_ZBD_SEQ_WRITE, which is implemented
by the mq-deadline elevator using zone write locking.

Other possible features are IO priorities, write hints, latency targets
or single-LUN dual-actuator disks (for which the elevator could maintain
one LBA ordered list per actuator).

The required_elevator_features field is also added to the request_queue
structure to allow a device driver to specify elevator feature flags
that an elevator must support for the correct operation of the device
(e.g. device drivers for zoned block devices can have the
ELEVATOR_F_ZBD_SEQ_WRITE flag as a required feature).
The helper function blk_queue_required_elevator_features() is
defined for setting this new field.

With these two new fields in place, the elevator functions
elevator_match() and elevator_find() are modified to allow a user to set
only an elevator with a set of features that satisfies the device
required features. Elevators not matching the device requirements are
not shown in the device sysfs queue/scheduler file to prevent their use.

The "none" elevator can always be selected as before.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-settings.c     | 16 +++++++++++++
 block/elevator.c         | 49 +++++++++++++++++++++++++++++++---------
 block/mq-deadline.c      |  1 +
 include/linux/blkdev.h   |  4 ++++
 include/linux/elevator.h |  8 +++++++
 5 files changed, 67 insertions(+), 11 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a058997b9cce..6bd1e3b082d8 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -832,6 +832,22 @@ void blk_queue_write_cache(struct request_queue *q, bool wc, bool fua)
 }
 EXPORT_SYMBOL_GPL(blk_queue_write_cache);
 
+/**
+ * blk_queue_required_elevator_features - Set a queue required elevator features
+ * @q:		the request queue for the target device
+ * @features:	Required elevator features OR'ed together
+ *
+ * Tell the block layer that for the device controlled through @q, only the
+ * only elevators that can be used are those that implement at least the set of
+ * features specified by @features.
+ */
+void blk_queue_required_elevator_features(struct request_queue *q,
+					  unsigned int features)
+{
+	q->required_elevator_features = features;
+}
+EXPORT_SYMBOL_GPL(blk_queue_required_elevator_features);
+
 static int __init blk_settings_init(void)
 {
 	blk_max_low_pfn = max_low_pfn - 1;
diff --git a/block/elevator.c b/block/elevator.c
index 2944c129760c..ac7c8ad580ba 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -83,8 +83,26 @@ bool elv_bio_merge_ok(struct request *rq, struct bio *bio)
 }
 EXPORT_SYMBOL(elv_bio_merge_ok);
 
-static bool elevator_match(const struct elevator_type *e, const char *name)
+static inline bool elv_support_features(unsigned int elv_features,
+					unsigned int required_features)
 {
+	return (required_features & elv_features) == required_features;
+}
+
+/**
+ * elevator_match - Test an elevator name and features
+ * @e: Scheduler to test
+ * @name: Elevator name to test
+ * @required_features: Features that the elevator must provide
+ *
+ * Return true is the elevator @e name matches @name and if @e provides all the
+ * the feratures spcified by @required_features.
+ */
+static bool elevator_match(const struct elevator_type *e, const char *name,
+			   unsigned int required_features)
+{
+	if (!elv_support_features(e->elevator_features, required_features))
+		return false;
 	if (!strcmp(e->elevator_name, name))
 		return true;
 	if (e->elevator_alias && !strcmp(e->elevator_alias, name))
@@ -93,15 +111,21 @@ static bool elevator_match(const struct elevator_type *e, const char *name)
 	return false;
 }
 
-/*
- * Return scheduler with name 'name'
+/**
+ * elevator_find - Find an elevator
+ * @name: Name of the elevator to find
+ * @required_features: Features that the elevator must provide
+ *
+ * Return the first registered scheduler with name @name and supporting the
+ * features @required_features and NULL otherwise.
  */
-static struct elevator_type *elevator_find(const char *name)
+static struct elevator_type *elevator_find(const char *name,
+					   unsigned int required_features)
 {
 	struct elevator_type *e;
 
 	list_for_each_entry(e, &elv_list, list) {
-		if (elevator_match(e, name))
+		if (elevator_match(e, name, required_features))
 			return e;
 	}
 
@@ -120,12 +144,12 @@ static struct elevator_type *elevator_get(struct request_queue *q,
 
 	spin_lock(&elv_list_lock);
 
-	e = elevator_find(name);
+	e = elevator_find(name, q->required_elevator_features);
 	if (!e && try_loading) {
 		spin_unlock(&elv_list_lock);
 		request_module("%s-iosched", name);
 		spin_lock(&elv_list_lock);
-		e = elevator_find(name);
+		e = elevator_find(name, q->required_elevator_features);
 	}
 
 	if (e && !try_module_get(e->elevator_owner))
@@ -525,7 +549,7 @@ int elv_register(struct elevator_type *e)
 
 	/* register, don't allow duplicate names */
 	spin_lock(&elv_list_lock);
-	if (elevator_find(e->elevator_name)) {
+	if (elevator_find(e->elevator_name, 0)) {
 		spin_unlock(&elv_list_lock);
 		kmem_cache_destroy(e->icq_cache);
 		return -EBUSY;
@@ -709,7 +733,8 @@ static int __elevator_change(struct request_queue *q, const char *name)
 	if (!e)
 		return -EINVAL;
 
-	if (q->elevator && elevator_match(q->elevator->type, elevator_name)) {
+	if (q->elevator &&
+	    elevator_match(q->elevator->type, elevator_name, 0)) {
 		elevator_put(e);
 		return 0;
 	}
@@ -749,11 +774,13 @@ ssize_t elv_iosched_show(struct request_queue *q, char *name)
 
 	spin_lock(&elv_list_lock);
 	list_for_each_entry(__e, &elv_list, list) {
-		if (elv && elevator_match(elv, __e->elevator_name)) {
+		if (elv && elevator_match(elv, __e->elevator_name, 0)) {
 			len += sprintf(name+len, "[%s] ", elv->elevator_name);
 			continue;
 		}
-		if (elv_support_iosched(q))
+		if (elv_support_iosched(q) &&
+		    elevator_match(__e, __e->elevator_name,
+				   q->required_elevator_features))
 			len += sprintf(name+len, "%s ", __e->elevator_name);
 	}
 	spin_unlock(&elv_list_lock);
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 35e84bc0ec8c..b490f47fd553 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -794,6 +794,7 @@ static struct elevator_type mq_deadline = {
 	.elevator_attrs = deadline_attrs,
 	.elevator_name = "mq-deadline",
 	.elevator_alias = "deadline",
+	.elevator_features = ELEVATOR_F_ZBD_SEQ_WRITE,
 	.elevator_owner = THIS_MODULE,
 };
 MODULE_ALIAS("mq-deadline-iosched");
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d0ad21e4771b..b196124e3240 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -496,6 +496,8 @@ struct request_queue {
 
 	struct queue_limits	limits;
 
+	unsigned int		required_elevator_features;
+
 #ifdef CONFIG_BLK_DEV_ZONED
 	/*
 	 * Zoned block device information for request dispatch control.
@@ -1097,6 +1099,8 @@ extern void blk_queue_dma_alignment(struct request_queue *, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_rq_timeout(struct request_queue *, unsigned int);
 extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fua);
+extern void blk_queue_required_elevator_features(struct request_queue *q,
+						 unsigned int features);
 
 /*
  * Number of physical segments as sent to the device.
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 1dd014c9c87b..901bda352dcb 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -76,6 +76,7 @@ struct elevator_type
 	struct elv_fs_entry *elevator_attrs;
 	const char *elevator_name;
 	const char *elevator_alias;
+	const unsigned int elevator_features;
 	struct module *elevator_owner;
 #ifdef CONFIG_BLK_DEBUG_FS
 	const struct blk_mq_debugfs_attr *queue_debugfs_attrs;
@@ -165,5 +166,12 @@ extern struct request *elv_rb_find(struct rb_root *, sector_t);
 #define rq_entry_fifo(ptr)	list_entry((ptr), struct request, queuelist)
 #define rq_fifo_clear(rq)	list_del_init(&(rq)->queuelist)
 
+/*
+ * Elevator features.
+ */
+
+/* Supports zoned block devices sequential write constraint */
+#define ELEVATOR_F_ZBD_SEQ_WRITE	(1U << 0)
+
 #endif /* CONFIG_BLOCK */
 #endif
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 4/7] block: Improve default elevator selection
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (2 preceding siblings ...)
  2019-09-04  8:42 ` [PATCH v3 3/7] block: Introduce elevator features Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  8:51   ` Johannes Thumshirn
  2019-09-04  8:42 ` [PATCH v3 5/7] block: Delay default elevator initialization Damien Le Moal
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

For block devices that do not specify required features, preserve the
current default elevator selection (mq-deadline for single queue
devices, none for multi-queue devices). However, for devices specifying
required features (e.g. zoned block devices ELEVATOR_F_ZBD_SEQ_WRITE
feature), select the first available elevator providing the required
features.

In all cases, default to "none" if no elevator is available or if the
initialization of the default elevator fails.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/elevator.c | 51 +++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index ac7c8ad580ba..520d6b224b74 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -651,9 +651,46 @@ static inline bool elv_support_iosched(struct request_queue *q)
 }
 
 /*
- * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
- * if available, for single queue devices. If deadline isn't available OR
- * deadline initialization fails OR we have multiple queues, default to "none".
+ * For single queue devices, default to using mq-deadline. If we have multiple
+ * queues or mq-deadline is not available, default to "none".
+ */
+static struct elevator_type *elevator_get_default(struct request_queue *q)
+{
+	if (q->nr_hw_queues != 1)
+		return NULL;
+
+	return elevator_get(q, "mq-deadline", false);
+}
+
+/*
+ * Get the first elevator providing the features required by the request queue.
+ * Default to "none" if no matching elevator is found.
+ */
+static struct elevator_type *elevator_get_by_features(struct request_queue *q)
+{
+	struct elevator_type *e;
+
+	spin_lock(&elv_list_lock);
+
+	list_for_each_entry(e, &elv_list, list) {
+		if (elv_support_features(e->elevator_features,
+					 q->required_elevator_features))
+			break;
+	}
+
+	if (e && !try_module_get(e->elevator_owner))
+		e = NULL;
+
+	spin_unlock(&elv_list_lock);
+
+	return e;
+}
+
+/*
+ * For a device queue that has no required features, use the default elevator
+ * settings. Otherwise, use the first elevator available matching the required
+ * features. If no suitable elevator is find or if the chosen elevator
+ * initialization fails, fall back to the "none" elevator (no elevator).
  */
 void elevator_init_mq(struct request_queue *q)
 {
@@ -663,15 +700,15 @@ void elevator_init_mq(struct request_queue *q)
 	if (!elv_support_iosched(q))
 		return;
 
-	if (q->nr_hw_queues != 1)
-		return;
-
 	WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags));
 
 	if (unlikely(q->elevator))
 		return;
 
-	e = elevator_get(q, "mq-deadline", false);
+	if (!q->required_elevator_features)
+		e = elevator_get_default(q);
+	else
+		e = elevator_get_by_features(q);
 	if (!e)
 		return;
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 5/7] block: Delay default elevator initialization
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (3 preceding siblings ...)
  2019-09-04  8:42 ` [PATCH v3 4/7] block: Improve default elevator selection Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  8:56   ` Johannes Thumshirn
  2019-09-04  9:29   ` Ming Lei
  2019-09-04  8:42 ` [PATCH v3 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
  2019-09-04  8:42 ` [PATCH v3 7/7] sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
  6 siblings, 2 replies; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
the only information known about the device is the number of hardware
queues as the block device scan by the device driver is not completed
yet. The device type and the device required features are not set yet,
preventing to correctly choose the default elevator most suitable for
the device.

This currently affects all multi-queue zoned block devices which default
to the "none" elevator instead of the required "mq-deadline" elevator.
These drives currently include host-managed SMR disks connected to a
smartpqi HBA and null_blk block devices with zoned mode enabled.
Upcoming NVMe Zoned Namespace devices will also be affected.

Fix this by moving the execution of elevator_init_mq() from
blk_mq_init_allocated_queue() into __device_add_disk() to allow for the
device driver to probe the device characteristics and set attributes
of the device request queue prior to the elevator initialization.

Also to make sure that the elevator initialization is never done while
requests are in-flight (there should be none when the device driver
calls device_add_disk()), freeze and quiesce the device request queue
before executing blk_mq_init_sched().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-mq.c   | 2 --
 block/elevator.c | 7 +++++++
 block/genhd.c    | 8 ++++++++
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ee4caf0c0807..a37503984206 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2902,8 +2902,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	blk_mq_add_queue_tag_set(set, q);
 	blk_mq_map_swqueue(q);
 
-	elevator_init_mq(q);
-
 	return q;
 
 err_hctxs:
diff --git a/block/elevator.c b/block/elevator.c
index 520d6b224b74..096a670d22d7 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -712,7 +712,14 @@ void elevator_init_mq(struct request_queue *q)
 	if (!e)
 		return;
 
+	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
+
 	err = blk_mq_init_sched(q, e);
+
+	blk_mq_unquiesce_queue(q);
+	blk_mq_unfreeze_queue(q);
+
 	if (err) {
 		pr_warn("\"%s\" elevator initialization failed, "
 			"falling back to \"none\"\n", e->elevator_name);
diff --git a/block/genhd.c b/block/genhd.c
index 54f1f0d381f4..7380dd7b2257 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -695,6 +695,13 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 	dev_t devt;
 	int retval;
 
+	/*
+	 * The disk queue should now be all set with enough information about
+	 * the device for the elevator code to pick an adequate default
+	 * elevator.
+	 */
+	elevator_init_mq(disk->queue);
+
 	/* minors == 0 indicates to use ext devt from part0 and should
 	 * be accompanied with EXT_DEVT flag.  Make sure all
 	 * parameters make sense.
@@ -734,6 +741,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 				    exact_match, exact_lock, disk);
 	}
 	register_disk(parent, disk, groups);
+
 	if (register_queue)
 		blk_register_queue(disk);
 
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (4 preceding siblings ...)
  2019-09-04  8:42 ` [PATCH v3 5/7] block: Delay default elevator initialization Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  8:52   ` Johannes Thumshirn
  2019-09-04  8:42 ` [PATCH v3 7/7] sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
  6 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Using the helper blk_queue_required_elevator_features(), set the
elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request
queue of null_blk devices created with zoned mode enabled.

This feature requirement can always be satisfied as the mq-deadline
elevator is always selected for in-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/null_blk_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
index b26a178d064d..b29b273690b0 100644
--- a/drivers/block/null_blk_main.c
+++ b/drivers/block/null_blk_main.c
@@ -1695,6 +1695,8 @@ static int null_add_dev(struct nullb_device *dev)
 		blk_queue_chunk_sectors(nullb->q, dev->zone_size_sects);
 		nullb->q->limits.zoned = BLK_ZONED_HM;
 		blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, nullb->q);
+		blk_queue_required_elevator_features(nullb->q,
+						ELEVATOR_F_ZBD_SEQ_WRITE);
 	}
 
 	nullb->q->queuedata = nullb;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 7/7] sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks
  2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (5 preceding siblings ...)
  2019-09-04  8:42 ` [PATCH v3 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
@ 2019-09-04  8:42 ` Damien Le Moal
  2019-09-04  8:52   ` Johannes Thumshirn
  6 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Using the helper blk_queue_required_elevator_features(), set the
elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request
queue of SCSI ZBC disks.

This feature requirement can always be satisfied as the mq-deadline
elevator is always selected for in-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd_zbc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 367614f0e34f..de4019dc0f0b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -493,6 +493,8 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	blk_queue_chunk_sectors(sdkp->disk->queue,
 			logical_to_sectors(sdkp->device, zone_blocks));
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, sdkp->disk->queue);
+	blk_queue_required_elevator_features(sdkp->disk->queue,
+					     ELEVATOR_F_ZBD_SEQ_WRITE);
 	nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
 
 	/* READ16/WRITE16 is mandatory for ZBC disks */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 4/7] block: Improve default elevator selection
  2019-09-04  8:42 ` [PATCH v3 4/7] block: Improve default elevator selection Damien Le Moal
@ 2019-09-04  8:51   ` Johannes Thumshirn
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Thumshirn @ 2019-09-04  8:51 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks
  2019-09-04  8:42 ` [PATCH v3 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
@ 2019-09-04  8:52   ` Johannes Thumshirn
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Thumshirn @ 2019-09-04  8:52 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 7/7] sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks
  2019-09-04  8:42 ` [PATCH v3 7/7] sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
@ 2019-09-04  8:52   ` Johannes Thumshirn
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Thumshirn @ 2019-09-04  8:52 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/7] block: Delay default elevator initialization
  2019-09-04  8:42 ` [PATCH v3 5/7] block: Delay default elevator initialization Damien Le Moal
@ 2019-09-04  8:56   ` Johannes Thumshirn
  2019-09-04  9:02     ` Damien Le Moal
  2019-09-04  9:29   ` Ming Lei
  1 sibling, 1 reply; 18+ messages in thread
From: Johannes Thumshirn @ 2019-09-04  8:56 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On 04/09/2019 10:42, Damien Le Moal wrote:
> @@ -734,6 +741,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>  				    exact_match, exact_lock, disk);
>  	}
>  	register_disk(parent, disk, groups);
> +
>  	if (register_queue)
>  		blk_register_queue(disk);

That hunk looks unrelated, but anyways:
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 1/7] block: Cleanup elevator_init_mq() use
  2019-09-04  8:42 ` [PATCH v3 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
@ 2019-09-04  9:02   ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2019-09-04  9:02 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Sep 04, 2019 at 05:42:41PM +0900, Damien Le Moal wrote:
> Instead of checking a queue tag_set BLK_MQ_F_NO_SCHED flag before
> calling elevator_init_mq() to make sure that the queue supports IO
> scheduling, use the elevator.c function elv_support_iosched() in
> elevator_init_mq(). This does not introduce any functional change but
> ensure that elevator_init_mq() does the right thing based on the queue
> settings.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-mq.c   |  8 +++-----
>  block/elevator.c | 23 +++++++++++++----------
>  2 files changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index b622029b19ea..13923630e00a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2904,11 +2904,9 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  	blk_mq_add_queue_tag_set(set, q);
>  	blk_mq_map_swqueue(q);
>  
> -	if (!(set->flags & BLK_MQ_F_NO_SCHED)) {
> -		ret = elevator_init_mq(q);
> -		if (ret)
> -			goto err_tag_set;
> -	}
> +	ret = elevator_init_mq(q);
> +	if (ret)
> +		goto err_tag_set;
>  
>  	return q;
>  
> diff --git a/block/elevator.c b/block/elevator.c
> index 86100de88883..4721834815bb 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -619,16 +619,26 @@ int elevator_switch_mq(struct request_queue *q,
>  	return ret;
>  }
>  
> +static inline bool elv_support_iosched(struct request_queue *q)
> +{
> +	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
> +		return false;
> +	return true;
> +}
> +
>  /*
> - * For blk-mq devices, we default to using mq-deadline, if available, for single
> - * queue devices.  If deadline isn't available OR we have multiple queues,
> - * default to "none".
> + * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
> + * if available, for single queue devices. If deadline isn't available OR we
> + * have multiple queues, default to "none".
>   */
>  int elevator_init_mq(struct request_queue *q)
>  {
>  	struct elevator_type *e;
>  	int err = 0;
>  
> +	if (!elv_support_iosched(q))
> +		return 0;
> +
>  	if (q->nr_hw_queues != 1)
>  		return 0;
>  
> @@ -706,13 +716,6 @@ static int __elevator_change(struct request_queue *q, const char *name)
>  	return elevator_switch(q, e);
>  }
>  
> -static inline bool elv_support_iosched(struct request_queue *q)
> -{
> -	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
> -		return false;
> -	return true;
> -}
> -
>  ssize_t elv_iosched_store(struct request_queue *q, const char *name,
>  			  size_t count)
>  {
> -- 
> 2.21.0
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/7] block: Delay default elevator initialization
  2019-09-04  8:56   ` Johannes Thumshirn
@ 2019-09-04  9:02     ` Damien Le Moal
  2019-09-04 12:56       ` Jens Axboe
  0 siblings, 1 reply; 18+ messages in thread
From: Damien Le Moal @ 2019-09-04  9:02 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-block, Jens Axboe, linux-scsi,
	Martin K . Petersen

On 2019/09/04 17:56, Johannes Thumshirn wrote:
> On 04/09/2019 10:42, Damien Le Moal wrote:
>> @@ -734,6 +741,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>>  				    exact_match, exact_lock, disk);
>>  	}
>>  	register_disk(parent, disk, groups);
>> +
>>  	if (register_queue)
>>  		blk_register_queue(disk);
> 
> That hunk looks unrelated, but anyways:
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

Oops. Yes, did not delete the blank line when I moved elevator_init_mq() call.
Jens, should I resend a v4 to fix this ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 2/7] block: Change elevator_init_mq() to always succeed
  2019-09-04  8:42 ` [PATCH v3 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
@ 2019-09-04  9:05   ` Ming Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Ming Lei @ 2019-09-04  9:05 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Sep 04, 2019 at 05:42:42PM +0900, Damien Le Moal wrote:
> If the default elevator chosen is mq-deadline, elevator_init_mq() may
> return an error if mq-deadline initialization fails, leading to
> blk_mq_init_allocated_queue() returning an error, which in turn will
> cause the block device initialization to fail and the device not being
> exposed.
> 
> Instead of taking such extreme measure, handle mq-deadline
> initialization failures in the same manner as when mq-deadline is not
> available (no module to load), that is, default to the "none" scheduler.
> With this change, elevator_init_mq() return type can be changed to void.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-mq.c   |  8 +-------
>  block/blk.h      |  2 +-
>  block/elevator.c | 23 ++++++++++++-----------
>  3 files changed, 14 insertions(+), 19 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 13923630e00a..ee4caf0c0807 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2842,8 +2842,6 @@ static unsigned int nr_hw_queues(struct blk_mq_tag_set *set)
>  struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  						  struct request_queue *q)
>  {
> -	int ret = -ENOMEM;
> -
>  	/* mark the queue as mq asap */
>  	q->mq_ops = set->ops;
>  
> @@ -2904,14 +2902,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  	blk_mq_add_queue_tag_set(set, q);
>  	blk_mq_map_swqueue(q);
>  
> -	ret = elevator_init_mq(q);
> -	if (ret)
> -		goto err_tag_set;
> +	elevator_init_mq(q);
>  
>  	return q;
>  
> -err_tag_set:
> -	blk_mq_del_queue_tag_set(q);
>  err_hctxs:
>  	kfree(q->queue_hw_ctx);
>  	q->nr_hw_queues = 0;
> diff --git a/block/blk.h b/block/blk.h
> index e4619fc5c99a..ed347f7a97b1 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -184,7 +184,7 @@ void blk_account_io_done(struct request *req, u64 now);
>  
>  void blk_insert_flush(struct request *rq);
>  
> -int elevator_init_mq(struct request_queue *q);
> +void elevator_init_mq(struct request_queue *q);
>  int elevator_switch_mq(struct request_queue *q,
>  			      struct elevator_type *new_e);
>  void __elevator_exit(struct request_queue *, struct elevator_queue *);
> diff --git a/block/elevator.c b/block/elevator.c
> index 4721834815bb..2944c129760c 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -628,34 +628,35 @@ static inline bool elv_support_iosched(struct request_queue *q)
>  
>  /*
>   * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
> - * if available, for single queue devices. If deadline isn't available OR we
> - * have multiple queues, default to "none".
> + * if available, for single queue devices. If deadline isn't available OR
> + * deadline initialization fails OR we have multiple queues, default to "none".
>   */
> -int elevator_init_mq(struct request_queue *q)
> +void elevator_init_mq(struct request_queue *q)
>  {
>  	struct elevator_type *e;
> -	int err = 0;
> +	int err;
>  
>  	if (!elv_support_iosched(q))
> -		return 0;
> +		return;
>  
>  	if (q->nr_hw_queues != 1)
> -		return 0;
> +		return;
>  
>  	WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags));
>  
>  	if (unlikely(q->elevator))
> -		goto out;
> +		return;
>  
>  	e = elevator_get(q, "mq-deadline", false);
>  	if (!e)
> -		goto out;
> +		return;
>  
>  	err = blk_mq_init_sched(q, e);
> -	if (err)
> +	if (err) {
> +		pr_warn("\"%s\" elevator initialization failed, "
> +			"falling back to \"none\"\n", e->elevator_name);
>  		elevator_put(e);
> -out:
> -	return err;
> +	}
>  }

Looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

BTW, blk_mq_init_sched()'s failure patch should have restored
q->nr_request. And that could be done in another standalone patch.

-- 
Ming

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/7] block: Delay default elevator initialization
  2019-09-04  8:42 ` [PATCH v3 5/7] block: Delay default elevator initialization Damien Le Moal
  2019-09-04  8:56   ` Johannes Thumshirn
@ 2019-09-04  9:29   ` Ming Lei
  1 sibling, 0 replies; 18+ messages in thread
From: Ming Lei @ 2019-09-04  9:29 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen,
	Mike Snitzer, dm-devel

On Wed, Sep 04, 2019 at 05:42:45PM +0900, Damien Le Moal wrote:
> When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
> the only information known about the device is the number of hardware
> queues as the block device scan by the device driver is not completed
> yet. The device type and the device required features are not set yet,
> preventing to correctly choose the default elevator most suitable for
> the device.
> 
> This currently affects all multi-queue zoned block devices which default
> to the "none" elevator instead of the required "mq-deadline" elevator.
> These drives currently include host-managed SMR disks connected to a
> smartpqi HBA and null_blk block devices with zoned mode enabled.
> Upcoming NVMe Zoned Namespace devices will also be affected.
> 
> Fix this by moving the execution of elevator_init_mq() from
> blk_mq_init_allocated_queue() into __device_add_disk() to allow for the
> device driver to probe the device characteristics and set attributes
> of the device request queue prior to the elevator initialization.
> 
> Also to make sure that the elevator initialization is never done while
> requests are in-flight (there should be none when the device driver
> calls device_add_disk()), freeze and quiesce the device request queue
> before executing blk_mq_init_sched().
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  block/blk-mq.c   | 2 --
>  block/elevator.c | 7 +++++++
>  block/genhd.c    | 8 ++++++++
>  3 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index ee4caf0c0807..a37503984206 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2902,8 +2902,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  	blk_mq_add_queue_tag_set(set, q);
>  	blk_mq_map_swqueue(q);
>  
> -	elevator_init_mq(q);
> -
>  	return q;
>  
>  err_hctxs:
> diff --git a/block/elevator.c b/block/elevator.c
> index 520d6b224b74..096a670d22d7 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -712,7 +712,14 @@ void elevator_init_mq(struct request_queue *q)
>  	if (!e)
>  		return;
>  
> +	blk_mq_freeze_queue(q);
> +	blk_mq_quiesce_queue(q);
> +
>  	err = blk_mq_init_sched(q, e);
> +
> +	blk_mq_unquiesce_queue(q);
> +	blk_mq_unfreeze_queue(q);
> +
>  	if (err) {
>  		pr_warn("\"%s\" elevator initialization failed, "
>  			"falling back to \"none\"\n", e->elevator_name);
> diff --git a/block/genhd.c b/block/genhd.c
> index 54f1f0d381f4..7380dd7b2257 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -695,6 +695,13 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>  	dev_t devt;
>  	int retval;
>  
> +	/*
> +	 * The disk queue should now be all set with enough information about
> +	 * the device for the elevator code to pick an adequate default
> +	 * elevator.
> +	 */
> +	elevator_init_mq(disk->queue);
> +

For dm-rq, add_disk_no_queue_reg() is called before blk_mq_init_allocated_queue().

That means this patch actually sets elevator early for dm-rq, and I
guess this way may not work as expected since hw/sw queues aren't allocated
yet.


Thanks,
Ming

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/7] block: Delay default elevator initialization
  2019-09-04  9:02     ` Damien Le Moal
@ 2019-09-04 12:56       ` Jens Axboe
  2019-09-05  4:30         ` Damien Le Moal
  0 siblings, 1 reply; 18+ messages in thread
From: Jens Axboe @ 2019-09-04 12:56 UTC (permalink / raw)
  To: Damien Le Moal, Johannes Thumshirn, linux-block, linux-scsi,
	Martin K . Petersen

On 9/4/19 3:02 AM, Damien Le Moal wrote:
> On 2019/09/04 17:56, Johannes Thumshirn wrote:
>> On 04/09/2019 10:42, Damien Le Moal wrote:
>>> @@ -734,6 +741,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>>>   				    exact_match, exact_lock, disk);
>>>   	}
>>>   	register_disk(parent, disk, groups);
>>> +
>>>   	if (register_queue)
>>>   		blk_register_queue(disk);
>>
>> That hunk looks unrelated, but anyways:
>> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
> 
> Oops. Yes, did not delete the blank line when I moved elevator_init_mq() call.
> Jens, should I resend a v4 to fix this ?

Series looks good to me, I'll just delete this one hunk, not a big deal.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 5/7] block: Delay default elevator initialization
  2019-09-04 12:56       ` Jens Axboe
@ 2019-09-05  4:30         ` Damien Le Moal
  0 siblings, 0 replies; 18+ messages in thread
From: Damien Le Moal @ 2019-09-05  4:30 UTC (permalink / raw)
  To: Jens Axboe, Johannes Thumshirn, linux-block, linux-scsi,
	Martin K . Petersen

On 2019/09/04 21:57, Jens Axboe wrote:
> On 9/4/19 3:02 AM, Damien Le Moal wrote:
>> On 2019/09/04 17:56, Johannes Thumshirn wrote:
>>> On 04/09/2019 10:42, Damien Le Moal wrote:
>>>> @@ -734,6 +741,7 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
>>>>   				    exact_match, exact_lock, disk);
>>>>   	}
>>>>   	register_disk(parent, disk, groups);
>>>> +
>>>>   	if (register_queue)
>>>>   		blk_register_queue(disk);
>>>
>>> That hunk looks unrelated, but anyways:
>>> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
>>
>> Oops. Yes, did not delete the blank line when I moved elevator_init_mq() call.
>> Jens, should I resend a v4 to fix this ?
> 
> Series looks good to me, I'll just delete this one hunk, not a big deal.
> 

Jens,

Thanks. But Ming's comment needed to be addressed, which I did in the V4 I just
sent out. I removed the white line chunk too.

Best regards.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, back to index

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-04  8:42 [PATCH v3 0/7] Elevator cleanups and improvements Damien Le Moal
2019-09-04  8:42 ` [PATCH v3 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
2019-09-04  9:02   ` Ming Lei
2019-09-04  8:42 ` [PATCH v3 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
2019-09-04  9:05   ` Ming Lei
2019-09-04  8:42 ` [PATCH v3 3/7] block: Introduce elevator features Damien Le Moal
2019-09-04  8:42 ` [PATCH v3 4/7] block: Improve default elevator selection Damien Le Moal
2019-09-04  8:51   ` Johannes Thumshirn
2019-09-04  8:42 ` [PATCH v3 5/7] block: Delay default elevator initialization Damien Le Moal
2019-09-04  8:56   ` Johannes Thumshirn
2019-09-04  9:02     ` Damien Le Moal
2019-09-04 12:56       ` Jens Axboe
2019-09-05  4:30         ` Damien Le Moal
2019-09-04  9:29   ` Ming Lei
2019-09-04  8:42 ` [PATCH v3 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
2019-09-04  8:52   ` Johannes Thumshirn
2019-09-04  8:42 ` [PATCH v3 7/7] sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
2019-09-04  8:52   ` Johannes Thumshirn

Linux-SCSI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-scsi/0 linux-scsi/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-scsi linux-scsi/ https://lore.kernel.org/linux-scsi \
		linux-scsi@vger.kernel.org linux-scsi@archiver.kernel.org
	public-inbox-index linux-scsi


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-scsi


AGPL code for this site: git clone https://public-inbox.org/ public-inbox