All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Elevator cleanups and improvements
@ 2019-08-28  2:29 Damien Le Moal
  2019-08-28  2:29 ` [PATCH v2 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

This patch series implements some cleanup of the elevator initialization
code and introduces elevator features identification and device matching
to enhance checks for elevator/device compatibility and fitness.

The first 2 patches of the series are simple cleanups which simplify 
elevator initialization for newly allocated device queues.

Patch 3 introduce elevator features, allowing a clean and extensible
definition of devices and features that an elevator supports and match
these against features required by a block device. With this, the sysfs
elevator list for a device always shows only elevators matching the
features that a particular device requires, with the exception of the
none elevator which has no features but is always available for use
with any device.

The first feature defined is for zoned block device sequential write
constraint support through zone write locking which prevents the use of
any elevator that does not support this feature with zoned devices.

The last 4 patches of this series rework the default elevator selection
and initialization to allow for the elevator/device features matching
to work, doing so addressing cases not currently well supported, namely,
multi-queue zoned block devices.

Damien Le Moal (7):
  block: Cleanup elevator_init_mq() use
  block: Change elevator_init_mq() to always succeed
  block: Introduce elevator features
  block: Improve default elevator selection
  block: Delay default elevator initialization
  block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks
  scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks

 block/blk-mq.c                |  10 ---
 block/blk-settings.c          |  16 ++++
 block/blk.h                   |   2 +-
 block/elevator.c              | 138 ++++++++++++++++++++++++++--------
 block/genhd.c                 |   3 +
 block/mq-deadline.c           |   1 +
 drivers/block/null_blk_main.c |   2 +
 drivers/scsi/sd_zbc.c         |   2 +
 include/linux/blkdev.h        |   4 +
 include/linux/elevator.h      |   8 ++
 10 files changed, 145 insertions(+), 41 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/7] block: Cleanup elevator_init_mq() use
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-09-03  8:55   ` Christoph Hellwig
  2019-08-28  2:29 ` [PATCH v2 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Instead of checking a queue tag_set BLK_MQ_F_NO_SCHED flag before
calling elevator_init_mq() to make sure that the queue supports IO
scheduling, use the elevator.c function elv_support_iosched() in
elevator_init_mq(). This does not introduce any functional change but
ensure that elevator_init_mq() does the right thing based on the queue
settings.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c   |  8 +++-----
 block/elevator.c | 23 +++++++++++++----------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f6620a30752e..de3aeafe48eb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2895,11 +2895,9 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	blk_mq_add_queue_tag_set(set, q);
 	blk_mq_map_swqueue(q);
 
-	if (!(set->flags & BLK_MQ_F_NO_SCHED)) {
-		ret = elevator_init_mq(q);
-		if (ret)
-			goto err_tag_set;
-	}
+	ret = elevator_init_mq(q);
+	if (ret)
+		goto err_tag_set;
 
 	return q;
 
diff --git a/block/elevator.c b/block/elevator.c
index 4781c4205a5d..ab4d50c0ed43 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -633,16 +633,26 @@ int elevator_switch_mq(struct request_queue *q,
 	return ret;
 }
 
+static inline bool elv_support_iosched(struct request_queue *q)
+{
+	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
+		return false;
+	return true;
+}
+
 /*
- * For blk-mq devices, we default to using mq-deadline, if available, for single
- * queue devices.  If deadline isn't available OR we have multiple queues,
- * default to "none".
+ * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
+ * if available, for single queue devices. If deadline isn't available OR we
+ * have multiple queues, default to "none".
  */
 int elevator_init_mq(struct request_queue *q)
 {
 	struct elevator_type *e;
 	int err = 0;
 
+	if (!elv_support_iosched(q))
+		return 0;
+
 	if (q->nr_hw_queues != 1)
 		return 0;
 
@@ -720,13 +730,6 @@ static int __elevator_change(struct request_queue *q, const char *name)
 	return elevator_switch(q, e);
 }
 
-static inline bool elv_support_iosched(struct request_queue *q)
-{
-	if (q->tag_set && (q->tag_set->flags & BLK_MQ_F_NO_SCHED))
-		return false;
-	return true;
-}
-
 ssize_t elv_iosched_store(struct request_queue *q, const char *name,
 			  size_t count)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/7] block: Change elevator_init_mq() to always succeed
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
  2019-08-28  2:29 ` [PATCH v2 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-09-03  8:55   ` Christoph Hellwig
  2019-08-28  2:29 ` [PATCH v2 3/7] block: Introduce elevator features Damien Le Moal
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

If the default elevator chosen is mq-deadline, elevator_init_mq() may
return an error if mq-deadline initialization fails, leading to
blk_mq_init_allocated_queue() returning an error, which in turn will
cause the block device initialization to fail and the device not being
exposed.

Instead of taking such extreme measure, handle mq-deadline
initialization failures in the same manner as when mq-deadline is not
available (no module to load), that is, default to the "none" scheduler.
With this change, elevator_init_mq() return type can be changed to void.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/blk-mq.c   |  8 +-------
 block/blk.h      |  2 +-
 block/elevator.c | 23 ++++++++++++-----------
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index de3aeafe48eb..0c9b1f403db8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2833,8 +2833,6 @@ static unsigned int nr_hw_queues(struct blk_mq_tag_set *set)
 struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 						  struct request_queue *q)
 {
-	int ret = -ENOMEM;
-
 	/* mark the queue as mq asap */
 	q->mq_ops = set->ops;
 
@@ -2895,14 +2893,10 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	blk_mq_add_queue_tag_set(set, q);
 	blk_mq_map_swqueue(q);
 
-	ret = elevator_init_mq(q);
-	if (ret)
-		goto err_tag_set;
+	elevator_init_mq(q);
 
 	return q;
 
-err_tag_set:
-	blk_mq_del_queue_tag_set(q);
 err_hctxs:
 	kfree(q->queue_hw_ctx);
 	q->nr_hw_queues = 0;
diff --git a/block/blk.h b/block/blk.h
index e4619fc5c99a..ed347f7a97b1 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -184,7 +184,7 @@ void blk_account_io_done(struct request *req, u64 now);
 
 void blk_insert_flush(struct request *rq);
 
-int elevator_init_mq(struct request_queue *q);
+void elevator_init_mq(struct request_queue *q);
 int elevator_switch_mq(struct request_queue *q,
 			      struct elevator_type *new_e);
 void __elevator_exit(struct request_queue *, struct elevator_queue *);
diff --git a/block/elevator.c b/block/elevator.c
index ab4d50c0ed43..06b70981a054 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -642,34 +642,35 @@ static inline bool elv_support_iosched(struct request_queue *q)
 
 /*
  * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
- * if available, for single queue devices. If deadline isn't available OR we
- * have multiple queues, default to "none".
+ * if available, for single queue devices. If deadline isn't available OR
+ * deadline initialization fails OR we have multiple queues, default to "none".
  */
-int elevator_init_mq(struct request_queue *q)
+void elevator_init_mq(struct request_queue *q)
 {
 	struct elevator_type *e;
-	int err = 0;
+	int err;
 
 	if (!elv_support_iosched(q))
-		return 0;
+		return;
 
 	if (q->nr_hw_queues != 1)
-		return 0;
+		return;
 
 	WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags));
 
 	if (unlikely(q->elevator))
-		goto out;
+		return;
 
 	e = elevator_get(q, "mq-deadline", false);
 	if (!e)
-		goto out;
+		return;
 
 	err = blk_mq_init_sched(q, e);
-	if (err)
+	if (err) {
+		pr_warn("\"%s\" elevator initialization failed, "
+			"falling back to \"none\"\n", e->elevator_name);
 		elevator_put(e);
-out:
-	return err;
+	}
 }
 
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
  2019-08-28  2:29 ` [PATCH v2 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
  2019-08-28  2:29 ` [PATCH v2 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-08-28  8:16   ` Johannes Thumshirn
  2019-09-03  8:56   ` Christoph Hellwig
  2019-08-28  2:29 ` [PATCH v2 4/7] block: Improve default elevator selection Damien Le Moal
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Introduce the definition of elevator features through the
elevator_features flags in the elevator_type structure. Each flag can
represent a feature supported by an elevator. The first feature defined
by this patch is support for zoned block device sequential write
constraint with the flag ELEVATOR_F_ZBD_SEQ_WRITE, which is implemented
by the mq-deadline elevator using zone write locking.

Other possible features are IO priorities, write hints, latency targets
or single-LUN dual-actuator disks (for which the elevator could maintain
one LBA ordered list per actuator).

The required_elevator_features field is also added to the request_queue
structure to allow a device driver to specify elevator feature flags
that an elevator must support for the correct operation of the device
(e.g. device drivers for zoned block devices can have the
ELEVATOR_F_ZBD_SEQ_WRITE flag as a required feature).
The helper function blk_queue_required_elevator_features() is
defined for setting this new field.

With these two new fields in place, the elevator functions
elevator_match() and elevator_find() are modified to allow a user to set
only an elevator with a set of features that satisfies the device
required features. Elevators not matching the device requirements are
not shown in the device sysfs queue/scheduler file to prevent their use.

The "none" elevator can always be selected as before.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-settings.c     | 16 +++++++++++++
 block/elevator.c         | 49 +++++++++++++++++++++++++++++++---------
 block/mq-deadline.c      |  1 +
 include/linux/blkdev.h   |  4 ++++
 include/linux/elevator.h |  8 +++++++
 5 files changed, 67 insertions(+), 11 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 2c1831207a8f..ed76dd4cb5d0 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -832,6 +832,22 @@ void blk_queue_write_cache(struct request_queue *q, bool wc, bool fua)
 }
 EXPORT_SYMBOL_GPL(blk_queue_write_cache);
 
+/**
+ * blk_queue_required_elevator_features - Set a queue required elevator features
+ * @q:		the request queue for the target device
+ * @features:	Required elevator features OR'ed together
+ *
+ * Tell the block layer that for the device controlled through @q, only the
+ * only elevators that can be used are those that implement at least the set of
+ * features specified by @features.
+ */
+void blk_queue_required_elevator_features(struct request_queue *q,
+					  unsigned int features)
+{
+	q->required_elevator_features = features;
+}
+EXPORT_SYMBOL_GPL(blk_queue_required_elevator_features);
+
 static int __init blk_settings_init(void)
 {
 	blk_max_low_pfn = max_low_pfn - 1;
diff --git a/block/elevator.c b/block/elevator.c
index 06b70981a054..2235dfe6755b 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -83,8 +83,26 @@ bool elv_bio_merge_ok(struct request *rq, struct bio *bio)
 }
 EXPORT_SYMBOL(elv_bio_merge_ok);
 
-static bool elevator_match(const struct elevator_type *e, const char *name)
+static inline bool elv_support_features(unsigned int elv_features,
+					unsigned int required_features)
 {
+	return (required_features & elv_features) == required_features;
+}
+
+/**
+ * elevator_match - Test an elevator name and features
+ * @e: Scheduler to test
+ * @name: Elevator name to test
+ * @required_features: Features that the elevator must provide
+ *
+ * Return true is the elevator @e name matches @name and if @e provides all the
+ * the feratures spcified by @required_features.
+ */
+static bool elevator_match(const struct elevator_type *e, const char *name,
+			   unsigned int required_features)
+{
+	if (!elv_support_features(e->elevator_features, required_features))
+		return false;
 	if (!strcmp(e->elevator_name, name))
 		return true;
 	if (e->elevator_alias && !strcmp(e->elevator_alias, name))
@@ -93,15 +111,21 @@ static bool elevator_match(const struct elevator_type *e, const char *name)
 	return false;
 }
 
-/*
- * Return scheduler with name 'name'
+/**
+ * elevator_find - Find an elevator
+ * @name: Name of the elevator to find
+ * @required_features: Features that the elevator must provide
+ *
+ * Return the first registered scheduler with name @name and supporting the
+ * features @required_features and NULL otherwise.
  */
-static struct elevator_type *elevator_find(const char *name)
+static struct elevator_type *elevator_find(const char *name,
+					   unsigned int required_features)
 {
 	struct elevator_type *e;
 
 	list_for_each_entry(e, &elv_list, list) {
-		if (elevator_match(e, name))
+		if (elevator_match(e, name, required_features))
 			return e;
 	}
 
@@ -120,12 +144,12 @@ static struct elevator_type *elevator_get(struct request_queue *q,
 
 	spin_lock(&elv_list_lock);
 
-	e = elevator_find(name);
+	e = elevator_find(name, q->required_elevator_features);
 	if (!e && try_loading) {
 		spin_unlock(&elv_list_lock);
 		request_module("%s-iosched", name);
 		spin_lock(&elv_list_lock);
-		e = elevator_find(name);
+		e = elevator_find(name, q->required_elevator_features);
 	}
 
 	if (e && !try_module_get(e->elevator_owner))
@@ -539,7 +563,7 @@ int elv_register(struct elevator_type *e)
 
 	/* register, don't allow duplicate names */
 	spin_lock(&elv_list_lock);
-	if (elevator_find(e->elevator_name)) {
+	if (elevator_find(e->elevator_name, 0)) {
 		spin_unlock(&elv_list_lock);
 		kmem_cache_destroy(e->icq_cache);
 		return -EBUSY;
@@ -723,7 +747,8 @@ static int __elevator_change(struct request_queue *q, const char *name)
 	if (!e)
 		return -EINVAL;
 
-	if (q->elevator && elevator_match(q->elevator->type, elevator_name)) {
+	if (q->elevator &&
+	    elevator_match(q->elevator->type, elevator_name, 0)) {
 		elevator_put(e);
 		return 0;
 	}
@@ -763,11 +788,13 @@ ssize_t elv_iosched_show(struct request_queue *q, char *name)
 
 	spin_lock(&elv_list_lock);
 	list_for_each_entry(__e, &elv_list, list) {
-		if (elv && elevator_match(elv, __e->elevator_name)) {
+		if (elv && elevator_match(elv, __e->elevator_name, 0)) {
 			len += sprintf(name+len, "[%s] ", elv->elevator_name);
 			continue;
 		}
-		if (elv_support_iosched(q))
+		if (elv_support_iosched(q) &&
+		    elevator_match(__e, __e->elevator_name,
+				   q->required_elevator_features))
 			len += sprintf(name+len, "%s ", __e->elevator_name);
 	}
 	spin_unlock(&elv_list_lock);
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 2a2a2e82832e..a17466f310f4 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -795,6 +795,7 @@ static struct elevator_type mq_deadline = {
 	.elevator_attrs = deadline_attrs,
 	.elevator_name = "mq-deadline",
 	.elevator_alias = "deadline",
+	.elevator_features = ELEVATOR_F_ZBD_SEQ_WRITE,
 	.elevator_owner = THIS_MODULE,
 };
 MODULE_ALIAS("mq-deadline-iosched");
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1ac790178787..c0fea94acc76 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -492,6 +492,8 @@ struct request_queue {
 
 	struct queue_limits	limits;
 
+	unsigned int		required_elevator_features;
+
 #ifdef CONFIG_BLK_DEV_ZONED
 	/*
 	 * Zoned block device information for request dispatch control.
@@ -1086,6 +1088,8 @@ extern void blk_queue_dma_alignment(struct request_queue *, int);
 extern void blk_queue_update_dma_alignment(struct request_queue *, int);
 extern void blk_queue_rq_timeout(struct request_queue *, unsigned int);
 extern void blk_queue_write_cache(struct request_queue *q, bool enabled, bool fua);
+extern void blk_queue_required_elevator_features(struct request_queue *q,
+						 unsigned int features);
 
 /*
  * Number of physical segments as sent to the device.
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 1dd014c9c87b..901bda352dcb 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -76,6 +76,7 @@ struct elevator_type
 	struct elv_fs_entry *elevator_attrs;
 	const char *elevator_name;
 	const char *elevator_alias;
+	const unsigned int elevator_features;
 	struct module *elevator_owner;
 #ifdef CONFIG_BLK_DEBUG_FS
 	const struct blk_mq_debugfs_attr *queue_debugfs_attrs;
@@ -165,5 +166,12 @@ extern struct request *elv_rb_find(struct rb_root *, sector_t);
 #define rq_entry_fifo(ptr)	list_entry((ptr), struct request, queuelist)
 #define rq_fifo_clear(rq)	list_del_init(&(rq)->queuelist)
 
+/*
+ * Elevator features.
+ */
+
+/* Supports zoned block devices sequential write constraint */
+#define ELEVATOR_F_ZBD_SEQ_WRITE	(1U << 0)
+
 #endif /* CONFIG_BLOCK */
 #endif
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/7] block: Improve default elevator selection
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (2 preceding siblings ...)
  2019-08-28  2:29 ` [PATCH v2 3/7] block: Introduce elevator features Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-09-03  8:57   ` Christoph Hellwig
  2019-08-28  2:29 ` [PATCH v2 5/7] block: Delay default elevator initialization Damien Le Moal
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

For block devices that do not specify required features, preserve the
current default elevator selection (mq-deadline for single queue
devices, none for multi-queue devices). However, for devices specifying
required features (e.g. zoned block devices ELEVATOR_F_ZBD_SEQ_WRITE
feature), select the first available elevator providing the required
features.

In all cases, default to "none" if no elevator is available or if the
initialization of the default elevator fails.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/elevator.c | 48 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 44 insertions(+), 4 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 2235dfe6755b..81d0877dbc34 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -665,9 +665,46 @@ static inline bool elv_support_iosched(struct request_queue *q)
 }
 
 /*
- * For blk-mq devices supporting IO scheduling, we default to using mq-deadline,
- * if available, for single queue devices. If deadline isn't available OR
- * deadline initialization fails OR we have multiple queues, default to "none".
+ * For single queue devices, default to using mq-deadline. If we have multiple
+ * queues or mq-deadline is not available, default to "none".
+ */
+static struct elevator_type *elevator_get_default(struct request_queue *q)
+{
+	if (q->nr_hw_queues != 1)
+		return NULL;
+
+	return elevator_get(q, "mq-deadline", false);
+}
+
+/*
+ * Get the first elevator providing the features required by the request queue.
+ * Default to "none" if no matching elevator is found.
+ */
+static struct elevator_type *elevator_get_by_features(struct request_queue *q)
+{
+	struct elevator_type *e;
+
+	spin_lock(&elv_list_lock);
+
+	list_for_each_entry(e, &elv_list, list) {
+		if (elv_support_features(e->elevator_features,
+					 q->required_elevator_features))
+			break;
+	}
+
+	if (e && !try_module_get(e->elevator_owner))
+		e = NULL;
+
+	spin_unlock(&elv_list_lock);
+
+	return e;
+}
+
+/*
+ * For a device queue that has no required features, use the default elevator
+ * settings. Otherwise, use the first elevator available matching the required
+ * features. If no suitable elevator is find or if the chosen elevator
+ * initialization fails, fall back to the "none" elevator (no elevator).
  */
 void elevator_init_mq(struct request_queue *q)
 {
@@ -685,7 +722,10 @@ void elevator_init_mq(struct request_queue *q)
 	if (unlikely(q->elevator))
 		return;
 
-	e = elevator_get(q, "mq-deadline", false);
+	if (!q->required_elevator_features)
+		e = elevator_get_default(q);
+	else
+		e = elevator_get_by_features(q);
 	if (!e)
 		return;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/7] block: Delay default elevator initialization
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (3 preceding siblings ...)
  2019-08-28  2:29 ` [PATCH v2 4/7] block: Improve default elevator selection Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-09-03  9:02   ` Christoph Hellwig
  2019-08-28  2:29 ` [PATCH v2 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
  2019-08-28  2:29 ` [PATCH v2 7/7] scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
  6 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
the only information known about the device is the number of hardware
queues as the block device scan by the device driver is not completed
yet. The device type and the device required features are not set yet,
preventing to correctly choose the default elevator most suitable for
the device.

This currently affects all multi-queue zoned block devices which default
to the "none" elevator instead of the required "mq-deadline" elevator.
These drives currently include host-managed SMR disks connected to a
smartpqi HBA and null_blk block devices with zoned mode enabled.
Upcoming NVMe Zoned Namespace devices will also be affected.

Fix this by moving the execution of elevator_init_mq() from
blk_mq_init_allocated_queue() into __device_add_disk() to allow for the
device driver to probe the device characteristics and set attributes
of the device request queue prior to the elevator initialization.

Also to make sure that the elevator initialization is never done while
requests are in-flight (there should be none when the device driver
calls device_add_disk()), freeze and quiesce the device request queue
before executing blk_mq_init_sched().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 block/blk-mq.c   | 2 --
 block/elevator.c | 7 +++++++
 block/genhd.c    | 3 +++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0c9b1f403db8..baf0c9cd8237 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2893,8 +2893,6 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
 	blk_mq_add_queue_tag_set(set, q);
 	blk_mq_map_swqueue(q);
 
-	elevator_init_mq(q);
-
 	return q;
 
 err_hctxs:
diff --git a/block/elevator.c b/block/elevator.c
index 81d0877dbc34..433ce722cf0a 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -729,7 +729,14 @@ void elevator_init_mq(struct request_queue *q)
 	if (!e)
 		return;
 
+	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
+
 	err = blk_mq_init_sched(q, e);
+
+	blk_mq_unquiesce_queue(q);
+	blk_mq_unfreeze_queue(q);
+
 	if (err) {
 		pr_warn("\"%s\" elevator initialization failed, "
 			"falling back to \"none\"\n", e->elevator_name);
diff --git a/block/genhd.c b/block/genhd.c
index 54f1f0d381f4..d2114c25dccd 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -734,6 +734,9 @@ static void __device_add_disk(struct device *parent, struct gendisk *disk,
 				    exact_match, exact_lock, disk);
 	}
 	register_disk(parent, disk, groups);
+
+	elevator_init_mq(disk->queue);
+
 	if (register_queue)
 		blk_register_queue(disk);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (4 preceding siblings ...)
  2019-08-28  2:29 ` [PATCH v2 5/7] block: Delay default elevator initialization Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-09-03  9:03   ` Christoph Hellwig
  2019-08-28  2:29 ` [PATCH v2 7/7] scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
  6 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Using the helper blk_queue_required_elevator_features(), set the
elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request
queue of null_blk devices created with zoned mode enabled.

This feature requirement can always be satisfied as the mq-deadline
elevator is always selected for in-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/block/null_blk_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
index b26a178d064d..b29b273690b0 100644
--- a/drivers/block/null_blk_main.c
+++ b/drivers/block/null_blk_main.c
@@ -1695,6 +1695,8 @@ static int null_add_dev(struct nullb_device *dev)
 		blk_queue_chunk_sectors(nullb->q, dev->zone_size_sects);
 		nullb->q->limits.zoned = BLK_ZONED_HM;
 		blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, nullb->q);
+		blk_queue_required_elevator_features(nullb->q,
+						ELEVATOR_F_ZBD_SEQ_WRITE);
 	}
 
 	nullb->q->queuedata = nullb;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 7/7] scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks
  2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
                   ` (5 preceding siblings ...)
  2019-08-28  2:29 ` [PATCH v2 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
@ 2019-08-28  2:29 ` Damien Le Moal
  2019-09-03  9:03   ` Christoph Hellwig
  6 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28  2:29 UTC (permalink / raw)
  To: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Using the helper blk_queue_required_elevator_features(), set the
elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request
queue of SCSI ZBC disks.

This feature requirement can always be satisfied as the mq-deadline
elevator is always selected for in-kernel compilation when
CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 367614f0e34f..de4019dc0f0b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -493,6 +493,8 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	blk_queue_chunk_sectors(sdkp->disk->queue,
 			logical_to_sectors(sdkp->device, zone_blocks));
 	blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, sdkp->disk->queue);
+	blk_queue_required_elevator_features(sdkp->disk->queue,
+					     ELEVATOR_F_ZBD_SEQ_WRITE);
 	nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
 
 	/* READ16/WRITE16 is mandatory for ZBC disks */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28  2:29 ` [PATCH v2 3/7] block: Introduce elevator features Damien Le Moal
@ 2019-08-28  8:16   ` Johannes Thumshirn
  2019-08-28 10:41     ` Damien Le Moal
  2019-09-03  8:56   ` Christoph Hellwig
  1 sibling, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2019-08-28  8:16 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

What happened to my review comment for v1 of this patch?

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28  8:16   ` Johannes Thumshirn
@ 2019-08-28 10:41     ` Damien Le Moal
  2019-08-28 10:43       ` Johannes Thumshirn
  0 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28 10:41 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-block, Jens Axboe, linux-scsi,
	Martin K . Petersen

On 2019/08/28 17:16, Johannes Thumshirn wrote:
> What happened to my review comment for v1 of this patch?
> 

I merged the renamed ELEVATOR_F_ZBD_SEQ_WRITE feature into this patch instead of
following patch and separated the nullblk and sd_zbc changes into other patches.
Well, at least that is what I understood you wanted... Did I misunderstand ?
When tired, my english becomes fuzzy sometimes :)

Please let me know if that is not what you wanted (it does seem so).

Cheers.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28 10:41     ` Damien Le Moal
@ 2019-08-28 10:43       ` Johannes Thumshirn
  2019-08-28 11:08         ` Damien Le Moal
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Thumshirn @ 2019-08-28 10:43 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On 28/08/2019 12:41, Damien Le Moal wrote:
> On 2019/08/28 17:16, Johannes Thumshirn wrote:
>> What happened to my review comment for v1 of this patch?
>>
> 
> I merged the renamed ELEVATOR_F_ZBD_SEQ_WRITE feature into this patch instead of
> following patch and separated the nullblk and sd_zbc changes into other patches.
> Well, at least that is what I understood you wanted... Did I misunderstand ?
> When tired, my english becomes fuzzy sometimes :)
> 
> Please let me know if that is not what you wanted (it does seem so).

I meant to useage of an 'unsigned int' vs. explicit u32/u64 for
'elevator_features'

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28 10:43       ` Johannes Thumshirn
@ 2019-08-28 11:08         ` Damien Le Moal
  2019-08-28 11:11           ` Johannes Thumshirn
  0 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-08-28 11:08 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-block, Jens Axboe, linux-scsi,
	Martin K . Petersen

On 2019/08/28 19:43, Johannes Thumshirn wrote:
> On 28/08/2019 12:41, Damien Le Moal wrote:
>> On 2019/08/28 17:16, Johannes Thumshirn wrote:
>>> What happened to my review comment for v1 of this patch?
>>>
>>
>> I merged the renamed ELEVATOR_F_ZBD_SEQ_WRITE feature into this patch instead of
>> following patch and separated the nullblk and sd_zbc changes into other patches.
>> Well, at least that is what I understood you wanted... Did I misunderstand ?
>> When tired, my english becomes fuzzy sometimes :)
>>
>> Please let me know if that is not what you wanted (it does seem so).
> 
> I meant to useage of an 'unsigned int' vs. explicit u32/u64 for
> 'elevator_features'
> 

I changed from unsigned long to unsigned int, which is always 32bits on any
arch, no ? I preferred the use of unsigned int over u32/u64 as these look more
like low level driver stuff... Did I miss something ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28 11:08         ` Damien Le Moal
@ 2019-08-28 11:11           ` Johannes Thumshirn
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Thumshirn @ 2019-08-28 11:11 UTC (permalink / raw)
  To: Damien Le Moal, linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On 28/08/2019 13:08, Damien Le Moal wrote:
> I changed from unsigned long to unsigned int, which is always 32bits on any
> arch, no ? I preferred the use of unsigned int over u32/u64 as these look more
> like low level driver stuff... Did I miss something ?

Ah no OK then, my mistake. My brain autocompleted to 'unsigned long'
when I read it.


Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/7] block: Cleanup elevator_init_mq() use
  2019-08-28  2:29 ` [PATCH v2 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
@ 2019-09-03  8:55   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  8:55 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Aug 28, 2019 at 11:29:41AM +0900, Damien Le Moal wrote:
> Instead of checking a queue tag_set BLK_MQ_F_NO_SCHED flag before
> calling elevator_init_mq() to make sure that the queue supports IO
> scheduling, use the elevator.c function elv_support_iosched() in
> elevator_init_mq(). This does not introduce any functional change but
> ensure that elevator_init_mq() does the right thing based on the queue
> settings.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/7] block: Change elevator_init_mq() to always succeed
  2019-08-28  2:29 ` [PATCH v2 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
@ 2019-09-03  8:55   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  8:55 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Aug 28, 2019 at 11:29:42AM +0900, Damien Le Moal wrote:
> If the default elevator chosen is mq-deadline, elevator_init_mq() may
> return an error if mq-deadline initialization fails, leading to
> blk_mq_init_allocated_queue() returning an error, which in turn will
> cause the block device initialization to fail and the device not being
> exposed.
> 
> Instead of taking such extreme measure, handle mq-deadline
> initialization failures in the same manner as when mq-deadline is not
> available (no module to load), that is, default to the "none" scheduler.
> With this change, elevator_init_mq() return type can be changed to void.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/7] block: Introduce elevator features
  2019-08-28  2:29 ` [PATCH v2 3/7] block: Introduce elevator features Damien Le Moal
  2019-08-28  8:16   ` Johannes Thumshirn
@ 2019-09-03  8:56   ` Christoph Hellwig
  1 sibling, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  8:56 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/7] block: Improve default elevator selection
  2019-08-28  2:29 ` [PATCH v2 4/7] block: Improve default elevator selection Damien Le Moal
@ 2019-09-03  8:57   ` Christoph Hellwig
  2019-09-04  8:42     ` Damien Le Moal
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  8:57 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Aug 28, 2019 at 11:29:44AM +0900, Damien Le Moal wrote:
> For block devices that do not specify required features, preserve the
> current default elevator selection (mq-deadline for single queue
> devices, none for multi-queue devices). However, for devices specifying
> required features (e.g. zoned block devices ELEVATOR_F_ZBD_SEQ_WRITE
> feature), select the first available elevator providing the required
> features.
> 
> In all cases, default to "none" if no elevator is available or if the
> initialization of the default elevator fails.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/7] block: Delay default elevator initialization
  2019-08-28  2:29 ` [PATCH v2 5/7] block: Delay default elevator initialization Damien Le Moal
@ 2019-09-03  9:02   ` Christoph Hellwig
  2019-09-04  2:07     ` Damien Le Moal
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  9:02 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Aug 28, 2019 at 11:29:45AM +0900, Damien Le Moal wrote:
> When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
> the only information known about the device is the number of hardware
> queues as the block device scan by the device driver is not completed
> yet. The device type and the device required features are not set yet,
> preventing to correctly choose the default elevator most suitable for
> the device.
> 
> This currently affects all multi-queue zoned block devices which default
> to the "none" elevator instead of the required "mq-deadline" elevator.
> These drives currently include host-managed SMR disks connected to a
> smartpqi HBA and null_blk block devices with zoned mode enabled.
> Upcoming NVMe Zoned Namespace devices will also be affected.
> 
> Fix this by moving the execution of elevator_init_mq() from
> blk_mq_init_allocated_queue() into __device_add_disk() to allow for the
> device driver to probe the device characteristics and set attributes
> of the device request queue prior to the elevator initialization.
> 
> Also to make sure that the elevator initialization is never done while
> requests are in-flight (there should be none when the device driver
> calls device_add_disk()), freeze and quiesce the device request queue
> before executing blk_mq_init_sched().

So the disk can be accessed from userspace or partition probing once we
registered the region.  Based on that I think it would be better if
we set the elevator a little earlier before that happens.  With that
we shouldn't have to freeze the queue.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks
  2019-08-28  2:29 ` [PATCH v2 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
@ 2019-09-03  9:03   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  9:03 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Aug 28, 2019 at 11:29:46AM +0900, Damien Le Moal wrote:
> Using the helper blk_queue_required_elevator_features(), set the
> elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request
> queue of null_blk devices created with zoned mode enabled.
> 
> This feature requirement can always be satisfied as the mq-deadline
> elevator is always selected for in-kernel compilation when
> CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 7/7] scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks
  2019-08-28  2:29 ` [PATCH v2 7/7] scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
@ 2019-09-03  9:03   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-03  9:03 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On Wed, Aug 28, 2019 at 11:29:47AM +0900, Damien Le Moal wrote:
> Using the helper blk_queue_required_elevator_features(), set the
> elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request
> queue of SCSI ZBC disks.
> 
> This feature requirement can always be satisfied as the mq-deadline
> elevator is always selected for in-kernel compilation when
> CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>

Maybe s/scsi/sd/ in the subject?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/7] block: Delay default elevator initialization
  2019-09-03  9:02   ` Christoph Hellwig
@ 2019-09-04  2:07     ` Damien Le Moal
  2019-09-04  6:47       ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Damien Le Moal @ 2019-09-04  2:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Jens Axboe, linux-scsi, Martin K . Petersen

On 2019/09/03 18:02, Christoph Hellwig wrote:
> On Wed, Aug 28, 2019 at 11:29:45AM +0900, Damien Le Moal wrote:
>> When elevator_init_mq() is called from blk_mq_init_allocated_queue(),
>> the only information known about the device is the number of hardware
>> queues as the block device scan by the device driver is not completed
>> yet. The device type and the device required features are not set yet,
>> preventing to correctly choose the default elevator most suitable for
>> the device.
>>
>> This currently affects all multi-queue zoned block devices which default
>> to the "none" elevator instead of the required "mq-deadline" elevator.
>> These drives currently include host-managed SMR disks connected to a
>> smartpqi HBA and null_blk block devices with zoned mode enabled.
>> Upcoming NVMe Zoned Namespace devices will also be affected.
>>
>> Fix this by moving the execution of elevator_init_mq() from
>> blk_mq_init_allocated_queue() into __device_add_disk() to allow for the
>> device driver to probe the device characteristics and set attributes
>> of the device request queue prior to the elevator initialization.
>>
>> Also to make sure that the elevator initialization is never done while
>> requests are in-flight (there should be none when the device driver
>> calls device_add_disk()), freeze and quiesce the device request queue
>> before executing blk_mq_init_sched().
> 
> So the disk can be accessed from userspace or partition probing once we
> registered the region.  Based on that I think it would be better if
> we set the elevator a little earlier before that happens.  With that
> we shouldn't have to freeze the queue.
> 

OK. I will move the registration earlier in device_add_disk(), before the region
registration.

However, I would still like to keep the queue freeze to protect against buggy
device drivers that call device_add_disk() with internal commands still going
on. I do not think that there are any such driver, but just want to avoid
problems. The queue freeze is also present for any user initiated elevator
change, so in this respect, this is not any different and should not be a big
problem. Thoughts ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/7] block: Delay default elevator initialization
  2019-09-04  2:07     ` Damien Le Moal
@ 2019-09-04  6:47       ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2019-09-04  6:47 UTC (permalink / raw)
  To: Damien Le Moal
  Cc: Christoph Hellwig, linux-block, Jens Axboe, linux-scsi,
	Martin K . Petersen

On Wed, Sep 04, 2019 at 02:07:39AM +0000, Damien Le Moal wrote:
> OK. I will move the registration earlier in device_add_disk(), before the region
> registration.
> 
> However, I would still like to keep the queue freeze to protect against buggy
> device drivers that call device_add_disk() with internal commands still going
> on. I do not think that there are any such driver, but just want to avoid
> problems. The queue freeze is also present for any user initiated elevator
> change, so in this respect, this is not any different and should not be a big
> problem. Thoughts ?

I don't really see the point, but there should be no harm it either
since a freeze of a non-busy queue should be fast.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/7] block: Improve default elevator selection
  2019-09-03  8:57   ` Christoph Hellwig
@ 2019-09-04  8:42     ` Damien Le Moal
  0 siblings, 0 replies; 23+ messages in thread
From: Damien Le Moal @ 2019-09-04  8:42 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi, axboe, linux-block, martin.petersen

On Tue, 2019-09-03 at 01:57 -0700, Christoph Hellwig wrote:
> On Wed, Aug 28, 2019 at 11:29:44AM +0900, Damien Le Moal wrote:
> > For block devices that do not specify required features, preserve the
> > current default elevator selection (mq-deadline for single queue
> > devices, none for multi-queue devices). However, for devices specifying
> > required features (e.g. zoned block devices ELEVATOR_F_ZBD_SEQ_WRITE
> > feature), select the first available elevator providing the required
> > features.
> > 
> > In all cases, default to "none" if no elevator is available or if the
> > initialization of the default elevator fails.
> > 
> > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> 
> Looks good,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Thanks, but I need to fix this patch too.
The "if (q->nr_hw_queues != 1)" at the beginning of elevator_init_mq()
must be removed.

I am sending a v3 with this fix and a modified patch 5 to call
elevator_init_mq() earlier in device_add_disk().

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-09-04  8:42 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-28  2:29 [PATCH v2 0/7] Elevator cleanups and improvements Damien Le Moal
2019-08-28  2:29 ` [PATCH v2 1/7] block: Cleanup elevator_init_mq() use Damien Le Moal
2019-09-03  8:55   ` Christoph Hellwig
2019-08-28  2:29 ` [PATCH v2 2/7] block: Change elevator_init_mq() to always succeed Damien Le Moal
2019-09-03  8:55   ` Christoph Hellwig
2019-08-28  2:29 ` [PATCH v2 3/7] block: Introduce elevator features Damien Le Moal
2019-08-28  8:16   ` Johannes Thumshirn
2019-08-28 10:41     ` Damien Le Moal
2019-08-28 10:43       ` Johannes Thumshirn
2019-08-28 11:08         ` Damien Le Moal
2019-08-28 11:11           ` Johannes Thumshirn
2019-09-03  8:56   ` Christoph Hellwig
2019-08-28  2:29 ` [PATCH v2 4/7] block: Improve default elevator selection Damien Le Moal
2019-09-03  8:57   ` Christoph Hellwig
2019-09-04  8:42     ` Damien Le Moal
2019-08-28  2:29 ` [PATCH v2 5/7] block: Delay default elevator initialization Damien Le Moal
2019-09-03  9:02   ` Christoph Hellwig
2019-09-04  2:07     ` Damien Le Moal
2019-09-04  6:47       ` Christoph Hellwig
2019-08-28  2:29 ` [PATCH v2 6/7] block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disks Damien Le Moal
2019-09-03  9:03   ` Christoph Hellwig
2019-08-28  2:29 ` [PATCH v2 7/7] scsi: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disks Damien Le Moal
2019-09-03  9:03   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.