All of lore.kernel.org
 help / color / mirror / Atom feed
* [for-4.16 PATCH v3 0/3] block/dm: allow DM to defer blk_register_queue() until ready
@ 2018-01-11  2:12 Mike Snitzer
  2018-01-11  2:12 ` [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11  2:12 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

Hi Jens,

I eliminated my implementation that set disk->queue = NULL before
calling add_disk().  As we discussed it left way too much potential
for NULL pointer crashes and I agree it was too fragile.

This v3's approach is much simpler.  It adjusts block core so that
blk_register_queue() can be deferred (so add_disk()'s call is avoided
if QUEUE_FLAG_DEFER_REG is set, and a driver must then call it once it
has completed initializing its request_queue).

PATCH 1 is just an unrelated fix.  Christoph agreed with it in reply
to my v2 submission (but he didn't provide a Reviewed-by).  Anyway,
I've revised the header to have it make more sense.

If these changes look reasonable I'd prefer that you pick them all up
for 4.16 (last DM patch included because it'll save me an awkward
dm-4.16 rebase, etc).

Thanks!
Mike

Mike Snitzer (3):
  block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN
  block: allow gendisk's request_queue registration to be deferred
  dm: fix awkward and incomplete request_queue initialization

 block/blk-sysfs.c      |  4 ++++
 block/genhd.c          |  7 +++++--
 drivers/md/dm-core.h   |  2 --
 drivers/md/dm-rq.c     | 11 -----------
 drivers/md/dm.c        | 44 ++++++++++++++++++++++++++------------------
 include/linux/blkdev.h |  1 +
 6 files changed, 36 insertions(+), 33 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN
  2018-01-11  2:12 [for-4.16 PATCH v3 0/3] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
@ 2018-01-11  2:12 ` Mike Snitzer
  2018-01-11  2:48   ` Ming Lei
  2018-01-11  7:40   ` Hannes Reinecke
  2018-01-11  2:12 ` [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
  2018-01-11  2:12 ` [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
  2 siblings, 2 replies; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11  2:12 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

device_add_disk() will only call bdi_register_owner() if
!GENHD_FL_HIDDEN, so it follows that del_gendisk() should only call
bdi_unregister() if !GENHD_FL_HIDDEN.

Found with code inspection.  bdi_unregister() won't do any harm if
bdi_register_owner() wasn't used but best to avoid the unnecessary
call to bdi_unregister().

Fixes: 8ddcd65325 ("block: introduce GENHD_FL_HIDDEN")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/genhd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/genhd.c b/block/genhd.c
index 96a66f671720..00620e01e043 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -725,7 +725,8 @@ void del_gendisk(struct gendisk *disk)
 		 * Unregister bdi before releasing device numbers (as they can
 		 * get reused and we'd get clashes in sysfs).
 		 */
-		bdi_unregister(disk->queue->backing_dev_info);
+		if (!(disk->flags & GENHD_FL_HIDDEN))
+			bdi_unregister(disk->queue->backing_dev_info);
 		blk_unregister_queue(disk);
 	} else {
 		WARN_ON(1);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11  2:12 [for-4.16 PATCH v3 0/3] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
  2018-01-11  2:12 ` [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
@ 2018-01-11  2:12 ` Mike Snitzer
  2018-01-11  2:54   ` Ming Lei
                     ` (2 more replies)
  2018-01-11  2:12 ` [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
  2 siblings, 3 replies; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11  2:12 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

Since I can remember DM has forced the block layer to allow the
allocation and initialization of the request_queue to be distinct
operations.  Reason for this is block/genhd.c:add_disk() has requires
that the request_queue (and associated bdi) be tied to the gendisk
before add_disk() is called -- because add_disk() also deals with
exposing the request_queue via blk_register_queue().

DM's dynamic creation of arbitrary device types (and associated
request_queue types) requires the DM device's gendisk be available so
that DM table loads can establish a master/slave relationship with
subordinate devices that are referenced by loaded DM tables -- using
bd_link_disk_holder().  But until these DM tables, and their associated
subordinate devices, are known DM cannot know what type of request_queue
it needs -- nor what its queue_limits should be.

This chicken and egg scenario has created all manner of problems for DM
and, at times, the block layer.

Summary of changes:

- Add QUEUE_FLAG_DEFER_REG that a block driver can set to defer the
  required blk_register_queue() until the block driver's request_queue
  is fully initialized.

- Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
  is not set.  It won't be set if a request_queue with
  QUEUE_FLAG_DEFER_REG set never called blk_register_queue() -- as can
  happen if a driver encounters an error and must del_gendisk() before
  calling blk_register_queue().

- Export blk_register_queue().

These changes allow DM to use device_add_disk() to anchor its gendisk as
the "master" for master/slave relationships DM must establish with
subordinate devices referenced in DM tables that get loaded.  Once all
"slave" devices for a DM device are known a request_queue can be
properly initialized and then advertised via sysfs -- important
improvement being that no request_queue resource initialization is
missed -- before these changes DM was known to be missing blk-mq debugfs
and proper block throttle initialization.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 block/blk-sysfs.c      | 4 ++++
 block/genhd.c          | 4 +++-
 include/linux/blkdev.h | 1 +
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 870484eaed1f..2395122875b4 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -921,6 +921,7 @@ int blk_register_queue(struct gendisk *disk)
 	mutex_unlock(&q->sysfs_lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(blk_register_queue);
 
 void blk_unregister_queue(struct gendisk *disk)
 {
@@ -929,6 +930,9 @@ void blk_unregister_queue(struct gendisk *disk)
 	if (WARN_ON(!q))
 		return;
 
+	if (!test_bit(QUEUE_FLAG_REGISTERED, &disk->queue->queue_flags))
+		return;
+
 	mutex_lock(&q->sysfs_lock);
 	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
 	mutex_unlock(&q->sysfs_lock);
diff --git a/block/genhd.c b/block/genhd.c
index 00620e01e043..3912a82ecc4f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 				    exact_match, exact_lock, disk);
 	}
 	register_disk(parent, disk);
-	blk_register_queue(disk);
 
 	/*
 	 * Take an extra ref on queue which will be put on disk_release()
@@ -692,6 +691,9 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 
 	disk_add_events(disk);
 	blk_integrity_add(disk);
+
+	if (!test_bit(QUEUE_FLAG_DEFER_REG, &disk->queue->queue_flags))
+		blk_register_queue(disk);
 }
 EXPORT_SYMBOL(device_add_disk);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 71a9371c8182..84d144c7b025 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -681,6 +681,7 @@ struct request_queue {
 #define QUEUE_FLAG_SCSI_PASSTHROUGH 27	/* queue supports SCSI commands */
 #define QUEUE_FLAG_QUIESCED    28	/* queue has been quiesced */
 #define QUEUE_FLAG_PREEMPT_ONLY	29	/* only process REQ_PREEMPT requests */
+#define QUEUE_FLAG_DEFER_REG	30	/* defer registering queue to a disk */
 
 #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_SAME_COMP)	|	\
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization
  2018-01-11  2:12 [for-4.16 PATCH v3 0/3] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
  2018-01-11  2:12 ` [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
  2018-01-11  2:12 ` [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
@ 2018-01-11  2:12 ` Mike Snitzer
  2018-01-11  2:56   ` Ming Lei
  2018-01-11  7:57   ` Hannes Reinecke
  2 siblings, 2 replies; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11  2:12 UTC (permalink / raw)
  To: axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

DM is now no longer prone to having its request_queue be improperly
initialized.

Summary of changes:

- defer DM's blk_register_queue() from add_disk()-time until
  dm_setup_md_queue() by setting QUEUE_FLAG_DEFER_REG in alloc_dev().

- dm_setup_md_queue() is updated to fully initialize DM's request_queue
  (_after_ all table loads have occurred and the request_queue's type,
  features and limits are known).

- various other small improvements that were noticed along the way.

A very welcome side-effect of these changes is DM no longer needs to:
1) backfill the "mq" sysfs entry (because historically DM didn't
initialize the request_queue to use blk-mq until _after_
register_queue() was called via add_disk()).
2) call elv_register_queue() to get .request_fn request-based DM
device's "queue" exposed in syfs.

In addition, blk-mq debugfs support is now made available because
request-based DM's blk-mq request_queue is now properly initialized
before blk_register_queue() is called.

These changes also stave off the need to introduce new DM-specific
workarounds in block core, e.g. this proposal:
https://patchwork.kernel.org/patch/10067961/

In the end DM devices should be less unicorn in nature (relative to
initialization and availability of block core infrastructure).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-core.h |  2 --
 drivers/md/dm-rq.c   | 11 -----------
 drivers/md/dm.c      | 44 ++++++++++++++++++++++++++------------------
 3 files changed, 26 insertions(+), 31 deletions(-)

diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index 6a14f945783c..f955123b4765 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -130,8 +130,6 @@ struct mapped_device {
 	struct srcu_struct io_barrier;
 };
 
-void dm_init_md_queue(struct mapped_device *md);
-void dm_init_normal_md_queue(struct mapped_device *md);
 int md_in_flight(struct mapped_device *md);
 void disable_write_same(struct mapped_device *md);
 void disable_write_zeroes(struct mapped_device *md);
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 9d32f25489c2..3b319776d80c 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -700,7 +700,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	/* disable dm_old_request_fn's merge heuristic by default */
 	md->seq_rq_merge_deadline_usecs = 0;
 
-	dm_init_normal_md_queue(md);
 	blk_queue_softirq_done(md->queue, dm_softirq_done);
 
 	/* Initialize the request-based DM worker thread */
@@ -713,8 +712,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
 		return error;
 	}
 
-	elv_register_queue(md->queue);
-
 	return 0;
 }
 
@@ -810,17 +807,9 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 		err = PTR_ERR(q);
 		goto out_tag_set;
 	}
-	dm_init_md_queue(md);
-
-	/* backfill 'mq' sysfs registration normally done in blk_register_queue */
-	err = blk_mq_register_dev(disk_to_dev(md->disk), q);
-	if (err)
-		goto out_cleanup_queue;
 
 	return 0;
 
-out_cleanup_queue:
-	blk_cleanup_queue(q);
 out_tag_set:
 	blk_mq_free_tag_set(md->tag_set);
 out_kfree_tag_set:
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7475739fee49..f5d61b6adaec 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1626,20 +1626,9 @@ static const struct dax_operations dm_dax_ops;
 
 static void dm_wq_work(struct work_struct *work);
 
-void dm_init_md_queue(struct mapped_device *md)
-{
-	/*
-	 * Initialize data that will only be used by a non-blk-mq DM queue
-	 * - must do so here (in alloc_dev callchain) before queue is used
-	 */
-	md->queue->queuedata = md;
-	md->queue->backing_dev_info->congested_data = md;
-}
-
-void dm_init_normal_md_queue(struct mapped_device *md)
+static void dm_init_normal_md_queue(struct mapped_device *md)
 {
 	md->use_blk_mq = false;
-	dm_init_md_queue(md);
 
 	/*
 	 * Initialize aspects of queue that aren't relevant for blk-mq
@@ -1734,10 +1723,15 @@ static struct mapped_device *alloc_dev(int minor)
 	md->queue = blk_alloc_queue_node(GFP_KERNEL, numa_node_id);
 	if (!md->queue)
 		goto bad;
+	md->queue->queuedata = md;
+	md->queue->backing_dev_info->congested_data = md;
+	/*
+	 * Do not allow add_disk() to blk_register_queue().
+	 * Defer blk_register_queue() until dm_setup_md_queue().
+	 */
+	queue_flag_set_unlocked(QUEUE_FLAG_DEFER_REG, md->queue);
 
-	dm_init_md_queue(md);
-
-	md->disk = alloc_disk_node(1, numa_node_id);
+	md->disk = alloc_disk_node(1, md->numa_node_id);
 	if (!md->disk)
 		goto bad;
 
@@ -1962,13 +1956,18 @@ static struct dm_table *__unbind(struct mapped_device *md)
  */
 int dm_create(int minor, struct mapped_device **result)
 {
+	int r;
 	struct mapped_device *md;
 
 	md = alloc_dev(minor);
 	if (!md)
 		return -ENXIO;
 
-	dm_sysfs_init(md);
+	r = dm_sysfs_init(md);
+	if (r) {
+		free_dev(md);
+		return r;
+	}
 
 	*result = md;
 	return 0;
@@ -2021,10 +2020,12 @@ EXPORT_SYMBOL_GPL(dm_get_queue_limits);
 int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 {
 	int r;
+	struct queue_limits limits;
 	enum dm_queue_mode type = dm_get_md_type(md);
 
 	switch (type) {
 	case DM_TYPE_REQUEST_BASED:
+		dm_init_normal_md_queue(md);
 		r = dm_old_init_request_queue(md, t);
 		if (r) {
 			DMERR("Cannot initialize queue for request-based mapped device");
@@ -2057,6 +2058,14 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 		break;
 	}
 
+	r = dm_calculate_queue_limits(t, &limits);
+	if (r) {
+		DMERR("Cannot calculate initial queue limits");
+		return r;
+	}
+	dm_table_set_restrictions(t, md->queue, &limits);
+	blk_register_queue(md->disk);
+
 	return 0;
 }
 
@@ -2121,7 +2130,6 @@ EXPORT_SYMBOL_GPL(dm_device_name);
 
 static void __dm_destroy(struct mapped_device *md, bool wait)
 {
-	struct request_queue *q = dm_get_md_queue(md);
 	struct dm_table *map;
 	int srcu_idx;
 
@@ -2132,7 +2140,7 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
 	set_bit(DMF_FREEING, &md->flags);
 	spin_unlock(&_minor_lock);
 
-	blk_set_queue_dying(q);
+	blk_set_queue_dying(md->queue);
 
 	if (dm_request_based(md) && md->kworker_task)
 		kthread_flush_worker(&md->kworker);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN
  2018-01-11  2:12 ` [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
@ 2018-01-11  2:48   ` Ming Lei
  2018-01-11  7:40   ` Hannes Reinecke
  1 sibling, 0 replies; 22+ messages in thread
From: Ming Lei @ 2018-01-11  2:48 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On Wed, Jan 10, 2018 at 09:12:54PM -0500, Mike Snitzer wrote:
> device_add_disk() will only call bdi_register_owner() if
> !GENHD_FL_HIDDEN, so it follows that del_gendisk() should only call
> bdi_unregister() if !GENHD_FL_HIDDEN.
> 
> Found with code inspection.  bdi_unregister() won't do any harm if
> bdi_register_owner() wasn't used but best to avoid the unnecessary
> call to bdi_unregister().
> 
> Fixes: 8ddcd65325 ("block: introduce GENHD_FL_HIDDEN")
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/genhd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 96a66f671720..00620e01e043 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -725,7 +725,8 @@ void del_gendisk(struct gendisk *disk)
>  		 * Unregister bdi before releasing device numbers (as they can
>  		 * get reused and we'd get clashes in sysfs).
>  		 */
> -		bdi_unregister(disk->queue->backing_dev_info);
> +		if (!(disk->flags & GENHD_FL_HIDDEN))
> +			bdi_unregister(disk->queue->backing_dev_info);
>  		blk_unregister_queue(disk);
>  	} else {
>  		WARN_ON(1);
> -- 
> 2.15.0
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11  2:12 ` [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
@ 2018-01-11  2:54   ` Ming Lei
  2018-01-11  7:46   ` Hannes Reinecke
  2018-01-11  7:56   ` Hannes Reinecke
  2 siblings, 0 replies; 22+ messages in thread
From: Ming Lei @ 2018-01-11  2:54 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On Wed, Jan 10, 2018 at 09:12:55PM -0500, Mike Snitzer wrote:
> Since I can remember DM has forced the block layer to allow the
> allocation and initialization of the request_queue to be distinct
> operations.  Reason for this is block/genhd.c:add_disk() has requires
> that the request_queue (and associated bdi) be tied to the gendisk
> before add_disk() is called -- because add_disk() also deals with
> exposing the request_queue via blk_register_queue().
> 
> DM's dynamic creation of arbitrary device types (and associated
> request_queue types) requires the DM device's gendisk be available so
> that DM table loads can establish a master/slave relationship with
> subordinate devices that are referenced by loaded DM tables -- using
> bd_link_disk_holder().  But until these DM tables, and their associated
> subordinate devices, are known DM cannot know what type of request_queue
> it needs -- nor what its queue_limits should be.
> 
> This chicken and egg scenario has created all manner of problems for DM
> and, at times, the block layer.
> 
> Summary of changes:
> 
> - Add QUEUE_FLAG_DEFER_REG that a block driver can set to defer the
>   required blk_register_queue() until the block driver's request_queue
>   is fully initialized.
> 
> - Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
>   is not set.  It won't be set if a request_queue with
>   QUEUE_FLAG_DEFER_REG set never called blk_register_queue() -- as can
>   happen if a driver encounters an error and must del_gendisk() before
>   calling blk_register_queue().
> 
> - Export blk_register_queue().
> 
> These changes allow DM to use device_add_disk() to anchor its gendisk as
> the "master" for master/slave relationships DM must establish with
> subordinate devices referenced in DM tables that get loaded.  Once all
> "slave" devices for a DM device are known a request_queue can be
> properly initialized and then advertised via sysfs -- important
> improvement being that no request_queue resource initialization is
> missed -- before these changes DM was known to be missing blk-mq debugfs
> and proper block throttle initialization.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-sysfs.c      | 4 ++++
>  block/genhd.c          | 4 +++-
>  include/linux/blkdev.h | 1 +
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 870484eaed1f..2395122875b4 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -921,6 +921,7 @@ int blk_register_queue(struct gendisk *disk)
>  	mutex_unlock(&q->sysfs_lock);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(blk_register_queue);
>  
>  void blk_unregister_queue(struct gendisk *disk)
>  {
> @@ -929,6 +930,9 @@ void blk_unregister_queue(struct gendisk *disk)
>  	if (WARN_ON(!q))
>  		return;
>  
> +	if (!test_bit(QUEUE_FLAG_REGISTERED, &disk->queue->queue_flags))
> +		return;
> +
>  	mutex_lock(&q->sysfs_lock);
>  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>  	mutex_unlock(&q->sysfs_lock);
> diff --git a/block/genhd.c b/block/genhd.c
> index 00620e01e043..3912a82ecc4f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  				    exact_match, exact_lock, disk);
>  	}
>  	register_disk(parent, disk);
> -	blk_register_queue(disk);
>  
>  	/*
>  	 * Take an extra ref on queue which will be put on disk_release()
> @@ -692,6 +691,9 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  
>  	disk_add_events(disk);
>  	blk_integrity_add(disk);
> +
> +	if (!test_bit(QUEUE_FLAG_DEFER_REG, &disk->queue->queue_flags))
> +		blk_register_queue(disk);
>  }
>  EXPORT_SYMBOL(device_add_disk);
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 71a9371c8182..84d144c7b025 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -681,6 +681,7 @@ struct request_queue {
>  #define QUEUE_FLAG_SCSI_PASSTHROUGH 27	/* queue supports SCSI commands */
>  #define QUEUE_FLAG_QUIESCED    28	/* queue has been quiesced */
>  #define QUEUE_FLAG_PREEMPT_ONLY	29	/* only process REQ_PREEMPT requests */
> +#define QUEUE_FLAG_DEFER_REG	30	/* defer registering queue to a disk */
>  
>  #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
>  				 (1 << QUEUE_FLAG_SAME_COMP)	|	\
> -- 
> 2.15.0

This way is safe for all other devices, even for DM it is safe too since
block does deal with NULL .make_request_fn well for legacy path.

Maybe QUEUE_FLAG_DEFER_REG can be renamed as QUEUE_FLAG_REG_BY_DRIVER,
but not a big deal, so

	Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization
  2018-01-11  2:12 ` [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
@ 2018-01-11  2:56   ` Ming Lei
  2018-01-11  7:57   ` Hannes Reinecke
  1 sibling, 0 replies; 22+ messages in thread
From: Ming Lei @ 2018-01-11  2:56 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On Wed, Jan 10, 2018 at 09:12:56PM -0500, Mike Snitzer wrote:
> DM is now no longer prone to having its request_queue be improperly
> initialized.
> 
> Summary of changes:
> 
> - defer DM's blk_register_queue() from add_disk()-time until
>   dm_setup_md_queue() by setting QUEUE_FLAG_DEFER_REG in alloc_dev().
> 
> - dm_setup_md_queue() is updated to fully initialize DM's request_queue
>   (_after_ all table loads have occurred and the request_queue's type,
>   features and limits are known).
> 
> - various other small improvements that were noticed along the way.
> 
> A very welcome side-effect of these changes is DM no longer needs to:
> 1) backfill the "mq" sysfs entry (because historically DM didn't
> initialize the request_queue to use blk-mq until _after_
> register_queue() was called via add_disk()).
> 2) call elv_register_queue() to get .request_fn request-based DM
> device's "queue" exposed in syfs.
> 
> In addition, blk-mq debugfs support is now made available because
> request-based DM's blk-mq request_queue is now properly initialized
> before blk_register_queue() is called.
> 
> These changes also stave off the need to introduce new DM-specific
> workarounds in block core, e.g. this proposal:
> https://patchwork.kernel.org/patch/10067961/
> 
> In the end DM devices should be less unicorn in nature (relative to
> initialization and availability of block core infrastructure).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-core.h |  2 --
>  drivers/md/dm-rq.c   | 11 -----------
>  drivers/md/dm.c      | 44 ++++++++++++++++++++++++++------------------
>  3 files changed, 26 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
> index 6a14f945783c..f955123b4765 100644
> --- a/drivers/md/dm-core.h
> +++ b/drivers/md/dm-core.h
> @@ -130,8 +130,6 @@ struct mapped_device {
>  	struct srcu_struct io_barrier;
>  };
>  
> -void dm_init_md_queue(struct mapped_device *md);
> -void dm_init_normal_md_queue(struct mapped_device *md);
>  int md_in_flight(struct mapped_device *md);
>  void disable_write_same(struct mapped_device *md);
>  void disable_write_zeroes(struct mapped_device *md);
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 9d32f25489c2..3b319776d80c 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -700,7 +700,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  	/* disable dm_old_request_fn's merge heuristic by default */
>  	md->seq_rq_merge_deadline_usecs = 0;
>  
> -	dm_init_normal_md_queue(md);
>  	blk_queue_softirq_done(md->queue, dm_softirq_done);
>  
>  	/* Initialize the request-based DM worker thread */
> @@ -713,8 +712,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  		return error;
>  	}
>  
> -	elv_register_queue(md->queue);
> -
>  	return 0;
>  }
>  
> @@ -810,17 +807,9 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  		err = PTR_ERR(q);
>  		goto out_tag_set;
>  	}
> -	dm_init_md_queue(md);
> -
> -	/* backfill 'mq' sysfs registration normally done in blk_register_queue */
> -	err = blk_mq_register_dev(disk_to_dev(md->disk), q);
> -	if (err)
> -		goto out_cleanup_queue;
>  
>  	return 0;
>  
> -out_cleanup_queue:
> -	blk_cleanup_queue(q);
>  out_tag_set:
>  	blk_mq_free_tag_set(md->tag_set);
>  out_kfree_tag_set:
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 7475739fee49..f5d61b6adaec 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1626,20 +1626,9 @@ static const struct dax_operations dm_dax_ops;
>  
>  static void dm_wq_work(struct work_struct *work);
>  
> -void dm_init_md_queue(struct mapped_device *md)
> -{
> -	/*
> -	 * Initialize data that will only be used by a non-blk-mq DM queue
> -	 * - must do so here (in alloc_dev callchain) before queue is used
> -	 */
> -	md->queue->queuedata = md;
> -	md->queue->backing_dev_info->congested_data = md;
> -}
> -
> -void dm_init_normal_md_queue(struct mapped_device *md)
> +static void dm_init_normal_md_queue(struct mapped_device *md)
>  {
>  	md->use_blk_mq = false;
> -	dm_init_md_queue(md);
>  
>  	/*
>  	 * Initialize aspects of queue that aren't relevant for blk-mq
> @@ -1734,10 +1723,15 @@ static struct mapped_device *alloc_dev(int minor)
>  	md->queue = blk_alloc_queue_node(GFP_KERNEL, numa_node_id);
>  	if (!md->queue)
>  		goto bad;
> +	md->queue->queuedata = md;
> +	md->queue->backing_dev_info->congested_data = md;
> +	/*
> +	 * Do not allow add_disk() to blk_register_queue().
> +	 * Defer blk_register_queue() until dm_setup_md_queue().
> +	 */
> +	queue_flag_set_unlocked(QUEUE_FLAG_DEFER_REG, md->queue);
>  
> -	dm_init_md_queue(md);
> -
> -	md->disk = alloc_disk_node(1, numa_node_id);
> +	md->disk = alloc_disk_node(1, md->numa_node_id);
>  	if (!md->disk)
>  		goto bad;
>  
> @@ -1962,13 +1956,18 @@ static struct dm_table *__unbind(struct mapped_device *md)
>   */
>  int dm_create(int minor, struct mapped_device **result)
>  {
> +	int r;
>  	struct mapped_device *md;
>  
>  	md = alloc_dev(minor);
>  	if (!md)
>  		return -ENXIO;
>  
> -	dm_sysfs_init(md);
> +	r = dm_sysfs_init(md);
> +	if (r) {
> +		free_dev(md);
> +		return r;
> +	}
>  
>  	*result = md;
>  	return 0;
> @@ -2021,10 +2020,12 @@ EXPORT_SYMBOL_GPL(dm_get_queue_limits);
>  int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>  {
>  	int r;
> +	struct queue_limits limits;
>  	enum dm_queue_mode type = dm_get_md_type(md);
>  
>  	switch (type) {
>  	case DM_TYPE_REQUEST_BASED:
> +		dm_init_normal_md_queue(md);
>  		r = dm_old_init_request_queue(md, t);
>  		if (r) {
>  			DMERR("Cannot initialize queue for request-based mapped device");
> @@ -2057,6 +2058,14 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>  		break;
>  	}
>  
> +	r = dm_calculate_queue_limits(t, &limits);
> +	if (r) {
> +		DMERR("Cannot calculate initial queue limits");
> +		return r;
> +	}
> +	dm_table_set_restrictions(t, md->queue, &limits);
> +	blk_register_queue(md->disk);
> +
>  	return 0;
>  }
>  
> @@ -2121,7 +2130,6 @@ EXPORT_SYMBOL_GPL(dm_device_name);
>  
>  static void __dm_destroy(struct mapped_device *md, bool wait)
>  {
> -	struct request_queue *q = dm_get_md_queue(md);
>  	struct dm_table *map;
>  	int srcu_idx;
>  
> @@ -2132,7 +2140,7 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>  	set_bit(DMF_FREEING, &md->flags);
>  	spin_unlock(&_minor_lock);
>  
> -	blk_set_queue_dying(q);
> +	blk_set_queue_dying(md->queue);
>  
>  	if (dm_request_based(md) && md->kworker_task)
>  		kthread_flush_worker(&md->kworker);
> -- 
> 2.15.0
> 

Pass some of my block/DM sanity test, and blk-mq debugfs can be used
with this patch on DM-MPATH.

	Tested-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN
  2018-01-11  2:12 ` [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
  2018-01-11  2:48   ` Ming Lei
@ 2018-01-11  7:40   ` Hannes Reinecke
  1 sibling, 0 replies; 22+ messages in thread
From: Hannes Reinecke @ 2018-01-11  7:40 UTC (permalink / raw)
  To: Mike Snitzer, axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On 01/11/2018 03:12 AM, Mike Snitzer wrote:
> device_add_disk() will only call bdi_register_owner() if
> !GENHD_FL_HIDDEN, so it follows that del_gendisk() should only call
> bdi_unregister() if !GENHD_FL_HIDDEN.
> 
> Found with code inspection.  bdi_unregister() won't do any harm if
> bdi_register_owner() wasn't used but best to avoid the unnecessary
> call to bdi_unregister().
> 
> Fixes: 8ddcd65325 ("block: introduce GENHD_FL_HIDDEN")
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/genhd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index 96a66f671720..00620e01e043 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -725,7 +725,8 @@ void del_gendisk(struct gendisk *disk)
>  		 * Unregister bdi before releasing device numbers (as they can
>  		 * get reused and we'd get clashes in sysfs).
>  		 */
> -		bdi_unregister(disk->queue->backing_dev_info);
> +		if (!(disk->flags & GENHD_FL_HIDDEN))
> +			bdi_unregister(disk->queue->backing_dev_info);
>  		blk_unregister_queue(disk);
>  	} else {
>  		WARN_ON(1);
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11  2:12 ` [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
  2018-01-11  2:54   ` Ming Lei
@ 2018-01-11  7:46   ` Hannes Reinecke
  2018-01-11 17:04     ` Mike Snitzer
  2018-01-11  7:56   ` Hannes Reinecke
  2 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2018-01-11  7:46 UTC (permalink / raw)
  To: Mike Snitzer, axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On 01/11/2018 03:12 AM, Mike Snitzer wrote:
> Since I can remember DM has forced the block layer to allow the
> allocation and initialization of the request_queue to be distinct
> operations.  Reason for this is block/genhd.c:add_disk() has requires
> that the request_queue (and associated bdi) be tied to the gendisk
> before add_disk() is called -- because add_disk() also deals with
> exposing the request_queue via blk_register_queue().
> 
> DM's dynamic creation of arbitrary device types (and associated
> request_queue types) requires the DM device's gendisk be available so
> that DM table loads can establish a master/slave relationship with
> subordinate devices that are referenced by loaded DM tables -- using
> bd_link_disk_holder().  But until these DM tables, and their associated
> subordinate devices, are known DM cannot know what type of request_queue
> it needs -- nor what its queue_limits should be.
> 
> This chicken and egg scenario has created all manner of problems for DM
> and, at times, the block layer.
> 
> Summary of changes:
> 
> - Add QUEUE_FLAG_DEFER_REG that a block driver can set to defer the
>   required blk_register_queue() until the block driver's request_queue
>   is fully initialized.
> 
> - Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
>   is not set.  It won't be set if a request_queue with
>   QUEUE_FLAG_DEFER_REG set never called blk_register_queue() -- as can
>   happen if a driver encounters an error and must del_gendisk() before
>   calling blk_register_queue().
> 
> - Export blk_register_queue().
> 
> These changes allow DM to use device_add_disk() to anchor its gendisk as
> the "master" for master/slave relationships DM must establish with
> subordinate devices referenced in DM tables that get loaded.  Once all
> "slave" devices for a DM device are known a request_queue can be
> properly initialized and then advertised via sysfs -- important
> improvement being that no request_queue resource initialization is
> missed -- before these changes DM was known to be missing blk-mq debugfs
> and proper block throttle initialization.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-sysfs.c      | 4 ++++
>  block/genhd.c          | 4 +++-
>  include/linux/blkdev.h | 1 +
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 870484eaed1f..2395122875b4 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -921,6 +921,7 @@ int blk_register_queue(struct gendisk *disk)
>  	mutex_unlock(&q->sysfs_lock);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(blk_register_queue);
>  
>  void blk_unregister_queue(struct gendisk *disk)
>  {
> @@ -929,6 +930,9 @@ void blk_unregister_queue(struct gendisk *disk)
>  	if (WARN_ON(!q))
>  		return;
>  
> +	if (!test_bit(QUEUE_FLAG_REGISTERED, &disk->queue->queue_flags))
> +		return;
> +
>  	mutex_lock(&q->sysfs_lock);
>  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>  	mutex_unlock(&q->sysfs_lock);
Why can't we use test_and_clear_bit() here?
Shouldn't that relieve the need for the mutex_lock() here, too?

> diff --git a/block/genhd.c b/block/genhd.c
> index 00620e01e043..3912a82ecc4f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  				    exact_match, exact_lock, disk);
>  	}
>  	register_disk(parent, disk);
> -	blk_register_queue(disk);
>  
>  	/*
>  	 * Take an extra ref on queue which will be put on disk_release()
> @@ -692,6 +691,9 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  
>  	disk_add_events(disk);
>  	blk_integrity_add(disk);
> +
> +	if (!test_bit(QUEUE_FLAG_DEFER_REG, &disk->queue->queue_flags))
> +		blk_register_queue(disk);
>  }
>  EXPORT_SYMBOL(device_add_disk);
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 71a9371c8182..84d144c7b025 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -681,6 +681,7 @@ struct request_queue {
>  #define QUEUE_FLAG_SCSI_PASSTHROUGH 27	/* queue supports SCSI commands */
>  #define QUEUE_FLAG_QUIESCED    28	/* queue has been quiesced */
>  #define QUEUE_FLAG_PREEMPT_ONLY	29	/* only process REQ_PREEMPT requests */
> +#define QUEUE_FLAG_DEFER_REG	30	/* defer registering queue to a disk */
>  
>  #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
>  				 (1 << QUEUE_FLAG_SAME_COMP)	|	\
> 
Where is this flag used?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11  2:12 ` [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
  2018-01-11  2:54   ` Ming Lei
  2018-01-11  7:46   ` Hannes Reinecke
@ 2018-01-11  7:56   ` Hannes Reinecke
  2018-01-11 16:03     ` Mike Snitzer
  2 siblings, 1 reply; 22+ messages in thread
From: Hannes Reinecke @ 2018-01-11  7:56 UTC (permalink / raw)
  To: Mike Snitzer, axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On 01/11/2018 03:12 AM, Mike Snitzer wrote:
> Since I can remember DM has forced the block layer to allow the
> allocation and initialization of the request_queue to be distinct
> operations.  Reason for this is block/genhd.c:add_disk() has requires
> that the request_queue (and associated bdi) be tied to the gendisk
> before add_disk() is called -- because add_disk() also deals with
> exposing the request_queue via blk_register_queue().
> 
> DM's dynamic creation of arbitrary device types (and associated
> request_queue types) requires the DM device's gendisk be available so
> that DM table loads can establish a master/slave relationship with
> subordinate devices that are referenced by loaded DM tables -- using
> bd_link_disk_holder().  But until these DM tables, and their associated
> subordinate devices, are known DM cannot know what type of request_queue
> it needs -- nor what its queue_limits should be.
> 
> This chicken and egg scenario has created all manner of problems for DM
> and, at times, the block layer.
> 
> Summary of changes:
> 
> - Add QUEUE_FLAG_DEFER_REG that a block driver can set to defer the
>   required blk_register_queue() until the block driver's request_queue
>   is fully initialized.
> 
> - Return early from blk_unregister_queue() if QUEUE_FLAG_REGISTERED
>   is not set.  It won't be set if a request_queue with
>   QUEUE_FLAG_DEFER_REG set never called blk_register_queue() -- as can
>   happen if a driver encounters an error and must del_gendisk() before
>   calling blk_register_queue().
> 
> - Export blk_register_queue().
> 
> These changes allow DM to use device_add_disk() to anchor its gendisk as
> the "master" for master/slave relationships DM must establish with
> subordinate devices referenced in DM tables that get loaded.  Once all
> "slave" devices for a DM device are known a request_queue can be
> properly initialized and then advertised via sysfs -- important
> improvement being that no request_queue resource initialization is
> missed -- before these changes DM was known to be missing blk-mq debugfs
> and proper block throttle initialization.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  block/blk-sysfs.c      | 4 ++++
>  block/genhd.c          | 4 +++-
>  include/linux/blkdev.h | 1 +
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 870484eaed1f..2395122875b4 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -921,6 +921,7 @@ int blk_register_queue(struct gendisk *disk)
>  	mutex_unlock(&q->sysfs_lock);
>  	return ret;
>  }
> +EXPORT_SYMBOL_GPL(blk_register_queue);
>  
>  void blk_unregister_queue(struct gendisk *disk)
>  {
> @@ -929,6 +930,9 @@ void blk_unregister_queue(struct gendisk *disk)
>  	if (WARN_ON(!q))
>  		return;
>  
> +	if (!test_bit(QUEUE_FLAG_REGISTERED, &disk->queue->queue_flags))
> +		return;
> +
>  	mutex_lock(&q->sysfs_lock);
>  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>  	mutex_unlock(&q->sysfs_lock);
> diff --git a/block/genhd.c b/block/genhd.c
> index 00620e01e043..3912a82ecc4f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  				    exact_match, exact_lock, disk);
>  	}
>  	register_disk(parent, disk);
> -	blk_register_queue(disk);
>  
>  	/*
>  	 * Take an extra ref on queue which will be put on disk_release()
> @@ -692,6 +691,9 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
>  
>  	disk_add_events(disk);
>  	blk_integrity_add(disk);
> +
> +	if (!test_bit(QUEUE_FLAG_DEFER_REG, &disk->queue->queue_flags))
> +		blk_register_queue(disk);
>  }
>  EXPORT_SYMBOL(device_add_disk);
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 71a9371c8182..84d144c7b025 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -681,6 +681,7 @@ struct request_queue {
>  #define QUEUE_FLAG_SCSI_PASSTHROUGH 27	/* queue supports SCSI commands */
>  #define QUEUE_FLAG_QUIESCED    28	/* queue has been quiesced */
>  #define QUEUE_FLAG_PREEMPT_ONLY	29	/* only process REQ_PREEMPT requests */
> +#define QUEUE_FLAG_DEFER_REG	30	/* defer registering queue to a disk */
>  
>  #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
>  				 (1 << QUEUE_FLAG_SAME_COMP)	|	\
> 
Thinking of this a bit more, wouldn't it be better to modify add_disk()
(or, rather, device_add_disk()) to handle this case?
You already moved the call to 'blk_register_queue()' to the end of
device_add_disk(), so it would be trivial to make device_add_disk() a
wrapper function like

void device_add_disk(struct device *parent, struct gendisk *disk) {
  blk_add_disk(parent, disk);
  blk_register_queue(disk);
}

and then call blk_add_disk() from the dm-core.
That would save us the magic 'you have to set this flag before
registering' dance in dm.c...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization
  2018-01-11  2:12 ` [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
  2018-01-11  2:56   ` Ming Lei
@ 2018-01-11  7:57   ` Hannes Reinecke
  1 sibling, 0 replies; 22+ messages in thread
From: Hannes Reinecke @ 2018-01-11  7:57 UTC (permalink / raw)
  To: Mike Snitzer, axboe; +Cc: Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On 01/11/2018 03:12 AM, Mike Snitzer wrote:
> DM is now no longer prone to having its request_queue be improperly
> initialized.
> 
> Summary of changes:
> 
> - defer DM's blk_register_queue() from add_disk()-time until
>   dm_setup_md_queue() by setting QUEUE_FLAG_DEFER_REG in alloc_dev().
> 
> - dm_setup_md_queue() is updated to fully initialize DM's request_queue
>   (_after_ all table loads have occurred and the request_queue's type,
>   features and limits are known).
> 
> - various other small improvements that were noticed along the way.
> 
> A very welcome side-effect of these changes is DM no longer needs to:
> 1) backfill the "mq" sysfs entry (because historically DM didn't
> initialize the request_queue to use blk-mq until _after_
> register_queue() was called via add_disk()).
> 2) call elv_register_queue() to get .request_fn request-based DM
> device's "queue" exposed in syfs.
> 
> In addition, blk-mq debugfs support is now made available because
> request-based DM's blk-mq request_queue is now properly initialized
> before blk_register_queue() is called.
> 
> These changes also stave off the need to introduce new DM-specific
> workarounds in block core, e.g. this proposal:
> https://patchwork.kernel.org/patch/10067961/
> 
> In the end DM devices should be less unicorn in nature (relative to
> initialization and availability of block core infrastructure).
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm-core.h |  2 --
>  drivers/md/dm-rq.c   | 11 -----------
>  drivers/md/dm.c      | 44 ++++++++++++++++++++++++++------------------
>  3 files changed, 26 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
> index 6a14f945783c..f955123b4765 100644
> --- a/drivers/md/dm-core.h
> +++ b/drivers/md/dm-core.h
> @@ -130,8 +130,6 @@ struct mapped_device {
>  	struct srcu_struct io_barrier;
>  };
>  
> -void dm_init_md_queue(struct mapped_device *md);
> -void dm_init_normal_md_queue(struct mapped_device *md);
>  int md_in_flight(struct mapped_device *md);
>  void disable_write_same(struct mapped_device *md);
>  void disable_write_zeroes(struct mapped_device *md);
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 9d32f25489c2..3b319776d80c 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -700,7 +700,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  	/* disable dm_old_request_fn's merge heuristic by default */
>  	md->seq_rq_merge_deadline_usecs = 0;
>  
> -	dm_init_normal_md_queue(md);
>  	blk_queue_softirq_done(md->queue, dm_softirq_done);
>  
>  	/* Initialize the request-based DM worker thread */
> @@ -713,8 +712,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  		return error;
>  	}
>  
> -	elv_register_queue(md->queue);
> -
>  	return 0;
>  }
>  
> @@ -810,17 +807,9 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
>  		err = PTR_ERR(q);
>  		goto out_tag_set;
>  	}
> -	dm_init_md_queue(md);
> -
> -	/* backfill 'mq' sysfs registration normally done in blk_register_queue */
> -	err = blk_mq_register_dev(disk_to_dev(md->disk), q);
> -	if (err)
> -		goto out_cleanup_queue;
>  
>  	return 0;
>  
> -out_cleanup_queue:
> -	blk_cleanup_queue(q);
>  out_tag_set:
>  	blk_mq_free_tag_set(md->tag_set);
>  out_kfree_tag_set:
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 7475739fee49..f5d61b6adaec 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1626,20 +1626,9 @@ static const struct dax_operations dm_dax_ops;
>  
>  static void dm_wq_work(struct work_struct *work);
>  
> -void dm_init_md_queue(struct mapped_device *md)
> -{
> -	/*
> -	 * Initialize data that will only be used by a non-blk-mq DM queue
> -	 * - must do so here (in alloc_dev callchain) before queue is used
> -	 */
> -	md->queue->queuedata = md;
> -	md->queue->backing_dev_info->congested_data = md;
> -}
> -
> -void dm_init_normal_md_queue(struct mapped_device *md)
> +static void dm_init_normal_md_queue(struct mapped_device *md)
>  {
>  	md->use_blk_mq = false;
> -	dm_init_md_queue(md);
>  
>  	/*
>  	 * Initialize aspects of queue that aren't relevant for blk-mq
> @@ -1734,10 +1723,15 @@ static struct mapped_device *alloc_dev(int minor)
>  	md->queue = blk_alloc_queue_node(GFP_KERNEL, numa_node_id);
>  	if (!md->queue)
>  		goto bad;
> +	md->queue->queuedata = md;
> +	md->queue->backing_dev_info->congested_data = md;
> +	/*
> +	 * Do not allow add_disk() to blk_register_queue().
> +	 * Defer blk_register_queue() until dm_setup_md_queue().
> +	 */
> +	queue_flag_set_unlocked(QUEUE_FLAG_DEFER_REG, md->queue);
>  
> -	dm_init_md_queue(md);
> -
> -	md->disk = alloc_disk_node(1, numa_node_id);
> +	md->disk = alloc_disk_node(1, md->numa_node_id);
>  	if (!md->disk)
>  		goto bad;
>  
> @@ -1962,13 +1956,18 @@ static struct dm_table *__unbind(struct mapped_device *md)
>   */
>  int dm_create(int minor, struct mapped_device **result)
>  {
> +	int r;
>  	struct mapped_device *md;
>  
>  	md = alloc_dev(minor);
>  	if (!md)
>  		return -ENXIO;
>  
> -	dm_sysfs_init(md);
> +	r = dm_sysfs_init(md);
> +	if (r) {
> +		free_dev(md);
> +		return r;
> +	}
>  
>  	*result = md;
>  	return 0;
> @@ -2021,10 +2020,12 @@ EXPORT_SYMBOL_GPL(dm_get_queue_limits);
>  int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>  {
>  	int r;
> +	struct queue_limits limits;
>  	enum dm_queue_mode type = dm_get_md_type(md);
>  
>  	switch (type) {
>  	case DM_TYPE_REQUEST_BASED:
> +		dm_init_normal_md_queue(md);
>  		r = dm_old_init_request_queue(md, t);
>  		if (r) {
>  			DMERR("Cannot initialize queue for request-based mapped device");
> @@ -2057,6 +2058,14 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>  		break;
>  	}
>  
> +	r = dm_calculate_queue_limits(t, &limits);
> +	if (r) {
> +		DMERR("Cannot calculate initial queue limits");
> +		return r;
> +	}
> +	dm_table_set_restrictions(t, md->queue, &limits);
> +	blk_register_queue(md->disk);
> +
>  	return 0;
>  }
>  
> @@ -2121,7 +2130,6 @@ EXPORT_SYMBOL_GPL(dm_device_name);
>  
>  static void __dm_destroy(struct mapped_device *md, bool wait)
>  {
> -	struct request_queue *q = dm_get_md_queue(md);
>  	struct dm_table *map;
>  	int srcu_idx;
>  
> @@ -2132,7 +2140,7 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>  	set_bit(DMF_FREEING, &md->flags);
>  	spin_unlock(&_minor_lock);
>  
> -	blk_set_queue_dying(q);
> +	blk_set_queue_dying(md->queue);
>  
>  	if (dm_request_based(md) && md->kworker_task)
>  		kthread_flush_worker(&md->kworker);
> 
As mentioned in the other mail, maybe one should consider using a
wrapper function for 'add_disk()' to avoid having to set the magic queue
flag.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11  7:56   ` Hannes Reinecke
@ 2018-01-11 16:03     ` Mike Snitzer
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11 16:03 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: axboe, Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On Thu, Jan 11 2018 at  2:56am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> Thinking of this a bit more, wouldn't it be better to modify add_disk()
> (or, rather, device_add_disk()) to handle this case?
> You already moved the call to 'blk_register_queue()' to the end of
> device_add_disk(), so it would be trivial to make device_add_disk() a
> wrapper function like
> 
> void device_add_disk(struct device *parent, struct gendisk *disk) {
>   blk_add_disk(parent, disk);
>   blk_register_queue(disk);
> }
> 
> and then call blk_add_disk() from the dm-core.
> That would save us the magic 'you have to set this flag before
> registering' dance in dm.c...

Really not seeing the QUEUE_FLAG I added as "magic", and I think it
useful to have a bit set in the queue that lets us know this variant of
queue registration was used (when debugging in "crash" or something,
avoids needing to mentally know that DM or some other driver uses it).

But if we did do what you're suggesting (see below), we're left with 2
functions that have similar names (leaving block core consumers
wondering which to use).  So I think it best to rename blk_add_disk to
blk_add_disk_no_queue_reg.

I'll run with that and take a look at your other review point (should we
just use test_and_set_bit in del_gendisk).  And then post v4 after
testing.

Thanks,
Mike

diff --git a/block/genhd.c b/block/genhd.c
index 00620e0..bdc3590 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -629,7 +629,7 @@ static void register_disk(struct device *parent, struct gendisk *disk)
 }
 
 /**
- * device_add_disk - add partitioning information to kernel list
+ * blk_add_disk - add partitioning information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
  *
@@ -638,7 +638,7 @@ static void register_disk(struct device *parent, struct gendisk *disk)
  *
  * FIXME: error handling
  */
-void device_add_disk(struct device *parent, struct gendisk *disk)
+void blk_add_disk(struct device *parent, struct gendisk *disk)
 {
 	dev_t devt;
 	int retval;
@@ -682,7 +682,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 				    exact_match, exact_lock, disk);
 	}
 	register_disk(parent, disk);
-	blk_register_queue(disk);
 
 	/*
 	 * Take an extra ref on queue which will be put on disk_release()
@@ -693,6 +692,21 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 	disk_add_events(disk);
 	blk_integrity_add(disk);
 }
+EXPORT_SYMBOL(blk_add_disk)
+
+/**
+ * device_add_disk - add disk information to kernel list
+ * @parent: parent device for the disk
+ * @disk: per-device disk information
+ *
+ * Adds partitioning information via blk_add_disk() and
+ * also also registers the associated request_queue to @disk.
+ */
+void device_add_disk(struct device *parent, struct gendisk *disk)
+{
+	blk_add_disk(parent, disk);
+	blk_register_queue(disk);
+}
 EXPORT_SYMBOL(device_add_disk);
 
 void del_gendisk(struct gendisk *disk)
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5144ebe..d83fd25 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -390,6 +390,7 @@ static inline void free_part_info(struct hd_struct *part)
 extern void part_round_stats(struct request_queue *q, int cpu, struct hd_struct *part);
 
 /* block/genhd.c */
+extern void blk_add_disk(struct device *parent, struct gendisk *disk);
 extern void device_add_disk(struct device *parent, struct gendisk *disk);
 static inline void add_disk(struct gendisk *disk)
 {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11  7:46   ` Hannes Reinecke
@ 2018-01-11 17:04     ` Mike Snitzer
  2018-01-11 17:18         ` Bart Van Assche
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11 17:04 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: axboe, Ming Lei, hch, Bart.VanAssche, dm-devel, linux-block

On Thu, Jan 11 2018 at  2:46am -0500,
Hannes Reinecke <hare@suse.de> wrote:

> On 01/11/2018 03:12 AM, Mike Snitzer wrote:
> > 
> > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> > index 870484eaed1f..2395122875b4 100644
> > --- a/block/blk-sysfs.c
> > +++ b/block/blk-sysfs.c
> > @@ -929,6 +930,9 @@ void blk_unregister_queue(struct gendisk *disk)
> >  	if (WARN_ON(!q))
> >  		return;
> >  
> > +	if (!test_bit(QUEUE_FLAG_REGISTERED, &disk->queue->queue_flags))
> > +		return;
> > +
> >  	mutex_lock(&q->sysfs_lock);
> >  	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> >  	mutex_unlock(&q->sysfs_lock);
> Why can't we use test_and_clear_bit() here?
> Shouldn't that relieve the need for the mutex_lock() here, too?

FYI, I looked at this and then got concerned with it enough to chat with
Mikulas (cc'd) about it, here is what he said:

11:54 <mikulas> if (!test_and_clear_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
11:55 <mikulas> this is buggy - the operation test_and_clear_bit is atomic - but if it races with some non-atomic read-modify-write operation on q->queue_flags, then the atomic modification would be lost
11:56 <mikulas> you must use always atomic accesses on q->queue_flags or always non-atomic accesses inside a mutex
11:56 <mikulas> queue_flag_clear_unlocked uses non-atomic __clear_bit
11:57 <mikulas> other functions in include/linux/blkdev.h also use non-atomic operations - so you cannot mix them with atomic operations

So my v4, that I'll send out shortly, won't be using test_and_clear_bit()

Thanks,
Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 17:04     ` Mike Snitzer
@ 2018-01-11 17:18         ` Bart Van Assche
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2018-01-11 17:18 UTC (permalink / raw)
  To: hare, snitzer; +Cc: hch, dm-devel, linux-block, tom.leiming, axboe

T24gVGh1LCAyMDE4LTAxLTExIGF0IDEyOjA0IC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IFNvIG15IHY0LCB0aGF0IEknbGwgc2VuZCBvdXQgc2hvcnRseSwgd29uJ3QgYmUgdXNpbmcgdGVz
dF9hbmRfY2xlYXJfYml0KCkNCg0KUGxlYXNlIHVzZSBxdWV1ZV9mbGFnX3NldCgpLCBxdWV1ZV9m
bGFnX2NsZWFyKCksIHF1ZXVlX2ZsYWdfdGVzdF9hbmRfY2xlYXIoKSBhbmQvb3INCnF1ZXVlX2Zs
YWdfdGVzdF9hbmRfc2V0KCkgdG8gbWFuaXB1bGF0ZSBxdWV1ZSBmbGFncy4NCg0KVGhhbmtzLA0K
DQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
@ 2018-01-11 17:18         ` Bart Van Assche
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2018-01-11 17:18 UTC (permalink / raw)
  To: hare, snitzer; +Cc: hch, dm-devel, linux-block, tom.leiming, axboe

On Thu, 2018-01-11 at 12:04 -0500, Mike Snitzer wrote:
> So my v4, that I'll send out shortly, won't be using test_and_clear_bit()

Please use queue_flag_set(), queue_flag_clear(), queue_flag_test_and_clear() and/or
queue_flag_test_and_set() to manipulate queue flags.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 17:18         ` Bart Van Assche
  (?)
@ 2018-01-11 17:29         ` Mike Snitzer
  2018-01-11 17:47             ` Bart Van Assche
  -1 siblings, 1 reply; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11 17:29 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hare, hch, dm-devel, linux-block, tom.leiming, axboe

On Thu, Jan 11 2018 at 12:18pm -0500,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> On Thu, 2018-01-11 at 12:04 -0500, Mike Snitzer wrote:
> > So my v4, that I'll send out shortly, won't be using test_and_clear_bit()
> 
> Please use queue_flag_set(), queue_flag_clear(), queue_flag_test_and_clear() and/or
> queue_flag_test_and_set() to manipulate queue flags.

Can you please expand on this?  My patch is only using test_bit().

So your comment isn't relevant.

Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 17:29         ` Mike Snitzer
@ 2018-01-11 17:47             ` Bart Van Assche
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2018-01-11 17:47 UTC (permalink / raw)
  To: snitzer; +Cc: hch, dm-devel, hare, linux-block, tom.leiming, axboe

T24gVGh1LCAyMDE4LTAxLTExIGF0IDEyOjI5IC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IE9uIFRodSwgSmFuIDExIDIwMTggYXQgMTI6MThwbSAtMDUwMCwNCj4gQmFydCBWYW4gQXNzY2hl
IDxCYXJ0LlZhbkFzc2NoZUB3ZGMuY29tPiB3cm90ZToNCj4gDQo+ID4gT24gVGh1LCAyMDE4LTAx
LTExIGF0IDEyOjA0IC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+ID4gPiBTbyBteSB2NCwg
dGhhdCBJJ2xsIHNlbmQgb3V0IHNob3J0bHksIHdvbid0IGJlIHVzaW5nIHRlc3RfYW5kX2NsZWFy
X2JpdCgpDQo+ID4gDQo+ID4gUGxlYXNlIHVzZSBxdWV1ZV9mbGFnX3NldCgpLCBxdWV1ZV9mbGFn
X2NsZWFyKCksIHF1ZXVlX2ZsYWdfdGVzdF9hbmRfY2xlYXIoKSBhbmQvb3INCj4gPiBxdWV1ZV9m
bGFnX3Rlc3RfYW5kX3NldCgpIHRvIG1hbmlwdWxhdGUgcXVldWUgZmxhZ3MuDQo+IA0KPiBDYW4g
eW91IHBsZWFzZSBleHBhbmQgb24gdGhpcz8gIE15IHBhdGNoIGlzIG9ubHkgdXNpbmcgdGVzdF9i
aXQoKS4NCg0KSGVsbG8gTWlrZSwNCg0KSSB3YXMgcmVmZXJyaW5nIHRvIHRoZSBmb2xsb3dpbmcg
Y29kZSwgd2hpY2ggYXBwYXJlbmx5IGlzIGV4aXN0aW5nIGNvZGU6DQoNCiAgICAgICAgbXV0ZXhf
bG9jaygmcS0+c3lzZnNfbG9jayk7DQogICAgICAgIHF1ZXVlX2ZsYWdfY2xlYXJfdW5sb2NrZWQo
UVVFVUVfRkxBR19SRUdJU1RFUkVELCBxKTsNCiAgICAgICAgbXV0ZXhfdW5sb2NrKCZxLT5zeXNm
c19sb2NrKTsNCg0KVGhlIGFib3ZlIGNvZGUgaXMgd3JvbmcuIE90aGVyIGNvZGUgdGhhdCBjaGFu
Z2VzIHRoZSBxdWV1ZSBmbGFncyBwcm90ZWN0cw0KdGhlc2UgY2hhbmdlcyB3aXRoIHRoZSB0aGUg
cXVldWUgbG9jay4gVGhlIGFib3ZlIGNvZGUgc2hvdWxkIGJlIGNoYW5nZWQgaW50bw0KdGhlIGZv
bGxvd2luZzoNCg0KCXNwaW5fbG9ja19pcnEocS0+cXVldWVfbG9jayk7DQoJcXVldWVfZmxhZ19j
bGVhcihRVUVVRV9GTEFHX1JFR0lTVEVSRUQsIHEpOw0KCXNwaW5fdW5sb2NrX2lycShxLT5xdWV1
ZV9sb2NrKTsNCg0KVGhlIG9ubHkgZnVuY3Rpb25zIGZyb20gd2hpY2ggaXQgaXMgc2FmZSB0byBj
YWxsIHF1ZXVlX2ZsYWdfKHNldHxjbGVhcilfdW5sb2NrZWQoKQ0Kd2l0aG91dCBob2xkaW5nIHRo
ZSBxdWV1ZSBsb2NrIGFyZSBibGtfYWxsb2NfcXVldWVfbm9kZSgpIGFuZA0KX19ibGtfcmVsZWFz
ZV9xdWV1ZSgpIGJlY2F1c2UgZm9yIHRoZXNlIGZ1bmN0aW9ucyBpdCBpcyBndWFyYW50ZWVkIHRo
YXQgbm8gcXVldWUNCmZsYWcgY2hhbmdlcyBjYW4gaGFwcGVuIGZyb20gYW5vdGhlciBjb250ZXh0
LCBlLmcuIHRocm91Z2ggYSBibGtfc2V0X3F1ZXVlX2R5aW5nKCkNCmNhbGwuDQoNCkJhcnQu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
@ 2018-01-11 17:47             ` Bart Van Assche
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2018-01-11 17:47 UTC (permalink / raw)
  To: snitzer; +Cc: hch, dm-devel, hare, linux-block, tom.leiming, axboe

On Thu, 2018-01-11 at 12:29 -0500, Mike Snitzer wrote:
> On Thu, Jan 11 2018 at 12:18pm -0500,
> Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> 
> > On Thu, 2018-01-11 at 12:04 -0500, Mike Snitzer wrote:
> > > So my v4, that I'll send out shortly, won't be using test_and_clear_bit()
> > 
> > Please use queue_flag_set(), queue_flag_clear(), queue_flag_test_and_clear() and/or
> > queue_flag_test_and_set() to manipulate queue flags.
> 
> Can you please expand on this?  My patch is only using test_bit().

Hello Mike,

I was referring to the following code, which apparenly is existing code:

        mutex_lock(&q->sysfs_lock);
        queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
        mutex_unlock(&q->sysfs_lock);

The above code is wrong. Other code that changes the queue flags protects
these changes with the the queue lock. The above code should be changed into
the following:

	spin_lock_irq(q->queue_lock);
	queue_flag_clear(QUEUE_FLAG_REGISTERED, q);
	spin_unlock_irq(q->queue_lock);

The only functions from which it is safe to call queue_flag_(set|clear)_unlocked()
without holding the queue lock are blk_alloc_queue_node() and
__blk_release_queue() because for these functions it is guaranteed that no queue
flag changes can happen from another context, e.g. through a blk_set_queue_dying()
call.

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 17:47             ` Bart Van Assche
  (?)
@ 2018-01-11 19:20             ` Mike Snitzer
  2018-01-11 19:32                 ` Bart Van Assche
  -1 siblings, 1 reply; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11 19:20 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, dm-devel, hare, linux-block, tom.leiming, axboe

On Thu, Jan 11 2018 at 12:47pm -0500,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> On Thu, 2018-01-11 at 12:29 -0500, Mike Snitzer wrote:
> > On Thu, Jan 11 2018 at 12:18pm -0500,
> > Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> > 
> > > On Thu, 2018-01-11 at 12:04 -0500, Mike Snitzer wrote:
> > > > So my v4, that I'll send out shortly, won't be using test_and_clear_bit()
> > > 
> > > Please use queue_flag_set(), queue_flag_clear(), queue_flag_test_and_clear() and/or
> > > queue_flag_test_and_set() to manipulate queue flags.
> > 
> > Can you please expand on this?  My patch is only using test_bit().
> 
> Hello Mike,
> 
> I was referring to the following code, which apparenly is existing code:
> 
>         mutex_lock(&q->sysfs_lock);
>         queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>         mutex_unlock(&q->sysfs_lock);
> 
> The above code is wrong. Other code that changes the queue flags protects
> these changes with the the queue lock. The above code should be changed into
> the following:
> 
> 	spin_lock_irq(q->queue_lock);
> 	queue_flag_clear(QUEUE_FLAG_REGISTERED, q);
> 	spin_unlock_irq(q->queue_lock);
> 
> The only functions from which it is safe to call queue_flag_(set|clear)_unlocked()
> without holding the queue lock are blk_alloc_queue_node() and
> __blk_release_queue() because for these functions it is guaranteed that no queue
> flag changes can happen from another context, e.g. through a blk_set_queue_dying()
> call.

Yes, I noticed the use of sysfs_lock also.  I've fixed it up earlier in
my v4 patchset and build on it.  I'll add your Reported-by too.

Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 19:20             ` Mike Snitzer
@ 2018-01-11 19:32                 ` Bart Van Assche
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2018-01-11 19:32 UTC (permalink / raw)
  To: snitzer; +Cc: hch, dm-devel, hare, linux-block, tom.leiming, axboe

T24gVGh1LCAyMDE4LTAxLTExIGF0IDE0OjIwIC0wNTAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6DQo+
IFllcywgSSBub3RpY2VkIHRoZSB1c2Ugb2Ygc3lzZnNfbG9jayBhbHNvLiAgSSd2ZSBmaXhlZCBp
dCB1cCBlYXJsaWVyIGluDQo+IG15IHY0IHBhdGNoc2V0IGFuZCBidWlsZCBvbiBpdC4gIEknbGwg
YWRkIHlvdXIgUmVwb3J0ZWQtYnkgdG9vLg0KDQpIZWxsbyBNaWtlLA0KDQpUaGVyZSBhcmUgbWFu
eSBtb3JlIGluY29uc2lzdGVuY2llcyB3aXRoIHJlZ2FyZCB0byBxdWV1ZSBmbGFnIG1hbmlwdWxh
dGlvbg0KaW4gdGhlIGJsb2NrIGxheWVyIGNvcmUgYW5kIGluIGJsb2NrIGRyaXZlcnMuIEknbSB3
b3JraW5nIG9uIGEgc2VyaWVzIHRvDQphZGRyZXNzIGFsbCBpbmNvbnNpc3RlbmNpZXMgaW4gdGhl
IGJsb2NrIGxheWVyIGFuZCBpbiB0aGUgc2QgZHJpdmVyLg0KDQpCYXJ0Lg==

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
@ 2018-01-11 19:32                 ` Bart Van Assche
  0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2018-01-11 19:32 UTC (permalink / raw)
  To: snitzer; +Cc: hch, dm-devel, hare, linux-block, tom.leiming, axboe

On Thu, 2018-01-11 at 14:20 -0500, Mike Snitzer wrote:
> Yes, I noticed the use of sysfs_lock also.  I've fixed it up earlier in
> my v4 patchset and build on it.  I'll add your Reported-by too.

Hello Mike,

There are many more inconsistencies with regard to queue flag manipulation
in the block layer core and in block drivers. I'm working on a series to
address all inconsistencies in the block layer and in the sd driver.

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred
  2018-01-11 19:32                 ` Bart Van Assche
  (?)
@ 2018-01-11 19:50                 ` Mike Snitzer
  -1 siblings, 0 replies; 22+ messages in thread
From: Mike Snitzer @ 2018-01-11 19:50 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: hch, dm-devel, hare, linux-block, tom.leiming, axboe

On Thu, Jan 11 2018 at  2:32pm -0500,
Bart Van Assche <Bart.VanAssche@wdc.com> wrote:

> On Thu, 2018-01-11 at 14:20 -0500, Mike Snitzer wrote:
> > Yes, I noticed the use of sysfs_lock also.  I've fixed it up earlier in
> > my v4 patchset and build on it.  I'll add your Reported-by too.
> 
> Hello Mike,
> 
> There are many more inconsistencies with regard to queue flag manipulation
> in the block layer core and in block drivers. I'm working on a series to
> address all inconsistencies in the block layer and in the sd driver.

OK, well I needed to fix this anyway to build my changes up properly.
I think my v4 changes are ready for Jens to pick up, so you'll just have
to rebase.

I'll send out v4 shortly.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-01-11 19:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-11  2:12 [for-4.16 PATCH v3 0/3] block/dm: allow DM to defer blk_register_queue() until ready Mike Snitzer
2018-01-11  2:12 ` [for-4.16 PATCH v3 1/3] block: only bdi_unregister() in del_gendisk() if !GENHD_FL_HIDDEN Mike Snitzer
2018-01-11  2:48   ` Ming Lei
2018-01-11  7:40   ` Hannes Reinecke
2018-01-11  2:12 ` [for-4.16 PATCH v3 2/3] block: allow gendisk's request_queue registration to be deferred Mike Snitzer
2018-01-11  2:54   ` Ming Lei
2018-01-11  7:46   ` Hannes Reinecke
2018-01-11 17:04     ` Mike Snitzer
2018-01-11 17:18       ` Bart Van Assche
2018-01-11 17:18         ` Bart Van Assche
2018-01-11 17:29         ` Mike Snitzer
2018-01-11 17:47           ` Bart Van Assche
2018-01-11 17:47             ` Bart Van Assche
2018-01-11 19:20             ` Mike Snitzer
2018-01-11 19:32               ` Bart Van Assche
2018-01-11 19:32                 ` Bart Van Assche
2018-01-11 19:50                 ` Mike Snitzer
2018-01-11  7:56   ` Hannes Reinecke
2018-01-11 16:03     ` Mike Snitzer
2018-01-11  2:12 ` [for-4.16 PATCH v3 3/3] dm: fix awkward and incomplete request_queue initialization Mike Snitzer
2018-01-11  2:56   ` Ming Lei
2018-01-11  7:57   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.