All of lore.kernel.org
 help / color / mirror / Atom feed
* move more work to disk_release v3
@ 2022-03-04 16:03 Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
                   ` (13 more replies)
  0 siblings, 14 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Hi all,

this series resurrects and forward ports ports larger parts of the
"block: don't drain file system I/O on del_gendisk" series from Ming,
but does not remove the draining in del_gendisk, but instead the one
in the sd driver, which always was a bit ad-hoc.  As part of that sd
and sr are switched to use the new ->free_disk method to avoid having
to clear disk->private_data and the way to lookup the SCSI ULP is
cleaned up as well.

Git branch:

    git://git.infradead.org/users/hch/block.git freeze-5.18

Gitweb:

    http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/freeze-5.18

Changes since v2:
 - handle the weird dm-multipath flush sequence corner case when
   assinging rq->part

Changes since v1:
 - fix a refcounting bug in sd
 - rename a function

Diffstat:
 block/blk-core.c           |    7 --
 block/blk-mq.c             |   10 +--
 block/blk-sysfs.c          |   25 --------
 block/blk.h                |    2 
 block/elevator.c           |    7 +-
 block/genhd.c              |   38 ++++++++++++-
 drivers/scsi/sd.c          |  114 +++++++++------------------------------
 drivers/scsi/sd.h          |   13 +++-
 drivers/scsi/sr.c          |  129 +++++++++------------------------------------
 drivers/scsi/sr.h          |    5 -
 drivers/scsi/st.c          |    1 
 drivers/scsi/st.h          |    1 
 include/scsi/scsi_cmnd.h   |    9 ---
 include/scsi/scsi_driver.h |    9 ++-
 14 files changed, 117 insertions(+), 253 deletions(-)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  1:02   ` Bart Van Assche
                     ` (2 more replies)
  2022-03-04 16:03 ` [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs Christoph Hellwig
                   ` (12 subsequent siblings)
  13 siblings, 3 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

I/O accounting buckets I/O into the read/write/discard categories into
which passthrough I/O does not fit at all.  It also accounts to the
block_device, which may not even exist for passthrough I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 11 ++++++++---
 block/blk.h    |  2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index a05ce77250316..ab4b646551334 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -883,10 +883,15 @@ static inline void blk_account_io_done(struct request *req, u64 now)
 
 static void __blk_account_io_start(struct request *rq)
 {
-	/* passthrough requests can hold bios that do not have ->bi_bdev set */
-	if (rq->bio && rq->bio->bi_bdev)
+	/*
+	 * All non-passthrough requests are created from a bio with one
+	 * exception: when a flush command that is part of a flush sequence
+	 * generated by the state machine in blk-flush.c is cloned onto the
+	 * lower device by dm-multipath we can get here without a bio.
+	 */
+	if (rq->bio)
 		rq->part = rq->bio->bi_bdev;
-	else if (rq->q->disk)
+	else
 		rq->part = rq->q->disk->part0;
 
 	part_stat_lock();
diff --git a/block/blk.h b/block/blk.h
index ebaa59ca46ca6..6f21859c7f0ff 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -325,7 +325,7 @@ int blk_dev_init(void);
  */
 static inline bool blk_do_io_stat(struct request *rq)
 {
-	return (rq->rq_flags & RQF_IO_STAT) && rq->q->disk;
+	return (rq->rq_flags & RQF_IO_STAT) && !blk_rq_is_passthrough(rq);
 }
 
 void update_io_ticks(struct block_device *part, unsigned long now, bool end);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  1:05   ` Bart Van Assche
  2022-03-08  3:14   ` Martin K. Petersen
  2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

From: Ming Lei <ming.lei@redhat.com>

To simplify further changes allow for double calling blk_mq_free_rqs on
a queue.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[hch: split out from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ab4b646551334..6fd0b0f652514 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3070,6 +3070,9 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 	struct blk_mq_tags *drv_tags;
 	struct page *page;
 
+	if (list_empty(&tags->page_list))
+		return;
+
 	if (blk_mq_is_shared_tags(set->flags))
 		drv_tags = set->shared_tags;
 	else
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  1:27   ` Bart Van Assche
                     ` (3 more replies)
  2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
                   ` (10 subsequent siblings)
  13 siblings, 4 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Requiring every ULP to have the scsi_drive as first member of the
private data is rather fragile and not necessary anyway.  Just use
the driver hanging off the SCSI device instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c          | 3 +--
 drivers/scsi/sd.h          | 3 +--
 drivers/scsi/sr.c          | 5 ++---
 drivers/scsi/sr.h          | 1 -
 drivers/scsi/st.c          | 1 -
 drivers/scsi/st.h          | 1 -
 include/scsi/scsi_cmnd.h   | 9 ---------
 include/scsi/scsi_driver.h | 9 +++++++--
 8 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2d648d27bfd71..2a1e19e871d30 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3515,7 +3515,6 @@ static int sd_probe(struct device *dev)
 	}
 
 	sdkp->device = sdp;
-	sdkp->driver = &sd_template;
 	sdkp->disk = gd;
 	sdkp->index = index;
 	sdkp->max_retries = SD_MAX_RETRIES;
@@ -3548,7 +3547,7 @@ static int sd_probe(struct device *dev)
 	gd->minors = SD_MINORS;
 
 	gd->fops = &sd_fops;
-	gd->private_data = &sdkp->driver;
+	gd->private_data = sdkp;
 
 	/* defaults, until the device tells us otherwise */
 	sdp->sector_size = 512;
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 2e5932bde43d1..303aa1c23aefb 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -68,7 +68,6 @@ enum {
 };
 
 struct scsi_disk {
-	struct scsi_driver *driver;	/* always &sd_template */
 	struct scsi_device *device;
 	struct device	dev;
 	struct gendisk	*disk;
@@ -131,7 +130,7 @@ struct scsi_disk {
 
 static inline struct scsi_disk *scsi_disk(struct gendisk *disk)
 {
-	return container_of(disk->private_data, struct scsi_disk, driver);
+	return disk->private_data;
 }
 
 #define sd_printk(prefix, sdsk, fmt, a...)				\
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index f925b1f1f9ada..569bda76a5175 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -147,7 +147,7 @@ static void sr_kref_release(struct kref *kref);
 
 static inline struct scsi_cd *scsi_cd(struct gendisk *disk)
 {
-	return container_of(disk->private_data, struct scsi_cd, driver);
+	return disk->private_data;
 }
 
 static int sr_runtime_suspend(struct device *dev)
@@ -692,7 +692,6 @@ static int sr_probe(struct device *dev)
 
 	cd->device = sdev;
 	cd->disk = disk;
-	cd->driver = &sr_template;
 	cd->capacity = 0x1fffff;
 	cd->device->changed = 1;	/* force recheck CD type */
 	cd->media_present = 1;
@@ -713,7 +712,7 @@ static int sr_probe(struct device *dev)
 	sr_vendor_init(cd);
 
 	set_capacity(disk, cd->capacity);
-	disk->private_data = &cd->driver;
+	disk->private_data = cd;
 
 	if (register_cdrom(disk, &cd->cdi))
 		goto fail_minor;
diff --git a/drivers/scsi/sr.h b/drivers/scsi/sr.h
index 1609f02ed29ac..d80af3fcb6f97 100644
--- a/drivers/scsi/sr.h
+++ b/drivers/scsi/sr.h
@@ -32,7 +32,6 @@ struct scsi_device;
 
 
 typedef struct scsi_cd {
-	struct scsi_driver *driver;
 	unsigned capacity;	/* size in blocks                       */
 	struct scsi_device *device;
 	unsigned int vendor;	/* vendor code, see sr_vendor.c         */
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index e869e90e05afe..ebe9412c86f43 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -4276,7 +4276,6 @@ static int st_probe(struct device *dev)
 		goto out_buffer_free;
 	}
 	kref_init(&tpnt->kref);
-	tpnt->driver = &st_template;
 
 	tpnt->device = SDp;
 	if (SDp->scsi_level <= 2)
diff --git a/drivers/scsi/st.h b/drivers/scsi/st.h
index c0ef0d9aaf8a2..7a68eaba7e810 100644
--- a/drivers/scsi/st.h
+++ b/drivers/scsi/st.h
@@ -117,7 +117,6 @@ struct scsi_tape_stats {
 
 /* The tape drive descriptor */
 struct scsi_tape {
-	struct scsi_driver *driver;
 	struct scsi_device *device;
 	struct mutex lock;	/* For serialization */
 	struct completion wait;	/* For SCSI commands */
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 6794d7322cbde..e3a4c67794b14 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -13,7 +13,6 @@
 #include <scsi/scsi_request.h>
 
 struct Scsi_Host;
-struct scsi_driver;
 
 /*
  * MAX_COMMAND_SIZE is:
@@ -159,14 +158,6 @@ static inline void *scsi_cmd_priv(struct scsi_cmnd *cmd)
 	return cmd + 1;
 }
 
-/* make sure not to use it with passthrough commands */
-static inline struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd)
-{
-	struct request *rq = scsi_cmd_to_rq(cmd);
-
-	return *(struct scsi_driver **)rq->q->disk->private_data;
-}
-
 void scsi_done(struct scsi_cmnd *cmd);
 
 extern void scsi_finish_command(struct scsi_cmnd *cmd);
diff --git a/include/scsi/scsi_driver.h b/include/scsi/scsi_driver.h
index 6dffa8555a390..4ce1988b2ba01 100644
--- a/include/scsi/scsi_driver.h
+++ b/include/scsi/scsi_driver.h
@@ -4,11 +4,10 @@
 
 #include <linux/blk_types.h>
 #include <linux/device.h>
+#include <scsi/scsi_cmnd.h>
 
 struct module;
 struct request;
-struct scsi_cmnd;
-struct scsi_device;
 
 struct scsi_driver {
 	struct device_driver	gendrv;
@@ -31,4 +30,10 @@ extern int scsi_register_interface(struct class_interface *);
 #define scsi_unregister_interface(intf) \
 	class_interface_unregister(intf)
 
+/* make sure not to use it with passthrough commands */
+static inline struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd)
+{
+	return to_scsi_driver(cmd->device->sdev_gendev.driver);
+}
+
 #endif /* _SCSI_SCSI_DRIVER_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  1:38   ` Bart Van Assche
                     ` (3 more replies)
  2022-03-04 16:03 ` [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference Christoph Hellwig
                   ` (9 subsequent siblings)
  13 siblings, 4 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

dev is very hard to grab for.  Give the field a more descriptive name and
documents it's purpose.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 22 +++++++++++-----------
 drivers/scsi/sd.h | 10 ++++++++--
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2a1e19e871d30..7479e7cb36b43 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -672,7 +672,7 @@ static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
 	if (disk->private_data) {
 		sdkp = scsi_disk(disk);
 		if (scsi_device_get(sdkp->device) == 0)
-			get_device(&sdkp->dev);
+			get_device(&sdkp->disk_dev);
 		else
 			sdkp = NULL;
 	}
@@ -685,7 +685,7 @@ static void scsi_disk_put(struct scsi_disk *sdkp)
 	struct scsi_device *sdev = sdkp->device;
 
 	mutex_lock(&sd_ref_mutex);
-	put_device(&sdkp->dev);
+	put_device(&sdkp->disk_dev);
 	scsi_device_put(sdev);
 	mutex_unlock(&sd_ref_mutex);
 }
@@ -3529,14 +3529,14 @@ static int sd_probe(struct device *dev)
 					     SD_MOD_TIMEOUT);
 	}
 
-	device_initialize(&sdkp->dev);
-	sdkp->dev.parent = get_device(dev);
-	sdkp->dev.class = &sd_disk_class;
-	dev_set_name(&sdkp->dev, "%s", dev_name(dev));
+	device_initialize(&sdkp->disk_dev);
+	sdkp->disk_dev.parent = get_device(dev);
+	sdkp->disk_dev.class = &sd_disk_class;
+	dev_set_name(&sdkp->disk_dev, "%s", dev_name(dev));
 
-	error = device_add(&sdkp->dev);
+	error = device_add(&sdkp->disk_dev);
 	if (error) {
-		put_device(&sdkp->dev);
+		put_device(&sdkp->disk_dev);
 		goto out;
 	}
 
@@ -3577,7 +3577,7 @@ static int sd_probe(struct device *dev)
 
 	error = device_add_disk(dev, gd, NULL);
 	if (error) {
-		put_device(&sdkp->dev);
+		put_device(&sdkp->disk_dev);
 		goto out;
 	}
 
@@ -3628,7 +3628,7 @@ static int sd_remove(struct device *dev)
 	sdkp = dev_get_drvdata(dev);
 	scsi_autopm_get_device(sdkp->device);
 
-	device_del(&sdkp->dev);
+	device_del(&sdkp->disk_dev);
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
 
@@ -3636,7 +3636,7 @@ static int sd_remove(struct device *dev)
 
 	mutex_lock(&sd_ref_mutex);
 	dev_set_drvdata(dev, NULL);
-	put_device(&sdkp->dev);
+	put_device(&sdkp->disk_dev);
 	mutex_unlock(&sd_ref_mutex);
 
 	return 0;
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 303aa1c23aefb..7625a90b0fa69 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -69,7 +69,13 @@ enum {
 
 struct scsi_disk {
 	struct scsi_device *device;
-	struct device	dev;
+
+	/*
+	 * This device is mostly just used to show a bunch of attributes in a
+	 * weird place.  In doubt don't add any new users, and most importantly
+	 * don't use if for any actual refcounting.
+	 */
+	struct device	disk_dev;
 	struct gendisk	*disk;
 	struct opal_dev *opal_dev;
 #ifdef CONFIG_BLK_DEV_ZONED
@@ -126,7 +132,7 @@ struct scsi_disk {
 	unsigned	security : 1;
 	unsigned	ignore_medium_access_errors : 1;
 };
-#define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev)
+#define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev)
 
 static inline struct scsi_disk *scsi_disk(struct gendisk *disk)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  1:44   ` Bart Van Assche
                     ` (2 more replies)
  2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
                   ` (8 subsequent siblings)
  13 siblings, 3 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

sd_zbc_release_disk accesses disk->device, so ensure that actually still has
a valid reference.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 7479e7cb36b43..7bfebf5b2832d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3672,9 +3672,9 @@ static void scsi_disk_release(struct device *dev)
 
 	disk->private_data = NULL;
 	put_disk(disk);
-	put_device(&sdkp->device->sdev_gendev);
 
 	sd_zbc_release_disk(sdkp);
+	put_device(&sdkp->device->sdev_gendev);
 
 	kfree(sdkp);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/14] sd: delay calling free_opal_dev
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  1:45   ` Bart Van Assche
                     ` (3 more replies)
  2022-03-04 16:03 ` [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting Christoph Hellwig
                   ` (7 subsequent siblings)
  13 siblings, 4 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Call free_opal_dev from scsi_disk_release as the opal_dev field is access
from the ioctl handler, which isn't synchronized vs sd_release and thus
can be accesses during or after sd_release was called.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 7bfebf5b2832d..346b8d62de7d1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3632,8 +3632,6 @@ static int sd_remove(struct device *dev)
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
 
-	free_opal_dev(sdkp->opal_dev);
-
 	mutex_lock(&sd_ref_mutex);
 	dev_set_drvdata(dev, NULL);
 	put_device(&sdkp->disk_dev);
@@ -3675,6 +3673,7 @@ static void scsi_disk_release(struct device *dev)
 
 	sd_zbc_release_disk(sdkp);
 	put_device(&sdkp->device->sdev_gendev);
+	free_opal_dev(sdkp->opal_dev);
 
 	kfree(sdkp);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  2:03   ` Bart Van Assche
                     ` (2 more replies)
  2022-03-04 16:03 ` [PATCH 08/14] sr: implement ->free_disk Christoph Hellwig
                   ` (6 subsequent siblings)
  13 siblings, 3 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Implement the ->free_disk method to to put struct scsi_disk when the last
gendisk reference count goes away.  This removes the need to clear
->private_data and thus freeze the queue on unbind.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sd.c | 90 +++++++++--------------------------------------
 1 file changed, 16 insertions(+), 74 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 346b8d62de7d1..498e6fdcf6cfe 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -121,11 +121,6 @@ static void scsi_disk_release(struct device *cdev);
 
 static DEFINE_IDA(sd_index_ida);
 
-/* This semaphore is used to mediate the 0->1 reference get in the
- * face of object destruction (i.e. we can't allow a get on an
- * object after last put) */
-static DEFINE_MUTEX(sd_ref_mutex);
-
 static struct kmem_cache *sd_cdb_cache;
 static mempool_t *sd_cdb_pool;
 static mempool_t *sd_page_pool;
@@ -663,33 +658,6 @@ static int sd_major(int major_idx)
 	}
 }
 
-static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
-{
-	struct scsi_disk *sdkp = NULL;
-
-	mutex_lock(&sd_ref_mutex);
-
-	if (disk->private_data) {
-		sdkp = scsi_disk(disk);
-		if (scsi_device_get(sdkp->device) == 0)
-			get_device(&sdkp->disk_dev);
-		else
-			sdkp = NULL;
-	}
-	mutex_unlock(&sd_ref_mutex);
-	return sdkp;
-}
-
-static void scsi_disk_put(struct scsi_disk *sdkp)
-{
-	struct scsi_device *sdev = sdkp->device;
-
-	mutex_lock(&sd_ref_mutex);
-	put_device(&sdkp->disk_dev);
-	scsi_device_put(sdev);
-	mutex_unlock(&sd_ref_mutex);
-}
-
 #ifdef CONFIG_BLK_SED_OPAL
 static int sd_sec_submit(void *data, u16 spsp, u8 secp, void *buffer,
 		size_t len, bool send)
@@ -1418,17 +1386,15 @@ static bool sd_need_revalidate(struct block_device *bdev,
  **/
 static int sd_open(struct block_device *bdev, fmode_t mode)
 {
-	struct scsi_disk *sdkp = scsi_disk_get(bdev->bd_disk);
-	struct scsi_device *sdev;
+	struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
+	struct scsi_device *sdev = sdkp->device;
 	int retval;
 
-	if (!sdkp)
+	if (scsi_device_get(sdev))
 		return -ENXIO;
 
 	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n"));
 
-	sdev = sdkp->device;
-
 	/*
 	 * If the device is in error recovery, wait until it is done.
 	 * If the device is offline, then disallow any access to it.
@@ -1473,7 +1439,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode)
 	return 0;
 
 error_out:
-	scsi_disk_put(sdkp);
+	scsi_device_put(sdkp->device);
 	return retval;	
 }
 
@@ -1502,7 +1468,7 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
 			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
 	}
 
-	scsi_disk_put(sdkp);
+	scsi_device_put(sdkp->device);
 }
 
 static int sd_getgeo(struct block_device *bdev, struct hd_geometry *geo)
@@ -1616,7 +1582,7 @@ static int media_not_present(struct scsi_disk *sdkp,
  **/
 static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
 {
-	struct scsi_disk *sdkp = scsi_disk_get(disk);
+	struct scsi_disk *sdkp = disk->private_data;
 	struct scsi_device *sdp;
 	int retval;
 	bool disk_changed;
@@ -1679,7 +1645,6 @@ static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
 	 */
 	disk_changed = sdp->changed;
 	sdp->changed = 0;
-	scsi_disk_put(sdkp);
 	return disk_changed ? DISK_EVENT_MEDIA_CHANGE : 0;
 }
 
@@ -1887,6 +1852,13 @@ static const struct pr_ops sd_pr_ops = {
 	.pr_clear	= sd_pr_clear,
 };
 
+static void scsi_disk_free_disk(struct gendisk *disk)
+{
+	struct scsi_disk *sdkp = disk->private_data;
+
+	put_device(&sdkp->disk_dev);
+}
+
 static const struct block_device_operations sd_fops = {
 	.owner			= THIS_MODULE,
 	.open			= sd_open,
@@ -1898,6 +1870,7 @@ static const struct block_device_operations sd_fops = {
 	.unlock_native_capacity	= sd_unlock_native_capacity,
 	.report_zones		= sd_zbc_report_zones,
 	.get_unique_id		= sd_get_unique_id,
+	.free_disk		= scsi_disk_free_disk,
 	.pr_ops			= &sd_pr_ops,
 };
 
@@ -3623,54 +3596,23 @@ static int sd_probe(struct device *dev)
  **/
 static int sd_remove(struct device *dev)
 {
-	struct scsi_disk *sdkp;
+	struct scsi_disk *sdkp = dev_get_drvdata(dev);
 
-	sdkp = dev_get_drvdata(dev);
 	scsi_autopm_get_device(sdkp->device);
 
 	device_del(&sdkp->disk_dev);
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
 
-	mutex_lock(&sd_ref_mutex);
-	dev_set_drvdata(dev, NULL);
-	put_device(&sdkp->disk_dev);
-	mutex_unlock(&sd_ref_mutex);
-
+	put_disk(sdkp->disk);
 	return 0;
 }
 
-/**
- *	scsi_disk_release - Called to free the scsi_disk structure
- *	@dev: pointer to embedded class device
- *
- *	sd_ref_mutex must be held entering this routine.  Because it is
- *	called on last put, you should always use the scsi_disk_get()
- *	scsi_disk_put() helpers which manipulate the semaphore directly
- *	and never do a direct put_device.
- **/
 static void scsi_disk_release(struct device *dev)
 {
 	struct scsi_disk *sdkp = to_scsi_disk(dev);
-	struct gendisk *disk = sdkp->disk;
-	struct request_queue *q = disk->queue;
 
 	ida_free(&sd_index_ida, sdkp->index);
-
-	/*
-	 * Wait until all requests that are in progress have completed.
-	 * This is necessary to avoid that e.g. scsi_end_request() crashes
-	 * due to clearing the disk->private_data pointer. Wait from inside
-	 * scsi_disk_release() instead of from sd_release() to avoid that
-	 * freezing and unfreezing the request queue affects user space I/O
-	 * in case multiple processes open a /dev/sd... node concurrently.
-	 */
-	blk_mq_freeze_queue(q);
-	blk_mq_unfreeze_queue(q);
-
-	disk->private_data = NULL;
-	put_disk(disk);
-
 	sd_zbc_release_disk(sdkp);
 	put_device(&sdkp->device->sdev_gendev);
 	free_opal_dev(sdkp->opal_dev);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/14] sr: implement ->free_disk
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06  4:01   ` Ming Lei
                     ` (2 more replies)
  2022-03-04 16:03 ` [PATCH 09/14] block: move blkcg initialization/destroy into disk allocation/release handler Christoph Hellwig
                   ` (5 subsequent siblings)
  13 siblings, 3 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Simplify the refcounting and remove the need to clear disk->private_data
by implementing the ->free_disk method.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/sr.c | 124 ++++++++++------------------------------------
 drivers/scsi/sr.h |   4 --
 2 files changed, 26 insertions(+), 102 deletions(-)

diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 569bda76a5175..11fbdc75bb711 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -109,11 +109,6 @@ static DEFINE_SPINLOCK(sr_index_lock);
 
 static struct lock_class_key sr_bio_compl_lkclass;
 
-/* This semaphore is used to mediate the 0->1 reference get in the
- * face of object destruction (i.e. we can't allow a get on an
- * object after last put) */
-static DEFINE_MUTEX(sr_ref_mutex);
-
 static int sr_open(struct cdrom_device_info *, int);
 static void sr_release(struct cdrom_device_info *);
 
@@ -143,8 +138,6 @@ static const struct cdrom_device_ops sr_dops = {
 	.capability		= SR_CAPABILITIES,
 };
 
-static void sr_kref_release(struct kref *kref);
-
 static inline struct scsi_cd *scsi_cd(struct gendisk *disk)
 {
 	return disk->private_data;
@@ -163,38 +156,6 @@ static int sr_runtime_suspend(struct device *dev)
 		return 0;
 }
 
-/*
- * The get and put routines for the struct scsi_cd.  Note this entity
- * has a scsi_device pointer and owns a reference to this.
- */
-static inline struct scsi_cd *scsi_cd_get(struct gendisk *disk)
-{
-	struct scsi_cd *cd = NULL;
-
-	mutex_lock(&sr_ref_mutex);
-	if (disk->private_data == NULL)
-		goto out;
-	cd = scsi_cd(disk);
-	kref_get(&cd->kref);
-	if (scsi_device_get(cd->device)) {
-		kref_put(&cd->kref, sr_kref_release);
-		cd = NULL;
-	}
- out:
-	mutex_unlock(&sr_ref_mutex);
-	return cd;
-}
-
-static void scsi_cd_put(struct scsi_cd *cd)
-{
-	struct scsi_device *sdev = cd->device;
-
-	mutex_lock(&sr_ref_mutex);
-	kref_put(&cd->kref, sr_kref_release);
-	scsi_device_put(sdev);
-	mutex_unlock(&sr_ref_mutex);
-}
-
 static unsigned int sr_get_events(struct scsi_device *sdev)
 {
 	u8 buf[8];
@@ -522,15 +483,13 @@ static void sr_revalidate_disk(struct scsi_cd *cd)
 
 static int sr_block_open(struct block_device *bdev, fmode_t mode)
 {
-	struct scsi_cd *cd;
-	struct scsi_device *sdev;
+	struct scsi_cd *cd = cd = scsi_cd(bdev->bd_disk);
+	struct scsi_device *sdev = cd->device;
 	int ret = -ENXIO;
 
-	cd = scsi_cd_get(bdev->bd_disk);
-	if (!cd)
-		goto out;
+	if (scsi_device_get(cd->device))
+		return -ENXIO;
 
-	sdev = cd->device;
 	scsi_autopm_get_device(sdev);
 	if (bdev_check_media_change(bdev))
 		sr_revalidate_disk(cd);
@@ -541,9 +500,7 @@ static int sr_block_open(struct block_device *bdev, fmode_t mode)
 
 	scsi_autopm_put_device(sdev);
 	if (ret)
-		scsi_cd_put(cd);
-
-out:
+		scsi_device_put(cd->device);
 	return ret;
 }
 
@@ -555,7 +512,7 @@ static void sr_block_release(struct gendisk *disk, fmode_t mode)
 	cdrom_release(&cd->cdi, mode);
 	mutex_unlock(&cd->lock);
 
-	scsi_cd_put(cd);
+	scsi_device_put(cd->device);
 }
 
 static int sr_block_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
@@ -595,18 +552,24 @@ static int sr_block_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
 static unsigned int sr_block_check_events(struct gendisk *disk,
 					  unsigned int clearing)
 {
-	unsigned int ret = 0;
-	struct scsi_cd *cd;
+	struct scsi_cd *cd = disk->private_data;
 
-	cd = scsi_cd_get(disk);
-	if (!cd)
+	if (atomic_read(&cd->device->disk_events_disable_depth))
 		return 0;
+	return cdrom_check_events(&cd->cdi, clearing);
+}
 
-	if (!atomic_read(&cd->device->disk_events_disable_depth))
-		ret = cdrom_check_events(&cd->cdi, clearing);
+static void sr_free_disk(struct gendisk *disk)
+{
+	struct scsi_cd *cd = disk->private_data;
 
-	scsi_cd_put(cd);
-	return ret;
+	spin_lock(&sr_index_lock);
+	clear_bit(MINOR(disk_devt(disk)), sr_index_bits);
+	spin_unlock(&sr_index_lock);
+
+	unregister_cdrom(&cd->cdi);
+	mutex_destroy(&cd->lock);
+	kfree(cd);
 }
 
 static const struct block_device_operations sr_bdops =
@@ -617,6 +580,7 @@ static const struct block_device_operations sr_bdops =
 	.ioctl		= sr_block_ioctl,
 	.compat_ioctl	= blkdev_compat_ptr_ioctl,
 	.check_events	= sr_block_check_events,
+	.free_disk	= sr_free_disk,
 };
 
 static int sr_open(struct cdrom_device_info *cdi, int purpose)
@@ -660,8 +624,6 @@ static int sr_probe(struct device *dev)
 	if (!cd)
 		goto fail;
 
-	kref_init(&cd->kref);
-
 	disk = __alloc_disk_node(sdev->request_queue, NUMA_NO_NODE,
 				 &sr_bio_compl_lkclass);
 	if (!disk)
@@ -727,10 +689,8 @@ static int sr_probe(struct device *dev)
 	sr_revalidate_disk(cd);
 
 	error = device_add_disk(&sdev->sdev_gendev, disk, NULL);
-	if (error) {
-		kref_put(&cd->kref, sr_kref_release);
-		goto fail;
-	}
+	if (error)
+		goto unregister_cdrom;
 
 	sdev_printk(KERN_DEBUG, sdev,
 		    "Attached scsi CD-ROM %s\n", cd->cdi.name);
@@ -738,6 +698,8 @@ static int sr_probe(struct device *dev)
 
 	return 0;
 
+unregister_cdrom:
+	unregister_cdrom(&cd->cdi);
 fail_minor:
 	spin_lock(&sr_index_lock);
 	clear_bit(minor, sr_index_bits);
@@ -1009,36 +971,6 @@ static int sr_read_cdda_bpc(struct cdrom_device_info *cdi, void __user *ubuf,
 	return ret;
 }
 
-
-/**
- *	sr_kref_release - Called to free the scsi_cd structure
- *	@kref: pointer to embedded kref
- *
- *	sr_ref_mutex must be held entering this routine.  Because it is
- *	called on last put, you should always use the scsi_cd_get()
- *	scsi_cd_put() helpers which manipulate the semaphore directly
- *	and never do a direct kref_put().
- **/
-static void sr_kref_release(struct kref *kref)
-{
-	struct scsi_cd *cd = container_of(kref, struct scsi_cd, kref);
-	struct gendisk *disk = cd->disk;
-
-	spin_lock(&sr_index_lock);
-	clear_bit(MINOR(disk_devt(disk)), sr_index_bits);
-	spin_unlock(&sr_index_lock);
-
-	unregister_cdrom(&cd->cdi);
-
-	disk->private_data = NULL;
-
-	put_disk(disk);
-
-	mutex_destroy(&cd->lock);
-
-	kfree(cd);
-}
-
 static int sr_remove(struct device *dev)
 {
 	struct scsi_cd *cd = dev_get_drvdata(dev);
@@ -1046,11 +978,7 @@ static int sr_remove(struct device *dev)
 	scsi_autopm_get_device(cd->device);
 
 	del_gendisk(cd->disk);
-	dev_set_drvdata(dev, NULL);
-
-	mutex_lock(&sr_ref_mutex);
-	kref_put(&cd->kref, sr_kref_release);
-	mutex_unlock(&sr_ref_mutex);
+	put_disk(cd->disk);
 
 	return 0;
 }
diff --git a/drivers/scsi/sr.h b/drivers/scsi/sr.h
index d80af3fcb6f97..1175f2e213b56 100644
--- a/drivers/scsi/sr.h
+++ b/drivers/scsi/sr.h
@@ -18,7 +18,6 @@
 #ifndef _SR_H
 #define _SR_H
 
-#include <linux/kref.h>
 #include <linux/mutex.h>
 
 #define MAX_RETRIES	3
@@ -51,9 +50,6 @@ typedef struct scsi_cd {
 
 	struct cdrom_device_info cdi;
 	struct mutex lock;
-	/* We hold gendisk and scsi_device references on probe and use
-	 * the refs on this kref to decide when to release them */
-	struct kref kref;
 	struct gendisk *disk;
 } Scsi_CD;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/14] block: move blkcg initialization/destroy into disk allocation/release handler
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 08/14] sr: implement ->free_disk Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-07  3:18   ` Chaitanya Kulkarni
  2022-03-04 16:03 ` [PATCH 10/14] block: don't remove hctx debugfs dir from blk_mq_exit_queue Christoph Hellwig
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

From: Ming Lei <ming.lei@redhat.com>

blkcg works on FS bio level, so it is reasonable to make both blkcg and
gendisk sharing same lifetime. Meantime there won't be any FS IO when
releasing disk, so safe to move blkcg initialization/destroy into disk
allocation/release handler

Long term, we can move blkcg into gendisk completely.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-core.c  | 5 -----
 block/blk-sysfs.c | 7 -------
 block/genhd.c     | 8 ++++++++
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 94bf37f8e61d2..b2f2c65774812 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -496,17 +496,12 @@ struct request_queue *blk_alloc_queue(int node_id, bool alloc_srcu)
 				PERCPU_REF_INIT_ATOMIC, GFP_KERNEL))
 		goto fail_stats;
 
-	if (blkcg_init_queue(q))
-		goto fail_ref;
-
 	blk_queue_dma_alignment(q, 511);
 	blk_set_default_limits(&q->limits);
 	q->nr_requests = BLKDEV_DEFAULT_RQ;
 
 	return q;
 
-fail_ref:
-	percpu_ref_exit(&q->q_usage_counter);
 fail_stats:
 	blk_free_queue_stats(q->stats);
 fail_split:
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 241ded62f458f..220085109d7f0 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -751,13 +751,6 @@ static void blk_exit_queue(struct request_queue *q)
 		ioc_clear_queue(q);
 		elevator_exit(q);
 	}
-
-	/*
-	 * Remove all references to @q from the block cgroup controller before
-	 * restoring @q->queue_lock to avoid that restoring this pointer causes
-	 * e.g. blkcg_print_blkgs() to crash.
-	 */
-	blkcg_exit_queue(q);
 }
 
 /**
diff --git a/block/genhd.c b/block/genhd.c
index 54f60ded2ee6f..073e93f2fc40b 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1120,9 +1120,12 @@ static void disk_release(struct device *dev)
 
 	blk_mq_cancel_work_sync(disk->queue);
 
+	blkcg_exit_queue(disk->queue);
+
 	disk_release_events(disk);
 	kfree(disk->random);
 	xa_destroy(&disk->part_tbl);
+
 	disk->queue->disk = NULL;
 	blk_put_queue(disk->queue);
 
@@ -1328,6 +1331,9 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
 	if (xa_insert(&disk->part_tbl, 0, disk->part0, GFP_KERNEL))
 		goto out_destroy_part_tbl;
 
+	if (blkcg_init_queue(q))
+		goto out_erase_part0;
+
 	rand_initialize_disk(disk);
 	disk_to_dev(disk)->class = &block_class;
 	disk_to_dev(disk)->type = &disk_type;
@@ -1340,6 +1346,8 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
 #endif
 	return disk;
 
+out_erase_part0:
+	xa_erase(&disk->part_tbl, 0);
 out_destroy_part_tbl:
 	xa_destroy(&disk->part_tbl);
 	disk->part0->bd_disk = NULL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/14] block: don't remove hctx debugfs dir from blk_mq_exit_queue
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 09/14] block: move blkcg initialization/destroy into disk allocation/release handler Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 11/14] block: move q_usage_counter release into blk_queue_release Christoph Hellwig
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

From: Ming Lei <ming.lei@redhat.com>

The queue's top debugfs dir is removed from blk_release_queue(), so all
hctx's debugfs dirs are removed from there. Given blk_mq_exit_queue()
is only called from blk_cleanup_queue(), it isn't necessary to remove
hctx debugfs from blk_mq_exit_queue().

So remove it from blk_mq_exit_queue().

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 6fd0b0f652514..b38e125a710c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3434,7 +3434,6 @@ static void blk_mq_exit_hw_queues(struct request_queue *q,
 	queue_for_each_hw_ctx(q, hctx, i) {
 		if (i == nr_queue)
 			break;
-		blk_mq_debugfs_unregister_hctx(hctx);
 		blk_mq_exit_hctx(q, set, hctx, i);
 	}
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/14] block: move q_usage_counter release into blk_queue_release
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 10/14] block: don't remove hctx debugfs dir from blk_mq_exit_queue Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 12/14] block: move blk_exit_queue into disk_release Christoph Hellwig
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

From: Ming Lei <ming.lei@redhat.com>

After blk_cleanup_queue() returns, disk may not be released yet, so
probably bio may still be submitted and ->q_usage_counter may be
touched, so far this way seems safe, but not good from API's viewpoint.

Move the release q_usage_counter into blk_queue_release().

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-core.c  | 2 --
 block/blk-sysfs.c | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index b2f2c65774812..a8c59913dd78d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -342,8 +342,6 @@ void blk_cleanup_queue(struct request_queue *q)
 		blk_mq_sched_free_rqs(q);
 	mutex_unlock(&q->sysfs_lock);
 
-	percpu_ref_exit(&q->q_usage_counter);
-
 	/* @q is and will stay empty, shutdown and put */
 	blk_put_queue(q);
 }
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 220085109d7f0..af5a6d86073f1 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -780,6 +780,8 @@ static void blk_release_queue(struct kobject *kobj)
 
 	might_sleep();
 
+	percpu_ref_exit(&q->q_usage_counter);
+
 	if (q->poll_stat)
 		blk_stat_remove_callback(q, q->poll_cb);
 	blk_stat_free_callback(q->poll_cb);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 12/14] block: move blk_exit_queue into disk_release
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 11/14] block: move q_usage_counter release into blk_queue_release Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06 21:21   ` Bart Van Assche
  2022-03-04 16:03 ` [PATCH 13/14] block: do more work in elevator_exit Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 14/14] block: move rq_qos_exit() into disk_release() Christoph Hellwig
  13 siblings, 1 reply; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

From: Ming Lei <ming.lei@redhat.com>

There can't be FS IO in disk_release(), so move blk_exit_queue() there.

We still need to freeze queue here since the request is freed after the
bio is completed and passthrough request rely on scheduler tags as well.

The disk can be released before or after queue is cleaned up, and we have
to free the scheduler request pool before blk_cleanup_queue returns,
while the static request pool has to be freed before exiting the
I/O scheduler.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[hch: rebased]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-sysfs.c | 16 ----------------
 block/genhd.c     | 32 +++++++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index af5a6d86073f1..85c4ba006671d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -739,20 +739,6 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head)
 	kmem_cache_free(blk_get_queue_kmem_cache(blk_queue_has_srcu(q)), q);
 }
 
-/* Unconfigure the I/O scheduler and dissociate from the cgroup controller. */
-static void blk_exit_queue(struct request_queue *q)
-{
-	/*
-	 * Since the I/O scheduler exit code may access cgroup information,
-	 * perform I/O scheduler exit before disassociating from the block
-	 * cgroup controller.
-	 */
-	if (q->elevator) {
-		ioc_clear_queue(q);
-		elevator_exit(q);
-	}
-}
-
 /**
  * blk_release_queue - releases all allocated resources of the request_queue
  * @kobj: pointer to a kobject, whose container is a request_queue
@@ -786,8 +772,6 @@ static void blk_release_queue(struct kobject *kobj)
 		blk_stat_remove_callback(q, q->poll_cb);
 	blk_stat_free_callback(q->poll_cb);
 
-	blk_exit_queue(q);
-
 	blk_free_queue_stats(q->stats);
 	kfree(q->poll_stat);
 
diff --git a/block/genhd.c b/block/genhd.c
index 073e93f2fc40b..73705a749ea92 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -29,6 +29,7 @@
 #include "blk.h"
 #include "blk-mq-sched.h"
 #include "blk-rq-qos.h"
+#include "blk-cgroup.h"
 
 static struct kobject *block_depr;
 
@@ -1097,6 +1098,34 @@ static const struct attribute_group *disk_attr_groups[] = {
 	NULL
 };
 
+static void disk_release_mq(struct request_queue *q)
+{
+	blk_mq_cancel_work_sync(q);
+
+	/*
+	 * There can't be any non non-passthrough bios in flight here, but
+	 * requests stay around longer, including passthrough ones so we
+	 * still need to freeze the queue here.
+	 */
+	blk_mq_freeze_queue(q);
+
+	/*
+	 * Since the I/O scheduler exit code may access cgroup information,
+	 * perform I/O scheduler exit before disassociating from the block
+	 * cgroup controller.
+	 */
+	if (q->elevator) {
+		ioc_clear_queue(q);
+
+		mutex_lock(&q->sysfs_lock);
+		blk_mq_sched_free_rqs(q);
+		elevator_exit(q);
+		mutex_unlock(&q->sysfs_lock);
+	}
+
+	__blk_mq_unfreeze_queue(q, true);
+}
+
 /**
  * disk_release - releases all allocated resources of the gendisk
  * @dev: the device representing this disk
@@ -1118,7 +1147,8 @@ static void disk_release(struct device *dev)
 	might_sleep();
 	WARN_ON_ONCE(disk_live(disk));
 
-	blk_mq_cancel_work_sync(disk->queue);
+	if (queue_is_mq(disk->queue))
+		disk_release_mq(disk->queue);
 
 	blkcg_exit_queue(disk->queue);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/14] block: do more work in elevator_exit
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 12/14] block: move blk_exit_queue into disk_release Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-04 16:03 ` [PATCH 14/14] block: move rq_qos_exit() into disk_release() Christoph Hellwig
  13 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Move the calls to ioc_clear_queue and blk_mq_sched_free_rqs into
elevator_exit.  Except for one call where we know we can't have io_cq
structures yet these always go together, and that extra call in an
error path is harmless.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/elevator.c | 6 +++---
 block/genhd.c    | 3 ---
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index a842e4b8ebc66..20a4e7c7c7745 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -192,6 +192,9 @@ void elevator_exit(struct request_queue *q)
 {
 	struct elevator_queue *e = q->elevator;
 
+	ioc_clear_queue(q);
+	blk_mq_sched_free_rqs(q);
+
 	mutex_lock(&e->sysfs_lock);
 	blk_mq_exit_sched(q, e);
 	mutex_unlock(&e->sysfs_lock);
@@ -596,8 +599,6 @@ int elevator_switch_mq(struct request_queue *q,
 
 	if (q->elevator) {
 		elv_unregister_queue(q);
-		ioc_clear_queue(q);
-		blk_mq_sched_free_rqs(q);
 		elevator_exit(q);
 	}
 
@@ -608,7 +609,6 @@ int elevator_switch_mq(struct request_queue *q,
 	if (new_e) {
 		ret = elv_register_queue(q, true);
 		if (ret) {
-			blk_mq_sched_free_rqs(q);
 			elevator_exit(q);
 			goto out;
 		}
diff --git a/block/genhd.c b/block/genhd.c
index 73705a749ea92..857e0a54da7dd 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1115,10 +1115,7 @@ static void disk_release_mq(struct request_queue *q)
 	 * cgroup controller.
 	 */
 	if (q->elevator) {
-		ioc_clear_queue(q);
-
 		mutex_lock(&q->sysfs_lock);
-		blk_mq_sched_free_rqs(q);
 		elevator_exit(q);
 		mutex_unlock(&q->sysfs_lock);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 14/14] block: move rq_qos_exit() into disk_release()
  2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2022-03-04 16:03 ` [PATCH 13/14] block: do more work in elevator_exit Christoph Hellwig
@ 2022-03-04 16:03 ` Christoph Hellwig
  2022-03-06 20:51   ` Bart Van Assche
  13 siblings, 1 reply; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-04 16:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

From: Ming Lei <ming.lei@redhat.com>

There can't be FS IO in disk_release(), so it is safe to move rq_qos_exit()
there.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/genhd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 857e0a54da7dd..56f66c6fee943 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -627,7 +627,6 @@ void del_gendisk(struct gendisk *disk)
 
 	blk_mq_freeze_queue_wait(q);
 
-	rq_qos_exit(q);
 	blk_sync_queue(q);
 	blk_flush_integrity();
 	/*
@@ -1119,7 +1118,7 @@ static void disk_release_mq(struct request_queue *q)
 		elevator_exit(q);
 		mutex_unlock(&q->sysfs_lock);
 	}
-
+	rq_qos_exit(q);
 	__blk_mq_unfreeze_queue(q, true);
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting
  2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
@ 2022-03-06  1:02   ` Bart Van Assche
  2022-03-07  3:12   ` Chaitanya Kulkarni
  2022-03-08  3:13   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  1:02 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> I/O accounting buckets I/O into the read/write/discard categories into
> which passthrough I/O does not fit at all.  It also accounts to the
> block_device, which may not even exist for passthrough I/O.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs
  2022-03-04 16:03 ` [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs Christoph Hellwig
@ 2022-03-06  1:05   ` Bart Van Assche
  2022-03-08  3:14   ` Martin K. Petersen
  1 sibling, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  1:05 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> To simplify further changes allow for double calling blk_mq_free_rqs on
> a queue.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver
  2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
@ 2022-03-06  1:27   ` Bart Van Assche
  2022-03-06  3:09   ` Ming Lei
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  1:27 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> Requiring every ULP to have the scsi_drive as first member of the
> private data is rather fragile and not necessary anyway.  Just use
> the driver hanging off the SCSI device instead.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
@ 2022-03-06  1:38   ` Bart Van Assche
  2022-03-06  8:40     ` Christoph Hellwig
  2022-03-06  3:31   ` Ming Lei
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  1:38 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> +	/*
> +	 * This device is mostly just used to show a bunch of attributes in a
> +	 * weird place.  In doubt don't add any new users, and most importantly
> +	 * don't use if for any actual refcounting.
> +	 */
> +	struct device	disk_dev;

Isn't "weird place" subjective? How about mentioning the sysfs path explicitly 
(/sys/class/scsi_disk/H:C:I:L)? How about explaining why no new sysfs 
attributes should be added to that device instance?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference
  2022-03-04 16:03 ` [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference Christoph Hellwig
@ 2022-03-06  1:44   ` Bart Van Assche
  2022-03-06  3:33   ` Ming Lei
  2022-03-08  3:25   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  1:44 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> sd_zbc_release_disk accesses disk->device, so ensure that actually still has
> a valid reference.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/14] sd: delay calling free_opal_dev
  2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
@ 2022-03-06  1:45   ` Bart Van Assche
  2022-03-06  3:41   ` Ming Lei
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  1:45 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> Call free_opal_dev from scsi_disk_release as the opal_dev field is access
> from the ioctl handler, which isn't synchronized vs sd_release and thus
> can be accesses during or after sd_release was called.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting
  2022-03-04 16:03 ` [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting Christoph Hellwig
@ 2022-03-06  2:03   ` Bart Van Assche
  2022-03-06  8:46     ` Christoph Hellwig
  2022-03-06  3:54   ` Ming Lei
  2022-03-08  3:29   ` Martin K. Petersen
  2 siblings, 1 reply; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06  2:03 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
>   error_out:
> -	scsi_disk_put(sdkp);
> +	scsi_device_put(sdkp->device);
>   	return retval;	
>   }

Hmm ... why is the above scsi_device_put() call passed sdkp->device? Wouldn't 
it be more symmetric to pass 'sdev' to that function?

> @@ -1502,7 +1468,7 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
>   			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
>   	}
>   
> -	scsi_disk_put(sdkp);
> +	scsi_device_put(sdkp->device);
>   }

Same question here - why to pass sdkp->device instead of sdev?

> +static void scsi_disk_free_disk(struct gendisk *disk)
> +{
> +	struct scsi_disk *sdkp = disk->private_data;
> +
> +	put_device(&sdkp->disk_dev);
> +}

Can the body of the above function be written as 
put_device(&scsi_disk(disk)->disk_dev) ? I'm asking this because other parts of 
this patch use scsi_disk() instead of using disk->private_data directly.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver
  2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
  2022-03-06  1:27   ` Bart Van Assche
@ 2022-03-06  3:09   ` Ming Lei
  2022-03-07  3:14   ` Chaitanya Kulkarni
  2022-03-08  3:16   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Ming Lei @ 2022-03-06  3:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Bart Van Assche, linux-block, linux-scsi

On Fri, Mar 04, 2022 at 05:03:20PM +0100, Christoph Hellwig wrote:
> Requiring every ULP to have the scsi_drive as first member of the
> private data is rather fragile and not necessary anyway.  Just use
> the driver hanging off the SCSI device instead.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
  2022-03-06  1:38   ` Bart Van Assche
@ 2022-03-06  3:31   ` Ming Lei
  2022-03-06  8:43     ` Christoph Hellwig
  2022-03-07  3:16   ` Chaitanya Kulkarni
  2022-03-08  3:25   ` Martin K. Petersen
  3 siblings, 1 reply; 51+ messages in thread
From: Ming Lei @ 2022-03-06  3:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Bart Van Assche, linux-block, linux-scsi

On Fri, Mar 04, 2022 at 05:03:21PM +0100, Christoph Hellwig wrote:
> dev is very hard to grab for.  Give the field a more descriptive name and
> documents it's purpose.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/scsi/sd.c | 22 +++++++++++-----------
>  drivers/scsi/sd.h | 10 ++++++++--
>  2 files changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 2a1e19e871d30..7479e7cb36b43 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -672,7 +672,7 @@ static struct scsi_disk *scsi_disk_get(struct gendisk *disk)
>  	if (disk->private_data) {
>  		sdkp = scsi_disk(disk);
>  		if (scsi_device_get(sdkp->device) == 0)
> -			get_device(&sdkp->dev);
> +			get_device(&sdkp->disk_dev);
>  		else
>  			sdkp = NULL;
>  	}
> @@ -685,7 +685,7 @@ static void scsi_disk_put(struct scsi_disk *sdkp)
>  	struct scsi_device *sdev = sdkp->device;
>  
>  	mutex_lock(&sd_ref_mutex);
> -	put_device(&sdkp->dev);
> +	put_device(&sdkp->disk_dev);
>  	scsi_device_put(sdev);
>  	mutex_unlock(&sd_ref_mutex);
>  }
> @@ -3529,14 +3529,14 @@ static int sd_probe(struct device *dev)
>  					     SD_MOD_TIMEOUT);
>  	}
>  
> -	device_initialize(&sdkp->dev);
> -	sdkp->dev.parent = get_device(dev);
> -	sdkp->dev.class = &sd_disk_class;
> -	dev_set_name(&sdkp->dev, "%s", dev_name(dev));
> +	device_initialize(&sdkp->disk_dev);
> +	sdkp->disk_dev.parent = get_device(dev);
> +	sdkp->disk_dev.class = &sd_disk_class;
> +	dev_set_name(&sdkp->disk_dev, "%s", dev_name(dev));
>  
> -	error = device_add(&sdkp->dev);
> +	error = device_add(&sdkp->disk_dev);
>  	if (error) {
> -		put_device(&sdkp->dev);
> +		put_device(&sdkp->disk_dev);
>  		goto out;
>  	}
>  
> @@ -3577,7 +3577,7 @@ static int sd_probe(struct device *dev)
>  
>  	error = device_add_disk(dev, gd, NULL);
>  	if (error) {
> -		put_device(&sdkp->dev);
> +		put_device(&sdkp->disk_dev);
>  		goto out;
>  	}
>  
> @@ -3628,7 +3628,7 @@ static int sd_remove(struct device *dev)
>  	sdkp = dev_get_drvdata(dev);
>  	scsi_autopm_get_device(sdkp->device);
>  
> -	device_del(&sdkp->dev);
> +	device_del(&sdkp->disk_dev);
>  	del_gendisk(sdkp->disk);
>  	sd_shutdown(dev);
>  
> @@ -3636,7 +3636,7 @@ static int sd_remove(struct device *dev)
>  
>  	mutex_lock(&sd_ref_mutex);
>  	dev_set_drvdata(dev, NULL);
> -	put_device(&sdkp->dev);
> +	put_device(&sdkp->disk_dev);
>  	mutex_unlock(&sd_ref_mutex);
>  
>  	return 0;
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 303aa1c23aefb..7625a90b0fa69 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -69,7 +69,13 @@ enum {
>  
>  struct scsi_disk {
>  	struct scsi_device *device;
> -	struct device	dev;
> +
> +	/*
> +	 * This device is mostly just used to show a bunch of attributes in a
> +	 * weird place.  In doubt don't add any new users, and most importantly
> +	 * don't use if for any actual refcounting.
> +	 */

The device looks partner of gendisk, I think it could just be a
private data of gendisk, and the attributes can be added to gendisk.

But scsi has the tradition of adding class device of scsi_host,
scsi_device, scsi_disk and scsi_generic.

Adding such device makes things complicated, such as refcounting
in open/close disk. But looks scsi_disk isn't part of sysfs ABI, maybe it
can be removed, anyway:

Reviewed-by: Ming Lei <ming.lei@redhat.com>


Thanks,
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference
  2022-03-04 16:03 ` [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference Christoph Hellwig
  2022-03-06  1:44   ` Bart Van Assche
@ 2022-03-06  3:33   ` Ming Lei
  2022-03-08  3:25   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Ming Lei @ 2022-03-06  3:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Bart Van Assche, linux-block, linux-scsi

On Fri, Mar 04, 2022 at 05:03:22PM +0100, Christoph Hellwig wrote:
> sd_zbc_release_disk accesses disk->device, so ensure that actually still has
> a valid reference.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/14] sd: delay calling free_opal_dev
  2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
  2022-03-06  1:45   ` Bart Van Assche
@ 2022-03-06  3:41   ` Ming Lei
  2022-03-07  3:17   ` Chaitanya Kulkarni
  2022-03-08  3:26   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Ming Lei @ 2022-03-06  3:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Bart Van Assche, linux-block, linux-scsi

On Fri, Mar 04, 2022 at 05:03:23PM +0100, Christoph Hellwig wrote:
> Call free_opal_dev from scsi_disk_release as the opal_dev field is access
> from the ioctl handler, which isn't synchronized vs sd_release and thus
> can be accesses during or after sd_release was called.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting
  2022-03-04 16:03 ` [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting Christoph Hellwig
  2022-03-06  2:03   ` Bart Van Assche
@ 2022-03-06  3:54   ` Ming Lei
  2022-03-08  3:29   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Ming Lei @ 2022-03-06  3:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Bart Van Assche, linux-block, linux-scsi

On Fri, Mar 04, 2022 at 05:03:24PM +0100, Christoph Hellwig wrote:
> Implement the ->free_disk method to to put struct scsi_disk when the last
> gendisk reference count goes away.  This removes the need to clear
> ->private_data and thus freeze the queue on unbind.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Nice cleanup:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 08/14] sr: implement ->free_disk
  2022-03-04 16:03 ` [PATCH 08/14] sr: implement ->free_disk Christoph Hellwig
@ 2022-03-06  4:01   ` Ming Lei
  2022-03-06 21:44   ` Bart Van Assche
  2022-03-08  3:31   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Ming Lei @ 2022-03-06  4:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Bart Van Assche, linux-block, linux-scsi

On Fri, Mar 04, 2022 at 05:03:25PM +0100, Christoph Hellwig wrote:
> Simplify the refcounting and remove the need to clear disk->private_data
> by implementing the ->free_disk method.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-06  1:38   ` Bart Van Assche
@ 2022-03-06  8:40     ` Christoph Hellwig
  2022-03-06 20:34       ` Bart Van Assche
  0 siblings, 1 reply; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-06  8:40 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen, Ming Lei,
	linux-block, linux-scsi

On Sat, Mar 05, 2022 at 05:38:40PM -0800, Bart Van Assche wrote:
> On 3/4/22 08:03, Christoph Hellwig wrote:
>> +	/*
>> +	 * This device is mostly just used to show a bunch of attributes in a
>> +	 * weird place.  In doubt don't add any new users, and most importantly
>> +	 * don't use if for any actual refcounting.
>> +	 */
>> +	struct device	disk_dev;
>
> Isn't "weird place" subjective? How about mentioning the sysfs path 
> explicitly (/sys/class/scsi_disk/H:C:I:L)? How about explaining why no new 
> sysfs attributes should be added to that device instance?

Well, weird place means that all normale drivers would just use
attributes on the gendisk for it, but sd creates a completely pointless
device under the gendisk device for it.  If you have a better wording
I can change it.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-06  3:31   ` Ming Lei
@ 2022-03-06  8:43     ` Christoph Hellwig
  0 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-06  8:43 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen,
	Bart Van Assche, linux-block, linux-scsi

On Sun, Mar 06, 2022 at 11:31:33AM +0800, Ming Lei wrote:
> The device looks partner of gendisk, I think it could just be a
> private data of gendisk, and the attributes can be added to gendisk.

Yes, that would be the normal way to do it.

> 
> But scsi has the tradition of adding class device of scsi_host,
> scsi_device, scsi_disk and scsi_generic.
> 
> Adding such device makes things complicated, such as refcounting
> in open/close disk. But looks scsi_disk isn't part of sysfs ABI, maybe it
> can be removed, anyway:

Unfortunatly it is in a major way:

root@testvm:~# ls /sys/class/scsi_disk/0\:0\:0\:0
FUA	       manage_start_stop	   protection_mode    uevent
allow_restart  max_medium_access_timeouts  protection_type    zeroing_mode
app_tag_own    max_retries		   provisioning_mode  zoned_cap
cache_type     max_write_same_blocks	   subsystem
device	       power			   thin_provisioning

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting
  2022-03-06  2:03   ` Bart Van Assche
@ 2022-03-06  8:46     ` Christoph Hellwig
  0 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-06  8:46 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen, Ming Lei,
	linux-block, linux-scsi

On Sat, Mar 05, 2022 at 06:03:39PM -0800, Bart Van Assche wrote:
>> -	scsi_disk_put(sdkp);
>> +	scsi_device_put(sdkp->device);
>>   	return retval;	
>>   }
>
> Hmm ... why is the above scsi_device_put() call passed sdkp->device? 
> Wouldn't it be more symmetric to pass 'sdev' to that function?
>
>> @@ -1502,7 +1468,7 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
>>   			scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
>>   	}
>>   -	scsi_disk_put(sdkp);
>> +	scsi_device_put(sdkp->device);
>>   }
>
> Same question here - why to pass sdkp->device instead of sdev?

Yes, we can just pass sdev in both cases as that is a bit cleaner.

>
>> +static void scsi_disk_free_disk(struct gendisk *disk)
>> +{
>> +	struct scsi_disk *sdkp = disk->private_data;
>> +
>> +	put_device(&sdkp->disk_dev);
>> +}
>
> Can the body of the above function be written as 
> put_device(&scsi_disk(disk)->disk_dev) ? I'm asking this because other 
> parts of this patch use scsi_disk() instead of using disk->private_data 
> directly.

The scsi_disk() helper is a bit pointless now, but I could use it
here for now.  In the long run we should probably just remove
scsi_disk() entirely.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-06  8:40     ` Christoph Hellwig
@ 2022-03-06 20:34       ` Bart Van Assche
  0 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06 20:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/6/22 00:40, Christoph Hellwig wrote:
> On Sat, Mar 05, 2022 at 05:38:40PM -0800, Bart Van Assche wrote:
>> On 3/4/22 08:03, Christoph Hellwig wrote:
>>> +	/*
>>> +	 * This device is mostly just used to show a bunch of attributes in a
>>> +	 * weird place.  In doubt don't add any new users, and most importantly
>>> +	 * don't use if for any actual refcounting.
>>> +	 */
>>> +	struct device	disk_dev;
>>
>> Isn't "weird place" subjective? How about mentioning the sysfs path
>> explicitly (/sys/class/scsi_disk/H:C:I:L)? How about explaining why no new
>> sysfs attributes should be added to that device instance?
> 
> Well, weird place means that all normale drivers would just use
> attributes on the gendisk for it, but sd creates a completely pointless
> device under the gendisk device for it.  If you have a better wording
> I can change it.

It's not that important. I wish it would be possible to get rid of this 
struct device instance. I think this instance was introduced in 2006 by 
patch "[SCSI] allow displaying and setting of cache type via sysfs".

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 14/14] block: move rq_qos_exit() into disk_release()
  2022-03-04 16:03 ` [PATCH 14/14] block: move rq_qos_exit() into disk_release() Christoph Hellwig
@ 2022-03-06 20:51   ` Bart Van Assche
  2022-03-07  2:50     ` Ming Lei
  0 siblings, 1 reply; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06 20:51 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> There can't be FS IO in disk_release(), so it is safe to move rq_qos_exit()
> there.

The commit message only explains why it is safe to move rq_qos_exit() 
but not why moving that function call is useful. Please add an 
explanation of why moving that function call is useful and/or necessary.

> diff --git a/block/genhd.c b/block/genhd.c
> index 857e0a54da7dd..56f66c6fee943 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -627,7 +627,6 @@ void del_gendisk(struct gendisk *disk)
>   
>   	blk_mq_freeze_queue_wait(q);
>   
> -	rq_qos_exit(q);
>   	blk_sync_queue(q);
>   	blk_flush_integrity();
>   	/*
> @@ -1119,7 +1118,7 @@ static void disk_release_mq(struct request_queue *q)
>   		elevator_exit(q);
>   		mutex_unlock(&q->sysfs_lock);
>   	}
> -
> +	rq_qos_exit(q);
>   	__blk_mq_unfreeze_queue(q, true);
>   }

Commit 8e141f9eb803 ("block: drain file system I/O on del_gendisk") 
removed the rq_qos_exit() call from blk_cleanup_queue(). This patch 
series does not restore the rq_qos_exit() call in blk_cleanup_queue(). I 
think that call should be restored since rq_qos_add() can be called 
before add_disk() is called. I'm referring to the following call chain:

__alloc_disk_node()
   blkcg_init_queue()
     blk_iolatency_init()
       rq_qos_add()

sd_probe() is one of the functions that can take an error path after 
__alloc_disk_node() has returned and before device_add_disk() is called.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/14] block: move blk_exit_queue into disk_release
  2022-03-04 16:03 ` [PATCH 12/14] block: move blk_exit_queue into disk_release Christoph Hellwig
@ 2022-03-06 21:21   ` Bart Van Assche
  0 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06 21:21 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> There can't be FS IO in disk_release(), so move blk_exit_queue() there.
> 
> We still need to freeze queue here since the request is freed after the
> bio is completed and passthrough request rely on scheduler tags as well.
> 
> The disk can be released before or after queue is cleaned up, and we have
> to free the scheduler request pool before blk_cleanup_queue returns,
> while the static request pool has to be freed before exiting the
> I/O scheduler.

The above explains why it is safe to make this change but not why this 
change is made. Please add an explanation of why this change is made. 
Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 08/14] sr: implement ->free_disk
  2022-03-04 16:03 ` [PATCH 08/14] sr: implement ->free_disk Christoph Hellwig
  2022-03-06  4:01   ` Ming Lei
@ 2022-03-06 21:44   ` Bart Van Assche
  2022-03-08  3:31   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Bart Van Assche @ 2022-03-06 21:44 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> Simplify the refcounting and remove the need to clear disk->private_data
> by implementing the ->free_disk method.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 14/14] block: move rq_qos_exit() into disk_release()
  2022-03-06 20:51   ` Bart Van Assche
@ 2022-03-07  2:50     ` Ming Lei
  0 siblings, 0 replies; 51+ messages in thread
From: Ming Lei @ 2022-03-07  2:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, Martin K. Petersen, linux-block,
	linux-scsi

On Sun, Mar 06, 2022 at 12:51:37PM -0800, Bart Van Assche wrote:
> On 3/4/22 08:03, Christoph Hellwig wrote:
> > From: Ming Lei <ming.lei@redhat.com>
> > 
> > There can't be FS IO in disk_release(), so it is safe to move rq_qos_exit()
> > there.
> 
> The commit message only explains why it is safe to move rq_qos_exit() but
> not why moving that function call is useful. Please add an explanation of
> why moving that function call is useful and/or necessary.
> 
> > diff --git a/block/genhd.c b/block/genhd.c
> > index 857e0a54da7dd..56f66c6fee943 100644
> > --- a/block/genhd.c
> > +++ b/block/genhd.c
> > @@ -627,7 +627,6 @@ void del_gendisk(struct gendisk *disk)
> >   	blk_mq_freeze_queue_wait(q);
> > -	rq_qos_exit(q);
> >   	blk_sync_queue(q);
> >   	blk_flush_integrity();
> >   	/*
> > @@ -1119,7 +1118,7 @@ static void disk_release_mq(struct request_queue *q)
> >   		elevator_exit(q);
> >   		mutex_unlock(&q->sysfs_lock);
> >   	}
> > -
> > +	rq_qos_exit(q);
> >   	__blk_mq_unfreeze_queue(q, true);
> >   }
> 
> Commit 8e141f9eb803 ("block: drain file system I/O on del_gendisk") removed
> the rq_qos_exit() call from blk_cleanup_queue(). This patch series does not
> restore the rq_qos_exit() call in blk_cleanup_queue(). I think that call
> should be restored since rq_qos_add() can be called before add_disk() is
> called. I'm referring to the following call chain:
> 
> __alloc_disk_node()
>   blkcg_init_queue()
>     blk_iolatency_init()
>       rq_qos_add()
> 
> sd_probe() is one of the functions that can take an error path after
> __alloc_disk_node() has returned and before device_add_disk() is called.

blkcg_exit_queue() is called in disk_release(), so this error handing is
covered since put_disk() should be called once __alloc_disk_node()
is successful, no matter disk is added or not.

We move rq_qos_exit() to disk_release() too, then every FS IO related
resources are released in disk_release().


Thanks,
Ming


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting
  2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
  2022-03-06  1:02   ` Bart Van Assche
@ 2022-03-07  3:12   ` Chaitanya Kulkarni
  2022-03-08  3:13   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-07  3:12 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> I/O accounting buckets I/O into the read/write/discard categories into
> which passthrough I/O does not fit at all.  It also accounts to the
> block_device, which may not even exist for passthrough I/O.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver
  2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
  2022-03-06  1:27   ` Bart Van Assche
  2022-03-06  3:09   ` Ming Lei
@ 2022-03-07  3:14   ` Chaitanya Kulkarni
  2022-03-08  3:16   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-07  3:14 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> Requiring every ULP to have the scsi_drive as first member of the
> private data is rather fragile and not necessary anyway.  Just use
> the driver hanging off the SCSI device instead.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
  2022-03-06  1:38   ` Bart Van Assche
  2022-03-06  3:31   ` Ming Lei
@ 2022-03-07  3:16   ` Chaitanya Kulkarni
  2022-03-08  3:25   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-07  3:16 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> dev is very hard to grab for.  Give the field a more descriptive name and
> documents it's purpose.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/14] sd: delay calling free_opal_dev
  2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
  2022-03-06  1:45   ` Bart Van Assche
  2022-03-06  3:41   ` Ming Lei
@ 2022-03-07  3:17   ` Chaitanya Kulkarni
  2022-03-08  3:26   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-07  3:17 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> Call free_opal_dev from scsi_disk_release as the opal_dev field is access
> from the ioctl handler, which isn't synchronized vs sd_release and thus
> can be accesses during or after sd_release was called.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 09/14] block: move blkcg initialization/destroy into disk allocation/release handler
  2022-03-04 16:03 ` [PATCH 09/14] block: move blkcg initialization/destroy into disk allocation/release handler Christoph Hellwig
@ 2022-03-07  3:18   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 51+ messages in thread
From: Chaitanya Kulkarni @ 2022-03-07  3:18 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

On 3/4/22 08:03, Christoph Hellwig wrote:
> From: Ming Lei <ming.lei@redhat.com>
> 
> blkcg works on FS bio level, so it is reasonable to make both blkcg and
> gendisk sharing same lifetime. Meantime there won't be any FS IO when
> releasing disk, so safe to move blkcg initialization/destroy into disk
> allocation/release handler
> 
> Long term, we can move blkcg into gendisk completely.
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting
  2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
  2022-03-06  1:02   ` Bart Van Assche
  2022-03-07  3:12   ` Chaitanya Kulkarni
@ 2022-03-08  3:13   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> I/O accounting buckets I/O into the read/write/discard categories into
> which passthrough I/O does not fit at all.  It also accounts to the
> block_device, which may not even exist for passthrough I/O.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs
  2022-03-04 16:03 ` [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs Christoph Hellwig
  2022-03-06  1:05   ` Bart Van Assche
@ 2022-03-08  3:14   ` Martin K. Petersen
  1 sibling, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> To simplify further changes allow for double calling blk_mq_free_rqs
> on a queue.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver
  2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
                     ` (2 preceding siblings ...)
  2022-03-07  3:14   ` Chaitanya Kulkarni
@ 2022-03-08  3:16   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> Requiring every ULP to have the scsi_drive as first member of the
> private data is rather fragile and not necessary anyway.  Just use
> the driver hanging off the SCSI device instead.

Looks fine.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/14] sd: rename the scsi_disk.dev field
  2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
                     ` (2 preceding siblings ...)
  2022-03-07  3:16   ` Chaitanya Kulkarni
@ 2022-03-08  3:25   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> dev is very hard to grab for.

grep?

> Give the field a more descriptive name and documents it's purpose.

its

> +
> +	/*
> +	 * This device is mostly just used to show a bunch of attributes in a
> +	 * weird place.  In doubt don't add any new users, and most importantly
> +	 * don't use if for any actual refcounting.
> +	 */
> +	struct device	disk_dev;

I agree with Bart that this should be more explicit about the
/sys/class/scsi_disk location. Otherwise it is not clear which device
the comment refers to.

Otherwise OK.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference
  2022-03-04 16:03 ` [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference Christoph Hellwig
  2022-03-06  1:44   ` Bart Van Assche
  2022-03-06  3:33   ` Ming Lei
@ 2022-03-08  3:25   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> sd_zbc_release_disk accesses disk->device, so ensure that actually
> still has a valid reference.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/14] sd: delay calling free_opal_dev
  2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
                     ` (2 preceding siblings ...)
  2022-03-07  3:17   ` Chaitanya Kulkarni
@ 2022-03-08  3:26   ` Martin K. Petersen
  3 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> Call free_opal_dev from scsi_disk_release as the opal_dev field is
> access

accessed?

> from the ioctl handler, which isn't synchronized vs sd_release and thus
> can be accesses during or after sd_release was called.

accessed?

Otherwise fine.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting
  2022-03-04 16:03 ` [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting Christoph Hellwig
  2022-03-06  2:03   ` Bart Van Assche
  2022-03-06  3:54   ` Ming Lei
@ 2022-03-08  3:29   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> Implement the ->free_disk method to to put struct scsi_disk when the
> last gendisk reference count goes away.  This removes the need to
> clear ->private_data and thus freeze the queue on unbind.

Looks fine.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 08/14] sr: implement ->free_disk
  2022-03-04 16:03 ` [PATCH 08/14] sr: implement ->free_disk Christoph Hellwig
  2022-03-06  4:01   ` Ming Lei
  2022-03-06 21:44   ` Bart Van Assche
@ 2022-03-08  3:31   ` Martin K. Petersen
  2 siblings, 0 replies; 51+ messages in thread
From: Martin K. Petersen @ 2022-03-08  3:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Martin K. Petersen, Ming Lei, Bart Van Assche,
	linux-block, linux-scsi


Christoph,

> Simplify the refcounting and remove the need to clear
> disk->private_data by implementing the ->free_disk method.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 13/14] block: do more work in elevator_exit
  2022-03-08  5:51 move more work to disk_release v4 Christoph Hellwig
@ 2022-03-08  5:51 ` Christoph Hellwig
  0 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-03-08  5:51 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Martin K. Petersen, Ming Lei, Bart Van Assche, linux-block, linux-scsi

Move the calls to ioc_clear_queue and blk_mq_sched_free_rqs into
elevator_exit.  Except for one call where we know we can't have io_cq
structures yet these always go together, and that extra call in an
error path is harmless.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/elevator.c | 6 +++---
 block/genhd.c    | 3 ---
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index a842e4b8ebc66..20a4e7c7c7745 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -192,6 +192,9 @@ void elevator_exit(struct request_queue *q)
 {
 	struct elevator_queue *e = q->elevator;
 
+	ioc_clear_queue(q);
+	blk_mq_sched_free_rqs(q);
+
 	mutex_lock(&e->sysfs_lock);
 	blk_mq_exit_sched(q, e);
 	mutex_unlock(&e->sysfs_lock);
@@ -596,8 +599,6 @@ int elevator_switch_mq(struct request_queue *q,
 
 	if (q->elevator) {
 		elv_unregister_queue(q);
-		ioc_clear_queue(q);
-		blk_mq_sched_free_rqs(q);
 		elevator_exit(q);
 	}
 
@@ -608,7 +609,6 @@ int elevator_switch_mq(struct request_queue *q,
 	if (new_e) {
 		ret = elv_register_queue(q, true);
 		if (ret) {
-			blk_mq_sched_free_rqs(q);
 			elevator_exit(q);
 			goto out;
 		}
diff --git a/block/genhd.c b/block/genhd.c
index 73705a749ea92..857e0a54da7dd 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1115,10 +1115,7 @@ static void disk_release_mq(struct request_queue *q)
 	 * cgroup controller.
 	 */
 	if (q->elevator) {
-		ioc_clear_queue(q);
-
 		mutex_lock(&q->sysfs_lock);
-		blk_mq_sched_free_rqs(q);
 		elevator_exit(q);
 		mutex_unlock(&q->sysfs_lock);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/14] block: do more work in elevator_exit
  2022-02-27 17:21 move more work to disk_release v2 Christoph Hellwig
@ 2022-02-27 17:21 ` Christoph Hellwig
  0 siblings, 0 replies; 51+ messages in thread
From: Christoph Hellwig @ 2022-02-27 17:21 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Martin K. Petersen, Ming Lei, linux-block, linux-scsi

Move the calls to ioc_clear_queue and blk_mq_sched_free_rqs into
elevator_exit.  Except for one call where we know we can't have io_cq
structures yet these always go together, and that extra call in an
error path is harmless.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/elevator.c | 7 +++----
 block/genhd.c    | 3 ---
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index 6847ab6e7aa50..4664cae50da86 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -192,6 +192,9 @@ void elevator_exit(struct request_queue *q)
 {
 	struct elevator_queue *e = q->elevator;
 
+	ioc_clear_queue(q);
+	blk_mq_sched_free_rqs(q);
+
 	mutex_lock(&e->sysfs_lock);
 	blk_mq_exit_sched(q, e);
 	mutex_unlock(&e->sysfs_lock);
@@ -595,9 +598,6 @@ int elevator_switch_mq(struct request_queue *q,
 	if (q->elevator) {
 		if (q->elevator->registered)
 			elv_unregister_queue(q);
-
-		ioc_clear_queue(q);
-		blk_mq_sched_free_rqs(q);
 		elevator_exit(q);
 	}
 
@@ -608,7 +608,6 @@ int elevator_switch_mq(struct request_queue *q,
 	if (new_e) {
 		ret = elv_register_queue(q, true);
 		if (ret) {
-			blk_mq_sched_free_rqs(q);
 			elevator_exit(q);
 			goto out;
 		}
diff --git a/block/genhd.c b/block/genhd.c
index a92641911bc1b..5368ec88e485f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1117,10 +1117,7 @@ static void disk_release_mq(struct request_queue *q)
 	 * cgroup controller.
 	 */
 	if (q->elevator) {
-		ioc_clear_queue(q);
-
 		mutex_lock(&q->sysfs_lock);
-		blk_mq_sched_free_rqs(q);
 		elevator_exit(q);
 		mutex_unlock(&q->sysfs_lock);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2022-03-08  5:53 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-04 16:03 move more work to disk_release v3 Christoph Hellwig
2022-03-04 16:03 ` [PATCH 01/14] blk-mq: do not include passthrough requests in I/O accounting Christoph Hellwig
2022-03-06  1:02   ` Bart Van Assche
2022-03-07  3:12   ` Chaitanya Kulkarni
2022-03-08  3:13   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 02/14] blk-mq: handle already freed tags gracefully in blk_mq_free_rqs Christoph Hellwig
2022-03-06  1:05   ` Bart Van Assche
2022-03-08  3:14   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 03/14] scsi: don't use disk->private_data to find the scsi_driver Christoph Hellwig
2022-03-06  1:27   ` Bart Van Assche
2022-03-06  3:09   ` Ming Lei
2022-03-07  3:14   ` Chaitanya Kulkarni
2022-03-08  3:16   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 04/14] sd: rename the scsi_disk.dev field Christoph Hellwig
2022-03-06  1:38   ` Bart Van Assche
2022-03-06  8:40     ` Christoph Hellwig
2022-03-06 20:34       ` Bart Van Assche
2022-03-06  3:31   ` Ming Lei
2022-03-06  8:43     ` Christoph Hellwig
2022-03-07  3:16   ` Chaitanya Kulkarni
2022-03-08  3:25   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 05/14] sd: call sd_zbc_release_disk before releasing the scsi_device reference Christoph Hellwig
2022-03-06  1:44   ` Bart Van Assche
2022-03-06  3:33   ` Ming Lei
2022-03-08  3:25   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 06/14] sd: delay calling free_opal_dev Christoph Hellwig
2022-03-06  1:45   ` Bart Van Assche
2022-03-06  3:41   ` Ming Lei
2022-03-07  3:17   ` Chaitanya Kulkarni
2022-03-08  3:26   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 07/14] sd: make use of ->free_disk to simplify refcounting Christoph Hellwig
2022-03-06  2:03   ` Bart Van Assche
2022-03-06  8:46     ` Christoph Hellwig
2022-03-06  3:54   ` Ming Lei
2022-03-08  3:29   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 08/14] sr: implement ->free_disk Christoph Hellwig
2022-03-06  4:01   ` Ming Lei
2022-03-06 21:44   ` Bart Van Assche
2022-03-08  3:31   ` Martin K. Petersen
2022-03-04 16:03 ` [PATCH 09/14] block: move blkcg initialization/destroy into disk allocation/release handler Christoph Hellwig
2022-03-07  3:18   ` Chaitanya Kulkarni
2022-03-04 16:03 ` [PATCH 10/14] block: don't remove hctx debugfs dir from blk_mq_exit_queue Christoph Hellwig
2022-03-04 16:03 ` [PATCH 11/14] block: move q_usage_counter release into blk_queue_release Christoph Hellwig
2022-03-04 16:03 ` [PATCH 12/14] block: move blk_exit_queue into disk_release Christoph Hellwig
2022-03-06 21:21   ` Bart Van Assche
2022-03-04 16:03 ` [PATCH 13/14] block: do more work in elevator_exit Christoph Hellwig
2022-03-04 16:03 ` [PATCH 14/14] block: move rq_qos_exit() into disk_release() Christoph Hellwig
2022-03-06 20:51   ` Bart Van Assche
2022-03-07  2:50     ` Ming Lei
  -- strict thread matches above, loose matches on Subject: below --
2022-03-08  5:51 move more work to disk_release v4 Christoph Hellwig
2022-03-08  5:51 ` [PATCH 13/14] block: do more work in elevator_exit Christoph Hellwig
2022-02-27 17:21 move more work to disk_release v2 Christoph Hellwig
2022-02-27 17:21 ` [PATCH 13/14] block: do more work in elevator_exit Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.